Parsing directly ArrayBuffer #347

lpatiny · 2021-05-25T09:23:03Z

In the project https://github.com/cheminfo/mzData we are using fast-xml-parser to parse scientific data (mass spectra).

Those data may be quite big and it works perfectly even with files of 400Mb.

However we may have files of 1Gb or more and there is currently a text size limitation (from javascript in Chrome) that is 512Mb.

I wonder if it would be possible to accept directly an ArrayBuffer and not only a text file. The current code uses nearly exclusively the array of chars so that most of the code could be compatible with ArrayBuffer but it would need to convert deal with multiple byte characters.

github-actions · 2021-05-25T09:23:40Z

I'm glad you find this repository helpful. I'll try to address your issue ASAP. You can watch the repo for new changes or star it.

amitguptagwl · 2021-06-06T08:56:29Z

Such big data might not be good for a web application. So if it is being used on backend then I'm not sure if this library is really a good choice for your project. I believe some library which works on stream would be a better choice. ArrayBuffer might not help completely.

lpatiny · 2021-06-25T09:14:56Z

Big data works pretty well in the browser for us. We process TIFF images of 1.5 Gb (electronic microscopy) in javascript in the browser without problems.

Indeed some libraries are working on stream but this one is faster this is why I was interested in this improvement.

amitguptagwl · 2021-06-26T09:02:56Z

Okay. To make it working perfectly for big data, we'll have to process streams. It is achievable but it complex the code and impact overall performance. I'm tagging it as a feature request.

lpatiny · 2021-09-01T14:13:44Z

We adapted the code to be suitable for our needs and parse directly a large ArrayBuffer or Uint8Array.

We had to change many things so that we could also parse a base64 encoded value (as a typedArray) to a Float64Array (and we still have a little bit of work on this).

Anyway the new parser is working and on my MacMini M1 I can parse a file of 1 Gb in 4.5s which is reasonable.

https://www.npmjs.com/package/arraybuffer-xml-parser

@amitguptagwl For me you may close this issue

amitguptagwl · 2021-09-06T05:07:38Z

It's nice to hear. I'm still keeping this issue open so it can be incorporated in future release.

amitguptagwl added the Feature-Request New features suggested by users label Jun 26, 2021

danielecr mentioned this issue Jul 11, 2022

Stream mode in reports download amz-tools/amazon-sp-api#56

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parsing directly ArrayBuffer #347

Parsing directly ArrayBuffer #347

lpatiny commented May 25, 2021

github-actions bot commented May 25, 2021

amitguptagwl commented Jun 6, 2021

lpatiny commented Jun 25, 2021

amitguptagwl commented Jun 26, 2021

lpatiny commented Sep 1, 2021

amitguptagwl commented Sep 6, 2021

Parsing directly ArrayBuffer #347

Parsing directly ArrayBuffer #347

Comments

lpatiny commented May 25, 2021

github-actions bot commented May 25, 2021

amitguptagwl commented Jun 6, 2021

lpatiny commented Jun 25, 2021

amitguptagwl commented Jun 26, 2021

lpatiny commented Sep 1, 2021

amitguptagwl commented Sep 6, 2021