Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing directly ArrayBuffer #347

Open
lpatiny opened this issue May 25, 2021 · 6 comments
Open

Parsing directly ArrayBuffer #347

lpatiny opened this issue May 25, 2021 · 6 comments
Labels
Feature-Request New features suggested by users

Comments

@lpatiny
Copy link

lpatiny commented May 25, 2021

In the project https://github.com/cheminfo/mzData we are using fast-xml-parser to parse scientific data (mass spectra).

Those data may be quite big and it works perfectly even with files of 400Mb.

However we may have files of 1Gb or more and there is currently a text size limitation (from javascript in Chrome) that is 512Mb.

I wonder if it would be possible to accept directly an ArrayBuffer and not only a text file. The current code uses nearly exclusively the array of chars so that most of the code could be compatible with ArrayBuffer but it would need to convert deal with multiple byte characters.

@github-actions
Copy link

I'm glad you find this repository helpful. I'll try to address your issue ASAP. You can watch the repo for new changes or star it.

@amitguptagwl
Copy link
Member

Such big data might not be good for a web application. So if it is being used on backend then I'm not sure if this library is really a good choice for your project. I believe some library which works on stream would be a better choice. ArrayBuffer might not help completely.

@lpatiny
Copy link
Author

lpatiny commented Jun 25, 2021

Big data works pretty well in the browser for us. We process TIFF images of 1.5 Gb (electronic microscopy) in javascript in the browser without problems.

Indeed some libraries are working on stream but this one is faster this is why I was interested in this improvement.

@amitguptagwl
Copy link
Member

Okay. To make it working perfectly for big data, we'll have to process streams. It is achievable but it complex the code and impact overall performance. I'm tagging it as a feature request.

@amitguptagwl amitguptagwl added the Feature-Request New features suggested by users label Jun 26, 2021
@lpatiny
Copy link
Author

lpatiny commented Sep 1, 2021

We adapted the code to be suitable for our needs and parse directly a large ArrayBuffer or Uint8Array.

We had to change many things so that we could also parse a base64 encoded value (as a typedArray) to a Float64Array (and we still have a little bit of work on this).

Anyway the new parser is working and on my MacMini M1 I can parse a file of 1 Gb in 4.5s which is reasonable.

https://www.npmjs.com/package/arraybuffer-xml-parser

@amitguptagwl For me you may close this issue

@amitguptagwl
Copy link
Member

It's nice to hear. I'm still keeping this issue open so it can be incorporated in future release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature-Request New features suggested by users
Projects
None yet
Development

No branches or pull requests

2 participants