-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RangeError - when adding just over 2,400 files #343
Comments
To generate a 1.3GB zip file with
The first one could be mitigated by fixing #290. The second one could be the easiest improvement. The third one is needed by Could you describe your use case ? |
Thanks for the info, Basically we are writing a medical app that downloads patient CT, Ultrasound, X-Ray, MRI etc. We have written all our back-end code in aws lambda, and lambda has a limitation of 500mb disk space so i am unable to compress it from there and serve. Our only other option is to have a server running which I really don't won't to do after all the work we have done to go server-less. |
Do you have a constraint on the browser ? Using var writeStream = streamSaver.createWriteStream('output.zip');
zip
.generateInternalStream({type:"uint8array"})
.on('data', function (data, metadata) {
writeStream.write(data);
})
.on('error', function (e) {
writeStream.abort(e);
})
.on('end', function () {
writeStream.close();
});
.resume(); Without streams, I don't see any magical solution. I'm modifying the |
I have tried it with the latest chrome and I keep getting "WritableStream is not defined", event using the examples of StreamSaver doesn't work. I think our only other option is to zip part by part. e.g. every 1,000 images we create one zip file and give to user part1.zip, part2.zip etc. It is of an inconvenience to the user but I don't see any other way at this stage. |
@hashitha you need to include the web-streams-polyfill to work |
Thanks @jimmywarting , I tried that now getting StreamSaver.js:122 Something went wrong! TypeError: writeStream.write is not a function after reading mitm I am now using 127.0.0.1 to test still not working. This is the code I am using
|
Hmm, they have changed the spec since i wrote this... going to work on updating the docs looks like you now have to call So something along this: var writeStream = streamSaver.createWriteStream('output.zip').getWriter(); |
Updated the readme/example/dependencies with how you should deal with the new spec var writeStream = streamSaver.createWriteStream('output.zip').getWriter(); |
I have tried it and it seems to work except its very unstable. Browser tab crashes at least 50% of the time. |
Hmm, need to try it myself to figure out what is wrong. Do you have some sample code/files I can try out? |
This is a snippet similar to what I use, only different is that you need to specify the urls, I had to remove the links because they were patient files. You can add around 5 links and run it.
|
This is what i played around with: https://jsfiddle.net/Lxwvco3u/ |
When i added node stream + node buffer into the fiddle then i manage to get rid of the most memory and i was able to start saving the zip file right away and i could see the progress as i downloaded https://jsfiddle.net/Lxwvco3u/1/ it would truly help if jszip supported Web streams (or at least the The key feature here is that you want to fetch the next binary file when jszip is ready to accept more data, the only way i was able to do that was with Then the only thing you are going to keep in the memory then is the current file you are compressing
|
Your fiddle code works, but I can't seem to get it working with my urls I am no javascript expert so most probably doing something wrong. this is the code
With the commented code it creates zip file with empty files inside and the other it creates zip with no files inside. Also I am guessing in this "for loop", it downloading one url at a time and then adding to zip where as the code I posted earlier using JSZipUtils.getBinaryContent downloads urls asynchronously and adds to the zip. Only issue here is if it was to download 1 file at a time then it would take a very long time to download 2,500 files each around 300kb. I am saying this because last couple of days I wrote a Amazon Elastic Beanstalk Worker Tier Application to zip the files and upload to a s3 bucket which worked but it took around 15mins and when I changed the code to use a thread pool it finished in around 3 mins. I am still hoping to use this library so the use can start downloading straight away and not wait for a beanstalk app to finish processing. |
jszip don't support web streams yet... so you can't do this: fetch(url).then(res => {
// Same as above but another way to get a readable stream
zip.file('dummyfile.txt', res.body)
}) This was but only a proposal i suggested for jszip to implement |
You can not call That is why i added a node stream that can add a pointer to all the entries so all header/metainfo gets created first |
Yea, I am only calling it when everything is added. Not sure if that's the correct way of doing it but it work.
|
Seems like you need some temporary client side storage to cache everything first. let rs = streamBrowserify.Readable()
// Start downloading right away,
// Browser will make 6 continues download and the rest will be pending
let p = fetch(url).then(res => res.body.getReader())
rs._read = () => {
p.then(reader => reader.read().then(({ value, done }) =>
rs.push(done ? null : new Buffer(value)))
)
} I'm uncertain if it will have negative or positive effect on the download |
Can I try this code now or do I have to wait till jszip supports web streams? |
My version with node-stream + node-buffer can be used right now. You can try it yourself I tried that modification on my second fiddle And i didn't like what i saw. jszip didn't read them in the same order i added the ajax call but it kept on downloading (so it get cached somewhere on the disk i suppose) the effect was: The saving part didn't happen right away since it didn't get the first piece first. And the progress was jumping from 87 mb downloaded - 122 mb downloaded, 155 mb downloaded It would be grate if you somehow could knew what order jszip would read the files - so you can make a ajax call in the same order |
Are you able to provide me with a snippet for node-stream + node-buffer maybe to download and zip these 2 urls |
sure: https://jsfiddle.net/Lxwvco3u/2/ The highlighted code is what makes a web stream to a node stream Also added a CORS proxy since those file didn't have CORS headers |
Thanks @jimmywarting . Sorry forgot to add CORS to the test bucket. I tried this code and it downloaded 2,400 files (1.2GB) in about 2 mins and all files seems to be intact inside the zip file. Just 2 more questions I have.
|
If your site is running https then you won't notice the mitm. (A hidden iframe will be used instead) mitm.html is required by http in order to send a message channel to a service worker and service worker can only be registered if the top domain is secure. So the StreamSaver service is actually a part of github.io The mitm can also be abstracted away all together but that means more configuration for developer to set it up in there own environment but requires that the site use https Think Firefox are close to implement streams so it will come eventually. It's a whatwg standard so it's not a chrome specific thing |
Thanks, it should be okay then as our production site is https. Thanks again for all your help |
Think I would like to borrow this thread and reopen it if possible/necessary. Together with StreamSaver, generateInternalStream and node streams i have been able to give data to jszip when required and save data when i receive some from jszip to get a good memory boost That's grate in the beginning and the end but durning the time i seeded a file to jszip i got nothing back. I think it's cuz jszip tries to compress the file first?. This works okey for many smaller files When running this jsfiddle: https://jsfiddle.net/Lxwvco3u/3/
For this purpose compressing doesn't matter the file is not going to be upload/download or save any few extra bytes on the hard drive. This is about saving multiple files and folder from the browser since it's a lot easier to save one zip and decompress it rather then giving multiple save dialogs to same folder. So you want a fast read/write, compress/decompress as possible So my question is more like: Can you turn of the compression and pipe the chunks directly to StreamSaver? So you will get more like this in the console log?
Ideally i would like to maybe save a hole 4gb file + a readme or something |
On some computers, the download speed goes to 0 kbps after a few seconds where it looks like it stopped downloading but then after a minute or so the file has completed downloading. Maybe what you are suggesting is causing this as well? |
@hashitha, that is cuz you are not downloading the remote files in the same order as jszip wants to read them |
@jimmywarting is there any way around this? I am using the code snippet you wrote before |
Maybe someone from the developers can tell you what order they will be read in. Or maybe you can try to figure it out yourself by logging the filename in rs._read = () => { Then you should try to order the array of files to be downloaded to be in the same order |
My first fiddle https://jsfiddle.net/Lxwvco3u/1/ that only use 1 Ajax call at the time won't randomly buffer up the memory. But that will also be slower |
Since I found about about |
I came across this thread while working on a similar problem. I am downloading from a number of urls, the solution works fine but the generateInternalStream works only after adding all the files to zip object. Can we download and stream at the same time instead of relying on system memory? |
@jimmywarting Thanks for your implementation example in the Fiddle. I was able to get it working for my use case in Chrome. Do you know why this wouldn't work in Firefox (since it seems like they implemented the stream API from what I can tell). I'm getting an error when trying to run the fiddle:
|
Have kinda left jszip for my own streaming implementation now. That fiddle is also very old.
better streaming support, less ram usage. Made for downloading/saving Also helped out to build https://github.com/transcend-io/conflux/ that have support for both reading and creating zip files using whatwg streams |
Fantastic, thank you! The example you gave here is more or less what I need to do (breaking up the file list into another zip stream if it will go over 4GB to get around that limitation). |
Hi @jimmywarting , I've used the code to download zip files "https://github.com/transcend-io/conflux/", is there any way I can get the progress bar how many files or data are zipped? |
I am getting RangeError when I add more than 2400 files, Total size of the final zip file would be around 1.3gb. Works fine for anything less than that.
The text was updated successfully, but these errors were encountered: