what's the normal time for downloading a paper? #41

shizhediao · 2020-04-10T16:27:58Z

Hi,
Thanks for your great work!
I was wondering what's the normal time for downloading a paper?
I would like to download as much as possible papers to do some research. Maybe the size is 10 K ~ 100 K.
But for now, it costs me 10 seconds for each paper downloading, so is it possible to speed up?
Thanks very much!

lukasschwab · 2020-04-11T05:36:29Z

Hmm, this depends on how you're finding the papers to download.

Does the 10-second operation include a query, or is it just the call to arxiv.download? It may be possible to improve the query performance.
arxiv.download uses urlretrieve; I don't know if this is the quickest solution for downloads in bulk. You might be interested in building your own bulk-download function.

Probably the most useful: if you just want as many papers as possible, arXiv offers bulk access to tarfiles of PDFs and source files via S3: https://arxiv.org/help/bulk_data_s3

I'll close this issue for the time being; feel free to reopen it if this doesn't answer your question!

shizhediao · 2020-04-11T05:40:02Z

Thanks for your reply!
In my experiment, it costs 10-second only for the call to arxiv.download.
Thanks for pointing out the bulk access, I'll take a look.
Thanks very much!

shizhediao added the enhancement Requests for new features or improvements. label Apr 10, 2020

lukasschwab added question Questions about how to use this package. and removed enhancement Requests for new features or improvements. labels Apr 11, 2020

lukasschwab closed this as completed Apr 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

what's the normal time for downloading a paper? #41

what's the normal time for downloading a paper? #41

shizhediao commented Apr 10, 2020

lukasschwab commented Apr 11, 2020 •

edited

Loading

shizhediao commented Apr 11, 2020

what's the normal time for downloading a paper? #41

what's the normal time for downloading a paper? #41

Comments

shizhediao commented Apr 10, 2020

lukasschwab commented Apr 11, 2020 • edited Loading

shizhediao commented Apr 11, 2020

lukasschwab commented Apr 11, 2020 •

edited

Loading