Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve speed of source downloads #89

Closed
Tracked by #54
bfhealy opened this issue Sep 2, 2022 · 1 comment
Closed
Tracked by #54

Improve speed of source downloads #89

bfhealy opened this issue Sep 2, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@bfhealy
Copy link
Collaborator

bfhealy commented Sep 2, 2022

Complementary to #88, but separate because the download script runs differently. Iterating over the pages of existing sources takes a non-negligible amount of time before the download loop begins. Once it does, it is possible to download a few thousand sources per hour.

Feature Summary
To prepare for the need to download large source lists, we should streamline scope_download_classification.py to run as quickly as possible.

Implementation details
The page-by-page loop before downloads begin, along with the download loop that starts afterwards, are both good areas to focus for this enhancement.

@bfhealy bfhealy added the enhancement New feature or request label Sep 2, 2022
This was referenced Sep 2, 2022
@bfhealy
Copy link
Collaborator Author

bfhealy commented Sep 9, 2022

With #90 implemented, if the user specifies a group from which to download rather than individual sources, it now takes a few minutes to download a few thousand sources from that group.

@bfhealy bfhealy closed this as completed Sep 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant