Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize get_latest_dumps script with xargs #92

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

rayrrr
Copy link

@rayrrr rayrrr commented Jul 31, 2018

To maximize usage of bandwidth and make the download marginally faster, use xargs in combination with wget.

Also, remove USER_AGENT as the parentheses don't play nice with xargs, and also because wget functions just fine without it.

The output is a bit hectic, with the progress bars from the parallel processes constantly overwriting each other...but totally worth it IMHO :)

@philipmat
Copy link
Owner

The discogs team requires User-Agent for their API.

I don't think it's required to download the zip files - is there a way we can make sure of it?

@rayrrr
Copy link
Author

rayrrr commented Jul 31, 2018

@philipmat thanks for the heads up! I will put it back. Is it cool if we make the User-Agent something simpler, like DiscogsXml2Db/1.0? That's what the Discogs API documentation recommends.

@rayrrr
Copy link
Author

rayrrr commented Aug 23, 2018

Of interest: if we remove the USER_AGENT parameter, wget will provide a default user agent, Wget/1.0 or whatever version, which it should probably technically be. https://superuser.com/questions/495855/what-is-the-default-user-agent-when-using-wget-on-linux

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants