Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the scraper mimic normal user to avoid IP block #109

Open
BruceJohnJennerLawso opened this issue Jan 19, 2017 · 0 comments
Open

Make the scraper mimic normal user to avoid IP block #109

BruceJohnJennerLawso opened this issue Jan 19, 2017 · 0 comments

Comments

@BruceJohnJennerLawso
Copy link
Owner

Turns out hock-ref doesnt like being looked at too much, so I need to be careful to avoid getting my IP blocked. This is mostly not an issue at the moment, given that the team csvs for NHL and WHA are stored in the dataBackup, but it could become an issue in the future with day to day scraping of the current season scores.

This hypothetically could be avoided by modifying the scrapers loop to be random in terms of the times between requests and the order of teams requested. Even better, it hypothetically could mimic a user starting from the season page and jump to each team page, spend an appropriate amount of time looking at that team page, then move to the next one. This would extend the runtime of the scraper quite a bit, but that would probably not be that big of a deal for a server with nothing to do. Of course if anyone from hockey reference is reading this, this is all hypothetical, I would never do such a thing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant