-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fake User-Agent if response is 403? #159
Comments
https://www.un.org/ is another example. |
In my opinion, this would be justified. Since we try to minimize traffic by checking links only once no matter how often they occur in the content and by only checking URLs once every |
Thanks, I appreciate your point of view. It makes much sense. |
I have some sites that return a 403 response for both HEAD and GET requests when the User-Agent is not in some whitelisted strings. Here's an example: https://www.cairn.info/revue-l-economie-politique-2005-3-page-60.htm Its probably a measure to avoid some bot traffic.
If you set the User-Agent as looking like a browser (e.g. 'Mozilla/5.0 (Windows NT 10.0; rv:91.0) Gecko/20100101 Firefox/91.0'), it returns a 200 status code.
Would it be acceptable (in the ethical sense) to try with a "fake" user agent in the case of a 403 response?
The text was updated successfully, but these errors were encountered: