Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set a User-Agent for HTTP requests from :elinks #1417

Merged
merged 1 commit into from Apr 27, 2019

Conversation

Projects
None yet
2 participants
@da2x
Copy link
Contributor

commented Apr 27, 2019

Detailed description

  • The default “Ruby” User-Agent is blocked everywhere so replace it.
  • Include “Mozilla/5.0” to pass through a lot of blocking filters and because it’s required due to the legacy of the web.
  • Include Nanoc software name and version to identify the software making the request.
  • Describe the purpose of the request to help webmasters make informed blocking decisions.

Proposed User-Agent string:

Mozilla/5.0 Nanoc/4.11.2 (link rot checker)

To do

  • Agree on User-Agent string.
  • Add to change log.

More details

In my own set of 1800 links, this change reduces the number of HTTP 403 Forbidden responses from 482 to 17. (Most of these are probably hosted by a single large provider like Cloudflare.) This could probably be reduced even further by faking a mainstream web browser’s User-Agent string. However, I’d rather live with a handful of blocked tests than start faking User-Agents.

Set a User-Agent for HTTP requests from :elinks
Goals:
* The default “Ruby” User-Agent is blocked everywhere so replace it.
* Include “Mozilla/5.0” to pass through a lot of blocking filters and because it’s required due to the legacy of the web.
* Include Nanoc software name and version to identify the software making the request.
* Describe the purpose of the request to help webmasters make informed blocking decisions.
@ddfreyne

This comment has been minimized.

Copy link
Member

commented Apr 27, 2019

Ooh, I like it. I think the User Agent that you propose is good, and I suppose it can be changed in the future if needed.

Could be useful to have a test for this, but probably not important. (I also realised that I haven’t really set up proper testing for the external links checker yet anyway…)

@ddfreyne ddfreyne merged commit 7410fed into nanoc:master Apr 27, 2019

21 checks passed

ci/circleci: check_style_cruby26 Your tests passed on CircleCI!
Details
ci/circleci: setup_cruby24 Your tests passed on CircleCI!
Details
ci/circleci: setup_cruby25 Your tests passed on CircleCI!
Details
ci/circleci: setup_cruby26 Your tests passed on CircleCI!
Details
ci/circleci: test_guard_nanoc_cruby24 Your tests passed on CircleCI!
Details
ci/circleci: test_guard_nanoc_cruby25 Your tests passed on CircleCI!
Details
ci/circleci: test_guard_nanoc_cruby26 Your tests passed on CircleCI!
Details
ci/circleci: test_nanoc_core_cruby24 Your tests passed on CircleCI!
Details
ci/circleci: test_nanoc_core_cruby25 Your tests passed on CircleCI!
Details
ci/circleci: test_nanoc_core_cruby26 Your tests passed on CircleCI!
Details
ci/circleci: test_nanoc_cruby24 Your tests passed on CircleCI!
Details
ci/circleci: test_nanoc_cruby25 Your tests passed on CircleCI!
Details
ci/circleci: test_nanoc_cruby26 Your tests passed on CircleCI!
Details
ci/circleci: test_nanoc_external_cruby24 Your tests passed on CircleCI!
Details
ci/circleci: test_nanoc_external_cruby25 Your tests passed on CircleCI!
Details
ci/circleci: test_nanoc_external_cruby26 Your tests passed on CircleCI!
Details
ci/circleci: test_nanoc_live_cruby24 Your tests passed on CircleCI!
Details
ci/circleci: test_nanoc_live_cruby25 Your tests passed on CircleCI!
Details
ci/circleci: test_nanoc_live_cruby26 Your tests passed on CircleCI!
Details
codecov/patch Coverage not affected when comparing e3a75e2...f0a8a6a
Details
codecov/project 97.93% (-0.01%) compared to e3a75e2
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.