Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Added: Making it case-insensitive when extracting sitemap URLs from a robots.txt #1902

Merged

Conversation

@starrify
Copy link
Contributor

@starrify starrify commented Apr 1, 2016

Reasons:

  • The proposal on robots.txt is never standardized.
  • The field "Sitemap:" is also merely a de facto standard since the original robots.txt draft did not mention it at all.
  • Many websites are currently using all-lowercase fields, e.g. "disallow" and "user-agent".
  • Some big companies treat the field names as case-insensitive, like Google
@codecov-io
Copy link

@codecov-io codecov-io commented Apr 1, 2016

Current coverage is 83.19%

Merging #1902 into master will not affect coverage as of 27d4128

Powered by Codecov. Updated on successful CI builds.

@kmike kmike changed the title Added: Making it case-insensitive when extracting sitemap URLs from a robots.txt [MRG+1] Added: Making it case-insensitive when extracting sitemap URLs from a robots.txt Apr 1, 2016
@redapple redapple merged commit 642fedb into scrapy:master Apr 4, 2016
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants