Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] PY3: Fix SitemapSpider to extract sitemap urls from robots.txt properly #1767

Merged
merged 1 commit into from Feb 8, 2016

Conversation

orangain
Copy link
Contributor

@orangain orangain commented Feb 6, 2016

Purpose

Fix #1766, the problem that SitemapSpider fails to extract sitemap urls from robots.txt in Python 3.

Changes

  • Pass response.text as an argument of sitemap_urls_from_robots() instead of response.body.
  • Add an unit test.

@codecov-io
Copy link

Current coverage is 83.33%

Merging #1767 into master will increase coverage by +0.04% as of f19c27b

Powered by Codecov. Updated on successful CI builds.

@eliasdorneles eliasdorneles changed the title PY3: Fix SitemapSpider to extract sitemap urls from robots.txt properly [MRG+1] PY3: Fix SitemapSpider to extract sitemap urls from robots.txt properly Feb 6, 2016
@eliasdorneles
Copy link
Member

Thanks for the patch @orangain, looks good! 👍

@kmike
Copy link
Member

kmike commented Feb 8, 2016

Thanks @orangain!

kmike added a commit that referenced this pull request Feb 8, 2016
[MRG+1] PY3: Fix SitemapSpider to extract sitemap urls from robots.txt properly
@kmike kmike merged commit 44bc4c0 into scrapy:master Feb 8, 2016
@redapple
Copy link
Contributor

redapple commented Feb 8, 2016

Needs backporting to 1.1 branch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants