Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Separate building request from _requests_to_follow in CrawlSpider #2562

merged 1 commit into from Feb 20, 2017


Copy link

@terut terut commented Feb 14, 2017

I use scrapy-splash for javascript and sometimes want to change the request to another request like splash one because it should be convenient using same logic for html after runing javascript.

scrapy-splash's request is following:

What do you think about that? Should I implement new spider?

You just overwrite buiding request if you can use another request class
because of something like splash-plugin.
Copy link

@kmike kmike commented Feb 14, 2017

This change makes sense to me. I guess the main question is whether we want this method to be public & documented & tested or not.

See also: #1495.

Copy link
Contributor Author

@terut terut commented Feb 15, 2017

Thank you for your reply. I saw #1495.

I agree with you and I think this method should be private for now because this method isn't referred to by other object in many case. In addition, someone who wants to change request object would be able to override this method and start_requests method. That is so reasonable for me.
Or, two private methods makes sense to me. For example, it's following:

def _make_first_request(self, url):
     return Request(url, dont_filter=True)

def _make_following_request(self, url):
     return Request(url=url, callback=self._response_downloaded)

Maybe it is better to have a method to build many kinds of requests but that method would be too complicated because of many arguments.

@kmike kmike changed the title Separate building request from _requests_to_follow in CrawlSpider [MRG+1] Separate building request from _requests_to_follow in CrawlSpider Feb 15, 2017
Copy link

@kmike kmike commented Feb 15, 2017

Makes sense. I think this change is fine if we look at it as a simple refactoring / code cleanup.

@dangra dangra merged commit b2d66dc into scrapy:master Feb 20, 2017
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
@dangra dangra added this to the v1.4 milestone Feb 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants