Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

restore request's callback after calling custom callback #3129

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
3 changes: 3 additions & 0 deletions scrapy/commands/parse.py
Expand Up @@ -192,6 +192,9 @@ def callback(response):
# parse items and requests
depth = response.meta['_depth']

if response.request.callback:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
I am new to the project and to open source development as well. but would you explain why
if response.request.callback: check is neccessary. also If i am not wrong , you can just use cb and not request.meta['_callback']

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @gekco , I believe there's a comment in the review changes that has more detail about this!

response.request.callback = response.meta['_callback']

items, requests = self.run_callback(response, cb)
if opts.pipelines:
itemproc = self.pcrawler.engine.scraper.itemproc
Expand Down
35 changes: 35 additions & 0 deletions tests/test_command_parse.py
Expand Up @@ -70,6 +70,26 @@ class MyBadCrawlSpider(CrawlSpider):

def parse(self, response):
return [scrapy.Item(), dict(foo='bar')]


class MyInfiniteCallbackSpider(scrapy.Spider):

name = 'infinitecb{0}'

def parse(self, response):
if response.meta.get('retried'):
yield dict(foo='bar')
return

response.meta['retried'] = True
yield response.request.replace(dont_filter=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May you can add another test case that tests non-default callbacks, like the example in this comment?


def custom_parse(self, response):
if response.meta.get('retried'):
yield dict(anotherfoo='anotherbar')
return
response.meta['retried'] = True
yield response.request.replace(dont_filter=True)
""".format(self.spider_name))

fname = abspath(join(self.proj_mod_path, 'pipelines.py'))
Expand Down Expand Up @@ -193,3 +213,18 @@ def test_crawlspider_no_matching_rule(self):
)
self.assertRegexpMatches(to_native_str(out), """# Scraped Items -+\n\[\]""")
self.assertIn("""Cannot find a rule that matches""", to_native_str(stderr))

@defer.inlineCallbacks
def test_parse_item_callback_infinite_loop(self):
status, out, stderr = yield self.execute(
['--spider', 'infinitecb'+self.spider_name, '-d', '2', self.url('/html')]
)
self.assertIn("""[{'foo': 'bar'}]""", to_native_str(out))

@defer.inlineCallbacks
def test_parse_item_callback_infinite_loop_custom_callback(self):
status, out, stderr = yield self.execute(['--spider',
'infinitecb'+self.spider_name,
'-d', '2', '-c', 'custom_parse',
self.url('/html')])
self.assertIn("""[{'anotherfoo': 'anotherbar'}]""", to_native_str(out))