-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First Spider Middleware does not process exception for generator callback #4260
Comments
Could you provide some code and logs? I could not reproduce the issue, either with import scrapy
class ExceptionHandlerMiddleware:
def process_spider_exception(self, response, exception, spider):
print('Exception caught:', exception)
return []
class ExceptionSpider(scrapy.Spider):
name = 'exception_spider'
start_urls = ['https://example.org']
custom_settings = {
'SPIDER_MIDDLEWARES': {
__name__ + '.ExceptionHandlerMiddleware': 1,
}
}
def parse(self, response):
raise Exception('foo')
|
If you add |
Ok, that changes things. Indeed, I this is what I'm getting: import scrapy
class ExceptionHandlerMiddleware:
def process_spider_exception(self, response, exception, spider):
print('Exception caught:', exception)
return []
class ExceptionSpider(scrapy.Spider):
name = 'exception_spider'
start_urls = ['https://example.org']
custom_settings = {
'SPIDER_MIDDLEWARES': {
__name__ + '.ExceptionHandlerMiddleware': 901,
}
}
def parse(self, response):
yield
raise Exception('foo')
|
It seems to me like this is an edge case of #220. By the time the middleware that's closest to the spider is executed the iterable from the spider callback has not been evaluated yet. The exception is raised in the first
import scrapy
class ExceptionHandlerMiddleware:
def process_spider_exception(self, response, exception, spider):
print('Exception caught:', exception)
return []
class DummyMiddleware:
def process_spider_output(self, response, result, spider):
yield from result
class ExceptionSpider(scrapy.Spider):
name = 'exception_spider'
start_urls = ['https://example.org']
custom_settings = {
'SPIDER_MIDDLEWARES': {
__name__ + '.DummyMiddleware': 902,
__name__ + '.ExceptionHandlerMiddleware': 901,
}
}
def parse(self, response):
yield
raise Exception('foo') |
Description
process_spider_exception
method of a spider middleware is ignored when spider middleware is first and callback is a generator.Steps to Reproduce
process_spider_exception
SPIDER_MIDDLEWARES
with number more than900
(to make it first)Expected behavior:
process_spider_exception
is called for this exceptionActual behavior:
process_spider_exception
is not calledVersions
The text was updated successfully, but these errors were encountered: