[MRG+1] Return non-zero exit code from commands in case of errors in spiders constructor #3226
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3226 +/- ##
==========================================
+ Coverage 82.14% 82.15% +<.01%
==========================================
Files 228 228
Lines 9599 9609 +10
Branches 1385 1387 +2
==========================================
+ Hits 7885 7894 +9
- Misses 1456 1458 +2
+ Partials 258 257 -1
|
def is_crawlers_has_spider(self): | ||
return reduce(lambda x,y: x and y, | ||
self.crawlers_has_spiders, | ||
True) |
kmike
Apr 27, 2018
Member
any(self.crawlers_has_spiders)
?
any(self.crawlers_has_spiders)
?
kmike
May 4, 2018
Member
as @VMRuiz noted, it is actually all(...)
, not any(...)
, which is an additional argument to change it, as the code is not obvious :P
as @VMRuiz noted, it is actually all(...)
, not any(...)
, which is an additional argument to change it, as the code is not obvious :P
Test failed due to new twisted release, fix in twisted trunk twisted/twisted#1008 but not released to pypi. All PRs are broken. |
@@ -1,3 +1,5 @@ | |||
import pdb | |||
import inspect |
kmike
Jun 5, 2018
•
Member
pdb import is unneeded
pdb import is unneeded
from scrapy.extensions.throttle import AutoThrottle | ||
from twisted.internet import defer | ||
import twisted.trial.unittest |
kmike
Jun 5, 2018
Member
as per pep8, twisted imports should go in a separate group, between stdlib imports and scrapy imports
as per pep8, twisted imports should go in a separate group, between stdlib imports and scrapy imports
@@ -220,6 +222,16 @@ def test_runspider(self): | |||
self.assertIn("INFO: Closing spider (finished)", log) | |||
self.assertIn("INFO: Spider closed (finished)", log) | |||
|
|||
def test_run_fail_spider(self): | |||
proc = self.runspider("import scrapy\n" + inspect.getsource(ExceptionSpider)) |
kmike
Jun 6, 2018
Member
nice trick!
nice trick!
Looks good to me, thanks @whalebot-helmsman! I'd remove |
@@ -221,6 +226,9 @@ def join(self): | |||
while self._active: | |||
yield defer.DeferredList(self._active) | |||
|
|||
def _is_spider_created_for_every_crawler(self): | |||
return all(self.crawlers_has_spiders) |
dangra
Jun 13, 2018
Member
I'd like to propose the removal of all this methods and use a single boolean flag that summarizes the outcome of bootstraping the spider.
The public attribute name would be bootstrap_failed
and we set it at _crawl / _done()
like:
self.bootstrap_failed |= not getattr(crawler, 'spider', None)
this public attribute replaces _is_spider_created_for_every_crawler()
method and crawlers_has_spiders
atttribute, but also remove the need for private _is_spider_created()
method.
then the check in runspider.py
and crawl.py
looks like:
if not self.crawler_process.bootstrap_failed:
self.exitcode = 1
I'd like to propose the removal of all this methods and use a single boolean flag that summarizes the outcome of bootstraping the spider.
The public attribute name would be bootstrap_failed
and we set it at _crawl / _done()
like:
self.bootstrap_failed |= not getattr(crawler, 'spider', None)
this public attribute replaces _is_spider_created_for_every_crawler()
method and crawlers_has_spiders
atttribute, but also remove the need for private _is_spider_created()
method.
then the check in runspider.py
and crawl.py
looks like:
if not self.crawler_process.bootstrap_failed:
self.exitcode = 1
Thanks @whalebot-helmsman and @dangra! |
thanks @whalebot-helmsman and @kmike |
I restarted the failed build to be sure it was a temporal failure. |
Return non-zero exit code from commands in case if we have error in spiders constructor