Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a non-zero exit code when a pipeline's open_spider method throws an exception #4207

Merged
merged 25 commits into from Feb 25, 2020

Conversation

gunblues
Copy link
Contributor

@gunblues gunblues commented Dec 4, 2019

Fixes #4175

scrapy/commands/crawl.py Outdated Show resolved Hide resolved
scrapy/commands/crawl.py Show resolved Hide resolved
@Gallaecio Gallaecio requested a review from kmike Dec 4, 2019
@codecov
Copy link

@codecov codecov bot commented Dec 4, 2019

Codecov Report

Merging #4207 into master will increase coverage by 0.18%.
The diff coverage is 0%.

@@            Coverage Diff             @@
##           master    #4207      +/-   ##
==========================================
+ Coverage   83.73%   83.92%   +0.18%     
==========================================
  Files         165      165              
  Lines        9714     9883     +169     
  Branches     1446     1471      +25     
==========================================
+ Hits         8134     8294     +160     
- Misses       1332     1336       +4     
- Partials      248      253       +5
Impacted Files Coverage Δ
scrapy/commands/crawl.py 26.66% <0%> (-1.25%) ⬇️
scrapy/utils/ftp.py 23.8% <0%> (-6.2%) ⬇️
scrapy/utils/test.py 49.35% <0%> (-5.2%) ⬇️
scrapy/pipelines/files.py 61.66% <0%> (-4.25%) ⬇️
scrapy/utils/display.py 31.25% <0%> (-4.05%) ⬇️
scrapy/utils/boto.py 71.42% <0%> (-3.58%) ⬇️
scrapy/extensions/memdebug.py 47.61% <0%> (-2.39%) ⬇️
scrapy/commands/list.py 77.77% <0%> (-2.23%) ⬇️
scrapy/utils/ossignal.py 73.33% <0%> (-1.67%) ⬇️
scrapy/core/downloader/__init__.py 89.31% <0%> (-1.6%) ⬇️
... and 77 more

Copy link
Member

@Gallaecio Gallaecio left a comment

A change like this should have a test. Could you please add one?

Also, after your latest change to remove redundant code, I don’t think it will work. res is a deferred that will not be processed until start is called, and you are now checking it for errors before start is called.

scrapy/commands/crawl.py Outdated Show resolved Hide resolved
scrapy/commands/crawl.py Outdated Show resolved Hide resolved
@gunblues
Copy link
Contributor Author

@gunblues gunblues commented Dec 5, 2019

A change like this should have a test. Could you please add one?

Also, after your latest change to remove redundant code, I don’t think it will work. res is a deferred that will not be processed until start is called, and you are now checking it for errors before start is called.

I've tried it and it works. We can get res.result.type and it belongs Exception when the exception is raised at open_spider in the pipeline.

Otherwise, I'll try to think about how to add a test for this, thanks

@gunblues
Copy link
Contributor Author

@gunblues gunblues commented Dec 7, 2019

@Gallaecio @wRAR I've added the test, could you help to review it, thanks

Copy link
Member

@Gallaecio Gallaecio left a comment

Great work with the test code, I don’t think there was much existing test code to help figuring out how to implement it.

However, I do think we can omit some of the new files created as part of a project. See #4218

@gunblues
Copy link
Contributor Author

@gunblues gunblues commented Dec 9, 2019

Great work with the test code, I don’t think there was much existing test code to help figuring out how to implement it.

However, I do think we can omit some of the new files created as part of a project. See #4218

@Gallaecio thanks for your reply. I think I need to run scrapy/commands/crawl.py for testing but I don't know how to run it by the way you provided #4218

Copy link
Member

@Gallaecio Gallaecio left a comment

Well, the tests seem to work, so I think we can go ahead with this implementation.

But I think there is still some 💄 cleanup we can do before merging:

  • Sort those imports at tests/test_cmdline_crawl_with_pipeline/__init__.py
  • Remove the unused setUp method.
  • Remove comments generated by Scrapy’s project generation tool.
  • Remove the [deploy] section from the scrapy.cfg file (I don’t think it’s needed here)
  • Remove BOT_NAME and NEWSPIDER_MODULE from settings.py (I think there are not needed either, although I’m less sure about NEWSPIDER_MODULE)

kevin-sb added 2 commits Dec 10, 2019
・Sort those imports at tests/test_cmdline_crawl_with_pipeline/__init__.py
・Remove the unused setUp method.
・Remove comments generated by Scrapy’s project generation tool.
・Remove the [deploy] section from the scrapy.cfg file (I don’t think it’s needed here)
・Remove BOT_NAME and NEWSPIDER_MODULE from settings.py (I think there are not needed either, although I’m less sure about NEWSPIDER_MODULE)
@gunblues
Copy link
Contributor Author

@gunblues gunblues commented Dec 10, 2019

Well, the tests seem to work, so I think we can go ahead with this implementation.

But I think there is still some 💄 cleanup we can do before merging:

  • Sort those imports at tests/test_cmdline_crawl_with_pipeline/__init__.py
  • Remove the unused setUp method.
  • Remove comments generated by Scrapy’s project generation tool.
  • Remove the [deploy] section from the scrapy.cfg file (I don’t think it’s needed here)
  • Remove BOT_NAME and NEWSPIDER_MODULE from settings.py (I think there are not needed either, although I’m less sure about NEWSPIDER_MODULE)

@Gallaecio I've modified the code via your suggestion but I have to reserve BOT_NAME in settings to let crawling function work normally, thanks for your review

@elacuesta elacuesta changed the title fix issue 4175 - Scrapy does not use a non-zero exit code when pipeli… Use a non-zero exit code when a pipeline's open_spider method throws an exception Dec 24, 2019
@gunblues
Copy link
Contributor Author

@gunblues gunblues commented Jan 15, 2020

hello, may I know the approximate time of merging this pr?

@gunblues
Copy link
Contributor Author

@gunblues gunblues commented Feb 6, 2020

@Gallaecio @wRAR may I know when will merge this pull request back?

@Gallaecio
Copy link
Member

@Gallaecio Gallaecio commented Feb 6, 2020

Could you fix the issue reported by Travis CI?

/home/travis/build/scrapy/scrapy/tests/test_cmdline_crawl_with_pipeline/test_spider/spiders/__init__.py:1:1: W391 blank line at end of file

@gunblues
Copy link
Contributor Author

@gunblues gunblues commented Feb 7, 2020

Could you fix the issue reported by Travis CI?

/home/travis/build/scrapy/scrapy/tests/test_cmdline_crawl_with_pipeline/test_spider/spiders/__init__.py:1:1: W391 blank line at end of file

Hi @Gallaecio , thanks for pointing out this issue, I've fixed it
May I know when will merge the PR If Travis CI passes?

@Gallaecio
Copy link
Member

@Gallaecio Gallaecio commented Feb 7, 2020

There is no fixed time. We usually merge once two maintainers have had a chance to review the changes and approve them.

@gunblues
Copy link
Contributor Author

@gunblues gunblues commented Feb 17, 2020

There is no fixed time. We usually merge once two maintainers have had a chance to review the changes and approve them.

got it, thanks a lot. so I still need one more maintainer to approve it. @wRAR could you help to review? thanks a lot

@gunblues
Copy link
Contributor Author

@gunblues gunblues commented Feb 25, 2020

@kmike, @wRAR could you help to review my pull request? it needs two maintainers to approve, thank you

wRAR
wRAR approved these changes Feb 25, 2020
@wRAR wRAR merged commit 034e2c3 into scrapy:master Feb 25, 2020
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants