New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] Use dont_filter=True for contracts requests #3381
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3381 +/- ##
==========================================
+ Coverage 84.48% 84.49% +<.01%
==========================================
Files 167 167
Lines 9371 9376 +5
Branches 1392 1392
==========================================
+ Hits 7917 7922 +5
Misses 1199 1199
Partials 255 255
|
Looks good! Would you mind adding a test case? |
@kmike I can't start class TestSameUrlSpider(Spider):
def parse_first(self, response):
"""first callback
@url http://scrapy.org
"""
pass
def parse_second(self, response):
"""second callback
@url http://scrapy.org
"""
pass
class ContractsManagerTest(unittest.TestCase):
contracts = [UrlContract, ReturnsContract, ScrapesContract]
def setUp(self):
self.conman = ContractsManager(self.contracts)
self.results = TextTestResult(stream=None, descriptions=False, verbosity=0)
def test_same_url(self):
crawler_process = CrawlerProcess()
TestSameUrlSpider.start_requests = lambda s: self.conman.from_spider(s, self.results)
crawler_process.crawl(TestSameUrlSpider)
crawler_process.start()
self.assertEqual(self.results.testsRun, 2) |
tests/test_contracts.py
Outdated
|
||
@defer.inlineCallbacks | ||
def test_same_url(self): | ||
TestSameUrlSpider.start_requests = lambda s: self.conman.from_spider(s, self.results) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if TestSameUrlSpider can be used only in this test (as its start_requests is monkey-patched), what do you think about just defining the spider inside the method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me!
tests/test_contracts.py
Outdated
|
||
def parse_first(self, response): | ||
"""first callback | ||
@url http://scrapy.org |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it going to be a real http request to scrapy.org? If so, this would make tests flaky. We have a mockserver available in tests, so it is better to use it -
Line 16 in 91f986e
class CrawlTestCase(TestCase): |
tests/test_contracts.py
Outdated
self.visited += 1 | ||
return TestItem() | ||
|
||
TestSameUrlSpider.start_requests = lambda s: self.conman.from_spider(s, self.results) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can define a method, no need to monkey-patch it
tests/test_contracts.py
Outdated
|
||
crawler = CrawlerRunner().create_crawler(TestSameUrlSpider) | ||
with MockServer() as mockserver: | ||
yield crawler.crawl(mockserver=mockserver) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately mockserver doesn't make Scrapy magically make local requests :) MockServer just starts a process which listens to HTTP, and has a few endpoints you can use for testing. To actually use it you need to instruct Scrapy to make requests to it.
In Scrapy Contracts URL to fetch is defined in a docstring (@url ...
), so if you define http://scrapy.org
there, this test will fetch scrapy.org website. A sanity check is to try running this test with Internet disabled (e.g. WiFi turned off) - Scrapy testing suite shouldn't need external resources to run. To make it work you need to use an URL of some mockserver endpoint instead of scrapy.org.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
crawler.crawl arguments are passed to Spider.from_crawler
, which eventually sets them as attributes. So this line does the following: starts crawling using TestSameUrlSpider, and this spider has mockserver
attribute set to mockserver
instance - which is not useful, as you're not using it in a spider.
90d3da8
to
0467737
Compare
@kmike thanks for your comments. Now it works without internet connection. |
@kmike could you please give it a pass? Hopefully, final one :) |
@StasDeep yeah, looks good, thanks for the fix! |
Fixes #3380.