[MRG+1] Use dont_filter=True for contracts requests #3381
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3381 +/- ##
==========================================
+ Coverage 84.48% 84.49% +<.01%
==========================================
Files 167 167
Lines 9371 9376 +5
Branches 1392 1392
==========================================
+ Hits 7917 7922 +5
Misses 1199 1199
Partials 255 255
|
Looks good! Would you mind adding a test case? |
@kmike I can't start class TestSameUrlSpider(Spider):
def parse_first(self, response):
"""first callback
@url http://scrapy.org
"""
pass
def parse_second(self, response):
"""second callback
@url http://scrapy.org
"""
pass
class ContractsManagerTest(unittest.TestCase):
contracts = [UrlContract, ReturnsContract, ScrapesContract]
def setUp(self):
self.conman = ContractsManager(self.contracts)
self.results = TextTestResult(stream=None, descriptions=False, verbosity=0)
def test_same_url(self):
crawler_process = CrawlerProcess()
TestSameUrlSpider.start_requests = lambda s: self.conman.from_spider(s, self.results)
crawler_process.crawl(TestSameUrlSpider)
crawler_process.start()
self.assertEqual(self.results.testsRun, 2) |
|
||
@defer.inlineCallbacks | ||
def test_same_url(self): | ||
TestSameUrlSpider.start_requests = lambda s: self.conman.from_spider(s, self.results) |
kmike
Aug 15, 2018
Member
if TestSameUrlSpider can be used only in this test (as its start_requests is monkey-patched), what do you think about just defining the spider inside the method?
if TestSameUrlSpider can be used only in this test (as its start_requests is monkey-patched), what do you think about just defining the spider inside the method?
StasDeep
Aug 15, 2018
Author
Contributor
Makes sense to me!
Makes sense to me!
|
||
def parse_first(self, response): | ||
"""first callback | ||
@url http://scrapy.org |
kmike
Aug 15, 2018
Member
Is it going to be a real http request to scrapy.org? If so, this would make tests flaky. We have a mockserver available in tests, so it is better to use it -
Line 16
in
91f986e
is an usage example.
Is it going to be a real http request to scrapy.org? If so, this would make tests flaky. We have a mockserver available in tests, so it is better to use it -
Line 16 in 91f986e
self.visited += 1 | ||
return TestItem() | ||
|
||
TestSameUrlSpider.start_requests = lambda s: self.conman.from_spider(s, self.results) |
kmike
Aug 15, 2018
Member
you can define a method, no need to monkey-patch it
you can define a method, no need to monkey-patch it
|
||
crawler = CrawlerRunner().create_crawler(TestSameUrlSpider) | ||
with MockServer() as mockserver: | ||
yield crawler.crawl(mockserver=mockserver) |
kmike
Aug 17, 2018
Member
Unfortunately mockserver doesn't make Scrapy magically make local requests :) MockServer just starts a process which listens to HTTP, and has a few endpoints you can use for testing. To actually use it you need to instruct Scrapy to make requests to it.
In Scrapy Contracts URL to fetch is defined in a docstring (@url ...
), so if you define http://scrapy.org
there, this test will fetch scrapy.org website. A sanity check is to try running this test with Internet disabled (e.g. WiFi turned off) - Scrapy testing suite shouldn't need external resources to run. To make it work you need to use an URL of some mockserver endpoint instead of scrapy.org.
Unfortunately mockserver doesn't make Scrapy magically make local requests :) MockServer just starts a process which listens to HTTP, and has a few endpoints you can use for testing. To actually use it you need to instruct Scrapy to make requests to it.
In Scrapy Contracts URL to fetch is defined in a docstring (@url ...
), so if you define http://scrapy.org
there, this test will fetch scrapy.org website. A sanity check is to try running this test with Internet disabled (e.g. WiFi turned off) - Scrapy testing suite shouldn't need external resources to run. To make it work you need to use an URL of some mockserver endpoint instead of scrapy.org.
kmike
Aug 17, 2018
•
Member
crawler.crawl arguments are passed to Spider.from_crawler
, which eventually sets them as attributes. So this line does the following: starts crawling using TestSameUrlSpider, and this spider has mockserver
attribute set to mockserver
instance - which is not useful, as you're not using it in a spider.
crawler.crawl arguments are passed to Spider.from_crawler
, which eventually sets them as attributes. So this line does the following: starts crawling using TestSameUrlSpider, and this spider has mockserver
attribute set to mockserver
instance - which is not useful, as you're not using it in a spider.
@kmike thanks for your comments. Now it works without internet connection. |
@kmike could you please give it a pass? Hopefully, final one :) |
@StasDeep yeah, looks good, thanks for the fix! |
Fixes #3380.