Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Use dont_filter=True for contracts requests #3381

Merged
merged 8 commits into from Sep 6, 2018

Conversation

@StasDeep
Copy link
Contributor

@StasDeep StasDeep commented Aug 11, 2018

Fixes #3380.

@codecov
Copy link

@codecov codecov bot commented Aug 11, 2018

Codecov Report

Merging #3381 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3381      +/-   ##
==========================================
+ Coverage   84.48%   84.49%   +<.01%     
==========================================
  Files         167      167              
  Lines        9371     9376       +5     
  Branches     1392     1392              
==========================================
+ Hits         7917     7922       +5     
  Misses       1199     1199              
  Partials      255      255
Impacted Files Coverage Δ
scrapy/contracts/__init__.py 81.3% <100%> (+0.79%) ⬆️
@kmike
Copy link
Member

@kmike kmike commented Aug 15, 2018

Looks good! Would you mind adding a test case?

@StasDeep
Copy link
Contributor Author

@StasDeep StasDeep commented Aug 15, 2018

@kmike I can't start CrawlerProcess in the test. Maybe you know what's wrong with this?

class TestSameUrlSpider(Spider):

    def parse_first(self, response):
        """first callback
        @url http://scrapy.org
        """
        pass

    def parse_second(self, response):
        """second callback
        @url http://scrapy.org
        """
        pass


class ContractsManagerTest(unittest.TestCase):
    contracts = [UrlContract, ReturnsContract, ScrapesContract]

    def setUp(self):
        self.conman = ContractsManager(self.contracts)
        self.results = TextTestResult(stream=None, descriptions=False, verbosity=0)

    def test_same_url(self):
        crawler_process = CrawlerProcess()

        TestSameUrlSpider.start_requests = lambda s: self.conman.from_spider(s, self.results)
        crawler_process.crawl(TestSameUrlSpider)
        crawler_process.start()

        self.assertEqual(self.results.testsRun, 2)
@kmike
Copy link
Member

@kmike kmike commented Aug 15, 2018

@StasDeep these tests needs to be run in an event loop, the code is async. See

class CrawlerRunnerHasSpider(unittest.TestCase):
for an example - you need to use twisted.trial.unittest and @defer.inlineCallbacks if you want ot run CrawlerProcess (or better CrawlerRunner).


@defer.inlineCallbacks
def test_same_url(self):
TestSameUrlSpider.start_requests = lambda s: self.conman.from_spider(s, self.results)

This comment has been minimized.

@kmike

kmike Aug 15, 2018
Member

if TestSameUrlSpider can be used only in this test (as its start_requests is monkey-patched), what do you think about just defining the spider inside the method?

This comment has been minimized.

@StasDeep

StasDeep Aug 15, 2018
Author Contributor

Makes sense to me!


def parse_first(self, response):
"""first callback
@url http://scrapy.org

This comment has been minimized.

@kmike

kmike Aug 15, 2018
Member

Is it going to be a real http request to scrapy.org? If so, this would make tests flaky. We have a mockserver available in tests, so it is better to use it -

class CrawlTestCase(TestCase):
is an usage example.

self.visited += 1
return TestItem()

TestSameUrlSpider.start_requests = lambda s: self.conman.from_spider(s, self.results)

This comment has been minimized.

@kmike

kmike Aug 15, 2018
Member

you can define a method, no need to monkey-patch it


crawler = CrawlerRunner().create_crawler(TestSameUrlSpider)
with MockServer() as mockserver:
yield crawler.crawl(mockserver=mockserver)

This comment has been minimized.

@kmike

kmike Aug 17, 2018
Member

Unfortunately mockserver doesn't make Scrapy magically make local requests :) MockServer just starts a process which listens to HTTP, and has a few endpoints you can use for testing. To actually use it you need to instruct Scrapy to make requests to it.

In Scrapy Contracts URL to fetch is defined in a docstring (@url ...), so if you define http://scrapy.org there, this test will fetch scrapy.org website. A sanity check is to try running this test with Internet disabled (e.g. WiFi turned off) - Scrapy testing suite shouldn't need external resources to run. To make it work you need to use an URL of some mockserver endpoint instead of scrapy.org.

This comment has been minimized.

@kmike

kmike Aug 17, 2018
Member

crawler.crawl arguments are passed to Spider.from_crawler, which eventually sets them as attributes. So this line does the following: starts crawling using TestSameUrlSpider, and this spider has mockserver attribute set to mockserver instance - which is not useful, as you're not using it in a spider.

@StasDeep StasDeep force-pushed the StasDeep:fix/issue-3380 branch from 90d3da8 to 0467737 Aug 18, 2018
@StasDeep
Copy link
Contributor Author

@StasDeep StasDeep commented Aug 18, 2018

@kmike thanks for your comments. Now it works without internet connection.

@StasDeep
Copy link
Contributor Author

@StasDeep StasDeep commented Sep 3, 2018

@kmike could you please give it a pass? Hopefully, final one :)

@dangra dangra added this to the v1.6 milestone Sep 5, 2018
@kmike kmike changed the title Use dont_filter=True for contracts requests [MRG+1] Use dont_filter=True for contracts requests Sep 5, 2018
@kmike
Copy link
Member

@kmike kmike commented Sep 5, 2018

@StasDeep yeah, looks good, thanks for the fix!

@dangra dangra merged commit 7223978 into scrapy:master Sep 6, 2018
3 checks passed
3 checks passed
codecov/patch 100% of diff hit (target 84.48%)
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants