New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose certificate for HTTPS responses #4054
Expose certificate for HTTPS responses #4054
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4054 +/- ##
==========================================
- Coverage 83.92% 83.73% -0.19%
==========================================
Files 166 165 -1
Lines 9877 9900 +23
Branches 1470 1470
==========================================
+ Hits 8289 8290 +1
- Misses 1334 1354 +20
- Partials 254 256 +2
|
We can have https mockserver as well if we want, e.g. https://github.com/scrapinghub/splash/blob/master/splash/tests/mockserver.py has it (if I'm not mistaken, it was originally based on Scrapy's, and evolved since then). Hmm, actually, https://github.com/scrapy/scrapy/blob/master/tests/mockserver.py also has https option. |
5bc97d5
to
3257155
Compare
@kmike I'm having trouble with the mockserver in HTTPS mode. Sadly, I don't have a lot of experience with self-signed certificates (I assume that is the case). diff --git a/tests/test_crawl.py b/tests/test_crawl.py
index 3fc13eeb..94d2d8ec 100644
--- a/tests/test_crawl.py
+++ b/tests/test_crawl.py
@@ -3,6 +3,7 @@ import logging
from testfixtures import LogCapture
from twisted.internet import defer
+from twisted.internet.ssl import Certificate
from twisted.trial.unittest import TestCase
from scrapy.http import Request
@@ -277,3 +278,19 @@ with multiples lines
self._assert_retried(log)
self.assertIn("Got response 200", str(log))
+
+ @defer.inlineCallbacks
+ def test_response_ssl_certificate_none(self):
+ crawler = self.runner.create_crawler(SingleRequestSpider)
+ url = self.mockserver.url("/status?n=200", is_secure=False)
+ yield crawler.crawl(seed=url, mockserver=self.mockserver)
+ self.assertIsNone(crawler.spider.meta['responses'][0].certificate)
+
+ @defer.inlineCallbacks
+ def test_response_ssl_certificate(self):
+ crawler = self.runner.create_crawler(SingleRequestSpider)
+ url = self.mockserver.url("/status?n=200", is_secure=True)
+ yield crawler.crawl(seed=url, mockserver=self.mockserver)
+ cert = crawler.spider.meta['responses'][0].certificate
+ print('Response.certificate:', cert)
+ self.assertIsInstance(cert, Certificate)
|
@elacuesta Could you push your WIP for certificate tests? |
@Gallaecio Pushed 3f76c85, with the changes from #4054 (comment). Sorry I missed this comment. |
The issue is in the Scrapy code, not the mock server. When the response body is empty, the certificate is set to I tried reading it from So:
|
Nice, I updated the tests to use |
I think we should also keep tests with |
2b51c06
to
41f122c
Compare
41f122c
to
ce5b3ea
Compare
ce5b3ea
to
998cc3b
Compare
43972f2
to
f5a01f6
Compare
f5a01f6
to
ede242b
Compare
ede242b
to
928eb54
Compare
928eb54
to
556fa19
Compare
Fixes #2726
The approach I took here is very similar to the one in #3940.
No tests at the moment (not sure how to test this without requesting external pages since the mockserver uses HTTP, any help is appreciated❤️ ),posting ascrapy shell
session instead: