New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
headers_received signal #4897
headers_received signal #4897
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
(not merging yet due to on-going tests)
c7c6316
to
b2ee4b9
Compare
Codecov Report
@@ Coverage Diff @@
## master #4897 +/- ##
==========================================
+ Coverage 87.82% 87.83% +0.01%
==========================================
Files 158 158
Lines 9723 9736 +13
Branches 1433 1435 +2
==========================================
+ Hits 8539 8552 +13
Misses 929 929
Partials 255 255
|
852bbd4
to
740d8a4
Compare
740d8a4
to
ea14172
Compare
…able in Twisted>=18.4.0 Absent in https://twistedmatrix.com/documents/17.9.0/api/twisted.web._newclient.TransportProxyProducer.html Available in https://twistedmatrix.com/documents/18.4.0/api/twisted.web._newclient.TransportProxyProducer.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So glad you managed to find the issue affecting the tests!
Co-authored-by: Adrián Chaves <adrian@chaves.io>
@Gallaecio please review again after 84e91b6 |
I’m undecided on the right approach to provide the content length. My initial reaction was: we should first fix Scrapy responses missing the Content-Length header. Then I saw https://github.com/Almad/twisted/blob/d9dccb23ffa63184e376e98bebe6a5cee32953f3/twisted/web2/channel/http.py#L202 and I thought it made sense, Twisted is just following the standard. Then I thought of cases where the header and the content length do not actually match, and how in Scrapy we allow our users to still process those requests; I think exposing the Content-Length header to users would go in line with that, Scrapy users some times may need that low-level stuff. So I’m +0.5 on this. I think the code looks great, I’m just not sure about the length parameter. Adding it as a parameter would require a messy deprecation later if we eventually remove it. But the only alternative I can think of would be to add the header to responses, which I think would be out of the scope of this change, and I’m unsure whether it would be the easiest thing (pick the content length like you did here and set that into the header) or require Twisted monkey patching. If it’s not easy, it’s probably best to have a messy deprecation later than to postpone this further. |
Closes #1772
A new
headers_received
signal that fires when the headers for a response are received, before starting the download of the body. Handlers for this signal can stop the download of a response by raising ascrapy.exceptions.StopDownload
exception, just like the handlers for thebytes_received
signal.Tasks: