Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Default downloader fails to get page #355
Looks like the default downloader implemented with twisted lib can't fetch the above url. I ran 'scrapy shell http://autos.msn.com/research/userreviews/reviewlist.aspx?ModelID=14749', and got the following output.
But both urlopen of urllib2 and requests.get can download the page smoothly.
The initial cause of the error is that there is a cookie header line that is too long:
This is caught by
But the Scrapy implementation of the transport does not have a
Which is caught here:
By the infamous catch-all
The code in
If there has been a fix for this upstream, it may still be too much trouble to backport it to the old pre-13 xlib/tx code. So I would propose closing this (and reporting it to Twisted if the issue persists).