New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support huge_tree=False? #5700
Comments
Here in scrapy/scrapy/selector/unified.py Lines 67 to 82 in 4af5a06
|
Only as long as |
As for alternative approaches, I think we may need to make it so that Maybe we can make |
The upcoming parsel 1.7.0 exposes, and flips, the lxml flag that controls the protection described here, so it's now possible to scrape certain large pages but presumably malicious pages can DoS the parser. So it would make sense to be able to disable
huge_tree
, re-enabling the protection, but as it's an argument forSelector.__init__()
, it's unclear how to do that in Scrapy:response.xpath()
uses a hiddenself._cached_selector = Selector(response=self)
and there is nowhere to pass custom arguments.The text was updated successfully, but these errors were encountered: