-
Notifications
You must be signed in to change notification settings - Fork 10.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPv6 addresses not correctly recognized #1832
Comments
This second example result requires running a webserver on localhost and having a working IPv6 stack. |
If anyone wonders why this |
@nyov , FYI, I tried with #1874 and contacting www.google.com with IPv6 and got the host to be passed correctly to Twisted, What wireshark sniffed (it did contact the correct endpoint at
Scrapy shell showing HTTP 400 (for bad host header presumably)
|
Sounds like a big improvement. Thanks for the work! :) |
I was about to close this as all the linked patches seem merged. I can't follow all the issues and backports spawned from #1874 to figure out what commit might be missing on master but made it into 1.1. |
This currently doesn't work (with "ValueError: invalid hostname: :") because of scrapy/w3lib#193, but if I downgrade w3lib to 1.22.0, the URL is parsed correctly and not escaped. So not further changes in Scrapy are needed. |
In a follow-up to #1116 scrapy does not recognize IPv6 addresses correctly.
IPv6 address notation should be written inside brackets as
[<ip>]
.(Check browser behavior for http://::1/ and http://[::1]/. But beware of the wrongly urlescaped
[]
when copying the second link).Scrapy seems to do the exact opposite:
...without the brackets it seems to work. Where it shouldn't, IMO.
The text was updated successfully, but these errors were encountered: