Hardening follow-up to #1664. No known exploit in the wild.
_validate_http_url calls socket.getaddrinfo and rejects the URL if any returned IP is private, loopback, link-local, or multicast. That runs at _HTTPSource.__init__ and on every redirect's Location.
Then _HTTPSource._request calls pool.request('GET', current_url, ...). urllib3 resolves the hostname a second time at connect time. The two resolutions are independent. Between them, a hostile DNS server can return a public IP first (validation passes) and a private IP second (TCP connection). The existing code comment labels this "best-effort" for exactly this reason.
Concrete scenarios:
- DNS rebinding: attacker controls a domain, hands 8.8.8.8 to the validator, then 127.0.0.1 (or 169.254.169.254) to the connect call.
- DNS cache races: low-TTL record flips between validate-time and connect-time.
Fix:
_validate_http_url returns the first validated public IP instead of None.
_HTTPSource opens the TCP socket to that IP literal while the HTTP Host header and TLS SNI stay set to the original hostname (so cert verification still works against the name).
- urllib3's
HTTPConnection._dns_host is the right hook. _new_conn uses _dns_host for socket.create_connection, while self.host and self.server_hostname keep the original name for Host header and SNI.
- Each redirect hop re-resolves and re-pins independently. A redirect to a new hostname gets its own fresh validate-and-pin pair.
Residual limitations:
- Within a single hop the IP is pinned. Across redirects, each hop is resolved independently.
- An attacker who controls a hostname that legitimately resolves to multiple public IPs can still influence which one we pick. They can't bend us to a private IP.
XRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1 still skips the pin (and the validation) for dev/test against localhost servers.
Refs #1664.
Hardening follow-up to #1664. No known exploit in the wild.
_validate_http_urlcallssocket.getaddrinfoand rejects the URL if any returned IP is private, loopback, link-local, or multicast. That runs at_HTTPSource.__init__and on every redirect'sLocation.Then
_HTTPSource._requestcallspool.request('GET', current_url, ...). urllib3 resolves the hostname a second time at connect time. The two resolutions are independent. Between them, a hostile DNS server can return a public IP first (validation passes) and a private IP second (TCP connection). The existing code comment labels this "best-effort" for exactly this reason.Concrete scenarios:
Fix:
_validate_http_urlreturns the first validated public IP instead ofNone._HTTPSourceopens the TCP socket to that IP literal while the HTTPHostheader and TLS SNI stay set to the original hostname (so cert verification still works against the name).HTTPConnection._dns_hostis the right hook._new_connuses_dns_hostforsocket.create_connection, whileself.hostandself.server_hostnamekeep the original name for Host header and SNI.Residual limitations:
XRSPATIAL_GEOTIFF_ALLOW_PRIVATE_HOSTS=1still skips the pin (and the validation) for dev/test against localhost servers.Refs #1664.