Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProxyManager doesn't honor HTTPConnection.default_socket_options #2936

Open
jbrunette opened this issue Mar 14, 2023 · 10 comments
Open

ProxyManager doesn't honor HTTPConnection.default_socket_options #2936

jbrunette opened this issue Mar 14, 2023 · 10 comments
Labels
💰 Bounty $100 If you complete this issue we'll pay you $100 on OpenCollective! Contributor Friendly ♥

Comments

@jbrunette
Copy link

jbrunette commented Mar 14, 2023

ProxyManager doesn't honor HTTPConnection.default_socket_options. I'm attempting to enable TCP Keep Alive which I monitor via "netstat -tnope" on Linux. Only the third block below works when using a Proxy.

Without a proxy, default_socket_options enables TCP Keep Alive successfully
from urllib3.connection import HTTPConnection
import urllib3
HTTPConnection.default_socket_options = HTTPConnection.default_socket_options + [
(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1),
]
http = urllib3.PoolManager()
http.request('GET', url)

With a proxy, default_socket_options does not enable TCP Keep Alive (or any other socket option)
from urllib3.connection import HTTPConnection
import urllib3
HTTPConnection.default_socket_options = HTTPConnection.default_socket_options + [
(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1),
]
http = urllib3.ProxyManager(proxy_url)
http.request('GET', url)

With a proxy, ProxyManager constructor param "socket_options" does enable TCP Keep Alive
from urllib3.connection import HTTPConnection
import urllib3
http = urllib3.ProxyManager(proxy_url, socket_options=HTTPConnection.default_socket_options + [
(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
])
http.request('GET', url)

This was observed while using the "requests" module to connect to a URL via a proxy that has a short idle timeout, and the common "HTTPConnection.default_socket_options" workaround did nothing. I couldn't find a way to pass "socket_options" to ProxyManager through "requests" methods.

@jbrunette
Copy link
Author

urllib3 1.26.15

@sethmlarson
Copy link
Member

Is your proxy HTTP or HTTPS? If the proxy is HTTPS you'd need to set the socket options on HTTPSConnection instead of HTTPConnection.

@jbrunette
Copy link
Author

No change when assigning to HTTPSConnection.default_socket_options.

Is it related to src/urllib3/connectionpool.py's "if self.proxy:" check, which states "We cannot know if the user has added default socket options, so we cannot replace the list." ?

@sethmlarson sethmlarson added Contributor Friendly ♥ 💰 Bounty $100 If you complete this issue we'll pay you $100 on OpenCollective! labels Jan 10, 2024
@sethmlarson
Copy link
Member

I went digging for why we even disable Nagle for proxies, remembering that we recently did some work to reduce packet fragmentation for proxies, specifically the CONNECT to initiate a tunnel.

I followed this trail:

@shazow @sigmavirus24 did either of you have insight into this? I'm kinda wondering if we should be universally disabling Nagle's and trying to combine request headers into a single send call instead of slowing down small proxy requests.

@sigmavirus24
Copy link
Contributor

So, to be clear, in general Nagle's algorithm is disabled for non-proxy traffic. The socket options are kind of confusing. I think we want the algorithm enabled though for proxies. Depending on the layer the proxy acts at, it may have behavior that isn't ideal if we break things up over more packets. Remember not all proxies act at the same layer of the network.

@sigmavirus24
Copy link
Contributor

Also, I'm not certain that at the level urllib3 is operating we can guarantee that headers are all in one packet.

Also, we're still supporting 3.8, right? So the fix for cpython is only so helpful until we drop that. Given that not everyone might have that fix in 3.9 or 3.10 I don't think we can meaningfully do anything until the better behavior is guaranteed for all users

@sethmlarson
Copy link
Member

sethmlarson commented Jan 17, 2024

@sigmavirus24

Remember not all proxies act at the same layer of the network.

This is the piece that I was missing mentally, thank you! We should continue using Nagle for proxies.

@nicoepp
Copy link

nicoepp commented Jan 30, 2024

Are the socket_options set correctly from HTTPConnection.default_socket_options?

I'm looking at the following code:

class HTTPConnection(_HTTPConnection):

    # ...

    #: Disable Nagle's algorithm by default.
    #: ``[(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)]``
    default_socket_options: typing.ClassVar[connection._TYPE_SOCKET_OPTIONS] = [
        (socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
    ]

    # ...

    def __init__(
        self,
        host: str,
        port: int | None = None,
        *,
        timeout: _TYPE_TIMEOUT = _DEFAULT_TIMEOUT,
        source_address: tuple[str, int] | None = None,
        blocksize: int = 16384,
        socket_options: None
        | (connection._TYPE_SOCKET_OPTIONS) = default_socket_options,
        proxy: Url | None = None,
        proxy_config: ProxyConfig | None = None,
    ) -> None:
    
    # ...

and if I do conn = HTTPConnection('httpbin.org', 80), the output between conn.socket_options and conn.default_socket_options always seems to differ even if I have HTTPConnection.default_socket_options set to the keep alive configuration suggested by @jbrunette. It seems like socket_options doesn't take the value of default_socket_options here.

I might be missing something obvious though.

@nicoepp
Copy link

nicoepp commented Jan 30, 2024

Setting the default options should have worked before this commit: 287052a

Something like this should fix it:

def __init__(
    self,
    # ...
    socket_options: None
    | (connection._TYPE_SOCKET_OPTIONS) = HTTPConnection.default_socket_options,
    # ...
) -> None:
    # ...

@nicoepp
Copy link

nicoepp commented Feb 15, 2024

@sethmlarson Thoughts?

I'd be happy to create a PR addressing this issue but would need some guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💰 Bounty $100 If you complete this issue we'll pay you $100 on OpenCollective! Contributor Friendly ♥
Projects
None yet
Development

No branches or pull requests

4 participants