Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: net/http: add .MaxConnLifespan or .SetMaxConnLifespan() to Transport #54429

Open
josephcopenhaver opened this issue Aug 13, 2022 · 7 comments
Labels
Milestone

Comments

@josephcopenhaver
Copy link

josephcopenhaver commented Aug 13, 2022

Currently there is no way to signal to a client via the http.Transport struct that a connection should not be used beyond a specific timeout since it was originally created.

Such a feature is useful to some authors as upstreams referenced via dns gain more ip addresses over time and scale-out horizontally. When this occurs it may be in the client's best interest to rebalance connections amongst the available upstream nodes to avoid creating hot nodes in the upstream

It is a naive implementation as really there is no need to reset connections in such a situation until there are indeed more upstreams available to connect to. In addition there's no need to kill all connections in such topologies repeatedly, just the ones we intend to move over to the newly available upstream ips. ( more discussion on the connection pool FIFO behavior being problematic for maintaining the fewest number of connections necessary can be found here )

Despite these drawbacks the benefits are clear for persons who accept the overhead of creating new connections in such a fashion if they wish to avoid hot partitions and do not have load-balancing capabilities offered by traditional expensive cloud provider appliances and mesh framework sidecar proxies. In short this is beneficial for the exact same reasons it's beneficial in https://pkg.go.dev/database/sql 's package via the SetConnMaxLifetime function. An argument can be made this is a tradeoff, but it appears to be one people need. It seems safest to provide something in the net/http.Transport struct rather than encourage people to implement their own transport or adopt an alternative.

This is also beneficial in situations where the maintainers know the upstream server will attempt to close the connections after a max lifetime / idle rate but for some edge cases such as network congestion or an event equivalent to complete packet loss. When the latter occurs and the client has only idle connections it can take quite a few seconds to recover and finally start sending requests again to the new upstream instances ( depending on the idle connection pool max size, keep alive settings, and the client's throughput/burst behaviors )


This would successfully address #23427 's concerns.

An example toy implementation adding MaxConnLifespan as a data member has already been created and a PR is open with passing tests: #46714 .

Minor tweaking to the above is required if exposing a way to mutate the MaxConnLifetime value over time is a desirable feature. By adding this functionality to the existing mechanisms for computing idle timeouts we also make sure the clients close the connections without maintaining them actively in the connection pool just to close them whenever the client finally gets around to using the connection again assuming we have a high idle connection timeout.

@gopherbot gopherbot added this to the Proposal milestone Aug 13, 2022
@ianlancetaylor
Copy link
Contributor

ianlancetaylor commented Aug 13, 2022

CC @neild @bradfitz

@szuecs
Copy link

szuecs commented Sep 15, 2022

Just a small correction:
Also cloud load balancers will not completely solve the problem of reusing forever old connections. For example NLB with preserving source-ip will forward the traffic always to the same set of old LB members. I observed this in a wrong configure apache java http client.

@josephcopenhaver
Copy link
Author

josephcopenhaver commented Sep 15, 2022

In general, if the user was able to listen to dns inquiries + results for a http client, set a policy to choose which ip to connect to for a dns response when constructing a new connection, observe when connections are closed, and alter an internal max usage deadline for the connection they could have MUCH more power to control their destiny around discovery, fan-out, fan-in, etc without us making a potentially too limited feature.

I might make a separate proposal covering all the other edge cases if this one is for some reason denied.

@szuecs
Copy link

szuecs commented Sep 15, 2022

@josephcopenhaver I agree with your proposal but do not have load-balancing capabilities offered by traditional expensive cloud provider appliances and mesh framework sidecar Proxies seems not correct. If you take for example Google's maglev algorithm it won't fix the problem without your proposed fix. The same is true for NLB in AWS. I am sure if you have a big and efficient enough client that you can observe the latency issue in cloud layer 7 load balancers, too. I saw it already in aws alb and I see it in my own metrics in skipper in zalando infrastructure.

@josephcopenhaver
Copy link
Author

josephcopenhaver commented Sep 16, 2022

Understood. @szuecs Is there a change you would like me to make? I think my opening statement is still accurate if the right appliance config and http protocol version is utilized on some cloud vendors. I am purposefully not trying to be too specific. Most people are likely not observing hot nodes because their clients are using a default MaxIdleConnsPerHost value which is only two so they are constantly opening new/closing old connections at high throughputs.

I hope it's clear my prior github comment was not in response to your original clarifying comment on this proposal ( which was appreciated ).

@szuecs
Copy link

szuecs commented Sep 16, 2022

@josephcopenhaver I would personally drop the mentioned sentence, but now we have a discussion which clarifies the point anyway, so from my side everything is good. Thanks for your proposal and PR work!

@josephcopenhaver
Copy link
Author

josephcopenhaver commented Sep 19, 2022

Please let me know if a much more detailed proposal document is warranted. What docs I could find said it's not required until requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Incoming
Development

No branches or pull requests

4 participants