New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optional rate limiting of new connections and help prevent churn #79
Conversation
309426e
to
475b492
Compare
Updated to remove the signature change to |
I wrote the following code to test the behavior with and without my changes and with and without a limited pool: https://gist.github.com/fastest963/1976bf549c91998f9f37435663eb66cf before:
after (not limited):
after (limited):
To summarize:
|
4609d14
to
f2acfb0
Compare
I implemented some of your changes in the pool in radix.v3: I think the option pattern is good for here, especially since radix.v2 is split up into separate packages, so it's even a little more clean. By making |
2cdb82e
to
79b3e87
Compare
I implemented the |
Here are some new numbers from the test: after (buffer size: 10*poolSize) (no creation limit)
after (buffer size: 10*poolSize) (limit headroom: 10, interval: 100ms)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small changes in the main commit. Did you mean to include those other two commits that you'd already submitted prs for?
pool/doc.go
Outdated
// Rate limited pool | ||
// | ||
// A standard pool creates new connections on the fly when necessary. You can | ||
// create a "limited" pool using NewLimited if you want to rate limit those new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NewLimited is no longer a thing
pool/pool.go
Outdated
func NewCustom(network, addr string, size int, df DialFunc) (*Pool, error) { | ||
func NewCustom(network, addr string, size int, df DialFunc, os ...Opt) (*Pool, error) { | ||
defaultPoolOpts := []Opt{ | ||
PingInterval(10 * time.Second / time.Duration(size)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a size of 0 must be supported
pool/pool.go
Outdated
} | ||
|
||
// additionally, if there are any connections in the reserve pool, they're closed | ||
// periodically so that within 100 seconds a full reserve will be emptied |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this comment isn't quite right, no longer always 100 seconds
pool/pool.go
Outdated
// remove one from the reservePool if there is any | ||
select { | ||
case conn := <-p.reservePool: | ||
conn.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might as well try and put it in the regular pool if there's room there
Yes, unfortunately I made this PR from When you merge those PRs, I'll rebase this one. |
I updated the docs file, added a test for the 0-sized pool (with a buffer), and added a test to ensure that reserve-evicted connections are added back to the main pool. |
@mediocregopher Is there any chance to merge it fast? It would be helpfull in our project and I don't want to change library that we are using :) |
Sorry that took so long, @fastest963 I'm gonna just close those other prs, was easier to merge them all here |
@fastest963 @mediocregopher It would be nice to have some timeout in
should be something like:
What do you think about that? Or maybe context.Context support and deadline/timeout ? But it will require more changes in code... |
@mediocregopher did something similar in v3 and I can add that as a separate option function but it's probably better to instead to make a GetContext method that accepts a context. |
Might as well add that third argument, I'd rather break the API now when it's only been a few days since merging and probably not that many people are using it, than to add an ugly extra method. I don't think messing around with contexts is a good idea, as nothing else in this project uses them. |
@mieczkowski @mediocregopher added a new option here: #88 |
We've been having an issue in production where thousands of connections are created/closed/repeat within a few seconds and the service runs out of file descriptors (because the old connections are left in TIME_WAIT) and is now in an unusable state. This PR adds a few things:
First, a "reserve pool" was added to
Pool
and has a capacity of10*size
. If the main pool is full, this reserve pool is used to temporarily hold onto a client in case it is immediately needed again. This should help prevent connection churn if more thansize
connections are needed temporarily. Over a period of 100 seconds, the reserve pool is drained re-using the ping timer.Second, instead of potentially creating a new connection just to ping, we only look for a client in the pool.
Third, an optional "limited" pool can be created with a token bucket system that only allows
size
clientsto be created roughly every 10 seconds. The bucket is refilled at a rate of
10/size
seconds. This also re-uses the ping timer. I left this default to off since technically it's backward-incompatible but we plan on turning it on in almost all cases. APoolLimited
bool was added to thecluster.Opts
struct to enable this functionality.