Do not use infinite slow write timers #1670
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
References #1629
This PR removes the use of 'infinite' slow write timers (i.e. timers which expire at the end of time). Previously these slow write timers would be leaked and slowly buildup over time as connections are recycled.
Background
This 'bug' is a bit of a weird feature - the perpetual reference to the
net.Conn
ensures that the connection is not closed in case the library users don't correctly close the connection, and in turn this is visible when the connection limit is hit.We discovered this while debugging why some of our tests were failing after #1629. The commit that triggered our issues was cd2986e, so we started to look into what could cause this issue in these changes.
The
slowWriteTimer
keeps a perpetual reference to thenet.Conn
(via the background reader). This in turn basically ensures that thenet.Conn
is never garbage collected, which in turn means that the finalizer is never run. The finalizer for anet.Conn
under Unix closes the file descriptor.What we were doing wrong in our code was that we did not close some rows returned by
QueryContext
, but we did stop referencing every related to the*sql.DB
(after callingClose
on it). However*sql.DB
works with a reference counting mechanism, and because we did not close thesql.Rows
, the underlying connection was never properly closed.Our application code works correctly both with and without the fix proposed in this PR, but I would still suggest not using these timers, as they are inherently leaky.
Changes
Stop
the timers when they should not fire, andReset
otherwise.Notes for reviewers
I use gofumpt which is a bit more exigent than
gofmt
and it picked up some whitespace changes. Let me know if I should remove these.