-
Notifications
You must be signed in to change notification settings - Fork 2.5k
fix(pool): double freeturn bug #3625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
96d0e65 to
4274cff
Compare
17acb71 to
c80f50b
Compare
|
@cyningsun feel free to review if you can. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a critical double-free bug in the connection pool that caused the pool to get stuck with aggressive timeout configurations. The bug occurred when a dial goroutine created a connection, the original waiter timed out, and the connection was delivered to another waiter - both the dial goroutine and the recipient would call freeTurn(), causing the semaphore to be released twice and allowing more concurrent operations than the pool size permits.
Key Changes
- Added
gotConnatomic boolean to track whether a waiter successfully received a connection - Modified dial goroutine logic to conditionally call
freeTurn()based on whether the original waiter received a connection - Enhanced context timeout handling in
redis.goto respect caller deadlines during connection initialization waits - Added comprehensive test coverage for the double-free scenario
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
redis.go |
Enhanced timeout logic to use minimum of remaining command timeout and DialTimeout when waiting for connection initialization |
internal/pool/want_conn.go |
Added atomic gotConn field to track whether waiter received a connection, enabling proper turn management |
internal/pool/pool.go |
Modified dial goroutine to conditionally free turns based on waiter's connection receipt status, preventing double-free |
internal/pool/double_freeturn_test.go |
Added test demonstrating the double-free bug scenario with slow dials and timeouts |
internal/pool/double_freeturn_simple_test.go |
Added simple test to detect double-free by checking queue length during critical window |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
1b6e627 to
9710b24
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Thanks for the excellent work @ndyakov! Your test coverage and solution look comprehensive. I've prepared #3626 as an alternative approach focusing on turn ownership semantics. It's offered in case it provides any additional useful insights, but please feel free to close it if #3625 already covers all the necessary ground. |
|
#3626 was merged instead |
A regression was reported where with aggressive timeout configuration the pool was stuck. Traced back to #3518
This PR fixes #3619 and #3623