Skip to content

Conversation

@cyningsun
Copy link
Contributor

Bug Description

A critical defect was identified in the connection pool's turn management mechanism within the putIdleConn method, breaking the fundamental 1:1 correspondence between turns and connections. ( Introduced by #3518)

Root Cause Analysis

Turn Release Scenarios (Expected Behavior):

- new connection failed
  → freeTurn: allow other goroutines to create connections
- pool.Put 
  → freeTurn: allow acquiring existing idle connections  
- pool.putIdleConn
  ├── deliver connection to waiting goroutine
  │   → ❌ DEFECT: turn should be released by receiving goroutine via Put
  └── place connection in idle pool
      → freeTurn: allow acquiring existing idle connections

The Defect: In putIdleConn, when delivering connections to waiting goroutines, the turn was incorrectly released immediately instead of being transferred to the receiving goroutine for proper release via Put.

Test Gap: The test case "should not leak turn when delivering connection via putIdleConn" contained flawed assertions due to misunderstanding of turn ownership mechanism in the context of connection delivery, failing to detect the leakage.

Relationship to #3625:
This PR addresses the same turn management issue identified in #3625. I want to acknowledge and thank @ndyakov for:

This PR offers an additional perspective on the solution, focusing specifically on turn ownership semantics. It's intended to provide more options for review and discussion as we work towards the best resolution for this issue.

Note to Reviewers:
This PR is optional and can be closed if #3625 covers all needs. It's just here to offer another perspective.

This commit fixes a critical race condition where freeTurn() could be
called twice in the connection pool's queuedNewConn flow, causing turn
counter inconsistency.

Problem:
- When a new connection creation failed in queuedNewConn, both the
  defer handler and the dialing goroutine could call freeTurn()
- This led to turn counter underflow and queue length inconsistency

Solution:
- Modified putIdleConn to return a boolean indicating whether the
  caller needs to call freeTurn()
  - Returns true: connection was put back to pool, caller must free turn
  - Returns false: connection was delivered to a waiting request,
    turn will be freed by the receiving goroutine
- Updated queuedNewConn to only call freeTurn() when putIdleConn
  returns true
- Improved error handling flow in the dialing goroutine

Changes:
- putIdleConn now returns bool instead of void
- Added comprehensive documentation for putIdleConn behavior
- Refactored error handling in queuedNewConn goroutine
- Updated test cases to reflect correct turn state expectations

This ensures each turn is freed exactly once, preventing resource
leaks and maintaining correct queue state.
@jit-ci
Copy link

jit-ci bot commented Nov 29, 2025

Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset.

In case there are security findings, they will be communicated to you as a comment inside the PR.

Hope you’ll enjoy using Jit.

Questions? Comments? Want to learn more? Get in touch with us.

@ndyakov ndyakov requested a review from Copilot November 29, 2025 21:37
Copilot finished reviewing on behalf of ndyakov November 29, 2025 21:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical bug in the connection pool's turn management system that could cause turn leaks and incorrect semaphore accounting. The fix ensures proper transfer of turn ownership when connections are delivered to waiting goroutines.

Key changes:

  • Modified putIdleConn to return a boolean indicating whether the caller should free the turn
  • Updated queuedNewConn logic to conditionally free turns based on putIdleConn's return value
  • Corrected test assertions to reflect the proper turn ownership semantics

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
internal/pool/pool.go Modified putIdleConn to return boolean for turn management, refactored queuedNewConn error handling to properly free turns only when connections are added to idle pool rather than delivered to waiting goroutines
internal/pool/pool_test.go Updated test expectations to correctly validate that turns are held during connection delivery and only freed after the receiving goroutine completes its own dial or returns the connection via Put()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@ndyakov
Copy link
Member

ndyakov commented Nov 29, 2025

I do personally like this approach better, it is easier to follow and read. Thank you @cyningsun. Will let @ofekshenawa decide which one to include. Could you please sync the rest of the changes (the context calculation and the tests).

@ndyakov ndyakov requested a review from ofekshenawa November 29, 2025 21:53
@cyningsun cyningsun force-pushed the bugfix/double_freeturn branch 3 times, most recently from e6b0ac5 to 8e24b4a Compare November 30, 2025 02:21
Synced from https://github.com/redis/go-redis/tree/ndyakov/freeturn-fix

Changes include:
- Add comprehensive tests for double freeTurn bug detection
- Improve context timeout calculation using min(remaining time, DialTimeout)
- Prevent goroutines from waiting longer than necessary

Co-authored-by: Nedyalko Dyakov <1547186+ndyakov@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@cyningsun cyningsun force-pushed the bugfix/double_freeturn branch from 8e24b4a to be8071f Compare November 30, 2025 02:31
@cyningsun
Copy link
Contributor Author

@ndyakov, thank you! I'm glad you like the approach. I've gone ahead and synced the context calculation and tests from #3625. All tests are passing.

@ndyakov ndyakov added the bug label Dec 1, 2025
@ndyakov ndyakov merged commit d4ae523 into redis:master Dec 1, 2025
21 checks passed
ndyakov added a commit that referenced this pull request Dec 1, 2025
…on leaks (#3626)

* fix(pool): prevent double freeTurn in queuedNewConn

This commit fixes a critical race condition where freeTurn() could be
called twice in the connection pool's queuedNewConn flow, causing turn
counter inconsistency.

Problem:
- When a new connection creation failed in queuedNewConn, both the
  defer handler and the dialing goroutine could call freeTurn()
- This led to turn counter underflow and queue length inconsistency

Solution:
- Modified putIdleConn to return a boolean indicating whether the
  caller needs to call freeTurn()
  - Returns true: connection was put back to pool, caller must free turn
  - Returns false: connection was delivered to a waiting request,
    turn will be freed by the receiving goroutine
- Updated queuedNewConn to only call freeTurn() when putIdleConn
  returns true
- Improved error handling flow in the dialing goroutine

Changes:
- putIdleConn now returns bool instead of void
- Added comprehensive documentation for putIdleConn behavior
- Refactored error handling in queuedNewConn goroutine
- Updated test cases to reflect correct turn state expectations

This ensures each turn is freed exactly once, preventing resource
leaks and maintaining correct queue state.

* fix: sync double freeturn bug fix and context calculation from upstream

Synced from https://github.com/redis/go-redis/tree/ndyakov/freeturn-fix

Changes include:
- Add comprehensive tests for double freeTurn bug detection
- Improve context timeout calculation using min(remaining time, DialTimeout)
- Prevent goroutines from waiting longer than necessary

Co-authored-by: Nedyalko Dyakov <1547186+ndyakov@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Nedyalko Dyakov <nedyalko.dyakov@gmail.com>
Co-authored-by: Nedyalko Dyakov <1547186+ndyakov@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants