Skip to content

Fix flaky test#1587

Merged
dgageot merged 1 commit intodocker:mainfrom
dgageot:fix-flaky-test
Feb 4, 2026
Merged

Fix flaky test#1587
dgageot merged 1 commit intodocker:mainfrom
dgageot:fix-flaky-test

Conversation

@dgageot
Copy link
Member

@dgageot dgageot commented Feb 4, 2026

No description provided.

@dgageot dgageot requested a review from a team as a code owner February 4, 2026 14:27
Signed-off-by: David Gageot <david.gageot@docker.com>
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

Found 3 issues in the changed code that should be addressed:

  1. Path construction portability issue (line 71) - Using string concatenation instead of filepath.Join
  2. Potential race condition (line 91) - Server goroutine lacks synchronization
  3. Silent error handling (line 91) - Server errors are ignored in goroutine

These issues could affect test reliability and cross-platform compatibility.


tmpDir := t.TempDir()
t.Chdir(tmpDir)
t.Chdir(tmpDir) // Use relative socket path to avoid Unix socket path length limit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Path construction uses string concatenation instead of filepath.Join

The code uses tmpDir + "/session.db" which hardcodes a forward slash, creating non-portable paths that may fail on Windows where backslashes are expected. This is confirmed as a bug because:

  1. The filepath package is already imported and used correctly elsewhere in the same file (line 40)
  2. This creates inconsistency within the same file
  3. It may cause test failures on Windows CI environments

Recommendation: Use filepath.Join(tmpDir, "session.db") instead to ensure correct path separators on all platforms.


srv, err := server.New(t.Context(), sessionStore, &config.RuntimeConfig{}, 0, nil)
require.NoError(t, err)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Server goroutine lacks synchronization, creating potential race condition

The function starts a goroutine with srv.Serve(t.Context(), ln) and immediately returns "cagent.sock" with no synchronization to ensure the server is ready to accept connections. While the Unix socket is created and bound by Listen() before the goroutine starts, there's still a race window between when the goroutine spawns and when http.Server.Serve() enters its accept loop.

If the test's HTTP client (lines 43-49) attempts to connect during this window, the connection could fail intermittently. This is a classic race condition in test setup that could cause test flakiness—ironically, the issue this PR aims to fix.

Recommendation: Add a readiness check before returning, such as:

  • Attempting to connect in a retry loop
  • Using a channel to signal when the server is ready
  • Using a WaitGroup or similar synchronization primitive

srv, err := server.New(t.Context(), sessionStore, &config.RuntimeConfig{}, 0, nil)
require.NoError(t, err)

go func() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Server error is silently ignored in goroutine

The goroutine ignores the error from srv.Serve(t.Context(), ln) using _ = srv.Serve(...). If the server fails to start or encounters an error during operation, the test will fail with a confusing "connection refused" error rather than showing the actual server error.

Problems:

  1. Test output visibility: Errors may not be visible in test output depending on log configuration
  2. Race condition: No way to detect if server failed to start before the test proceeds
  3. Error attribution: Test failures will show indirect symptoms rather than root causes
  4. Testing best practice: Errors should be explicitly handled in tests using t.Errorf() or similar

Recommendation: Either send the error through a channel and check it after the test, use t.Errorf() to report the error, or at minimum use t.Logf() to ensure errors are visible in test output.

@dgageot dgageot merged commit 2015d7c into docker:main Feb 4, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants