Skip to content

fix(util): Fix flaky TestWorkerPool_TestMetric race condition#7497

Merged
yeya24 merged 1 commit intomasterfrom
fix/worker-pool-flaky-test
May 11, 2026
Merged

fix(util): Fix flaky TestWorkerPool_TestMetric race condition#7497
yeya24 merged 1 commit intomasterfrom
fix/worker-pool-flaky-test

Conversation

@yeya24
Copy link
Copy Markdown
Contributor

@yeya24 yeya24 commented May 10, 2026

What this PR does

Fixes the flaky TestWorkerPool_TestMetric test in pkg/util.

Root Cause

The test was racy because the first Submit() could hit the select default case (fallback path) if the worker goroutine hadn't started receiving on the channel yet. Since the channel is unbuffered and Submit uses a non-blocking select:

select {
case s.serverWorkerChannel <- f:
default:
    s.fallbackTotal.Inc()
    go f()
}

If the single worker goroutine wasn't ready to receive when the first job was submitted, both jobs would go through the fallback path, incrementing the counter to 2 instead of the expected 1.

Fix

Add a started channel that the first job closes once it begins executing. The test waits on this channel before submitting the second job, ensuring the worker is actually busy.

Verification

Ran the test 50 times with -count=50 and with -race — all pass consistently.

Copy link
Copy Markdown
Member

@SungJin1212 SungJin1212 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, can you get the rebase?

@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label May 10, 2026
The test was racy because the first Submit() could hit the select
default case (fallback path) if the worker goroutine had not yet
started receiving on the channel. This caused the fallback counter
to be 2 instead of the expected 1.

Fix by adding a started channel that the first job closes once it
begins executing, ensuring the worker is actually busy before the
second job is submitted.

Signed-off-by: Ben Ye <benye@amazon.com>
@yeya24 yeya24 force-pushed the fix/worker-pool-flaky-test branch from 40e9e6a to 2aa711f Compare May 11, 2026 00:57
@yeya24 yeya24 merged commit 3d38903 into master May 11, 2026
37 checks passed
@yeya24 yeya24 deleted the fix/worker-pool-flaky-test branch May 11, 2026 04:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size/XS type/flaky-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants