Update agent code review guidelines flaky test findings#9669
Conversation
tests/activity_test.go
Outdated
| select { | ||
| case <-cancelCh: | ||
| case <-ctx.Done(): | ||
| s.Fail("Test timed out after 90s waiting for activity cancellation", ctx.Err()) |
There was a problem hiding this comment.
nit: let's not hard-code timeouts (90s) as they might change
| case <-listReqDone: | ||
| case <-ctx.Done(): | ||
| s.Fail("timed out waiting for list nexus endpoints request to complete") | ||
| } |
There was a problem hiding this comment.
non-blocking musings: I wonder if this could be a helper on TestEnv; like s.WaitForChn and it uses the right context automatically. I didn't think if happens that often, but now seeing multiple examples ...
.github/copilot-instructions.md
Outdated
| - Use `require.ErrorAs(t, err, &specificErr)` for specific error type checks | ||
| - Prefer `require` over `assert` - it's rarely useful to continue a test after a failed assertion | ||
| - Add comments explaining why `Eventually` is needed (e.g., eventual consistency) | ||
| - Do not use single-value type assertions on errors (`err.(*T)`); this panics instead of failing the test when the type doesn't match. Use `errors.As` with a guarded return, or the two-value form `v, ok := err.(*T)` followed by an explicit `require.True(t, ok)`. |
There was a problem hiding this comment.
I'd suggest only errors.As and not the longer/less helpful version of checked type casting.
There was a problem hiding this comment.
This 1-liner is now possible in latest Go:
s.ErrorAs(err, new(*serviceerror.NotFound))
That might be actually the best advice IMO.
.github/copilot-instructions.md
Outdated
| - Never call testify assertions (`s.NoError`, `s.Equal`, `require.NoError`, even `assert.NoError`) inside a `go func()` — if the goroutine outlives the test, the assertion panics the binary with `panic: Fail in goroutine after TestXxx has completed`. Move assertions to the test goroutine or use a buffered error channel. | ||
| - Any `<-ch` that isn't inside a `select` with `ctx.Done()` will hang indefinitely if the sender never sends. Always provide a context cancellation fallback. | ||
| - Never write to package-level or global variables in tests — parallel tests share the same process; thread values through function parameters instead. | ||
| - Proto message fields accessed outside the workflow lock must be cloned, not aliased: use `common.CloneProto(...)` rather than returning the pointer directly. |
There was a problem hiding this comment.
Is this test specific? or does it belong to Concurrency and Safety?
.github/copilot-instructions.md
Outdated
| - Never write to package-level or global variables in tests — parallel tests share the same process; thread values through function parameters instead. | ||
| - Proto message fields accessed outside the workflow lock must be cloned, not aliased: use `common.CloneProto(...)` rather than returning the pointer directly. | ||
| - Never use `time.Sleep` or `time.Since(start) > threshold` to enforce ordering — use channels, `sync.WaitGroup`, or `EventuallyWithT` instead. | ||
| - Do not use `RealTimeSource` with a zero-duration delay in unit tests — it creates an infinite spin loop. Use a mock/event time source with a non-zero delay and advance the clock explicitly. |
There was a problem hiding this comment.
I don't quite follow that one; is that referring to NewTimer?
There was a problem hiding this comment.
I'm removing this one, it was a one-off that I don't think happens frequently enough for a rule
.github/copilot-instructions.md
Outdated
| - Proto message fields accessed outside the workflow lock must be cloned, not aliased: use `common.CloneProto(...)` rather than returning the pointer directly. | ||
| - Never use `time.Sleep` or `time.Since(start) > threshold` to enforce ordering — use channels, `sync.WaitGroup`, or `EventuallyWithT` instead. | ||
| - Do not use `RealTimeSource` with a zero-duration delay in unit tests — it creates an infinite spin loop. Use a mock/event time source with a non-zero delay and advance the clock explicitly. | ||
| - When combining a short-lived background operation with a propagation wait, verify the timeouts are compatible: if the background op can time out before the condition is satisfied, the propagation wait will never succeed. |
There was a problem hiding this comment.
Maybe AI will get it, but I don't :D What's "propagation wait" meaning here?
There was a problem hiding this comment.
Meaning an Eventually that is waiting on an async operation to take affect. Rewording to be more clear
.github/copilot-instructions.md
Outdated
| - Prefer `sync.Mutex` over `sync.RWMutex` almost always, except when reads are much more common than writes (>1000×) or readers hold the lock for significant time | ||
| - Don't do IO while holding locks - use side effect tasks | ||
| - Clone data before releasing locks if it might be modified | ||
| - Be suspicious of `go s.someHelper(ctx, ...)` calls where the goroutine runs exactly once and the test then immediately waits for something that helper was supposed to cause. If the operation can fail transiently (network, tight deadline, busy CI), the single attempt may fail silently and the wait will never succeed. Either loop the goroutine until `ctx.Done()`, or check that the operation succeeded before proceeding. |
There was a problem hiding this comment.
nit: should this be in ## 3. Testify Suite Correctness and Reliability ?
) ## What changed? Add new review guidelines for test reliability and fix instances found in the current codebase. ## Why? I have found these to be the most common patterns across the last two weeks of targeting flaky tests. ## How did you test it? - [ ] built - [ ] run locally and tested manually - [ ] covered by existing tests - [ ] added new unit test(s) - [ ] added new functional test(s) ## Potential risks Minimal
) ## What changed? Add new review guidelines for test reliability and fix instances found in the current codebase. ## Why? I have found these to be the most common patterns across the last two weeks of targeting flaky tests. ## How did you test it? - [ ] built - [ ] run locally and tested manually - [ ] covered by existing tests - [ ] added new unit test(s) - [ ] added new functional test(s) ## Potential risks Minimal
## What changed? Add new review guidelines for test reliability and fix instances found in the current codebase. ## Why? I have found these to be the most common patterns across the last two weeks of targeting flaky tests. ## How did you test it? - [ ] built - [ ] run locally and tested manually - [ ] covered by existing tests - [ ] added new unit test(s) - [ ] added new functional test(s) ## Potential risks Minimal
) ## What changed? Add new review guidelines for test reliability and fix instances found in the current codebase. ## Why? I have found these to be the most common patterns across the last two weeks of targeting flaky tests. ## How did you test it? - [ ] built - [ ] run locally and tested manually - [ ] covered by existing tests - [ ] added new unit test(s) - [ ] added new functional test(s) ## Potential risks Minimal
What changed?
Add new review guidelines for test reliability and fix instances found in the current codebase.
Why?
I have found these to be the most common patterns across the last two weeks of targeting flaky tests.
How did you test it?
Potential risks
Minimal