Fix race condition in ProcessBufferedRenderBatches_WritesRenders#65783
Conversation
7cb7ae2 to
d1a68f9
Compare
There was a problem hiding this comment.
Pull request overview
This PR re-enables and stabilizes the previously quarantined ProcessBufferedRenderBatches_WritesRenders test by changing how it waits for the renderer to finish processing the first render batch, aiming to eliminate intermittent CI failures in the Components Server circuit rendering tests.
Changes:
- Removes the
[QuarantinedTest]attribute fromProcessBufferedRenderBatches_WritesRendersso it runs in CI again. - Replaces an unused
ManualResetEventSlimwait with a timeout-bounded loop intended to wait for render-batch processing to complete.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
This PR fixes an intermittent CI failure in the Components Server RemoteRendererTest by replacing a broken synchronization mechanism with a dispatcher-queued no-op, and re-enables the previously quarantined test.
Changes:
- Remove the quarantine marker from
ProcessBufferedRenderBatches_WritesRendersso it runs in CI again. - Replace the unused/unsignaled
ManualResetEventSlimwait withawait renderer.Dispatcher.InvokeAsync(() => { })to correctly sequence dispatcher work afterSetResult().
You can also share your feedback on Copilot code review. Take the survey.
The test used a ManualResetEventSlim that was never signaled, relying on a 10-second timeout to elapse before proceeding. This masked a race where firstBatchTCS.SetResult() triggers an async continuation in the renderer, but TriggerRender() was called before that continuation completed. On loaded CI agents, TriggerRender's assertion that the render task completes synchronously would fail because the renderer's dispatcher was still busy processing the previous batch. Replace the broken event wait with polling on the renderer's observable state (_unacknowledgedRenderBatches.Count) to ensure the first batch is fully processed before triggering more renders. Fixes #61807 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
d1a68f9 to
bf50163
Compare
The previous fix used a no-op dispatcher drain followed by TriggerRender() from the test thread. This had a race: InvokeAsync(Action) returns a task (t) that completes inside PostAsync's callback, but PostAsync's own task (stored as _taskQueue) completes slightly later. When the test's await resumed, _taskQueue could still be non-completed, causing the next TriggerRender's InvokeAsync to chain instead of running synchronously. Fix by running TriggerRender inside the dispatcher callback. When CheckAccess() is true, Dispatcher.InvokeAsync bypasses _taskQueue entirely and runs the action synchronously, returning Task.CompletedTask. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ilonatommy
left a comment
There was a problem hiding this comment.
The fix looks fine.
Also removes the quarantine attribute so the test runs in CI again.
The process is to keep it in quarantined. We should merge this PR and add test-fixed label to the original issue (that we keep open - remove fixes keyword from the PR description). During the CI duty these tests that have been passing for a month will get moved from the quarantine.
Per review feedback: the test should remain quarantined and the original issue should get the 'test-fixed' label. After passing in quarantine CI for a month, it will be moved out. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Fun fact, copilot decided to add "fixes" keyword to squash message anyway because it s created by concatenating all commit messages of the PR and the 1st commit contained it. |
Problem
ProcessBufferedRenderBatches_WritesRendersfails intermittently in CI (#61807), quarantined since Sep 2025.Root Cause
The test had a
ManualResetEventSlimthat was never signaled — it was created, reset, and waited on with a 10-second timeout, but nothing ever calledSet(). The comment acknowledged that "continuations of SetResult will not execute synchronously" but the synchronization mechanism was broken.After
firstBatchTCS.SetResult(), the renderer processes the batch completion asynchronously on its dispatcher. The test proceeded after the 10s timeout expired, then calledTriggerRender()which asserts the render task completes synchronously. On loaded CI agents, the renderer's dispatcher was still busy processing the previous batch's continuation, causing the assertion to fail.Fix
Move the post-
SetResult()test setup (SetDisconnected+TriggerRendercalls) inside arenderer.Dispatcher.InvokeAsynccallback. This ensures:SetResultcontinuations) drains before our callback executesTriggerRenderruns withCheckAccess() == true, bypassing the internal task queue and executing the render synchronouslyA single
await InvokeAsync(() => {})no-op was insufficient becauseInvokeAsync(Action)returns a task that completes insidePostAsync's callback, butPostAsync's own task (stored as_taskQueue) completes slightly later — creating a race where the nextTriggerRenderwould chain instead of running synchronously.Related to #61807