Fix flaky NRE in ConcurrentLfuSoakTests scheduler teardown#758
Closed
Fix flaky NRE in ConcurrentLfuSoakTests scheduler teardown#758
Conversation
…eduler type before cast Agent-Logs-Url: https://github.com/bitfaster/BitFaster.Caching/sessions/d56560c8-1ab4-47a6-8acc-3d850c94848b Co-authored-by: bitfaster <12851828+bitfaster@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Fix flaky CI failure in BitFaster.Caching
Fix flaky NRE in ConcurrentLfuSoakTests scheduler teardown
Apr 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
RunIntegrityCheckAsyncusedas BackgroundThreadSchedulerthen immediately dereferenced the result, causing a spuriousNullReferenceExceptionif the cast returned null — masking the real failure and making the soak test flaky.Changes
ConcurrentLfuSoakTests.cs— bothRunIntegrityCheckAsyncoverloads (ConcurrentLfu<int,string>andConcurrentLfu<string,string>): replace the silentas-cast-and-dereference with a FluentAssertions type assertion followed by a direct cast.If the scheduler type ever changes unexpectedly the test now produces a clear assertion failure rather than an unattributed
NullReferenceException.Original prompt
Create a fix for a flaky CI failure in BitFaster.Caching.
Context:
System.NullReferenceExceptioninBitFaster.Caching.UnitTests.Lfu.ConcurrentLfuSoakTests.WhenConcurrentGetCacheEndsInConsistentState(iteration: 5).BitFaster.Caching.UnitTests/Lfu/ConcurrentLfuSoakTests.csat the call siteawait RunIntegrityCheckAsync(lfu, iteration);(around line 164).RunIntegrityCheckAsyncdoingvar scheduler = lfu.Scheduler as BackgroundThreadScheduler; scheduler.Dispose(); await scheduler.Completion;which can dereference null if the scheduler is not aBackgroundThreadSchedulerat runtime (or if it is unexpectedly null), leading to flaky NRE.Goal:
NullReferenceExceptionduring scheduler teardown.Implementation guidance:
BitFaster.Caching.UnitTests/Lfu/ConcurrentLfuSoakTests.cs, update BOTH overloads ofRunIntegrityCheckAsync(theConcurrentLfu<int,string>andConcurrentLfu<string,string>versions).as BackgroundThreadScheduler+ dereference with either:lfu.Scheduleris of typeBackgroundThreadScheduler, then cast and dispose/await completion; ORPrefer the assertion approach to catch unexpected scheduler changes.
RunIntegrityCheck(lfu, output)after stopping the scheduler.Reference:
Please open a PR with the fix and include a concise description in the PR body.
The following is the prior conversation context from the user's chat exploration (may be truncated):
User: Please find a solution for this failing job. Use the logs, job definition, and any referenced files where the failure occurred. Keep your response focused on the solution and include code suggestions when appropriate.
Assistant: The job is failing due to a flaky
NullReferenceExceptionin the soak testConcurrentLfuSoakTests.WhenConcurrentGetCacheEndsInConsistentState(iteration: 5)on macOS (dotnet test -f net6.0). The stack points at the integrity-check call site inConcurrentLfuSoakTests.cs:164(ref49e105fd56fd3b880ba7024c3c2f8ce4948591bd), and the most likely null is the background scheduler cast inRunIntegrityCheckAsync:Even though the test constructs the cache with a
BackgroundThreadScheduler, under certain race/teardown conditions (or if the scheduler gets swapped/disposed unexpectedly), that cast can returnnull, turning the test failure into a spuriousNullReferenceExceptioninstead of a meaningful assertion failure.Fix: make the test robust and fail with a clear message (or handle non-background scheduler)
Update both overloads of
RunIntegrityCheckAsyncto (1) avoidas+dereference, and (2) produce a clear failure if the scheduler isn’t the expected type.Suggested patch (same change for both
intandstringoverloads):If you prefer not to assert the type, then at least guard and still run the integrity check:
Why this solves the CI failure
NullReferenceExceptioncaused by dereferencing a potentially-nullschedulervariable.BackgroundThreadScheduler, the test will now fail with a diagnostic assertion explaining what went wrong, instead of an opaque NRE.Optional CI hardening (if you still see rare integrity flakiness)
These are “soak” tests (
[Collection("Soak")], r...This pull request was created from Copilot chat.