Merged
Conversation
Ensures data consistency between local and distributed caches.
The local cache is now only updated when the distributed cache operations succeed.
If a distributed cache operation fails, the corresponding key is removed from the local cache to force a re-fetch, preventing stale data.
Specifically addresses scenarios where `Set`, `SetAll`, `Replace`, `ReplaceIfEqual`, `Increment`, `ListAdd`, `ListRemove`, `SetIfHigher`, `SetIfLower` may result in inconsistent state.
Also refactors the local cache initialization and removes the local cache expired event handler, relying solely on the message bus for invalidation.
Remove L1 Expiration Notification
Current Code (HybridCacheClient.cs lines 61-68)
private Task OnLocalCacheItemExpiredAsync(object sender, ItemExpiredEventArgs args)
{
if (!args.SendNotification)
return Task.CompletedTask;
_logger.LogTrace("Local cache expired event: key={Key}", args.Key);
return _messageBus.PublishAsync(new InvalidateCache { CacheId = _cacheId, Keys = [args.Key], Expired = true });
}
Why Remove It
L1 expiration is a local concern - When L1 expires, L2 has either already expired (same TTL) or still has the value (TTL skew). Either way, notifying other instances is unnecessary.
Industry standard - Microsoft HybridCache (.NET 9+) and EasyCaching do NOT publish L1 expiration events. They only publish on write operations.
Unnecessary traffic - Every L1 expiration generates a message to ALL instances, even though they will naturally expire around the same time.
Potential harm - If L1 expires before L2 (clock drift), this forces other instances to re-fetch even though L2 still has valid data.
Contributor
There was a problem hiding this comment.
Pull request overview
This pull request addresses critical cache consistency issues in the HybridCacheClient by ensuring the local cache (L1) only contains data that successfully exists in the distributed cache (L2). The changes implement a "write-to-L2-first, then-L1-on-success" pattern to prevent stale data.
Key Changes:
- Removed L1 expiration event handler that unnecessarily notified other instances when local cache entries expired
- Modified all write operations (
Set,Replace,Increment,ListAdd/Remove, etc.) to update L2 first, then L1 only on success - Added defensive local cache removal when L2 operations fail or partially succeed
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
randylsu
approved these changes
Jan 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Ensures data consistency between local and distributed caches.
The local cache is now only updated when the distributed cache operations succeed. If a distributed cache operation fails, the corresponding key is removed from the local cache to force a re-fetch, preventing stale data.
Specifically addresses scenarios where
Set,SetAll,Replace,ReplaceIfEqual,Increment,ListAdd,ListRemove,SetIfHigher,SetIfLowermay result in inconsistent state.Also refactors the local cache initialization and removes the local cache expired event handler, relying solely on the message bus for invalidation.
Remove L1 Expiration Notification
Why Remove It
Now I have a complete picture. Let me analyze why the tests didn't catch these issues:
Analysis: Why Tests Didn't Catch These Issues
Based on my review of the test coverage and the changes made, here's the analysis:
The Changes Made (Summary)
OnLocalCacheItemExpiredAsyncwas publishing invalidation messages when local cache items expiredexpiresInis null, now removes from L1 instead of setting with null expirationWhy Tests Didn't Catch These
1. These weren't "bugs" in the traditional sense - they were design decisions that could cause subtle consistency issues
The old code "worked" - it just had potential for stale data in edge cases:
2. The test infrastructure uses
InMemoryCacheClientfor both L1 and L2Looking at
HybridCacheClientTestBase.cs:With in-memory implementations:
SetAsync,ReplaceAsync, etc. always succeedSetAllAsyncalways sets all items3. The skipped test reveals the limitation
This test was skipped because using two in-memory caches sharing the same message bus creates invalidation loops - the exact scenario where the L1 expiration notification was problematic.
4. No tests for failure scenarios
The test suite doesn't have:
SetAsyncreturnsfalseSetAllAsyncpartially succeedsReplaceAsyncfails because key doesn't existIncrementAsyncwith null expiration on a key that has existing TTL5. The consistency issues are timing-dependent
The stale data scenarios require:
Bottom Line
You're correct - these issues weren't caught because:
The changes we made are defensive improvements to ensure L1 never has data that L2 doesn't have, and to eliminate unnecessary message bus traffic. They're not fixing test failures - they're preventing potential production issues that the test suite architecture couldn't detect.