Skip to content

docs: expand CACHING.md (non-Redis L2, sizing/cost, SRE coordination)#7736

Merged
withinfocus merged 4 commits into
mainfrom
docs/extended-cache-non-redis-l2
May 28, 2026
Merged

docs: expand CACHING.md (non-Redis L2, sizing/cost, SRE coordination)#7736
withinfocus merged 4 commits into
mainfrom
docs/extended-cache-non-redis-l2

Conversation

@withinfocus
Copy link
Copy Markdown
Contributor

@withinfocus withinfocus commented May 28, 2026

🎟️ Tracking

Follow-up to #6682, expanded in review.

📔 Objective

#6682 added support for using any IDistributedCache (Cosmos DB, SQL Server, EF) as the L2 store under ExtendedCache, but the docs in src/Core/Utilities/CACHING.md only got a one-line setting rename. This PR closes that gap and, in the course of review, expands the doc to cover two adjacent areas that were also missing or hand-wavy: sizing/throughput/cost, and coordination with the infrastructure team.

Non-Redis L2 documentation (original scope)

  • Overview table and deployment-modes list reflect Cosmos / SQL / EF as valid L2 backends.
  • New Option 5 example wires ExtendedCache with a non-Redis L2, including the keyed-alias pattern for pairing with the "persistent" Cosmos cache.
  • "Backend Configuration" subsection rewritten as shared-mode vs keyed-mode resolution, with a prominent backplane caveat (cross-instance L1 invalidation is Redis-only).
  • "When NOT to Use" explains why ExtendedCache may not fit long-lived data (no backplane → cross-instance L1 staleness) instead of the prior blanket "days/weeks → persistent cache."
  • New worked example: Long-Lived Per-Tenant Computed Aggregates for the ExtendedCache + Cosmos pairing, exercising every FusionCache feature honestly.
  • Decision tree and Configuration Priority table updated to match.

Review feedback applied

  • L1/L2 vocabulary defined once in the ExtendedCache intro and dropped from the decision tree, which was using it before introduction.
  • Cosmos TTL precision: DurationAbsoluteExpirationRelativeToNow → per-document ttl; container DefaultTimeToLive must be enabled for it to apply.
  • RU profile / eager-refresh tradeoff called out for the ExtendedCache + Cosmos pairing.
  • Note that ExtendedCache prefixes every key with cacheName:, so aliased consumers cannot collide with raw "persistent" namespace.
  • Fixed Option 5 inline comment that incorrectly implied AddDistributedCache registers Cosmos under "persistent" in self-hosted (it aliases to the unnamed IDistributedCache there).
  • Footnoted the Configuration Priority table to point at the opt-in Cosmos pairing path.

Sizing, throughput, and cost (new)

A new section after the overview table acknowledges that the decision tree silently assumed the dataset fit in the backends it pointed to. Covers:

  • Working-set size: L1 bounded by process heap; per-backend L2 ceilings (Redis memory cap and shared-instance eviction, Cosmos RU economics, SQL/EF table growth).
  • Read rate: when L1 absorbs reads vs. becomes pure overhead at high cardinality with low per-key reuse.
  • Write rate: backplane saturation under sustained writes; eager-refresh's per-key background write cadence.
  • Cost shape per backend (fixed-ceiling vs. per-RU vs. shared-DB load).
  • A 3-question check and a workload-vs-tree-vs-reality table for when sizing overrides the tree.

Decision tree adds gates for working-set size and sustained-high write rate. ExtendedCache Cons and "When NOT to Use" expanded with the two main outs: working sets too large for L1, and sustained-high write rates that saturate the Redis backplane. SSO grants and Org/Provider Abilities "Why" lists now include the size/rate profile that makes ExtendedCache appropriate there.

Coordinating with Infrastructure (new)

A new section calling out that services.AddExtendedCache(...) does not stand up the runtime. Covers:

  • Loop in SRE / infrastructure early — they have visibility into budgets, utilization, and incident history the code does not.
  • Defaults are not production configuration. Uses CreateIfNotExists = false + container-level DefaultTimeToLive as the worked example.
  • The shared application Redis already has tenants — a noisy new cache can starve unrelated callers.
  • New backends need provisioning, dashboards, secrets, cost approval — not just code.
  • Self-hosted reality differs from cloud; "works in cloud" ≠ "works in every customer's environment."

No code changes — src/Core/Utilities/CACHING.md only. Net diff is ~+230 / -45.

📸 Screenshots

N/A — docs-only.

PR #6682 added support for any IDistributedCache (Cosmos, SQL, EF) as the
L2 store under ExtendedCache, but the docs continued to read as Redis-only.
Update CACHING.md to document the new modes, show wiring for pairing with
Cosmos, and explain the backplane / L1-staleness tradeoff.
@withinfocus withinfocus added the ai-review Request a Claude code review label May 28, 2026
@withinfocus withinfocus changed the title docs: follow up to #6682 — cover non-Redis L2 support in ExtendedCache docs: cover non-Redis L2 support in ExtendedCache May 28, 2026
@withinfocus withinfocus marked this pull request as ready for review May 28, 2026 13:28
@codecov
Copy link
Copy Markdown

codecov Bot commented May 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.58%. Comparing base (e10eb5e) to head (a280b54).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7736      +/-   ##
==========================================
- Coverage   65.01%   60.58%   -4.43%     
==========================================
  Files        2145     2147       +2     
  Lines       95028    95086      +58     
  Branches     8504     8507       +3     
==========================================
- Hits        61781    57607    -4174     
- Misses      31126    35453    +4327     
+ Partials     2121     2026      -95     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

🤖 Bitwarden Claude Code Review

Overall Assessment: APPROVE

Reviewed the docs-only follow-up to #6682 that fills in non-Redis L2 coverage for ExtendedCache in src/Core/Utilities/CACHING.md. Verified each technical claim against ExtendedCacheServiceCollectionExtensions.cs (shared-mode and keyed-mode IDistributedCache resolution, backplane-Redis-only constraint) and SharedWeb/Utilities/ServiceCollectionExtensions.AddDistributedCache (the "persistent" keyed service, the CreateIfNotExists = false Cosmos container, and the self-hosted SQL/EF fallback alias). The "30 minutes" expiration-scan figure matches EntityFrameworkCache._expiredItemsDeletionInterval. Internal anchors, the updated overview table, the four deployment modes, Option 5, the Cosmos pairing subsection, and the Configuration Priority footnote all check out.

Code Review Details

No findings. All seven prior review threads from the previous round are resolved with substantive follow-up commits, including the most recent addition of an explicit SRE / infrastructure coordination section.

Copy link
Copy Markdown
Contributor

@JaredSnider-Bitwarden JaredSnider-Bitwarden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding these doc changes! Claude and I found a few things below (included suggestions for fixes). Happy to discuss!

Comment thread src/Core/Utilities/CACHING.md Outdated
Comment thread src/Core/Utilities/CACHING.md Outdated
Comment thread src/Core/Utilities/CACHING.md Outdated
Comment thread src/Core/Utilities/CACHING.md
Comment thread src/Core/Utilities/CACHING.md Outdated
Comment thread src/Core/Utilities/CACHING.md
Comment thread src/Core/Utilities/CACHING.md
- Drop L1/L2 shorthand from decision tree; define vocabulary once in the
  ExtendedCache section.
- Fix Option 5 inline comment about the "persistent" keyed cache —
  Cosmos in cloud, aliased to the unnamed IDistributedCache in self-hosted.
- Note that ExtendedCache prefixes every key with cacheName: so aliased
  consumers cannot collide with the raw "persistent" namespace.
- Pairing-with-Cosmos: clarify that FusionCache writes per-document ttl
  (not container TTL) and call out the RU / eager-refresh tradeoff.
- Add a worked Specific Example for ExtendedCache + Cosmos
  (Long-Lived Per-Tenant Computed Aggregates).
- Footnote the Configuration Priority table to point at the opt-in
  Cosmos pairing path.
Volume, write rate, and cache size are first-order constraints that gate which
backend can hold a workload — but the prior decision tree silently assumed the
answer fit. Address that:

- New "Sizing, Throughput, and Cost" section after the overview table, covering
  working-set size (L1 bounded by heap, L2 ceilings per backend), read rate
  (when L1 absorbs vs. becomes overhead), write rate (backplane saturation,
  eager-refresh cost), and per-backend cost shape.
- Callout above the decision tree pointing at the new section.
- Decision-tree branches now include sizing/write-rate gates that can override
  the API recommendation.
- SSO grants and Org/Provider Abilities "Why" lists now include the size and
  rate profile that makes ExtendedCache appropriate there.
- ExtendedCache Cons and "When NOT to Use" expanded with the two main outs:
  working sets too large for L1, and sustained-high write rates that saturate
  the Redis backplane.
Picking a caching option in code is only half of standing up a new cache.
The runtime — Redis sizing, Cosmos provisioning, RU budgets, eviction
policies, monitoring — is owned by SRE and is not stood up by the
AddExtendedCache / AddDistributedCache registrations.

- New "Coordinating with Infrastructure" section after "Sizing, Throughput,
  and Cost", covering: loop SRE in early, defaults are not production
  configuration, shared infrastructure has unseen tenants, new backends
  need provisioning beyond code, and self-hosted environments differ from
  cloud.
- Decision-tree preamble now points at both Sizing and Coordinating
  with Infrastructure as prerequisites before using the tree.
@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

@JaredSnider-Bitwarden JaredSnider-Bitwarden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are really helpful improvements. Thank you very much!

@withinfocus withinfocus changed the title docs: cover non-Redis L2 support in ExtendedCache docs: expand CACHING.md (non-Redis L2, sizing/cost, SRE coordination) May 28, 2026
@withinfocus withinfocus merged commit b72edb5 into main May 28, 2026
67 of 68 checks passed
@withinfocus withinfocus deleted the docs/extended-cache-non-redis-l2 branch May 28, 2026 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-review Request a Claude code review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants