(feat:) Cache monitoring & automatic bust root-cause analysis (Dashboard + Doctor)

# Feature: Cache monitoring & automatic bust root-cause analysis (Dashboard + Doctor)

**Type:** Feature request / Idea  
**Components:** `packages/dashboard`, `packages/plugin/scripts/context-dump`, `doctor` CLI, optional `opencode-interceptor` integration  

---

## Summary

Magic Context Dashboard already surfaces cache hit ratios and severity (`stable` / `warning` / `bust` / `full_bust`), but users cannot easily answer **why** a step busted, **what changed** in the prompt between steps, or whether the cause is **MC-side** (transform flush, system hash change, heuristics) vs **provider-side** (routing, implicit cache semantics, eviction). For openai-compatible providers (GLM, DeepSeek, etc.) this is especially painful because MC does not see wire payloads.

This issue proposes closing that gap with: (1) richer persisted cache diagnostics, (2) automatic root-cause classification with confidence, (3) Dashboard drill-down and export, and (4) a `doctor check-cache <session_id>` workflow built on existing `context-dump` + interceptor dumps.

---

## Problem

### User story

> On one provider where I use GLM 5.1 I get **streaks of busts** but also **streaks of working fine**. I need to figure out what's going on there.

Users want to:

1. **Inspect cache events** with enough detail to find bust causes (not just red bars).
2. **See per-message token deltas** — what grew in `input` vs what was served from `cache.read`.
3. **Correlate** MC transform decisions (`defer` / `execute`, `ctx-flush`, system prompt hash changes) with provider usage.
4. **Export** diagnostics for bug reports (provider routing, model comparison).
5. *(Related)* Edit **[User]** notes and some **[Hist]** fields (e.g. notes) in the Dashboard — separate but adjacent UX gap.

### Real-world example

Investigation of `ses_*` on OneRouter showed:

| Model | Unique API steps | Stable (≥90% hit) | Bust (<50%) | Median `cache.read` | Max `cache.read` |
|-------|------------------|-------------------|-------------|---------------------|------------------|
| DeepSeek v4 Pro | 110 | 84 | 20 | 139,136 | 168,064 |
| GLM 5.1 | 220 | 6 | 214 | 2,688 | 98,496 |

GLM **can** hit ~99% cache ratio when `input` is tiny (e.g. 131 tokens, 95,936 cached) but tool-loop steps with ~90k `input` and ~2.6k `cache.read` classify as **bust on every step** — indistinguishable in the Dashboard from a true MC-induced invalidation without deeper context.

Captured requests had **no `cache_control`**; OpenCode does not apply caching hints to `@ai-sdk/openai-compatible` today. Dashboard showed bust; logs showed `decision=defer` + small cache hits — **metric vs root cause** are different questions.

---

## Current state (what already exists)

### Dashboard — Cache tab & Cache Diagnostics page

- **`DbCacheEvent`** (`packages/dashboard/src-tauri/src/db.rs`): `input_tokens`, `cache_read`, `cache_write`, `hit_ratio`, `severity`, `cause`, `turn_id`, `agent`.
- **`log_parser.rs`**: severity from `hit_ratio = cache_read / (input + cache_read + cache_write)`; `detect_bust_cause()` scans ±10 log lines for strings like `system prompt hash`, `Execute pass`, `Historian output`, `decision=defer`.
- **`CacheDiagnostics.tsx`**: session timeline, turn grouping, bust counts.
- **Limitation:** causes are **log-string heuristics** with a narrow window; no message-level diff, no wire dump linkage, no provider/model dimension on events.

### Plugin — `context-dump` script

- **`metrics.ts`**: `perMessageCache` with `cache_bust` when `prevCacheRead > 0 && cacheRead < prevCacheRead`.
- **`run-context-dump.ts` → `classifyBust()`**: richer MC-aware labels (`historian injection`, `system/variant flush`, `nudge anchor / same turn`, etc.) using OpenCode DB + `context.db` metadata.
- **Output:** JSON dump with original/transformed messages, stats, last 10 classified busts.
- **Limitation:** CLI-only; not integrated into Dashboard or Doctor; no interceptor request correlation.

### `opencode-interceptor`

- Writes `{seq}-{provider}-{timestamp}.{request,response,meta}.json` per API call (`src/intercept/dump.ts`).
- Includes model, messages, headers (e.g. `x-session-affinity`), `stream_options.include_usage`.
- **Limitation:** responses often empty for SSE; MC does not consume these files today.

---

## Gaps

| Gap | Impact |
|-----|--------|
| No unified **step ID** linking log event ↔ OpenCode message ↔ interceptor dump | Cannot drill from red bar to request body |
| **`classifyBust()`** (context-dump) not used by Dashboard | Dashboard causes are weaker than script |
| **Low hit ratio** on openai-compatible routes treated same as true invalidation | False alarm fatigue (GLM tool loops) |
| No **delta view**: Δinput, Δcache_read, Δtransformed_chars between steps | Cannot see what grew |
| No **streak detection** (N consecutive busts, model-specific baselines) | User's "streaks" pattern invisible |
| No **export** (JSON/CSV) from Cache tab | Hard to share with provider/MC issues |
| MC transform diagnostics not **persisted** per API step | Must grep `magic-context.log` manually |
| Non-Anthropic wire capture depends on **optional** interceptor | Doctor cannot mandate prerequisites |

---

## Proposed solution (phased)

### Phase 1 — Doctor: `check-cache <session_id>` (ship existing script)

Expose `runContextDump()` as a first-class doctor subcommand:

```bash
bunx @cortexkit/opencode-magic-context doctor check-cache ses_xxx
# Optional: --interceptor-dir ~/.local/share/.../opencode-interceptor/ses_xxx
# Optional: --json-out ./cache-report.json
```

**Behavior:**

- Run current context-dump pipeline (OpenCode DB + context.db + transform replay).
- Print summary table: bust count, last N busts with `classification` + `detail` (already in `scripts/context-dump.ts`).
- **Preflight:** if provider is not Anthropic, check for interceptor dump dir; warn if missing with setup instructions.
- Exit non-zero if bust rate exceeds threshold (optional `--fail-on-bust-rate 0.5`).

**Acceptance:**

- Parity with `bun run scripts/context-dump.ts <session_id>`.
- Documented in doctor help and README.

---

### Phase 2 — Persist cache diagnostics in `context.db`

New table (name TBD, e.g. `cache_step_diagnostics`):

| Column | Description |
|--------|-------------|
| `session_id`, `message_id`, `timestamp` | Join keys |
| `provider`, `model` | From `message.updated` |
| `input_tokens`, `cache_read`, `cache_write`, `hit_ratio` | Usage snapshot |
| `severity` | Same rules as dashboard |
| `mc_transform_decision` | `defer` \| `execute` |
| `mc_bust_signals` | JSON: `system_prompt_hash_changed`, `rematerialized`, `ctx_flush`, `heuristics_ran`, … |
| `classification`, `classification_detail` | From `classifyBust()` logic |
| `classification_confidence` | `high` \| `medium` \| `low` |
| `interceptor_dump_basename` | Optional link to capture file |
| `delta_input`, `delta_cache_read` | vs previous assistant step |

Populate on `message.updated` when `hasUsageTokens=true` (plugin hook), reusing classify logic from context-dump.

**Acceptance:**

- Dashboard reads from DB first; log parser remains fallback for historical sessions.
- Schema migrated via existing plugin migration path.

---

### Phase 3 — Dashboard: bust drill-down & export

#### 3a. Cache event detail panel

When user expands a cache step/turn:

- Usage breakdown bar: `input` vs `cache.read` vs `cache.write`
- **Classification** + confidence + linked MC log excerpts (timestamp-matched)
- **Delta row** vs previous step: `Δinput`, `Δcache_read`, hit ratio change
- **Transform summary:** defer/execute, rematerialized m[0]/m[1], tag message count delta
- **Link:** "Open interceptor dump" (if basename present and file exists)

#### 3b. Streak & model-aware views

- Detect **consecutive busts** ≥ N; highlight in timeline
- Optional filter: "show only provider/model X"
- **Baseline hint** for openai-compatible: if `input > 50k` and `hit_ratio < 10%`, label as `provider_low_cache_yield` (informational) vs `mc_likely_bust` when execute+hash change in same window

#### 3c. Export

- **Export session cache report** (JSON): all `DbCacheEvent` + diagnostics + bust classifications
- **Export bust-only CSV** for spreadsheets
- Button on Cache tab + Cache Diagnostics page

**Acceptance:**

- User can answer "what changed on this bust?" without leaving Dashboard for typical MC causes.
- Export opens in GitHub issue template for provider bugs.

---

### Phase 4 — Wire-level root cause (interceptor integration)

For non-Anthropic providers, correlate interceptor dumps with cache steps:

1. Match by timestamp ± duration (`meta.json` `durationMs`) and session affinity header.
2. Compute **request-level** signals:
   - Message count, total request bytes
   - System prompt hash (first 2 system blocks)
   - Presence of `cache_control` / `providerOptions`
   - Model string
3. **Diff vs previous dump:** which message indices changed (length/hash), new tools, reasoning blocks added
4. Surface in Dashboard: "Request grew by +12,400 tokens in messages [45–47] (tool results)"

**Dependencies:**

- `opencode-interceptor` enabled and capturing session
- Future: optional MC auth-plugin hook for OpenAI websocket providers (out of scope for v1)

**Acceptance:**

- `doctor check-cache` accepts `--interceptor-dir` and includes wire diff in output.
- Dashboard shows "wire diff unavailable" with setup link when dumps missing.

---

### Phase 5 — Automatic root-cause engine (unify classifiers)

Merge and extend:

- `log_parser.rs` `detect_bust_cause()`
- `run-context-dump.ts` `classifyBust()`
- New rules from production cases:

| Cause ID | Detection signal |
|----------|------------------|
| `mc_system_prompt_hash` | log: `system prompt hash changed` within Δt |
| `mc_explicit_flush` | `ctx-flush` / `explicit_flush` / `rematerialized=true` |
| `mc_execute_heuristics` | `decision=execute` + `heuristics WILL RUN` |
| `mc_historian` | historian write within 5m window (existing) |
| `mc_storage_fatal` | `storage fatal` (schema skew — annotate, not blame provider) |
| `provider_cold_start` | first event or `cache_read=0` after long gap |
| `provider_low_yield` | high `input`, low `cache.read`, defer pass, no MC mutation |
| `provider_routing_change` | interceptor model/provider header changed (if exposed) |
| `client_no_cache_hints` | openai-compatible + no `cache_control` in dump |
| `unknown` | fallback |

Output: `{ cause_id, summary, evidence: [{source, snippet, timestamp}], confidence }`.

**Acceptance:**

- Same engine in doctor, plugin persistence, and dashboard.
- Unit tests with fixtures from anonymized real sessions (GLM streak, DeepSeek stable chain).

---

## Related: editable User / Historian notes (separate scope)

Discord also asked to edit **[User]** notes and **[Hist]** note fields in Dashboard. Recommend **separate issue** to avoid scope creep, but shared UX pattern: inline edit → RPC → `context.db` update with audit timestamp.

---

## Non-goals (v1)

- MC becoming a MITM proxy for all providers (maintainer constraint)
- Fixing OpenCode/provider caching behavior (document as `client_no_cache_hints` finding only)
- Real-time websocket capture without auth plugin

---

## Acceptance criteria (overall)

- [ ] `doctor check-cache <session_id>` ships with bust table + JSON export
- [ ] Dashboard cache step shows classification, deltas, and MC evidence
- [ ] Session cache report exportable from UI
- [ ] Interceptor dumps linked when present; clear setup path when absent
- [ ] GLM-style "chronic low hit ratio" distinguishable from MC flush busts
- [ ] Docs: "Diagnosing cache busts" covering Anthropic dumps, interceptor, limitations

---

## Implementation notes

- Reuse `buildDumpStats()` / `classifyBust()` — avoid duplicating logic in Rust; consider shared WASM or JSON schema generated from TS.
- Dashboard already lazy-loads cache events (`cacheActivated`); extend `get_session_cache_events` to join diagnostics table.
- Respect `PER_SESSION` / `TOTAL_CAP` limits in `CacheDiagnostics.tsx` when adding fields.

---

## References

- `packages/plugin/scripts/context-dump/` — metrics, classifyBust, runContextDump
- `packages/dashboard/src-tauri/src/log_parser.rs` — severity + detect_bust_cause
- `packages/dashboard/src/components/CacheDiagnostics/CacheDiagnostics.tsx`
- `opencode-interceptor` — `src/intercept/dump.ts`, per-session capture dirs
- Community reproduction: GLM 5.1 / OneRouter, `hit_ratio` ~2–3% on tool-loop steps vs ~99% on tiny-input continuations


Column	Description
`session_id`, `message_id`, `timestamp`	Join keys
`provider`, `model`	From `message.updated`
`input_tokens`, `cache_read`, `cache_write`, `hit_ratio`	Usage snapshot
`severity`	Same rules as dashboard
`mc_transform_decision`	`defer` \| `execute`
`mc_bust_signals`	JSON: `system_prompt_hash_changed`, `rematerialized`, `ctx_flush`, `heuristics_ran`, …
`classification`, `classification_detail`	From `classifyBust()` logic
`classification_confidence`	`high` \| `medium` \| `low`
`interceptor_dump_basename`	Optional link to capture file
`delta_input`, `delta_cache_read`	vs previous assistant step

Cause ID	Detection signal
`mc_system_prompt_hash`	log: `system prompt hash changed` within Δt
`mc_explicit_flush`	`ctx-flush` / `explicit_flush` / `rematerialized=true`
`mc_execute_heuristics`	`decision=execute` + `heuristics WILL RUN`
`mc_historian`	historian write within 5m window (existing)
`mc_storage_fatal`	`storage fatal` (schema skew — annotate, not blame provider)
`provider_cold_start`	first event or `cache_read=0` after long gap
`provider_low_yield`	high `input`, low `cache.read`, defer pass, no MC mutation
`provider_routing_change`	interceptor model/provider header changed (if exposed)
`client_no_cache_hints`	openai-compatible + no `cache_control` in dump
`unknown`	fallback

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(feat:) Cache monitoring & automatic bust root-cause analysis (Dashboard + Doctor) #130

Feature: Cache monitoring & automatic bust root-cause analysis (Dashboard + Doctor)

Summary

Problem

User story

Real-world example

Current state (what already exists)

Dashboard — Cache tab & Cache Diagnostics page

Plugin — `context-dump` script

`opencode-interceptor`

Gaps

Proposed solution (phased)

Phase 1 — Doctor: `check-cache <session_id>` (ship existing script)

Phase 2 — Persist cache diagnostics in `context.db`

Phase 3 — Dashboard: bust drill-down & export

3a. Cache event detail panel

3b. Streak & model-aware views

3c. Export

Phase 4 — Wire-level root cause (interceptor integration)

Phase 5 — Automatic root-cause engine (unify classifiers)

Related: editable User / Historian notes (separate scope)

Non-goals (v1)

Acceptance criteria (overall)

Implementation notes

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Model	Unique API steps	Stable (≥90% hit)	Bust (<50%)	Median `cache.read`	Max `cache.read`
DeepSeek v4 Pro	110	84	20	139,136	168,064
GLM 5.1	220	6	214	2,688	98,496

Gap	Impact
No unified step ID linking log event ↔ OpenCode message ↔ interceptor dump	Cannot drill from red bar to request body
`classifyBust()` (context-dump) not used by Dashboard	Dashboard causes are weaker than script
Low hit ratio on openai-compatible routes treated same as true invalidation	False alarm fatigue (GLM tool loops)
No delta view: Δinput, Δcache_read, Δtransformed_chars between steps	Cannot see what grew
No streak detection (N consecutive busts, model-specific baselines)	User's "streaks" pattern invisible
No export (JSON/CSV) from Cache tab	Hard to share with provider/MC issues
MC transform diagnostics not persisted per API step	Must grep `magic-context.log` manually
Non-Anthropic wire capture depends on optional interceptor	Doctor cannot mandate prerequisites

(feat:) Cache monitoring & automatic bust root-cause analysis (Dashboard + Doctor) #130

Description

Feature: Cache monitoring & automatic bust root-cause analysis (Dashboard + Doctor)

Summary

Problem

User story

Real-world example

Current state (what already exists)

Dashboard — Cache tab & Cache Diagnostics page

Plugin — context-dump script

opencode-interceptor

Gaps

Proposed solution (phased)

Phase 1 — Doctor: check-cache <session_id> (ship existing script)

Phase 2 — Persist cache diagnostics in context.db

Phase 3 — Dashboard: bust drill-down & export

3a. Cache event detail panel

3b. Streak & model-aware views

3c. Export

Phase 4 — Wire-level root cause (interceptor integration)

Phase 5 — Automatic root-cause engine (unify classifiers)

Related: editable User / Historian notes (separate scope)

Non-goals (v1)

Acceptance criteria (overall)

Implementation notes

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Plugin — `context-dump` script

`opencode-interceptor`

Phase 1 — Doctor: `check-cache <session_id>` (ship existing script)

Phase 2 — Persist cache diagnostics in `context.db`