Wire remaining burn hotspots CLI modes (#376)#387
Conversation
Closes the 2.x parity gap on `burn hotspots`: - `--session <id>` now flows the existing `HotspotsOptions.session` through the SDK instead of erroring out. - `--workflow <id>` and `--provider <csv>` are added to `HotspotsOptions`, threaded into `build_query` enrichment + a post-query provider filter matching the shape `compare()` already uses, and exposed through the napi facade and `@relayburn/sdk` types. - `--patterns [csv]` and `--findings` dispatch to the SDK's existing `run_hotspots_findings` path, with a CLI-side detector validator and a human renderer (per-detector grouping by default; `--findings` swaps in the unified severity-ranked table). - `--group-by` and `--patterns/--findings` are explicitly mutually exclusive. - The per-session aggregate view (`--session` with no id) and `--explain-drift` stay stubs but now exit 2 with directed messaging pointing at the supported flag forms. Tests: 3 new SDK integration tests for `--session`, `--workflow`, `--provider` filter behaviors; 4 new CLI smoke tests covering the stub exits and the new validation paths. Existing golden snapshots remain unchanged.
|
ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (4)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughAdds workflow and provider filters to HotspotsOptions across SDK and Node facade, wires CLI flags (--workflow, --provider, --patterns, --findings, --session ), enforces CLI gating (stubbed --session aggregate and --explain-drift exit code 2), implements pattern validation, renders findings (unified or grouped), and adds regression/unit tests and docs. ChangesHotspots CLI Modes Expansion
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5a4667b227
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "retry-loop", | ||
| "failure-run", | ||
| "cancellation-run", | ||
| "compaction", |
There was a problem hiding this comment.
Use SDK finding-kind identifiers for pattern selection
PATTERN_KINDS is validating and forwarding detector IDs like compaction / opencode-system-prompt, but run_hotspots_findings() in the SDK filters pattern-derived findings by f.kind (e.g. compaction-loss, skill-recall-dup, skill-pruning-protection, system-prompt-tax). As a result, burn hotspots --patterns (including the empty "all detectors" form) silently drops those finding families, and users cannot pass the matching kind names because this validator rejects them.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
README.md (1)
62-73:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd
--explain-driftto the hotspots options table.It’s currently only mentioned in the note below the examples, so the README still under-documents the actual CLI surface users see in
--help. A one-line table entry keeps the stubbed flag discoverable and aligned with the PR objective to document the implemented surface.Also applies to: 85-86
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` around lines 62 - 73, Add a one-line table entry for the new CLI flag `--explain-drift` to the hotspots options table (the same table that lists `--since`, `--project`, `--patterns`, etc.) so the flag appears in `--help`; include a brief description like "Emit explanations for detected drift in hotspots" and also add the same one-line entry to the second hotspots/options table referenced later in the README (the table noted below the examples) so both locations are consistent.
🧹 Nitpick comments (2)
packages/sdk-node/CHANGELOG.md (1)
17-19: ⚡ Quick winDrop the PR reference from the Unreleased entry.
This entry is already clear without
(#376), and removing the issue reference keeps the package changelog aligned with the repo’s Unreleased-format rule.As per coding guidelines,
[Unreleased]changelog entries should be “concise and impact-first” and should “Drop issue/PR links, internal review notes, implementation backstory”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/sdk-node/CHANGELOG.md` around lines 17 - 19, Remove the PR reference "(`#376`)" from the Unreleased changelog entry that documents the hotspots() options; specifically edit the line mentioning "hotspots() options now accept `workflow` ... and `provider` ... — same shape the `compare()` options expose." to drop the trailing PR/issue link so the entry is concise and follows the Unreleased-format rule.CHANGELOG.md (1)
9-15: ⚡ Quick winTrim the Unreleased note to the user-visible effect.
The
(#376)suffix and thevia napiimplementation detail make this entry less curated than the repo changelog rule expects. Please keep it impact-first and drop issue/PR references/internal wiring details.As per coding guidelines,
[Unreleased]changelog entries should be “concise and impact-first” and should “Drop issue/PR links, internal review notes, implementation backstory”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@CHANGELOG.md` around lines 9 - 15, Update the Unreleased changelog entry for relayburn-cli / relayburn-sdk to be concise and impact-first: remove the internal details "(via napi)" and the " (`#376`)" PR/issue reference, and drop implementation/backstory like "wires ... over the existing SDK surface"; keep only the user-visible effects such as that `burn hotspots` now accepts `--session <id>`, `--workflow <id>`, `--provider <csv>`, `--patterns [csv]`, and `--findings`, and that `HotspotsOptions` gains `workflow` and `provider` filters while noting that the per-session aggregate view and `--explain-drift` remain explicit stubs that exit 2.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/relayburn-cli/src/commands/hotspots.rs`:
- Around line 585-660: Sort the incoming findings by severity before rendering
so the views are severity-ranked and stable: in both emit_findings_unified and
emit_findings_grouped, create a sorted collection (e.g. collect findings.iter()
into a Vec<&WasteFinding>), sort it by severity (highest first) and a stable
tiebreaker (e.g. session_id or title), then iterate over that sorted collection
instead of the original findings slice; for grouped output, build groups from
the sorted collection so items within each kind preserve the severity order when
capping by limit.
In `@crates/relayburn-sdk/src/query_verbs.rs`:
- Around line 1069-1085: The side-data query builders (used by
run_hotspots_attribution and run_hotspots_findings) are still using the original
q (built by build_query) and only session/since, so they can pull unrelated
turns when a workflowId enrichment or provider slice was applied to turns (see
q.enrichment handling, collect_turns and normalize_provider_filter). Update the
code so the same slice is applied to the auxiliary queries: either propagate the
modified q (with enrichment including "workflowId") and/or a provider constraint
into the side-data query construction, or pass the already-filtered turns list
into run_hotspots_attribution/run_hotspots_findings so their
user-turn/tool-result-event queries are restricted to the same workflowId and
provider set as used for turns. Ensure the symbols build_query, q.enrichment,
collect_turns, normalize_provider_filter, run_hotspots_attribution and
run_hotspots_findings are updated accordingly.
---
Outside diff comments:
In `@README.md`:
- Around line 62-73: Add a one-line table entry for the new CLI flag
`--explain-drift` to the hotspots options table (the same table that lists
`--since`, `--project`, `--patterns`, etc.) so the flag appears in `--help`;
include a brief description like "Emit explanations for detected drift in
hotspots" and also add the same one-line entry to the second hotspots/options
table referenced later in the README (the table noted below the examples) so
both locations are consistent.
---
Nitpick comments:
In `@CHANGELOG.md`:
- Around line 9-15: Update the Unreleased changelog entry for relayburn-cli /
relayburn-sdk to be concise and impact-first: remove the internal details "(via
napi)" and the " (`#376`)" PR/issue reference, and drop implementation/backstory
like "wires ... over the existing SDK surface"; keep only the user-visible
effects such as that `burn hotspots` now accepts `--session <id>`, `--workflow
<id>`, `--provider <csv>`, `--patterns [csv]`, and `--findings`, and that
`HotspotsOptions` gains `workflow` and `provider` filters while noting that the
per-session aggregate view and `--explain-drift` remain explicit stubs that exit
2.
In `@packages/sdk-node/CHANGELOG.md`:
- Around line 17-19: Remove the PR reference "(`#376`)" from the Unreleased
changelog entry that documents the hotspots() options; specifically edit the
line mentioning "hotspots() options now accept `workflow` ... and `provider` ...
— same shape the `compare()` options expose." to drop the trailing PR/issue link
so the entry is concise and follows the Unreleased-format rule.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 86013be6-096a-4b2e-a8ad-e6c601d2da11
📒 Files selected for processing (8)
CHANGELOG.mdREADME.mdcrates/relayburn-cli/src/commands/hotspots.rscrates/relayburn-cli/tests/smoke.rscrates/relayburn-sdk-node/src/lib.rscrates/relayburn-sdk/src/query_verbs.rspackages/sdk-node/CHANGELOG.mdpackages/sdk-node/src/index.d.ts
| fn emit_findings_unified(findings: &[WasteFinding]) { | ||
| let mut out: Vec<String> = Vec::new(); | ||
| out.push(String::new()); | ||
| out.push(format!("findings: {}", format_uint(findings.len() as u64))); | ||
| out.push(String::new()); | ||
| if findings.is_empty() { | ||
| out.push(" (no hotspot findings)".to_string()); | ||
| out.push(String::new()); | ||
| print!("{}", out.join("\n")); | ||
| return; | ||
| } | ||
| let mut rows: Vec<Vec<String>> = vec![vec![ | ||
| "severity".into(), | ||
| "kind".into(), | ||
| "session".into(), | ||
| "usd".into(), | ||
| "title".into(), | ||
| ]]; | ||
| for f in findings { | ||
| let usd = f | ||
| .estimated_savings | ||
| .usd_per_session | ||
| .map(format_usd) | ||
| .unwrap_or_else(|| "—".to_string()); | ||
| rows.push(vec![ | ||
| severity_label(f.severity).to_string(), | ||
| f.kind.clone(), | ||
| f.session_id.chars().take(8).collect(), | ||
| usd, | ||
| truncate(&f.title, 80), | ||
| ]); | ||
| } | ||
| out.push(render_table(&rows)); | ||
| out.push(String::new()); | ||
| print!("{}", out.join("\n")); | ||
| } | ||
|
|
||
| fn emit_findings_grouped(findings: &[WasteFinding], limit: usize) { | ||
| let mut out: Vec<String> = Vec::new(); | ||
| out.push(String::new()); | ||
| out.push(format!("findings: {}", format_uint(findings.len() as u64))); | ||
| out.push(String::new()); | ||
| if findings.is_empty() { | ||
| out.push(" (no hotspot findings)".to_string()); | ||
| out.push(String::new()); | ||
| print!("{}", out.join("\n")); | ||
| return; | ||
| } | ||
| // Group by detector kind, preserving severity-sorted order of the | ||
| // sdk-emitted slice. Within each group we cap at `limit`. | ||
| use std::collections::BTreeMap; | ||
| let mut groups: BTreeMap<&str, Vec<&WasteFinding>> = BTreeMap::new(); | ||
| for f in findings { | ||
| groups.entry(f.kind.as_str()).or_default().push(f); | ||
| } | ||
| for (kind, items) in &groups { | ||
| out.push(format!("{} ({})", kind, format_uint(items.len() as u64))); | ||
| let mut rows: Vec<Vec<String>> = | ||
| vec![vec!["severity".into(), "session".into(), "usd".into(), "title".into()]]; | ||
| for f in items.iter().take(limit) { | ||
| let usd = f | ||
| .estimated_savings | ||
| .usd_per_session | ||
| .map(format_usd) | ||
| .unwrap_or_else(|| "—".to_string()); | ||
| rows.push(vec![ | ||
| severity_label(f.severity).to_string(), | ||
| f.session_id.chars().take(8).collect(), | ||
| usd, | ||
| truncate(&f.title, 70), | ||
| ]); | ||
| } | ||
| out.push(render_table(&rows)); | ||
| out.push(String::new()); | ||
| } | ||
| print!("{}", out.join("\n")); |
There was a problem hiding this comment.
Sort findings before rendering the “severity-ranked” views.
Both renderers consume the slice as-is. The SDK currently appends detector batches without a global severity sort, so a later high finding can appear below earlier warn/info rows. Sorting once here by severity, with a stable tiebreaker, would make --findings match the documented ranking and keep grouped output consistent too.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/relayburn-cli/src/commands/hotspots.rs` around lines 585 - 660, Sort
the incoming findings by severity before rendering so the views are
severity-ranked and stable: in both emit_findings_unified and
emit_findings_grouped, create a sorted collection (e.g. collect findings.iter()
into a Vec<&WasteFinding>), sort it by severity (highest first) and a stable
tiebreaker (e.g. session_id or title), then iterate over that sorted collection
instead of the original findings slice; for grouped output, build groups from
the sorted collection so items within each kind preserve the severity order when
capping by limit.
- Use SDK finding-kind names in hotspots `--patterns` validator (`compaction-loss`, `skill-recall-dup`, `skill-pruning-protection`, `system-prompt-tax`) so the empty "all detectors" form actually forwards every family the SDK can emit. (Codex P1) - Sort findings once at the SDK boundary in `run_hotspots_findings` so the unified `--findings` view stays severity-descending after appending tool-output-bloat / ghost-surface / tool-call-pattern batches. (CodeRabbit) - Propagate `q.enrichment` into the side-data query in `run_hotspots_attribution` and `run_hotspots_findings` so a partial- session `workflowId` stamp doesn't pull unrelated user-turns / tool-result events into the per-session buckets. (CodeRabbit) - Trim Unreleased changelog entries to be impact-first per project guidelines.
Closes #376.
Summary
Wires the remaining
burn hotspotsflags from the 1.x surface over the existing SDK plumbing (and lights up the few SDK fields that were missing).--session <id>now flows the existingHotspotsOptions.sessionthrough the SDK instead of erroring out at the CLI guard.--workflow <id>and--provider <csv>are added toHotspotsOptions, threaded intobuild_queryenrichment + a post-query provider filter matching the shapecompare()already uses, and surfaced through the napi facade and@relayburn/sdktypes.--patterns [csv]and--findingsdispatch to the SDK's existingrun_hotspots_findingspath. A CLI-side detector validator rejects unknown kinds, and a human renderer renders per-detector groupings by default —--findingsswaps in the unified severity-ranked table.--group-byand--patterns/--findingsare explicitly mutually exclusive (group-by selects an attribution rollup; patterns/findings drive the detector view).--sessionwith no id) and--explain-driftremain stubs but now exit 2 with directed messaging — the relationship-drift / chronology query verbs aren't yet exposed by the SDK.Test plan
cargo build --workspacecargo test --workspace(3 new SDK integration tests for--session,--workflow,--provider; 4 new CLI smoke tests)BURN_GOLDEN=1 cargo test --test golden -- --include-ignored(existing snapshots remain byte-identical)burn hotspots --patterns retry-loop --jsonandburn hotspots --findingsagainst a populated ledgerGenerated by Claude Code