Skip to content

Wire remaining burn hotspots CLI modes (#376)#387

Open
willwashburn wants to merge 2 commits intomainfrom
claude/fix-issue-376-Vp2Ju
Open

Wire remaining burn hotspots CLI modes (#376)#387
willwashburn wants to merge 2 commits intomainfrom
claude/fix-issue-376-Vp2Ju

Conversation

@willwashburn
Copy link
Copy Markdown
Member

Closes #376.

Summary

Wires the remaining burn hotspots flags from the 1.x surface over the existing SDK plumbing (and lights up the few SDK fields that were missing).

  • --session <id> now flows the existing HotspotsOptions.session through the SDK instead of erroring out at the CLI guard.
  • --workflow <id> and --provider <csv> are added to HotspotsOptions, threaded into build_query enrichment + a post-query provider filter matching the shape compare() already uses, and surfaced through the napi facade and @relayburn/sdk types.
  • --patterns [csv] and --findings dispatch to the SDK's existing run_hotspots_findings path. A CLI-side detector validator rejects unknown kinds, and a human renderer renders per-detector groupings by default — --findings swaps in the unified severity-ranked table.
  • --group-by and --patterns/--findings are explicitly mutually exclusive (group-by selects an attribution rollup; patterns/findings drive the detector view).
  • The per-session aggregate view (--session with no id) and --explain-drift remain stubs but now exit 2 with directed messaging — the relationship-drift / chronology query verbs aren't yet exposed by the SDK.

Test plan

  • cargo build --workspace
  • cargo test --workspace (3 new SDK integration tests for --session, --workflow, --provider; 4 new CLI smoke tests)
  • BURN_GOLDEN=1 cargo test --test golden -- --include-ignored (existing snapshots remain byte-identical)
  • Manual: burn hotspots --patterns retry-loop --json and burn hotspots --findings against a populated ledger

Generated by Claude Code

Closes the 2.x parity gap on `burn hotspots`:

- `--session <id>` now flows the existing `HotspotsOptions.session` through
  the SDK instead of erroring out.
- `--workflow <id>` and `--provider <csv>` are added to `HotspotsOptions`,
  threaded into `build_query` enrichment + a post-query provider filter
  matching the shape `compare()` already uses, and exposed through the
  napi facade and `@relayburn/sdk` types.
- `--patterns [csv]` and `--findings` dispatch to the SDK's existing
  `run_hotspots_findings` path, with a CLI-side detector validator and a
  human renderer (per-detector grouping by default; `--findings` swaps in
  the unified severity-ranked table).
- `--group-by` and `--patterns/--findings` are explicitly mutually
  exclusive.
- The per-session aggregate view (`--session` with no id) and
  `--explain-drift` stay stubs but now exit 2 with directed messaging
  pointing at the supported flag forms.

Tests: 3 new SDK integration tests for `--session`, `--workflow`,
`--provider` filter behaviors; 4 new CLI smoke tests covering the stub
exits and the new validation paths. Existing golden snapshots remain
unchanged.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Review Change Stack
No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: e5af7953-00c3-4985-bb2c-32d5a712ffb6

📥 Commits

Reviewing files that changed from the base of the PR and between 5a4667b and 191b372.

📒 Files selected for processing (4)
  • CHANGELOG.md
  • crates/relayburn-cli/src/commands/hotspots.rs
  • crates/relayburn-sdk/src/query_verbs.rs
  • packages/sdk-node/CHANGELOG.md
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md

📝 Walkthrough

Walkthrough

Adds workflow and provider filters to HotspotsOptions across SDK and Node facade, wires CLI flags (--workflow, --provider, --patterns, --findings, --session ), enforces CLI gating (stubbed --session aggregate and --explain-drift exit code 2), implements pattern validation, renders findings (unified or grouped), and adds regression/unit tests and docs.

Changes

Hotspots CLI Modes Expansion

Layer / File(s) Summary
Node & Type Declarations
crates/relayburn-sdk-node/src/lib.rs, packages/sdk-node/src/index.d.ts
Node wrapper and TypeScript declarations expose workflow and provider on HotspotsOptions and forward them to the SDK.
Core SDK Type
crates/relayburn-sdk/src/query_verbs.rs
HotspotsOptions gains workflow and provider fields in the Rust SDK public type.
SDK Query Implementation
crates/relayburn-sdk/src/query_verbs.rs
LedgerHandle::hotspots injects workflowId enrichment when opts.workflow set, preserves enrichment in side queries, post-filters turns by derived provider allowlist, re-sorts combined findings; unit tests added for session/workflow/provider.
CLI Pattern Selection Resolution
crates/relayburn-cli/src/commands/hotspots.rs
Adds PATTERN_KINDS and resolve_pattern_selection() to validate/deduplicate pattern names and default to all detectors when none provided.
CLI Argument Gating & Validation
crates/relayburn-cli/src/commands/hotspots.rs
run_inner stubs per-session aggregate (--session without id) and --explain-drift (exit code 2), treats --findings as selecting all detectors, enforces mutual exclusivity of --group-by with --patterns/--findings, and parses --provider into a list.
CLI Options Wiring to SDK
crates/relayburn-cli/src/commands/hotspots.rs
Constructs sdk::HotspotsOptions with resolved patterns, workflow, provider, session filter etc., and calls the SDK hotspots verb.
CLI Output Rendering for Findings
crates/relayburn-cli/src/commands/hotspots.rs
Renderer now supports HotspotsResult::Findings as a unified findings table or grouped-by-kind view; adds severity_label() and per-kind capping.
CLI & SDK Tests
crates/relayburn-cli/tests/smoke.rs, crates/relayburn-sdk/src/query_verbs.rs
Adds CLI smoke tests for stub/validation cases and SDK unit tests validating session, workflow, and provider filtering.
Documentation & Changelogs
README.md, CHANGELOG.md, packages/sdk-node/CHANGELOG.md
Updates CLI options table, documents new flags and stub behavior, and records HotspotsOptions shape change in Node SDK changelog.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Possibly related PRs

  • AgentWorkforce/burn#384: Modifies the HotspotsOptions surface (Node SDK changelog/type changes); related to Node-facing option shape changes.
  • AgentWorkforce/burn#315: Modifies the hotspots CLI presenter and argument wiring; related to CLI parsing/rendering changes here.
  • AgentWorkforce/burn#370: Also touches hotspots CLI and SDK surfaces and result shaping, making it relevant at the code level.

Poem

🐰 I hopped through flags and fields today,
workflow tunnels led the way,
provider lists and pattern signs,
findings sorted into lines—
stubs wait patient for their play.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 31.25% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: wiring the remaining burn hotspots CLI modes, which aligns with the primary objective of the PR.
Description check ✅ Passed The description provides a detailed summary of changes across multiple components, clearly mapping to the CLI flag wiring objectives and test coverage mentioned.
Linked Issues check ✅ Passed The PR comprehensively implements all objectives from #376: wires --session, --explain-drift, --patterns, --findings, --workflow, and --provider with proper error handling and test coverage.
Out of Scope Changes check ✅ Passed All changes are scoped to wiring hotspots CLI modes and supporting SDK fields; documentation, type definitions, and test additions all directly support the stated objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/fix-issue-376-Vp2Ju
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch claude/fix-issue-376-Vp2Ju

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5a4667b227

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

"retry-loop",
"failure-run",
"cancellation-run",
"compaction",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Use SDK finding-kind identifiers for pattern selection

PATTERN_KINDS is validating and forwarding detector IDs like compaction / opencode-system-prompt, but run_hotspots_findings() in the SDK filters pattern-derived findings by f.kind (e.g. compaction-loss, skill-recall-dup, skill-pruning-protection, system-prompt-tax). As a result, burn hotspots --patterns (including the empty "all detectors" form) silently drops those finding families, and users cannot pass the matching kind names because this validator rejects them.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
README.md (1)

62-73: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add --explain-drift to the hotspots options table.

It’s currently only mentioned in the note below the examples, so the README still under-documents the actual CLI surface users see in --help. A one-line table entry keeps the stubbed flag discoverable and aligned with the PR objective to document the implemented surface.

Also applies to: 85-86

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@README.md` around lines 62 - 73, Add a one-line table entry for the new CLI
flag `--explain-drift` to the hotspots options table (the same table that lists
`--since`, `--project`, `--patterns`, etc.) so the flag appears in `--help`;
include a brief description like "Emit explanations for detected drift in
hotspots" and also add the same one-line entry to the second hotspots/options
table referenced later in the README (the table noted below the examples) so
both locations are consistent.
🧹 Nitpick comments (2)
packages/sdk-node/CHANGELOG.md (1)

17-19: ⚡ Quick win

Drop the PR reference from the Unreleased entry.

This entry is already clear without (#376), and removing the issue reference keeps the package changelog aligned with the repo’s Unreleased-format rule.

As per coding guidelines, [Unreleased] changelog entries should be “concise and impact-first” and should “Drop issue/PR links, internal review notes, implementation backstory”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/sdk-node/CHANGELOG.md` around lines 17 - 19, Remove the PR reference
"(`#376`)" from the Unreleased changelog entry that documents the hotspots()
options; specifically edit the line mentioning "hotspots() options now accept
`workflow` ... and `provider` ... — same shape the `compare()` options expose."
to drop the trailing PR/issue link so the entry is concise and follows the
Unreleased-format rule.
CHANGELOG.md (1)

9-15: ⚡ Quick win

Trim the Unreleased note to the user-visible effect.

The (#376) suffix and the via napi implementation detail make this entry less curated than the repo changelog rule expects. Please keep it impact-first and drop issue/PR references/internal wiring details.

As per coding guidelines, [Unreleased] changelog entries should be “concise and impact-first” and should “Drop issue/PR links, internal review notes, implementation backstory”.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CHANGELOG.md` around lines 9 - 15, Update the Unreleased changelog entry for
relayburn-cli / relayburn-sdk to be concise and impact-first: remove the
internal details "(via napi)" and the " (`#376`)" PR/issue reference, and drop
implementation/backstory like "wires ... over the existing SDK surface"; keep
only the user-visible effects such as that `burn hotspots` now accepts
`--session <id>`, `--workflow <id>`, `--provider <csv>`, `--patterns [csv]`, and
`--findings`, and that `HotspotsOptions` gains `workflow` and `provider` filters
while noting that the per-session aggregate view and `--explain-drift` remain
explicit stubs that exit 2.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/relayburn-cli/src/commands/hotspots.rs`:
- Around line 585-660: Sort the incoming findings by severity before rendering
so the views are severity-ranked and stable: in both emit_findings_unified and
emit_findings_grouped, create a sorted collection (e.g. collect findings.iter()
into a Vec<&WasteFinding>), sort it by severity (highest first) and a stable
tiebreaker (e.g. session_id or title), then iterate over that sorted collection
instead of the original findings slice; for grouped output, build groups from
the sorted collection so items within each kind preserve the severity order when
capping by limit.

In `@crates/relayburn-sdk/src/query_verbs.rs`:
- Around line 1069-1085: The side-data query builders (used by
run_hotspots_attribution and run_hotspots_findings) are still using the original
q (built by build_query) and only session/since, so they can pull unrelated
turns when a workflowId enrichment or provider slice was applied to turns (see
q.enrichment handling, collect_turns and normalize_provider_filter). Update the
code so the same slice is applied to the auxiliary queries: either propagate the
modified q (with enrichment including "workflowId") and/or a provider constraint
into the side-data query construction, or pass the already-filtered turns list
into run_hotspots_attribution/run_hotspots_findings so their
user-turn/tool-result-event queries are restricted to the same workflowId and
provider set as used for turns. Ensure the symbols build_query, q.enrichment,
collect_turns, normalize_provider_filter, run_hotspots_attribution and
run_hotspots_findings are updated accordingly.

---

Outside diff comments:
In `@README.md`:
- Around line 62-73: Add a one-line table entry for the new CLI flag
`--explain-drift` to the hotspots options table (the same table that lists
`--since`, `--project`, `--patterns`, etc.) so the flag appears in `--help`;
include a brief description like "Emit explanations for detected drift in
hotspots" and also add the same one-line entry to the second hotspots/options
table referenced later in the README (the table noted below the examples) so
both locations are consistent.

---

Nitpick comments:
In `@CHANGELOG.md`:
- Around line 9-15: Update the Unreleased changelog entry for relayburn-cli /
relayburn-sdk to be concise and impact-first: remove the internal details "(via
napi)" and the " (`#376`)" PR/issue reference, and drop implementation/backstory
like "wires ... over the existing SDK surface"; keep only the user-visible
effects such as that `burn hotspots` now accepts `--session <id>`, `--workflow
<id>`, `--provider <csv>`, `--patterns [csv]`, and `--findings`, and that
`HotspotsOptions` gains `workflow` and `provider` filters while noting that the
per-session aggregate view and `--explain-drift` remain explicit stubs that exit
2.

In `@packages/sdk-node/CHANGELOG.md`:
- Around line 17-19: Remove the PR reference "(`#376`)" from the Unreleased
changelog entry that documents the hotspots() options; specifically edit the
line mentioning "hotspots() options now accept `workflow` ... and `provider` ...
— same shape the `compare()` options expose." to drop the trailing PR/issue link
so the entry is concise and follows the Unreleased-format rule.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 86013be6-096a-4b2e-a8ad-e6c601d2da11

📥 Commits

Reviewing files that changed from the base of the PR and between 0f2dc86 and 5a4667b.

📒 Files selected for processing (8)
  • CHANGELOG.md
  • README.md
  • crates/relayburn-cli/src/commands/hotspots.rs
  • crates/relayburn-cli/tests/smoke.rs
  • crates/relayburn-sdk-node/src/lib.rs
  • crates/relayburn-sdk/src/query_verbs.rs
  • packages/sdk-node/CHANGELOG.md
  • packages/sdk-node/src/index.d.ts

Comment on lines +585 to +660
fn emit_findings_unified(findings: &[WasteFinding]) {
let mut out: Vec<String> = Vec::new();
out.push(String::new());
out.push(format!("findings: {}", format_uint(findings.len() as u64)));
out.push(String::new());
if findings.is_empty() {
out.push(" (no hotspot findings)".to_string());
out.push(String::new());
print!("{}", out.join("\n"));
return;
}
let mut rows: Vec<Vec<String>> = vec![vec![
"severity".into(),
"kind".into(),
"session".into(),
"usd".into(),
"title".into(),
]];
for f in findings {
let usd = f
.estimated_savings
.usd_per_session
.map(format_usd)
.unwrap_or_else(|| "—".to_string());
rows.push(vec![
severity_label(f.severity).to_string(),
f.kind.clone(),
f.session_id.chars().take(8).collect(),
usd,
truncate(&f.title, 80),
]);
}
out.push(render_table(&rows));
out.push(String::new());
print!("{}", out.join("\n"));
}

fn emit_findings_grouped(findings: &[WasteFinding], limit: usize) {
let mut out: Vec<String> = Vec::new();
out.push(String::new());
out.push(format!("findings: {}", format_uint(findings.len() as u64)));
out.push(String::new());
if findings.is_empty() {
out.push(" (no hotspot findings)".to_string());
out.push(String::new());
print!("{}", out.join("\n"));
return;
}
// Group by detector kind, preserving severity-sorted order of the
// sdk-emitted slice. Within each group we cap at `limit`.
use std::collections::BTreeMap;
let mut groups: BTreeMap<&str, Vec<&WasteFinding>> = BTreeMap::new();
for f in findings {
groups.entry(f.kind.as_str()).or_default().push(f);
}
for (kind, items) in &groups {
out.push(format!("{} ({})", kind, format_uint(items.len() as u64)));
let mut rows: Vec<Vec<String>> =
vec![vec!["severity".into(), "session".into(), "usd".into(), "title".into()]];
for f in items.iter().take(limit) {
let usd = f
.estimated_savings
.usd_per_session
.map(format_usd)
.unwrap_or_else(|| "—".to_string());
rows.push(vec![
severity_label(f.severity).to_string(),
f.session_id.chars().take(8).collect(),
usd,
truncate(&f.title, 70),
]);
}
out.push(render_table(&rows));
out.push(String::new());
}
print!("{}", out.join("\n"));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Sort findings before rendering the “severity-ranked” views.

Both renderers consume the slice as-is. The SDK currently appends detector batches without a global severity sort, so a later high finding can appear below earlier warn/info rows. Sorting once here by severity, with a stable tiebreaker, would make --findings match the documented ranking and keep grouped output consistent too.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/relayburn-cli/src/commands/hotspots.rs` around lines 585 - 660, Sort
the incoming findings by severity before rendering so the views are
severity-ranked and stable: in both emit_findings_unified and
emit_findings_grouped, create a sorted collection (e.g. collect findings.iter()
into a Vec<&WasteFinding>), sort it by severity (highest first) and a stable
tiebreaker (e.g. session_id or title), then iterate over that sorted collection
instead of the original findings slice; for grouped output, build groups from
the sorted collection so items within each kind preserve the severity order when
capping by limit.

Comment thread crates/relayburn-sdk/src/query_verbs.rs
- Use SDK finding-kind names in hotspots `--patterns` validator
  (`compaction-loss`, `skill-recall-dup`, `skill-pruning-protection`,
  `system-prompt-tax`) so the empty "all detectors" form actually
  forwards every family the SDK can emit. (Codex P1)
- Sort findings once at the SDK boundary in `run_hotspots_findings` so
  the unified `--findings` view stays severity-descending after
  appending tool-output-bloat / ghost-surface / tool-call-pattern
  batches. (CodeRabbit)
- Propagate `q.enrichment` into the side-data query in
  `run_hotspots_attribution` and `run_hotspots_findings` so a partial-
  session `workflowId` stamp doesn't pull unrelated user-turns /
  tool-result events into the per-session buckets. (CodeRabbit)
- Trim Unreleased changelog entries to be impact-first per project
  guidelines.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2.x gap: port remaining burn hotspots CLI modes

2 participants