Conversation
…usage, and time diffs - Add AuditDiff top-level struct combining FirewallDiff, MCPToolsDiff, and RunMetricsDiff - Add MCPToolDiffEntry, MCPToolsDiff, MCPToolsDiffSummary types for MCP tool comparison - Add RunMetricsDiff type for token usage, duration, and turns comparison - Add computeAuditDiff, computeMCPToolsDiff, computeRunMetricsDiff functions - Add formatCountChange helper for absolute delta formatting - Replace loadFirewallAnalysisForRun with loadRunSummaryForDiff (returns full RunSummary) - Update render functions (JSON, markdown, pretty) for all new diff sections - Add isEmptyMCPToolsDiff and isEmptyAuditDiff helpers - Add 17 new tests covering all new compute functions and edge cases - Update command help text to describe new diff capabilities Agent-Logs-Url: https://github.com/github/gh-aw/sessions/d24c8923-28dd-4082-98b1-435af2aa7d2a Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/d24c8923-28dd-4082-98b1-435af2aa7d2a Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Enhances audit diff to compare additional run behavior beyond firewall domains by introducing a unified AuditDiff output that can include MCP tool invocation changes and run-level metrics when cached summaries are available.
Changes:
- Introduces
AuditDiffand computes MCP tool usage diffs and run metrics diffs alongside the existing firewall diff. - Updates CLI command flow to load full cached
RunSummarywhen possible and fall back to firewall-only analysis on cache miss. - Extends JSON/markdown/pretty renderers to display MCP tool changes and run metrics sections.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| pkg/cli/audit_diff.go | Adds new diff data models and computation logic for MCP tools and run metrics; replaces firewall-only loader with loadRunSummaryForDiff. |
| pkg/cli/audit_diff_command.go | Updates command help text and switches runtime behavior to compute/render the new AuditDiff. |
| pkg/cli/audit_diff_render.go | Reworks renderers to output a full audit diff across firewall, MCP tools, and run metrics. |
| pkg/cli/audit_diff_test.go | Adds unit tests covering MCP tools diffing, run metrics diffs, audit diff composition, and JSON serialization. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if s2.ErrorCount > s1.ErrorCount { | ||
| entry.IsAnomaly = true | ||
| entry.AnomalyNote = "error count increased" | ||
| anomalyCount++ | ||
| } |
There was a problem hiding this comment.
The PR description calls out anomaly flags for "increased error rates", but the current anomaly detection only checks ErrorCount increasing (absolute count). This can miss cases where call volume increases and the error rate increases even if error count is flat, or where error count increases but rate improves. Consider computing and comparing error rate (ErrorCount/CallCount) when both call counts are >0 and flagging anomalies based on rate deltas (in addition to absolute error count if desired).
| fmt.Fprintln(os.Stderr, console.FormatInfoMessage("Changes: "+strings.Join(summaryParts, " | "))) | ||
| } | ||
| if anomalyCount > 0 { | ||
| fmt.Fprintln(os.Stderr, console.FormatWarningMessage(fmt.Sprintf("⚠️ %d anomalies detected", anomalyCount))) |
There was a problem hiding this comment.
console.FormatWarningMessage already prefixes the message with a warning icon (see pkg/console/console.go:156-159). Including "
| fmt.Fprintln(os.Stderr, console.FormatWarningMessage(fmt.Sprintf("⚠️ %d anomalies detected", anomalyCount))) | |
| fmt.Fprintln(os.Stderr, console.FormatWarningMessage(fmt.Sprintf("%d anomalies detected", anomalyCount))) |
| type AuditDiff struct { | ||
| Run1ID int64 `json:"run1_id"` | ||
| Run2ID int64 `json:"run2_id"` | ||
| FirewallDiff *FirewallDiff `json:"firewall_diff,omitempty"` |
There was a problem hiding this comment.
AuditDiff.FirewallDiff is tagged with omitempty, but computeAuditDiff always assigns a non-nil FirewallDiff (even when there are no firewall changes). This means firewall_diff will always be serialized, defeating omitempty and making the JSON shape less consistent with the intent of optional sections. Either set FirewallDiff to nil when isEmptyDiff(...) is true, or remove omitempty if you want it always present.
| FirewallDiff *FirewallDiff `json:"firewall_diff,omitempty"` | |
| FirewallDiff *FirewallDiff `json:"firewall_diff"` |
Document the `gh aw audit diff <run-id-1> <run-id-2>` command, which compares firewall behavior, MCP tool invocations, and run metrics between two workflow runs. Added as part of coverage for #23118 (MCP tool/token/ duration diffs) and the original audit diff addition. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
audit diffonly compared firewall domain behavior. This upgrades it to also surface MCP tool invocation changes and run-level metrics when cached run summaries are available.New diff sections
Data model
Introduces
AuditDiffas the new top-level output, wrapping existingFirewallDiffplus two new sections:MCPToolsDifftracks new/removed/changed(server, tool)pairs with call counts and error counts.RunMetricsDiffis omitted when all values are zero (i.e., no cached summary available).Loading strategy
loadFirewallAnalysisForRunis replaced byloadRunSummaryForDiff, which returns the fullRunSummaryfrom cache (giving MCP + metrics data) and falls back to a firewall-only partial summary on cache miss — preserving backward compatibility.Output formats
All three output formats (pretty, markdown, JSON) render the new sections as sub-sections after the existing firewall diff. The pretty format aggregates a cross-section summary line.