Fix: firewall proxy cap inflating blocked-request severity to "high"#28890
Fix: firewall proxy cap inflating blocked-request severity to "high"#28890
Conversation
…count is at cap Add `firewallBlockedRequestCap = 50` constant documenting the Squid proxy truncation limit. When any contributing run's `FirewallAnalysis.BlockedRequests` equals or exceeds this value, `EpisodeData.BlockedRequestAtCap` is set to true so consumers can distinguish a real high count from a proxy-truncated sentinel. In both `buildAuditObservabilityInsights` (single run) and `buildLogsObservabilityInsights` (multi-run hotspot), the absolute-count upgrade path (`blocked >= 10 → severity = "high"`) is suppressed when the count is at cap. The rate-based upgrade path (`>= 50% block rate`) is unaffected, so a genuinely high block rate still surfaces as high severity even at cap. Adds 5 new unit tests covering cap detection, below-cap false-negative, and both severity paths for audit and logs insights. Agent-Logs-Url: https://github.com/github/gh-aw/sessions/f5658b1d-9be4-445c-908c-ab434c633611 Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adjusts network observability severity so Squid firewall proxy truncation at 50 blocked requests no longer automatically inflates insights to "high" based on absolute blocked counts.
Changes:
- Adds “at-cap” tracking (
blockedAtCap/BlockedRequestAtCap) whenFirewallAnalysis.BlockedRequests >= 50. - Gates the
"high"severity absolute-count trigger (blocked >= 10) on!blockedAtCapin both audit and logs insights; keeps the rate-based"high"path unchanged. - Adds unit tests covering cap detection and the updated severity behavior.
Show a summary per file
| File | Description |
|---|---|
| pkg/cli/observability_insights.go | Tracks firewall cap and gates absolute-count escalation for network severity in audit and aggregated logs insights. |
| pkg/cli/observability_insights_test.go | Adds tests ensuring severity is suppressed at cap (unless block rate is high). |
| pkg/cli/logs_episode.go | Introduces firewallBlockedRequestCap constant and records EpisodeData.BlockedRequestAtCap. |
| pkg/cli/logs_episode_test.go | Adds tests validating episode cap detection behavior. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comments suppressed due to low confidence (1)
pkg/cli/observability_insights.go:287
- The new
blockedAtCapgating fixes severity inflation, but the emitted hotspot insight still reportsblockedas an exact count inSummary/Evidenceeven whenblockedAtCapis true (the underlying per-run counts may have been truncated at the proxy cap). Consider adjusting the hotspot message to reflect that the blocked total is a lower bound whenblockedAtCapis set.
if networkRate >= 0.5 || (networkHotspot.blocked >= 10 && !networkHotspot.blockedAtCap) {
severity = "high"
}
insights = append(insights, ObservabilityInsight{
Category: "network",
- Files reviewed: 4/4 changed files
- Comments generated: 2
| // firewallBlockedRequestCap is the maximum blocked-request count the Squid firewall | ||
| // proxy reports before truncating. A BlockedRequestCount equal to this value may | ||
| // represent any number of requests >= 50 and should not be treated as an exact figure. |
There was a problem hiding this comment.
The comment for firewallBlockedRequestCap refers to BlockedRequestCount, but this cap is about the per-run firewall metric (FirewallAnalysis.BlockedRequests) being truncated by Squid. As written it can be confused with the aggregated episode field EpisodeData.BlockedRequestCount (which can exceed 50 across runs). Consider rewording to explicitly reference FirewallAnalysis.BlockedRequests (and that 50 means “>= 50”, not an exact count).
| // firewallBlockedRequestCap is the maximum blocked-request count the Squid firewall | |
| // proxy reports before truncating. A BlockedRequestCount equal to this value may | |
| // represent any number of requests >= 50 and should not be treated as an exact figure. | |
| // firewallBlockedRequestCap is the maximum per-run blocked-request count that the | |
| // Squid firewall proxy reports in FirewallAnalysis.BlockedRequests before truncating. | |
| // A FirewallAnalysis.BlockedRequests value equal to this cap means ">= 50", not an | |
| // exact count of 50. |
| summary := fmt.Sprintf("The firewall observed %d request(s) with %d blocked, for a %.0f%% block rate.", processedRun.FirewallAnalysis.TotalRequests, processedRun.FirewallAnalysis.BlockedRequests, blockedRate*100) | ||
| if processedRun.FirewallAnalysis.BlockedRequests > 0 { | ||
| title = "Network friction detected" | ||
| severity = "medium" | ||
| if blockedRate >= 0.5 || processedRun.FirewallAnalysis.BlockedRequests >= 10 { | ||
| blockedAtCap := processedRun.FirewallAnalysis.BlockedRequests >= firewallBlockedRequestCap | ||
| if blockedRate >= 0.5 || (processedRun.FirewallAnalysis.BlockedRequests >= 10 && !blockedAtCap) { | ||
| severity = "high" |
There was a problem hiding this comment.
When blockedAtCap is true, the blocked count is a lower bound (Squid truncates at the cap), but the insight Summary/Evidence still presents the blocked count as an exact number (e.g., “50 blocked”). Consider adjusting the wording when at cap (e.g., “>= 50 blocked” and/or appending a note that the count may be truncated) to avoid misleading output.
This issue also appears on line 283 of the same file.
🧪 Test Quality Sentinel ReportTest Quality Score: 80/100✅ Excellent test quality
Test Classification Details
Flagged Tests — Requires ReviewNo tests flagged. All new tests demonstrate strong behavioral coverage. Test Inflation NoteThe test files added 138 lines total (62 + 76) versus 16 production lines (9 + 7), a ratio well above 2:1. However, this is expected and appropriate here — the fix targets nuanced boundary conditions in the severity-classification logic that require several explicit scenario tests to cover the cap/rate matrix. The inflation penalty (−10 pts) is applied but does not indicate a quality problem in this case. Language SupportTests analyzed: Verdict
📖 Understanding Test ClassificationsDesign Tests (High Value) verify what the system does:
Implementation Tests (Low Value) verify how the system does it:
Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators. References: §25045413707
|
🏗️ Design Decision Gate — ADR RequiredThis PR makes significant changes to core business logic (154 new lines in AI has analyzed the PR diff and generated a draft ADR to help you get started: 📄 Draft ADR: What to do next
Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision. Why ADRs Matter
ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you. 📋 Michael Nygard ADR Format ReferenceAn ADR must contain these four sections to be considered complete:
All ADRs are stored in
References: §25045413687
|
The Squid firewall proxy truncates
blocked_requestsat 50. Because the network insight'sseverity = "high"upgrade fired onblocked >= 10, any run hitting the cap was guaranteed high severity regardless of actual block rate, inflating risk scores across the fleet.Changes
EpisodeData: newBlockedRequestAtCap boolfield (blocked_request_at_cap,omitempty); set when any contributing run'sFirewallAnalysis.BlockedRequests >= firewallBlockedRequestCap (50).workflowObservabilityStats: newblockedAtCap boolfield, propagated during firewall accumulation inbuildLogsObservabilityInsights.buildAuditObservabilityInsightsand multi-runbuildLogsObservabilityInsights): the absolute-count path (blocked >= 10 → "high") is gated on!blockedAtCap. The rate-based path (>= 50% block rate) is unchanged — a genuinely high block rate still produces"high"even when at cap.