feat: auto-enrich gh aw audit with firewall policy rule attribution#22756
feat: auto-enrich gh aw audit with firewall policy rule attribution#22756
Conversation
Add PolicyManifest, PolicyRule Go structs and audit.jsonl JSONL parser. Port enrichWithPolicyRules/domainMatchesRule from TypeScript to Go. Auto-detect firewall audit artifacts (policy-manifest.json + audit.jsonl). Integrate enriched policy analysis into audit console and JSON output. Add comprehensive unit tests for all new functionality. Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/36ac2fc6-303e-452b-b36f-cadde5ac87f8
Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/36ac2fc6-303e-452b-b36f-cadde5ac87f8
Mossaka
left a comment
There was a problem hiding this comment.
Code Review — PR #22756
Good first implementation! The overall structure is clean, follows gh-aw patterns well (logger namespace, table-driven tests, struct tags, console.RenderTable, stringutil.Truncate), and the JSONL parser + artifact detection are solid. However, there are several behavioral divergences from the TypeScript source in gh-aw-firewall/src/logs/audit-enricher.ts that will cause incorrect rule attribution. These need fixing before merge.
🔴 Critical: findMatchingRule is missing three key behaviors from the TypeScript version
The TypeScript findMatchingRule() in audit-enricher.ts has three behaviors that the Go port omits:
1. No aclName === "all" handling for default deny rules
The TS version (lines 100-104) has:
if (rule.aclName === 'all') {
if (expectedAction === 'deny') return rule;
continue;
}The Go version has no equivalent. In a real policy-manifest.json, the default deny rule has aclName: "all" with empty domains. Your current code will skip it (empty domains → no match), falling through to (implicit-deny). This means all denied traffic will be attributed to "implicit deny" instead of the actual deny-default rule, making the rule hit table inaccurate.
Your test works around this by giving the deny-all rule Domains: []string{"."}, but "." won't match arbitrary domains in the current matching logic either (confirmed: strings.HasSuffix("evil.com", ".") == false). So even in the test, evil.com falls through to implicit deny.
2. No protocol matching
The TS version checks protocolMatches(rule, entry.isHttps) (lines 78-83, 98). HTTPS-only rules shouldn't match HTTP requests, and vice versa. The Go version ignores protocol entirely.
You'll need to determine isHttps from the audit entry — the method === "CONNECT" check is a reliable signal (CONNECT = HTTPS tunneling).
3. No observed-decision validation
The TS version (line 95, 113) only attributes a rule if its action matches the observed outcome:
const expectedAction = entry.isAllowed ? 'allow' : 'deny';
// ...
if (rule.action === expectedAction) return rule;The Go version attributes based purely on domain match, so an allow rule could be "credited" for denied traffic if the domain matches but the request was actually blocked (e.g., by a port-based rule).
Suggested fix: Change findMatchingRule to take the full AuditLogEntry (not just host string) so it has access to method/status for protocol and decision matching.
🟡 Important: Missing regex domain matching
The TS domainMatchesRule (lines 44-57) detects regex-based rules (aclName.includes('regex') or regex metacharacters in domains) and applies regex matching. The Go version only handles plain domain patterns. This will cause regex rules (like dst_ipv4 patterns with ^[0-9]+ entries) to never match.
At minimum, add the regex detection + regexp.MustCompile path, or document that regex rules are intentionally unsupported with a comment explaining why.
🟡 Important: Missing fields in Go PolicyManifest
The TypeScript PolicyManifest has two additional fields:
hostAccessEnabled: boolean;
allowHostPorts: string | null;These should be added to the Go struct for forward compatibility, even if not used in the current rendering. Without them, round-tripping (unmarshal → marshal) loses data, and --json output will be incomplete.
HostAccessEnabled bool `json:"hostAccessEnabled"`
AllowHostPorts *string `json:"allowHostPorts"` // nullable → pointer🟡 Minor: Missing error:transaction-end-before-headers filter
The TS computeRuleStats (line 163) skips entries where url === 'error:transaction-end-before-headers' — these are benign Squid operational entries. The Go enrichWithPolicyRules counts them, which may inflate totals.
✅ What looks good
- Clean file organization (
firewall_policy.go+firewall_policy_test.go) - Logger namespace
cli:firewall_policyfollows convention - JSONL parser with proper buffer sizing and malformed-line tolerance
- Artifact detection with both
sandbox/firewall/audit/andfirewall-audit-*fallback paths - Good test coverage: domain matching edge cases (suffix collision, case insensitivity, port stripping), manifest loading with sorting, JSONL parsing with empty/malformed lines
- Proper integration into
AuditDatapipeline andrenderConsoleflow --jsonoutput automatically works via struct tagsformatUnixTimestamphelper is clean
@copilot Please address the critical findMatchingRule issues (aclName "all" handling, protocol matching, and observed-decision validation) — these are needed for correct rule attribution. The regex and missing-fields issues are important but lower priority. Happy to discuss any of these further!
…alidation, regex, missing fields, squid filter - findMatchingRule now takes full AuditLogEntry instead of just host string - Add aclName "all" catch-all handling for default deny rules - Add protocol matching (CONNECT = HTTPS, else HTTP) - Add observed-decision validation (rule action must match observed outcome) - Add regex domain matching via aclName "regex" detection or metachar detection - Add missing PolicyManifest fields (hostAccessEnabled, allowHostPorts) - Filter error:transaction-end-before-headers Squid operational entries - Add new tests: TestProtocolMatching, TestIsEntryHTTPS, TestIsEntryAllowed, TestContainsRegexMeta, hostAccessEnabled manifest test, squid filter test - Update existing tests to use full AuditLogEntry with method/status/decision Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/8a6f8f24-239c-4326-b935-1553bf8d7487
All issues addressed in 89e65a6: 🔴 Critical
🟡 Regex domain matching:
🟡 Missing PolicyManifest fields:
🟡 Squid filter:
Tests: Added |
Mossaka
left a comment
There was a problem hiding this comment.
Follow-up Review — Commit 89e65a6
@copilot Great job addressing the review feedback! All critical and important issues have been resolved:
✅ Critical issues fixed
-
aclName "all"handling —findMatchingRulenow correctly handles catch-all rules withrule.ACLName == "all", matching any domain when the action matches the observed outcome. Tests verify both the match case (denied evil.com → deny-all) and the skip case (allowed unknown.com → nil). -
Protocol matching —
protocolMatches()andisEntryHTTPS()are correctly implemented. CONNECT → HTTPS, everything else → HTTP. Tests cover HTTPS-only rule matching/skipping. -
Observed-decision validation —
isEntryAllowed()mirrors the existing firewall log classifier (200/206/304 = allowed, 403/407 = denied, fallback to decision field).findMatchingRulenow only attributes whenrule.Action == expectedAction. Test "allow rule skipped for denied traffic" confirms this.
✅ Important issues fixed
-
Regex domain matching —
containsRegexMeta()+domainMatchesRegex()correctly detect and handle regex-based rules. Tests cover IP pattern and metacharacter detection. -
Missing PolicyManifest fields —
HostAccessEnabled boolandAllowHostPorts *stringadded with proper JSON tags. Test round-trips these fields. -
error:transaction-end-before-headersfilter — Added inenrichWithPolicyRuleswith test coverage.
✅ Test coverage is comprehensive
TestFindMatchingRulenow uses fullAuditLogEntrywith method/status/decisionTestProtocolMatchingcovers HTTPS-only rulesTestIsEntryHTTPSandTestIsEntryAllowedcover classification edge casesTestContainsRegexMetatests metacharacter detectionTestEnrichWithPolicyRulesnow verifies deny-all attribution and Squid error filteringTestAnalyzeFirewallPolicyfull-enrichment test uses properaclName: "all"deny rule
The implementation now faithfully ports the TypeScript audit-enricher.ts logic. This looks good to merge once CI passes! 👍
|
@copilot update the daily firewall report agentic workflow to leverage this data |
…ttribution data Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/d0283a81-24a3-4d01-bf28-71ac57201e36
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/d0283a81-24a3-4d01-bf28-71ac57201e36
Updated the daily firewall report workflow in
The section is conditionally included — only when at least one run has Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
|
@copilot Use formatters from console pkg |
There was a problem hiding this comment.
Pull request overview
Adds automatic firewall policy rule attribution to gh aw audit when policy-manifest.json and audit.jsonl are present, enriching the report with per-rule hit counts and denied request details while preserving the existing fallback behavior when artifacts are absent.
Changes:
- Introduces
pkg/cli/firewall_policy.goto detect artifacts, parse JSONL, match requests to policy rules, and compute enriched policy analysis. - Wires
PolicyAnalysisinto audit processing, JSON output, console rendering, and cached run summaries. - Updates the daily firewall report workflow guide to document the new
policy_analysisoutput shape and suggested aggregations.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
pkg/cli/firewall_policy.go |
Implements policy manifest loading, audit JSONL parsing, rule matching, and enrichment output (PolicyAnalysis). |
pkg/cli/firewall_policy_test.go |
Adds unit tests for domain/rule matching, parsing, artifact detection, enrichment, and timestamp formatting. |
pkg/cli/audit.go |
Runs policy analysis during audit and attaches it to ProcessedRun / RunSummary. |
pkg/cli/audit_report.go |
Adds policy_analysis to the structured JSON output (AuditData). |
pkg/cli/audit_report_render.go |
Renders a new “Firewall Policy Analysis” console section and formats timestamps. |
pkg/cli/logs_models.go |
Persists PolicyAnalysis in the cached run summary model. |
.github/workflows/daily-firewall-report.md |
Documents how to consume policy_analysis in the daily report process. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pkg/cli/firewall_policy.go
Outdated
| // No matching rule — implicit deny | ||
| enriched.RuleID = "(implicit-deny)" | ||
| enriched.Action = "deny" | ||
| enriched.Reason = "No matching policy rule" | ||
| deniedRequests = append(deniedRequests, enriched) | ||
| deniedCount++ |
There was a problem hiding this comment.
When no policy rule matches an entry, the code unconditionally treats the request as an implicit deny (adds it to DeniedRequests and increments deniedCount). This will misclassify requests that were actually allowed but couldn’t be attributed to a specific allow rule (e.g., due to non-domain ACLs, IP-based rules, or gaps in manifest rule parsing). Consider deriving allowed/denied from the observed entry outcome (reusing isEntryAllowed(entry)) and only classifying as "(implicit-deny)" when the entry was actually denied; for unmatched-but-allowed traffic, keep it allowed and attribute it to a separate placeholder like "(unattributed-allow)" (and add a test for this case).
| // No matching rule — implicit deny | |
| enriched.RuleID = "(implicit-deny)" | |
| enriched.Action = "deny" | |
| enriched.Reason = "No matching policy rule" | |
| deniedRequests = append(deniedRequests, enriched) | |
| deniedCount++ | |
| // No matching rule — derive outcome from entry | |
| if isEntryAllowed(entry) { | |
| // Allowed but cannot be attributed to a specific manifest rule | |
| enriched.RuleID = "(unattributed-allow)" | |
| enriched.Action = "allow" | |
| allowedCount++ | |
| } else { | |
| // No matching rule and the entry was actually denied — implicit deny | |
| enriched.RuleID = "(implicit-deny)" | |
| enriched.Action = "deny" | |
| enriched.Reason = "No matching policy rule" | |
| deniedRequests = append(deniedRequests, enriched) | |
| deniedCount++ | |
| } |
pkg/cli/firewall_policy.go
Outdated
| // Strip port for domain tracking | ||
| domain := host | ||
| if idx := strings.LastIndex(host, ":"); idx != -1 { | ||
| domain = host[:idx] | ||
| } |
There was a problem hiding this comment.
UniqueDomains is tracked using the raw host-derived domain without normalizing case. Since matching logic lowercases domains, this can over-count unique domains if the audit log contains mixed-case hosts. Consider lowercasing the stripped domain before inserting into uniqueDomains (and ideally share the same host→domain normalization helper used by domainMatchesRule).
| // Strip port for domain tracking | |
| domain := host | |
| if idx := strings.LastIndex(host, ":"); idx != -1 { | |
| domain = host[:idx] | |
| } | |
| // Strip port for domain tracking and normalize case | |
| domain := host | |
| if idx := strings.LastIndex(host, ":"); idx != -1 { | |
| domain = host[:idx] | |
| } | |
| domain = strings.ToLower(domain) |
| // domainMatchesRegex checks if a domain matches any regex pattern in the list. | ||
| func domainMatchesRegex(domain string, patterns []string) bool { | ||
| for _, pattern := range patterns { | ||
| re, err := regexp.Compile(pattern) | ||
| if err != nil { | ||
| firewallPolicyLog.Printf("Invalid regex pattern %q: %v", pattern, err) | ||
| continue | ||
| } | ||
| if re.MatchString(domain) { | ||
| return true | ||
| } | ||
| } |
There was a problem hiding this comment.
domainMatchesRegex recompiles each regex pattern on every match attempt. For large audit logs this can become a noticeable hotspot and will also spam logs repeatedly for any invalid pattern. Consider compiling regexes once (e.g., during loadPolicyManifest) into a cached []*regexp.Regexp on the rule (json:"-"), or caching compiled patterns in a map keyed by pattern string, so matching is just re.MatchString().
Mossaka
left a comment
There was a problem hiding this comment.
Review — Commits 5-6 (Workflow Update)
@copilot The daily firewall report workflow update looks good overall. The documentation accurately reflects the PolicyAnalysis struct fields and provides useful jq examples.
✅ Verified
- JSON field names in workflow docs match Go struct tags (
policy_summary,rule_hits,denied_requests, etc.) - Conditional "when
policy_analysisis present" phrasing is correct — graceful fallback tofirewall_analysis - Section numbering correctly updated (4→5→6→7)
- jq examples are syntactically valid
⚠️ Issues from Copilot Reviewer worth addressing
The copilot-pull-request-reviewer raised 3 points that I agree with:
- Implicit deny misclassification (important): In
enrichWithPolicyRules, whenfindMatchingRulereturns nil, the code unconditionally classifies it as(implicit-deny). But in the TS source,enrichWithPolicyRuleschecksentry.isAllowedto determine the fallback reason:"Allowed (rule not identified)"vs"Denied (rule not identified)". The Go version should do the same:
} else {
// No matching rule — check observed decision
if isEntryAllowed(entry) {
enriched.RuleID = "(unattributed-allow)"
enriched.Action = "allow"
enriched.Reason = "Allowed (rule not identified)"
allowedCount++
} else {
enriched.RuleID = "(implicit-deny)"
enriched.Action = "deny"
enriched.Reason = "No matching policy rule"
deniedRequests = append(deniedRequests, enriched)
deniedCount++
}
}-
Unique domains case normalization (minor):
uniqueDomainsmap should usestrings.ToLower(domain)since domain matching is case-insensitive. -
Regex recompilation (performance, low priority):
domainMatchesRegexrecompiles patterns on every call. Could cache compiled regexes inloadPolicyManifest. Low priority for Phase 1 since audit logs are typically small.
📋 Pending: pelikhan's "Use formatters from console pkg"
I noticed the summary stats in renderPolicyAnalysis use raw fmt.Fprintf for the counts section. The existing pattern elsewhere is console.RenderStruct for key-value displays — you should switch to that for consistency.
Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/6842f536-c9de-4410-8020-70b30473730c
Done in 2b29236.
Also fixed a |
…alize domain case Co-authored-by: Mossaka <5447827+Mossaka@users.noreply.github.com> Agent-Logs-Url: https://github.com/github/gh-aw/sessions/f7144292-b3b2-4f1f-ab3d-d759920e039f
All three issues addressed in f991b87:
|
Mossaka
left a comment
There was a problem hiding this comment.
Follow-up Review — Commit f991b87
@copilot All three issues resolved correctly:
- ✅ Unattributed-allow: Matches TS behavior —
isEntryAllowed(entry)check on the fallback path - ✅ Case normalization:
strings.ToLower(domain)before inserting intouniqueDomains - ✅ Tests: Both new test cases are well-structured and cover the exact scenarios
The commit 7 (console formatters) also looks clean — PolicySummaryDisplay with console tags and RenderStruct follows the established pattern.
No further issues from me on commits 5-8. The remaining item from the Copilot reviewer (regex caching) is a performance optimization that's fine to skip for Phase 1.
When
policy-manifest.jsonandaudit.jsonlare present in downloaded firewall artifacts,gh aw auditnow automatically enriches the report with policy rule attribution — rule hit counts, denied request details, and summary statistics. Falls back to existing domain-count-only behavior when artifacts are absent. No new flags.New:
pkg/cli/firewall_policy.goPolicyManifest,PolicyRule,AuditLogEntry,EnrichedRequest,RuleHitStats,PolicyAnalysisparseAuditJSONL()— one JSON object per line, skips malformed linesdomainMatchesRule()—.github.commatches bothgithub.comand*.github.comper AWF conventions; regex-based rules detected viacontainsRegexMeta()+domainMatchesRegex()findMatchingRule()— top-to-bottom byorder, first match wins; includesaclName "all"catch-all handling, protocol matching (protocolMatches()/isEntryHTTPS()), and observed-decision validation (isEntryAllowed())enrichWithPolicyRules()— computes rule hits, denied request list, summary stats; filterserror:transaction-end-before-headersSquid entriesdetectFirewallAuditArtifacts()— discovers files insandbox/firewall/audit/orfirewall-audit-logs/Modified
audit.go— callsanalyzeFirewallPolicy(), wiresPolicyAnalysisintoProcessedRunandRunSummaryaudit_report.go— addsPolicyAnalysisfield toAuditDataaudit_report_render.go—renderPolicyAnalysis()renders rule hit table and denied request details;formatUnixTimestamp()helperlogs_models.go— addsPolicyAnalysistoProcessedRunandRunSummaryfor cache persistenceUpdated:
.github/workflows/daily-firewall-report.mdUpdated the daily firewall report agentic workflow to leverage the new
policy_analysisdata from the audit tool:policy_analysisfields returned by the audit tool (rule_hits,denied_requests,policy_summary, etc.)policy_analysisdata is available; falls back gracefully to domain-count-only reportingExample console output
Tests
pkg/cli/firewall_policy_test.go— domain matching, rule matching (includingaclName "all", protocol matching, observed-decision validation), regex metacharacter detection, manifest loading with round-trip forHostAccessEnabled/AllowHostPorts, JSONL parsing, enrichment with Squid error filtering, artifact detection, end-to-end analysis, timestamp formatting.📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.