feat: enhance gh aw audit with firewall policy analysis, diff, and report commands

## Problem Statement

`gh aw audit <run-id>` already downloads firewall artifacts (`audit.jsonl`, `policy-manifest.json`, `access.log`) but doesn't leverage the policy manifest for **rule attribution** — the ability to show *which policy rule* caused each allow/deny decision. The AWF CLI already has this capability (`awf logs audit` with `audit-enricher.ts`), but `gh aw audit` only shows basic domain-level allow/block counts.

Additionally, there's no way to:
- **Compare** firewall behavior across two runs (detect regressions, new denied domains, policy drift)
- **Generate comprehensive reports** across multiple runs for security review

## Proposed Changes (MVP-scoped)

### Phase 1: Enhance `gh aw audit <run-id>` with automatic policy enrichment

When firewall artifacts are present in the downloaded run directory (detected automatically — no new flags), enrich the audit output with:

- **Per-domain breakdown** with allow/deny counts (already exists in `firewall_log.go`)
- **Policy rule attribution** — for each domain, show which rule caused the decision
  - Requires parsing `policy-manifest.json` and replaying ACL evaluation order (port the logic from `audit-enricher.ts` to Go)
  - Rules are evaluated top-to-bottom by `order` field; first matching rule wins
  - Domain matching: `.github.com` matches both `github.com` and `*.github.com`
- **Denied request details** with the specific rule ID, description, and reason
- **Summary statistics** — total requests, allowed, denied, unique domains, rule hit counts

**Data sources** (in priority order):
1. `audit.jsonl` — structured JSON lines, preferred
2. `access.log` — space-separated text format, fallback

**Policy manifest location**: `policy-manifest.json` in the audit artifacts directory

**Output**: Integrated into the existing audit markdown report (no separate flag). When `policy-manifest.json` is absent, fall back to current behavior (domain counts only).

<details>
<summary>Example enriched output section</summary>

```markdown
### Firewall Policy Analysis

**Policy**: 12 rules, SSL Bump disabled, DLP disabled

| Rule | Action | Description | Hits |
|------|--------|-------------|------|
| allow-both-plain | ✅ allow | Allow HTTP/HTTPS to whitelisted domains | 47 |
| deny-blocked-plain | ❌ deny | Deny all other HTTP/HTTPS traffic | 3 |
| deny-unsafe-ports | ❌ deny | Deny requests to unsafe ports | 0 |

**Denied Requests (3)**

| Time | Domain | Rule | Reason |
|------|--------|------|--------|
| 14:23:01 | evil.com:443 | deny-blocked-plain | Domain not in allowlist |
| 14:23:05 | tracker.io:443 | deny-blocked-plain | Domain not in allowlist |
| 14:24:12 | evil.com:80 | deny-blocked-plain | Domain not in allowlist |
```
</details>

#### Implementation notes

- **Reimplement in Go** — do NOT invoke `awf logs audit` as a subprocess (requires Docker/sudo)
- Port `enrichWithPolicyRules()` and `domainMatchesRule()` logic from `src/logs/audit-enricher.ts`
- Add `PolicyManifest` and `PolicyRule` structs mirroring the TypeScript types in `src/types.ts`
- Parse `audit.jsonl` as JSON lines (each line: `{"ts", "client", "host", "dest", "method", "status", "decision", "url"}`)
- Extend existing `pkg/cli/firewall_log.go` with JSONL parsing + policy manifest loading
- Follow existing patterns: `console.Format*()` for output, `--json` support for structured output

#### Tasks
- [ ] Add `PolicyManifest`, `PolicyRule` Go structs
- [ ] Add `audit.jsonl` JSONL parser (complement existing `access.log` parser)
- [ ] Port `enrichWithPolicyRules()` / `domainMatchesRule()` to Go
- [ ] Detect firewall artifacts automatically in `AuditWorkflowRun()`
- [ ] Integrate enriched policy analysis into audit markdown output
- [ ] Add `--json` support for enriched firewall data
- [ ] Unit tests for rule matching, JSONL parsing, policy manifest loading

---

### Phase 2: `gh aw audit diff <run-id-1> <run-id-2>`

Compare firewall behavior across two workflow runs.

**Output includes**:
- **New domains** — domains contacted in run-2 but not run-1
- **Removed domains** — domains in run-1 but not run-2
- **Status changes** — domains that changed from allowed→denied or denied→allowed
- **Request volume changes** — significant increase/decrease in request counts per domain
- **Anomaly flags** — new denied domains, previously-denied domains now allowed

**Output formats**: `pretty` (default), `markdown`, `json`

**Usage**:
```bash
# Compare two runs
gh aw audit diff 12345 12346

# Markdown output for PR comments
gh aw audit diff 12345 12346 --format markdown

# JSON for CI integration
gh aw audit diff 12345 12346 --json
```

<details>
<summary>Example diff output</summary>

```markdown
### Firewall Diff: Run #12345 → Run #12346

**New domains (2)**
- ✅ `registry.npmjs.org` (15 requests, allowed)
- ❌ `telemetry.example.com` (2 requests, denied)

**Removed domains (1)**
- `old-api.internal.com` (was allowed, 8 requests in previous run)

**Status changes (1)**
- `staging.api.com`: ✅ allowed → ❌ denied (policy change?)

**Volume changes**
- `api.github.com`: 23 → 89 requests (+287%)
```
</details>

#### Implementation notes

- Reuse Phase 1's artifact downloading and parsing
- Diff logic operates on aggregated `DomainRequestStats` from both runs
- Cache downloaded artifacts to avoid re-downloading when diffing against a previously audited run

#### Tasks
- [ ] Add `audit diff` subcommand with two positional run-id arguments
- [ ] Implement domain-level diff logic (new, removed, changed status, volume delta)
- [ ] Add anomaly detection (new denied, status flips)
- [ ] Pretty, markdown, and JSON formatters
- [ ] Unit tests for diff logic

---

### Phase 3: `gh aw audit report [--workflow <name>] [--last <N>]`

Generate a comprehensive audit report across multiple runs for security review.

**Output includes**:
- **Executive summary** — total runs analyzed, overall denial rate, unique domains across all runs
- **Domain inventory** — all domains contacted across runs, with per-run allow/deny status
- **Anomaly detection** — runs with unusual patterns (spike in denials, new domains)
- **Recommendations** — frequently denied domains that might need allowlisting, unused allowed domains
- **Per-run breakdown** — summary row per run with key metrics

**Usage**:
```bash
# Report on last 10 runs of a workflow
gh aw audit report --workflow "agent-task" --last 10

# Report on all recent runs (default: last 20)
gh aw audit report

# JSON for dashboards
gh aw audit report --workflow "agent-task" --last 5 --json
```

**Output**: Markdown by default (suitable for security reviews, piping to files, or `$GITHUB_STEP_SUMMARY`).

#### Implementation notes

- Fetch run list via GitHub API filtered by workflow name
- Download and parse artifacts for each run (with caching)
- Aggregate stats across runs into a cross-run summary
- Follow existing `addRepoFlag()`, `addJSONFlag()` patterns

#### Tasks
- [ ] Add `audit report` subcommand with `--workflow` and `--last` flags
- [ ] Implement cross-run aggregation logic
- [ ] Anomaly detection (denial rate spikes, new domain appearances)
- [ ] Recommendation engine (frequently denied → suggest allowlist, unused allowed → suggest removal)
- [ ] Markdown and JSON formatters
- [ ] Unit tests for aggregation and anomaly detection

---

## Data Format Reference

**`audit.jsonl`** (one JSON object per line):
```json
{"ts":1761074374.646,"client":"172.30.0.20","host":"api.github.com:443","dest":"140.82.114.22:443","method":"CONNECT","status":200,"decision":"TCP_TUNNEL","url":"api.github.com:443"}
```

**`policy-manifest.json`**:
```json
{
  "version": 1,
  "generatedAt": "2024-12-20T15:30:45.123Z",
  "rules": [
    {
      "id": "deny-unsafe-ports",
      "order": 1,
      "action": "deny",
      "aclName": "!Safe_ports",
      "protocol": "both",
      "domains": [],
      "description": "Deny requests to ports not in Safe_ports ACL"
    }
  ],
  "dangerousPorts": [22, 25, 109, 110, 143, 389, 465, 587],
  "dnsServers": ["8.8.8.8", "8.8.4.4"],
  "sslBumpEnabled": false,
  "dlpEnabled": false,
  "hostAccessEnabled": false,
  "allowHostPorts": null
}
```

**Decision codes**: `TCP_TUNNEL`/`TCP_HIT`/`TCP_MISS` = allowed, `TCP_DENIED`/`NONE_NONE` = denied

## Artifact directory structure

```
run-{id}/
├── sandbox/
│   └── firewall/
│       ├── logs/
│       │   ├── audit.jsonl            # Structured log entries (preferred)
│       │   └── access.log             # Text format (fallback)
│       └── policy/
│           └── policy-manifest.json   # Policy rules for enrichment
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: enhance gh aw audit with firewall policy analysis, diff, and report commands #22736

Problem Statement

Proposed Changes (MVP-scoped)

Phase 1: Enhance `gh aw audit <run-id>` with automatic policy enrichment

Implementation notes

Tasks

Phase 2: `gh aw audit diff <run-id-1> <run-id-2>`

Implementation notes

Tasks

Phase 3: `gh aw audit report [--workflow <name>] [--last <N>]`

Implementation notes

Tasks

Data Format Reference

Artifact directory structure

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat: enhance gh aw audit with firewall policy analysis, diff, and report commands #22736

Description

Problem Statement

Proposed Changes (MVP-scoped)

Phase 1: Enhance gh aw audit <run-id> with automatic policy enrichment

Implementation notes

Tasks

Phase 2: gh aw audit diff <run-id-1> <run-id-2>

Implementation notes

Tasks

Phase 3: gh aw audit report [--workflow <name>] [--last <N>]

Implementation notes

Tasks

Data Format Reference

Artifact directory structure

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Phase 1: Enhance `gh aw audit <run-id>` with automatic policy enrichment

Phase 2: `gh aw audit diff <run-id-1> <run-id-2>`

Phase 3: `gh aw audit report [--workflow <name>] [--last <N>]`