feat: sandbox-initiated policy recommendations with human-in-the-loop approval

## Implementation Plan

This feature is split into two issues for incremental delivery:

1. **#204 — Plumbing:** Denial aggregation, gRPC transport, gateway persistence, approval workflow, policy merge, CLI/TUI, and a deterministic (no-LLM) chunk generator for end-to-end testing (~15-18 days)
2. **#205 — PolicyAdvisor agent harness:** LLM-powered analysis via `inference.local`, skill context extraction, OPA validation loop, context window management, progressive L7 intelligence (~5-6 days)

Issue 1 is fully functional on its own — mechanistic recommendations (denied host:port to allow rule, with L7 audit-then-refine). Issue 2 upgrades the analysis to produce intelligent, grouped, security-aware recommendations.

Full design: https://gitlab-master.nvidia.com/-/snippets/12930

---

## Problem Statement

When a sandbox denies a network connection, the current remediation loop is entirely client-side: an agent skill must parse logs, detect denials, generate a policy update, and push it via `openshell sandbox policy set`. This requires significant client-side work, relies on correct agent skill execution, and provides no system-level enforcement for the approval step (no HITL gate). The best UX would be for sandboxes to auto-detect when network policies are preventing connections, generate structured policy change recommendations, and send them through the existing sandbox → gateway → user communication channel with an explicit human-in-the-loop approval step before any policy change takes effect.

## Architecture: Sandbox Aggregator + Sandbox PolicyAdvisor + Gateway Persistence

The sandbox runs both a `DenialAggregator` and a lightweight `PolicyAdvisor` agent harness. The aggregator groups denials locally (typed data, precise timing, L7 samples). The PolicyAdvisor analyzes summaries using the cluster-wide inference model via `inference.local`, validates proposals against the local OPA engine, and submits proposed chunks to the gateway. The gateway is a thin persistence + validation + approval layer — it never calls an LLM.

```
SANDBOX                                         GATEWAY                     USER
+---------------------------------------------+ +-------------------+ +----------+
|                                              | |                   | |          |
| Deny Event --> DenialAggregator              | |                   | |          |
| (proxy.rs,    (host,port,binary + L7 samples)| |                   | |          |
|  relay.rs)         |                         | |                   | |          |
|                    v                         | |                   | |          |
|           PolicyAdvisor (agent harness)       | |                   | |          |
|              |                               | |                   | |          |
|              | 1. Skill context (bundled)     | |                   | |          |
|              | 2. LLM call via inference.local| |                   | |          |
|              | 3. Validate against local OPA  | |                   | |          |
|              | 4. Fix call if invalid         | |                   | |          |
|              v                               | |                   | |          |
|     SubmitPolicyAnalysis RPC --------------->| | Validate + persist| |          |
|     (summaries + proposed chunks)            | |     |             | |          |
|                                              | |     v             | |          |
+---------------------------------------------+ | DraftPolicyUpdate |->| draft    |
                                                 | (stream event)    | | approve  |
                                                 | <-- ApproveChunk -|-| reject   |
                                                 |     merge policy  | |          |
                                                 +-------------------+ +----------+
```

### Why Sandbox-Side LLM

1. **Zero LLM configuration.** Uses existing cluster inference (`openshell cluster inference set`). Every sandbox already has `inference.local` access via the proxy fast path (proxy.rs:217, bypasses OPA). No new env vars, API keys, or network policy entries.
2. **Distributed scaling.** Each sandbox makes its own LLM calls via `inference.local`. N sandboxes = N independent analysis pipelines. No gateway bottleneck.
3. **Pre-validated proposals.** The sandbox has the OPA engine — it validates proposed rules locally before submission. The gateway doesn't run OPA. Only the sandbox can catch complete-rule conflicts, L7 config inconsistencies, and schema errors before they reach the user.
4. **Deep policy knowledge.** The PolicyAdvisor embeds extracted knowledge from the `generate-sandbox-policy` skill — validation rules, decision trees, access presets, glob patterns, private IP handling, 20+ reference examples (~4000-5000 tokens, bundled in sandbox binary).
5. **Thin gateway.** Gateway is purely persistence + validation + approval workflow. No LLM client, no analysis triggers, no context window management.

### Trust Model

The sandbox proposes policy chunks, but dual control ensures safety:
1. The PolicyAdvisor is **our code** (baked into the sandbox binary), not untrusted user code
2. The gateway **validates** all proposed chunks (rejects loopback/link-local, rate-limits, format checks)
3. The user **approves** every chunk (human-in-the-loop is mandatory)
4. LLM calls traverse the audited `inference.local` path

## Core Concepts

### Living Draft Policy

The system maintains a **continuously-evolving draft policy** per sandbox — not a queue of individual recommendations. The draft is composed of granular **chunks**, each a proposed rule addition tied to specific denial events. Users can view the full draft, inspect individual chunks with rationale, and selectively approve/reject at any time.

### PolicyChunk Lifecycle

```
  pending --> approved --> (superseded by Stage 2 refinement)
         \-> rejected --> (superseded if re-analysis produces new chunk)
```

Each chunk carries: proposed rule, rationale, security notes, confidence score, denial references, stage (initial/refined), and optional supersession pointer.

### Progressive L7 Visibility

Policy recommendations follow a two-stage pipeline that leverages L7 inspection for data-driven refinement:

**Stage 1 (L4 denial → initial recommendation):** When a new host:port is first denied and the port supports HTTP inspection (80, 443, 8080, etc.), the system recommends a rule with L7 audit mode: `protocol: rest`, `tls: terminate` (for 443), `enforcement: audit`, broad access. Traffic flows immediately while every HTTP request is logged with method + path by the existing L7 relay.

**Stage 2 (L7 audit data → refined recommendation):** The aggregator collects L7 audit events with (method, path) samples. Once enough data accumulates, a refined chunk replaces the audit-mode rule with specific access presets or explicit L7 rules based on observed traffic patterns. The refined chunk supersedes the Stage 1 chunk.

This eliminates the two-approval-cycle problem (read-only approved, then write needed) and produces data-driven L7 recommendations instead of guessing access levels. For L4-only protocols (SSH, databases, Kafka, etc.), Stage 1 produces a plain L4 rule with no Stage 2.

### DenialAggregator

Groups by primary key `(host, port, binary)` with dedup windows. Key features:
- Accurate counts: `suppressed_count` + `total_count` (not just threshold value)
- Memory bounds: `max_keys=1000` cap with overflow detection
- Slow-drip detection: `persistent_threshold` + periodic stale-flush
- Credential sanitization: strips Authorization, API keys, cookies from cmdline before storage
- **L7 event ingestion:** collects per-request `(method, path, decision)` samples from L7 relay events (capped at 50 per entry)
- **DNS probe:** one-shot `lookup_host()` at entry creation to detect private IPs early, enabling `allowed_ips` in initial chunks

### PolicyAdvisor Agent Harness

A lightweight agent — not general-purpose tool-calling, but a fixed 1-2 call pattern:

1. **First call:** Analyze denials + propose chunks (using embedded skill context as system prompt via `inference.local`)
2. **Validate** each proposed chunk against the local OPA engine (check conflicts, L7 config, breadth warnings)
3. **Second call (if needed):** Fix chunks that failed validation, including the error messages
4. **Submit** validated chunks + summaries to gateway via `SubmitPolicyAnalysis` RPC

The skill context (~4000-5000 tokens) is extracted from the existing `generate-sandbox-policy` skill and bundled in the sandbox binary. It includes: validation rules, access preset definitions, L4 vs L7 decision tree, glob pattern translation, private IP / SSRF rules, protocol reference table, auth chain patterns, and reference examples.

**Mechanistic mode:** When cluster inference is not configured, the PolicyAdvisor skips LLM calls and runs rule-based analysis. Stage 1 produces L7 audit-mode rules for HTTP ports; Stage 2 computes access level from observed methods (all GET/HEAD/OPTIONS → read-only, POST but no DELETE → read-write, etc.).

### DNS Resolution: Sandbox Probe + Gateway Verification

The DenialAggregator performs a speculative DNS lookup when it first observes a new (host, port) key. This enables the PolicyAdvisor to include `allowed_ips` in Stage 1 chunks for hosts that resolve to private IPs — without a second approval cycle.

- **Sandbox probe (untrusted, best-effort):** One-shot `tokio::net::lookup_host()` at entry creation. If DNS resolves to RFC1918, the proposed rule includes `allowed_ips: ["x.x.x.x/32"]`.
- **Gateway verification (trusted):** Re-resolves independently on `SubmitPolicyAnalysis` receipt. If sandbox and gateway DNS diverge, gateway's resolution wins and a security warning is added.
- **Trusted CIDRs:** `--trusted-cidr` at sandbox creation pre-authorizes known-good ranges, replacing per-host /32s with subnet CIDRs for environments with many internal services on the same subnet.

## Affected Components

| Component | Role |
|-----------|------|
| `proto/navigator.proto`, `proto/sandbox.proto` | New messages (`L7RequestSample`, `DenialSummary`, `PolicyChunk`, `SubmitPolicyAnalysis`), new RPCs, `DraftPolicyUpdate` stream event |
| `crates/navigator-server/src/grpc.rs`, `persistence/` | DraftPolicy Store, `SubmitPolicyAnalysis` handler, draft query/approval RPCs, DNS re-verification |
| `crates/navigator-sandbox/src/proxy.rs`, `l7/relay.rs` | Feed L4 deny + L7 audit/deny events to aggregator |
| `crates/navigator-sandbox/src/` (new modules) | `DenialAggregator`, `PolicyAdvisor` agent harness, `SubmitPolicyAnalysis` client |
| `crates/navigator-cli/src/main.rs`, `run.rs` | New `sandbox draft` commands, `cluster policy-advisor` toggle |
| `crates/navigator-tui/src/app.rs`, `ui/` | Draft panel with chunk inspection and approval keybindings |

## Proto Changes

New messages:
- `L7RequestSample` — observed HTTP method+path pattern from L7 inspection
- `DenialSummary` — with `l7_request_samples`, `l7_inspection_active`, `denial_stage` (`l4_deny`|`l7_deny`|`l7_audit`|`ssrf`), sandbox-probed `resolved_ips`
- `PolicyChunk` — with `stage` (initial/refined), `supersedes_chunk_id`, `security_notes`, `confidence`
- `SubmitPolicyAnalysisRequest/Response` — atomic submission of summaries + proposed chunks + analysis mode
- `DraftPolicyUpdate` — new `SandboxStreamEvent` variant for real-time notifications

New RPCs:
- `SubmitPolicyAnalysis` — sandbox → gateway
- `GetDraftPolicy`, `ApproveDraftChunk`, `RejectDraftChunk`, `ApproveAllDraftChunks`, `EditDraftChunk`, `UndoDraftChunk`, `GetDraftHistory` — CLI/TUI → gateway

## CLI UX

```
openshell sandbox draft <name>                        # View the full living draft policy
openshell sandbox draft <name> --chunks               # View individual chunks with rationale
openshell sandbox draft <name> --chunk <id>           # View a specific chunk in detail
openshell sandbox draft approve <name> <chunk_id>     # Approve a specific chunk
openshell sandbox draft approve <name> --all          # Approve all (skips security-flagged)
openshell sandbox draft reject <name> <chunk_id>      # Reject a chunk
openshell sandbox draft reject <name> <id> --reason   # Reject with reason (fed to LLM)
openshell sandbox draft edit <name> <chunk_id>        # Edit a pending chunk
openshell sandbox draft undo <name> <chunk_id>        # Reverse an approval
openshell sandbox draft history <name>                # View decision history

openshell cluster policy-advisor enable               # Enable for all sandboxes
openshell cluster policy-advisor disable              # Disable
```

## Phased Implementation

| Phase | Scope | Effort |
|-------|-------|--------|
| **Phase 0** | Policy templates (`--policy-template python-dev/node-dev/agent-coding`) + trusted networks (`--trusted-cidr`) | 2-3 days |
| **Phase 1** | Gateway persistence (DraftPolicy Store + `SubmitPolicyAnalysis` RPC + draft query/approval RPCs) + basic CLI (`sandbox draft`, `draft approve`, `draft reject`) + gateway-side DNS verification + cmdline sanitization | 5-6 days |
| **Phase 2** | Sandbox-side `DenialAggregator` (counts, caps, expiry, stale-flush, L7 event ingestion, DNS probe) + mechanistic `PolicyAdvisor` (Stage 1 L7 audit + Stage 2 access refinement, no LLM) + `SubmitPolicyAnalysis` client | 5-6 days |
| **Phase 3** | Sandbox-side LLM `PolicyAdvisor` agent harness: skill context extraction + `inference.local` LLM calls + OPA validation loop + fix-and-retry + adaptive triggers + pre-filtering + context window management. CLI: `cluster policy-advisor enable/disable` | 5-6 days |
| **Phase 4** | Full CLI (`draft edit`, `draft reject --reason`, `draft undo`, `draft history`) + TUI draft panel with keybindings + chunk supersession UX | 4-5 days |
| **Phase 5** | Pre-merge conflict detection + `approve --all` safety gate + rejection backoff + hostname normalization + sensitive endpoint blocklist | 3-4 days |

**Total: ~24-30 days.** Phases 0-1 ship independently. Phase 0 = cold-start friction reduction. Phase 1 = gateway MVP. Phase 2 = sandbox MVP with mechanistic recommendations. Phase 3 = LLM intelligence. Phase 4 = full UX. Phase 5 = security hardening.

## Key Design Decisions

| Decision | Resolution |
|----------|------------|
| LLM location | Sandbox-side via `inference.local` (not gateway) |
| Model selection | Cluster-wide inference config — no new env vars |
| Aggregation key | `(host, port, binary)` primary + L7 (method, path) sub-samples |
| Trust model | Sandbox proposes → gateway validates → user approves |
| Without LLM | Mechanistic mode: rule-based Stage 1/Stage 2 chunks |
| L7 access levels | Data-driven via progressive L7 visibility (Stage 1 audit → Stage 2 enforce) |
| Private IPs | Sandbox DNS probe at aggregation + gateway re-verification |
| Chunk granularity | Coarse by default (LLM groups related services) |
| Auto-expiry | 1h default for pending chunks |
| Rejection semantics | Backoff after 2 rejections for same (host, port); `draft retry` to re-queue |

## Risks & Open Questions

### Trust & Security
- **Privilege escalation pathway**: A compromised sandbox could craft denial patterns to manipulate operators into approving access to sensitive endpoints. Mitigation: gateway validates proposals (reject loopback/link-local/metadata IPs), dual-control approval, audit trail.
- **Sandbox DNS untrusted**: Speculative DNS probe results could be poisoned. Mitigation: gateway re-verifies independently; gateway's resolution is the trust anchor.
- **Rate limiting**: Max 10 pending chunks per sandbox, adaptive analysis intervals (10s cold-start → 2m steady-state). Gateway-enforced.
- **Strictly additive**: Recommendations can only propose adding entries to `network_policies`. Cannot modify static fields, remove restrictions, or change network mode.

### Open Questions
| # | Question | Recommendation |
|---|----------|----------------|
| 1 | Chunk granularity — fine (per host:port) or coarse (per activity pattern)? | Coarse by default. LLM groups related services. |
| 2 | Auto-expiry for pending chunks? | Yes, 1h default with configurable TTL. |
| 3 | Cross-sandbox learning? | Defer to v2. |
| 4 | Agent workflow UX — autonomous agents can't pause for approval | Policy templates + audit mode as interim. Needs further design. |
| 5 | Wildcard host matching (`*.googleapis.com`)? | Defer to v2. Each unique hostname is a separate rule entry for now. |

## Test Considerations

- **Unit tests**: `DenialAggregator` dedup/threshold/cooldown/L7 sample logic, OPA validation of proposed chunks, cmdline sanitization, DNS probe
- **Integration tests**: Full flow — sandbox deny → aggregation → PolicyAdvisor analysis → submission → gateway persistence → approval → policy merge → sandbox reload
- **E2E tests**: CLI `sandbox draft` / `draft approve` flow, TUI draft panel rendering
- **Security tests**: SSRF protections after approved endpoints, static field immutability through merge path, rate limiting, gateway DNS re-verification vs sandbox probe divergence




Component	Role
`proto/navigator.proto`, `proto/sandbox.proto`	New messages (`L7RequestSample`, `DenialSummary`, `PolicyChunk`, `SubmitPolicyAnalysis`), new RPCs, `DraftPolicyUpdate` stream event
`crates/navigator-server/src/grpc.rs`, `persistence/`	DraftPolicy Store, `SubmitPolicyAnalysis` handler, draft query/approval RPCs, DNS re-verification
`crates/navigator-sandbox/src/proxy.rs`, `l7/relay.rs`	Feed L4 deny + L7 audit/deny events to aggregator
`crates/navigator-sandbox/src/` (new modules)	`DenialAggregator`, `PolicyAdvisor` agent harness, `SubmitPolicyAnalysis` client
`crates/navigator-cli/src/main.rs`, `run.rs`	New `sandbox draft` commands, `cluster policy-advisor` toggle
`crates/navigator-tui/src/app.rs`, `ui/`	Draft panel with chunk inspection and approval keybindings

Phase	Scope	Effort
Phase 0	Policy templates (`--policy-template python-dev/node-dev/agent-coding`) + trusted networks (`--trusted-cidr`)	2-3 days
Phase 1	Gateway persistence (DraftPolicy Store + `SubmitPolicyAnalysis` RPC + draft query/approval RPCs) + basic CLI (`sandbox draft`, `draft approve`, `draft reject`) + gateway-side DNS verification + cmdline sanitization	5-6 days
Phase 2	Sandbox-side `DenialAggregator` (counts, caps, expiry, stale-flush, L7 event ingestion, DNS probe) + mechanistic `PolicyAdvisor` (Stage 1 L7 audit + Stage 2 access refinement, no LLM) + `SubmitPolicyAnalysis` client	5-6 days
Phase 3	Sandbox-side LLM `PolicyAdvisor` agent harness: skill context extraction + `inference.local` LLM calls + OPA validation loop + fix-and-retry + adaptive triggers + pre-filtering + context window management. CLI: `cluster policy-advisor enable/disable`	5-6 days
Phase 4	Full CLI (`draft edit`, `draft reject --reason`, `draft undo`, `draft history`) + TUI draft panel with keybindings + chunk supersession UX	4-5 days
Phase 5	Pre-merge conflict detection + `approve --all` safety gate + rejection backoff + hostname normalization + sensitive endpoint blocklist	3-4 days

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: sandbox-initiated policy recommendations with human-in-the-loop approval #153

Implementation Plan

Problem Statement

Architecture: Sandbox Aggregator + Sandbox PolicyAdvisor + Gateway Persistence

Why Sandbox-Side LLM

Trust Model

Core Concepts

Living Draft Policy

PolicyChunk Lifecycle

Progressive L7 Visibility

DenialAggregator

PolicyAdvisor Agent Harness

DNS Resolution: Sandbox Probe + Gateway Verification

Affected Components

Proto Changes

CLI UX

Phased Implementation

Key Design Decisions

Risks & Open Questions

Trust & Security

Open Questions

Test Considerations

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Decision	Resolution
LLM location	Sandbox-side via `inference.local` (not gateway)
Model selection	Cluster-wide inference config — no new env vars
Aggregation key	`(host, port, binary)` primary + L7 (method, path) sub-samples
Trust model	Sandbox proposes → gateway validates → user approves
Without LLM	Mechanistic mode: rule-based Stage 1/Stage 2 chunks
L7 access levels	Data-driven via progressive L7 visibility (Stage 1 audit → Stage 2 enforce)
Private IPs	Sandbox DNS probe at aggregation + gateway re-verification
Chunk granularity	Coarse by default (LLM groups related services)
Auto-expiry	1h default for pending chunks
Rejection semantics	Backoff after 2 rejections for same (host, port); `draft retry` to re-queue

#	Question	Recommendation
1	Chunk granularity — fine (per host:port) or coarse (per activity pattern)?	Coarse by default. LLM groups related services.
2	Auto-expiry for pending chunks?	Yes, 1h default with configurable TTL.
3	Cross-sandbox learning?	Defer to v2.
4	Agent workflow UX — autonomous agents can't pause for approval	Policy templates + audit mode as interim. Needs further design.
5	Wildcard host matching (`*.googleapis.com`)?	Defer to v2. Each unique hostname is a separate rule entry for now.

feat: sandbox-initiated policy recommendations with human-in-the-loop approval #153

Description

Implementation Plan

Problem Statement

Architecture: Sandbox Aggregator + Sandbox PolicyAdvisor + Gateway Persistence

Why Sandbox-Side LLM

Trust Model

Core Concepts

Living Draft Policy

PolicyChunk Lifecycle

Progressive L7 Visibility

DenialAggregator

PolicyAdvisor Agent Harness

DNS Resolution: Sandbox Probe + Gateway Verification

Affected Components

Proto Changes

CLI UX

Phased Implementation

Key Design Decisions

Risks & Open Questions

Trust & Security

Open Questions

Test Considerations

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions