Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 30 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,12 +71,29 @@ Every request and response passes through the DLP engine before leaving your mac
```toml
[dlp]
enabled = true
secrets = "redact" # API keys, tokens, credentials → [REDACTED]
pii = "warn" # Emails, phone numbers → logged
names = "pseudonymize" # Real names → consistent pseudonyms
injection = "block" # Prompt injection attempts → 400
url_exfil = "block" # Data exfiltration URLs → stripped
canary = true # Inject canary tokens to detect leaks

[[dlp.secrets]]
name = "custom_token"
prefix = "tok_"
pattern = "tok_[A-Za-z0-9]{40}"
action = "redact" # API keys, tokens, credentials → [REDACTED]

[dlp.pii]
credit_cards = true
iban = true
action = "redact" # Emails, phone numbers → redacted

[[dlp.names]]
term = "Acme Corp"
action = "pseudonym" # Real names → consistent pseudonyms

[dlp.prompt_injection]
enabled = true
action = "block" # Prompt injection attempts → 400

[dlp.url_exfil]
enabled = true
action = "block" # Data exfiltration URLs → stripped
```

No other LLM proxy does this. LiteLLM, Bifrost, Portkey, Kong -- none have inline DLP on the hot path.
Expand Down Expand Up @@ -267,6 +284,8 @@ src/
│ ├── dispatch/ Core dispatch: DLP, cache, route, provider loop
│ ├── openai_compat/ OpenAI /v1/chat/completions translation
│ ├── responses_compat/ OpenAI Responses API translation
│ ├── rpc/ JSON-RPC endpoint
│ ├── watch_sse.rs Live traffic inspector SSE backend
│ └── fan_out.rs Parallel multi-provider dispatch
├── providers/ Provider implementations and registry
├── router/ Regex-based request routing engine
Expand All @@ -279,9 +298,12 @@ src/
│ ├── token_pricing/ Pricing, spend tracking, budgets
│ ├── mcp/ MCP tool matrix, JSON-RPC server
│ ├── tap/ Webhook event emission
│ └── harness/ Record & replay sandwich testing
│ ├── harness/ Record & replay sandwich testing
│ ├── tool_layer/ Tool-calling abstraction layer
│ └── pledge/ Pledge-based capability restrictions
├── security/ Circuit breakers, rate limiting, audit log
└── storage/ Unified redb storage backend
├── storage/ Unified redb storage backend
└── preset/ Preset management system
```

## Development
Expand Down
10 changes: 5 additions & 5 deletions docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,13 +25,13 @@ flowchart TB
end

subgraph router["Router"]
r0["0. Auto-map regex — transform model name"]
r1["1. WebSearch — web_search tool detected"]
r2["2. Background — model matches background_regex"]
r3["3. Subagent — GROB-SUBAGENT-MODEL tag"]
r4["4. Prompt rules — regex on user message"]
r5["5. Think — thinking/reasoning enabled"]
r6["6. Default model — fallback"]
r3["3. Auto-map regex — transform model name"]
r4["4. Subagent — GROB-SUBAGENT-MODEL tag"]
r5["5. Prompt rules — regex on user message"]
r6["6. Think — thinking/reasoning enabled"]
r7["7. Default model — fallback"]
rd["RouteDecision { model, route_type }"]
end

Expand Down
2 changes: 1 addition & 1 deletion docs/CONFIGURATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,7 @@ max_body_size = 10485760 # Max request body in bytes (default: 10MB)
security_headers = true # Apply OWASP security headers (default: true)
circuit_breaker = true # Enable circuit breaker per provider (default: true)
audit_dir = "" # Audit log directory, empty = disabled (default: "")
audit_signing_algorithm = "" # "ecdsa-p256" (default) or "hmac-sha256"
audit_signing_algorithm = "" # "ecdsa-p256" (default), "hmac-sha256", or "ed25519"
audit_hmac_key_path = "" # Path to HMAC key file (for hmac-sha256; default: <audit_dir>/audit_hmac.key)

# Adaptive provider scoring (opt-in)
Expand Down
2 changes: 1 addition & 1 deletion docs/DCI-REPORT.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Documentation Completeness Index (DCI) Report

**Project**: Grob v0.30.0
**Project**: Grob v0.35.1
**Date**: 2026-03-18
**Auditor**: Doc Forge (automated)
**Scope**: Full project audit (9th pass)
Expand Down
2 changes: 1 addition & 1 deletion docs/QUICKSTART.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Get up and running in 30 seconds.
```bash
brew install azerozero/tap/grob
# or without Homebrew:
curl -fsSL https://raw.githubusercontent.com/azerozero/grob/main/scripts/install.sh | sh
curl -fsSL https://grob.sh | sh
```

## 2. Apply a preset
Expand Down
29 changes: 26 additions & 3 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,20 @@ Grob accepts requests in both Anthropic and OpenAI API formats, normalizes them,
| Topic | Document |
|-------|----------|
| All config options | [Configuration Reference](CONFIGURATION.md) |
| CLI commands | [CLI Reference](reference/cli.md) |
| DLP engine | [DLP Reference](reference/dlp.md) |
| Routing engine | [Routing Reference](reference/routing.md) |
| Authentication | [Authentication Reference](reference/authentication.md) |
| Caching | [Caching Reference](reference/caching.md) |
| Fan-out racing | [Fan-out Reference](reference/fan-out.md) |
| Security middleware | [Security Reference](reference/security.md) |
| Storage backend | [Storage Reference](reference/storage.md) |
| Observability | [Observability Reference](reference/observability.md) |
| Operations | [Operations Reference](reference/operations.md) |
| Setup wizard | [Setup Wizard Reference](reference/setup-wizard.md) |
| Feature matrix | [Feature Matrix](reference/features.md) |
| Benchmarks | [Benchmarks](reference/benchmarks.md) |
| OWASP LLM Top 10 | [OWASP Coverage](reference/owasp-llm-top10.md) |
| CLI commands | [CLI Reference](reference/cli.md) |
| Provider internals | [Provider Reference](reference/providers.md) |
| API compatibility | [API Compatibility Reference](reference/api-compatibility.md) |
| API endpoints | [OpenAPI Spec](openapi.yaml) |
Expand All @@ -62,11 +73,23 @@ Grob accepts requests in both Anthropic and OpenAI API formats, normalizes them,
|-------|----------|
| Architecture | [Architecture Overview](ARCHITECTURE.md) |
| Security model | [Security Model](explanation/security.md) |
| Policy engine | [Policy Engine](explanation/policies.md) |
| Design philosophy | [Design Principles](design-principles.md) |
| Gemini specifics | [Gemini Integration](gemini-integration.md) |
| Architecture decisions | [ADRs](decisions/) |
| Design doc template | [Design Doc Template](design/000-template.md) |

### Architecture decisions (ADRs)

| ADR | Title |
|-----|-------|
| [0001](decisions/0001-static-config-no-hot-reload.md) | Static config, no hot reload |
| [0002](decisions/0002-custom-oauth-no-crate.md) | Custom OAuth, no crate |
| [0003](decisions/0003-regex-routing-engine.md) | Regex routing engine |
| [0004](decisions/0004-persistent-spend-tracking.md) | Persistent spend tracking |
| [0005](decisions/0005-anthropic-native-provider-trait.md) | Anthropic-native provider trait |
| [0006](decisions/0006-policy-engine-encrypted-audit-hit-gateway.md) | Policy engine, encrypted audit, HIT gateway |
| [0008](decisions/0008-wizard-lifecycle.md) | Wizard lifecycle |

## Version

Current release: **v0.30.0** -- see [CHANGELOG](../CHANGELOG.md) for history.
Current release: **v0.35.1** -- see [CHANGELOG](../CHANGELOG.md) for history.
4 changes: 2 additions & 2 deletions docs/reference/benchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

> **90 µs overhead** with routing + auth + rate limiting + cache + DLP on 4 vCPU ARM — 40x faster than LiteLLM, with more features than Bifrost.

Grob v0.30.0 — 2026-03-27 — `grob bench --concurrent` (c=vCPU, 5 sec/scenario, mock TCP backend on localhost).
Grob v0.35.1 — 2026-04-09 — `grob bench --concurrency` (c=vCPU, 5 sec/scenario, mock TCP backend on localhost).

> v0.26.0 added the HIT policy engine (SSE stream interception + approval channel). For requests without tool_use blocks the overhead is unchanged. For tool_use blocks requiring human approval, stream latency includes the approval wait time (not a grob bottleneck).
>
Expand All @@ -19,7 +19,7 @@ Grob v0.30.0 — 2026-03-27 — `grob bench --concurrent` (c=vCPU, 5 sec/scenari

All within the ADR-0006 target of < 10 µs for 20 rules.

**Combined proxy+policy overhead** (routing + DLP + policy matcher, from `grob bench --concurrent`):
**Combined proxy+policy overhead** (routing + DLP + policy matcher, from `grob bench --concurrency`):

| Scenario | P50 overhead | Notes |
|----------|-------------:|-------|
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -270,7 +270,7 @@ Pull config from a remote grob instance and save it as a local preset. Fetches t
grob preset pull --from https://grob-prod.example.com --save prod-snapshot
```

### `grob watch` (requires `--features watch`)
### `grob watch`

Live traffic inspector TUI. Connects to the running server's SSE endpoint (`/api/events`) and displays a ratatui dashboard with provider health, live request stream, and DLP alerts.

Expand Down
6 changes: 3 additions & 3 deletions docs/tutorials/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@ Choose one of three methods:
**Option A: Install script (recommended)**

```bash
curl -fsSL https://raw.githubusercontent.com/azerozero/grob/main/scripts/install.sh | sh
curl -fsSL https://grob.sh | sh
```

**Option B: cargo-binstall (pre-built binary)**
**Option B: Homebrew (macOS / Linux)**

```bash
cargo binstall grob
brew install azerozero/tap/grob
```

**Option C: Build from source**
Expand Down
Loading