| title | AuditGuard MCP |
|---|---|
| emoji | 🔒 |
| colorFrom | blue |
| colorTo | gray |
| sdk | docker |
| pinned | false |
| license | mit |
A production-grade, compliance-aware Model Context Protocol (MCP) server that wraps LLM tool use with four primitives: PII safety, role-based access control (RBAC), configurable policy enforcement, and structured audit logging.
Built on OpenAI's newly released Privacy Filter (1.5B params, April 2026), running locally on CPU — no data leaves your infrastructure.
This project serves as a reference implementation for agentic systems in highly regulated industries.
Live demo:
Prerequisites: Python 3.11+, uv
# Set your OpenAI API key for the demo's LangGraph agent
export OPENAI_API_KEY="sk-..."
git clone https://github.com/ree2raz/auditguard-mcp.git
cd auditguard-mcp
make install
make seed
MOCK_PII=1 make demoMost PII detection in production pipelines is regex-based. Regex catches SSNs and email addresses but misses the entire class of contextual PII: a sentence like "the Henderson trust's primary contact" contains no syntactic PII pattern, but a token classifier trained on privacy data identifies "Henderson" as a private_person span with high confidence.
OpenAI released Privacy Filter on April 22, 2026. It's a 1.5B-parameter bidirectional token classifier supporting 8 PII categories. We built robust BIOES span decoding to map token predictions back to character offsets in the original text — the unglamorous part most integrations skip. The model runs locally on CPU. No data is sent to OpenAI APIs. This matches the deployment requirement of every regulated buyer: PII detection that works without trusting a third-party vendor with the data being detected.
Because Privacy Filter was trained to aggressively protect personal data, it occasionally over-redacts public entities. For example, if a query returns a transaction counterparty named "Bennett Group", the model may tag it as three separate private_person spans.
We also observe phone number false positives on numeric financial values. A balance like 496959.67 is flagged as private_phone because the digit sequence resembles a phone number pattern. The live demo ships a post-detection numeric guard that checks whether a phone detection falls on a purely numeric value inside a JSON number context — suppressing the redaction when the span is unlikely to be a real phone number. This guard is documented in policy.py under _is_numeric_json_value().
We leave this behavior intact in the demo to illustrate a core design philosophy: detection is a primitive, not a pipeline. If your use case requires suppressing company name false-positives, the correct place to do so is in a post-detection filter (e.g., dropping spans that match known company suffixes), rather than trying to force the model to behave differently.
When an LLM (the client) calls a tool (e.g., sql_query), that call passes through a strict compliance pipeline before execution, and the result passes through the pipeline again before returning to the LLM.
- RBAC Gate: Fails fast if the user's role lacks access to the tool or specific data fields.
- Inbound PII Scan: Detects sensitive data in the query using OpenAI's 1.5B parameter Privacy Filter.
- Inbound Policy: Applies role-specific rules (Allow, Redact, Hash, Vault, Review, Block).
- Tool Execution: Executes the bounded tool (with timeouts).
- Outbound PII Scan: Scans the canonical JSON output.
- Outbound Policy: Applies redaction/hashing rules to the results.
- Audit Logging: Writes a structured JSONL record of the entire trace.
Every tool call flows through a 7-step pipeline. Cheapest checks first, most expensive last.
User Query
│
▼
[1] RBAC Gate ────── fail-closed set membership check (O(1))
│
▼
[2] Inbound PII Scan ── 1.5B Privacy Filter detects 8 categories
│
▼
[3] Inbound Policy ──── ALLOW | REDACT | HASH | VAULT | REVIEW | BLOCK
│
▼
[4] Tool Execution ──── bounded, with timeouts
│
▼
[5] Outbound PII Scan ── canonical JSON scan (sort_keys=True)
│
▼
[6] Outbound Policy ──── redact/hash/vault results before returning
│
▼
[7] Audit Logging ───── append-only JSONL (SHA-256, no raw PII)
│
▼
LLM Response
The server uses FastMCP with stdio transport. The core pipeline is in auditguard_mcp/server.py:process_request().
We use openai/privacy-filter (model card: April 22, 2026), a 1.5B parameter bidirectional token classifier that supports 8 PII categories (e.g., private_person, account_number, secret).
- It runs locally on CPU by default. No data is sent to OpenAI APIs.
- We implemented robust BIOES span decoding to accurately map token predictions back to character offsets in original text.
- For fast local testing, setting
MOCK_PII=1bypasses model and uses a regex stub.- Note: The mock is a fast stub for local iteration and CI. Real Privacy Filter detection is qualitatively different and handles complex semantics. See benchmarks in
benchmarks/for latency comparisons.
- Note: The mock is a fast stub for local iteration and CI. Real Privacy Filter detection is qualitatively different and handles complex semantics. See benchmarks in
At runtime, the server checks the PRIVACY_FILTER_LOCAL_PATH environment variable:
- If set and directory exists: Loads model from local disk (no network call)
- Otherwise: Downloads model from Hugging Face Hub (cached to
HF_HOMEorTRANSFORMERS_CACHE)
This enables air-gapped deployments or faster startups with pre-downloaded models.
Docker usage with local model:
# Mount a local model directory
docker run -p 7860:7860 \
-v /path/to/privacy-filter:/app/model \
-e PRIVACY_FILTER_LOCAL_PATH=/app/model \
auditguard-mcp:localLocal development:
# Point to a downloaded model
export PRIVACY_FILTER_LOCAL_PATH=/path/to/privacy-filter
uv run uvicorn web_app:app --port 7860Startup logs indicate the source: source=local or source=huggingface.
For known limitations, see What Privacy Filter gets wrong above.
Policies are defined as strict Pydantic models, not loose YAML files. This ensures type safety and IDE autocomplete.
There are six policy actions:
ALLOW: Pass through unchanged.REDACT: Replace with[category].HASH: Replace with[category:sha256-first-8]. Preserves identity consistency.VAULT: Store raw text invault.jsonland replace with a UUID reference.REVIEW: Leave intact but flag for human review (writes toreview_queue.jsonl).BLOCK: Halt the request immediately and raise aPolicyViolation.
Policy Philosophies: The repo includes two bundled policies that demonstrate different compliance philosophies:
permissive_analyst: Prioritizes data usability. Replaces names and emails with aHASHso analysts can still correlate records (e.g.,GROUP BY) belonging to the same entity across tables without knowing the entity's true identity.strict_financial: Prioritizes absolute privacy. Replaces names and emails with a genericREDACT(e.g.,[private_person]), preventing even statistical correlation.
What's happening here?
permissive_analyst_v1keeps redacted text inline as[private_person:sha256-first-8]. An analyst can stillGROUP BYthat hash to correlate records belonging to the same person — without ever seeing the person's name.strict_financial_v1replaces all PII with a generic[private_person]tag, making even statistical correlation impossible. Same detection pipeline, opposite compliance philosophies. The policy is a config object, not a code change.
The audit log (audit.jsonl) is the ultimate source of truth. A single request produces a single JSONL record containing:
- The actor (role, user_id, session_id)
- SHA-256 hashes of the raw input and raw output
- Inbound and outbound detections (with raw text stripped)
- The exact policy config version and model version used
- Latency and terminal status
The scripts/seed_data.py script generates a realistic SQLite database with customers, accounts, transactions, and advisors. Crucially, the transaction descriptions include deliberate edge cases for the PII scanner, such as compound identifiers ("account ending in 4821") and aliases.
Run make eval to execute the evaluation harness against a golden set of 15 test cases. It measures:
- RBAC accuracy (did it correctly allow/deny?)
- Status accuracy (did the request end in the expected state?)
- Inbound PII detection (were the expected categories caught?)
- Audit completeness (are all required fields present?)
sql_query: Read-only SQLite execution with RBAC-enforced column filtering.customer_api: A separate FastAPI process simulating a REST backend, demonstrating how the pipeline handles internal service boundaries.
To take this from a reference implementation to production:
- Async Review Queue: Currently, the
REVIEWaction is synchronous (the request completes but is flagged). In production,REVIEWshould hold the request, return a "pending" status to the LLM, and wait for an out-of-band human approval webhook. - KMS Vaulting: Replace the local
vault.jsonlwriter with a call to AWS KMS or HashiCorp Vault. - Client-side TLS: The
stdiotransport assumes a trusted local client. If moving to SSE transport, add mTLS client certificate validation to strictly identify the actor.