AgentSpecGap is a runnable hackathon prototype for mining enforceable specifications from LLM-agent artifacts. It shows how rules scattered across system prompts, tool descriptions, tool schemas, and runtime config can be recovered into an auditable policy layer, resolved against a canonical tool registry, and validated against safe and mutated unsafe tool-call traces.
This is not a SQL regex blocker. SQL is only one fixture. The prototype uses database, object storage, authorization, web retrieval, and public-email-style tools to show a general policy mining and validation flow.
LLM agents often receive safety and reliability instructions in natural language: "call schema lookup first," "do not write to unapproved buckets," "do not leak secrets," or "return at most 100 rows." If those rules remain only in prompts, they depend on model behavior at the exact moment a tool call is proposed.
AgentSpecGap treats those instructions as candidate specifications. It extracts them with source spans, resolves tool names against a registry, compiles enforceable rules into policy JSON, and evaluates every simulated tool-call event before execution.
- SQL analytics agent
sql.schema_lookupsql.execute- Rules for schema lookup ordering, destructive action denial, table allowlists, row bounds, schema contracts, and business-logic review.
- Cloud data agent
authz.check_permissionaws.dynamodb.PutItemscan_for_secretsaws.s3.PutObjectweb.fetchpublic_email.send- Rules for permission ordering, secret-scan ordering, private ACLs, bucket allowlists, timeouts, schema contracts, and secret flow denial.
All data is local fixture data. No database, AWS, email, or network execution is performed.
The prototype uses these category names consistently:
- Interface validation: input/output/schema contract checks.
- Authorization check: actor, resource, action, permission, allowlist, and exposure checks.
- Workflow ordering validation: sequencing checks across tool calls.
- Data validation: provenance and taint checks from data source to destination.
- Runtime validation: execution-time bounds such as timeout, row count, retries, and response size.
- Business-logic validation: semantic task correctness that usually needs review or evals.
The deterministic extractor emits candidate rules with:
- raw rule text
- source artifact and source span
- mentioned tool or action
- grounding:
exact,partial, orinferred - check category
- proposed policy operator
- confidence
The compiler turns grounded and resolved rules into policy JSON:
{
"id": "policy-cloud-r1",
"title": "Permission check before DynamoDB write",
"checkCategory": "workflow_ordering_validation",
"operator": "require_before",
"target": { "toolId": "aws.dynamodb.PutItem" },
"condition": { "requiredToolId": "authz.check_permission" },
"source": {
"artifact": "cloud-system-prompt",
"span": "Rules bullet 1",
"grounding": "exact"
},
"enforcement": "enforce"
}Exact grounded and resolved rules become enforce. Partial rules are review_only unless their resolution confidence is high. Inferred business-logic rules are eval_required.
Raw LLM output is not trusted as a canonical tool reference. The resolver compares each mention against the registry using:
- exact canonical ID match
- alias match
- context keyword match
- schema keyword match
- ambiguity detection
For example, the raw mention PutItem resolves to aws.dynamodb.PutItem through the registry, while other candidates such as aws.s3.PutObject remain visible with lower scores.
The engine uses generic policy operators:
schema_contractdeny_actionrequire_beforeresource_in_allowlistarg_within_boundflow_deniedtimeout_leqreview_required
These operators are intentionally not SQL-specific. They evaluate normalized tool-call events with canonical toolId, actor, resource, args, timestamps, and optional input/output labels.
The app includes safe traces and mutated unsafe traces.
SQL mutants:
- remove
schema_lookup - replace
SELECTwithDELETE - change table to
payments - change
LIMIT 50toLIMIT 500
Cloud mutants:
- remove
authz.check_permission - remove
scan_for_secrets - change ACL to
public-read - change bucket to an unapproved bucket
- send secret-labeled data to
public_email.send
The evaluator computes total traces, safe traces allowed, unsafe mutants blocked, false allows, false blocks, and coverage by check category.
npm install
npm run devThis starts three processes:
- Policy watcher — rebuilds
policy/policy.jsonwhensrc/data/artifacts.ts,src/data/toolRegistry.ts, orsrc/engine/ruleExtractor.tschanges. - Validation server (port 8787) — reloads
policy/policy.jsonand enforces rules on tool-call traces. - UI — reads policy and evaluation from the server; Trace Runner calls
POST /api/validate/trace.
Open the Vite URL printed in the terminal. Edit an artifact file (e.g. add a rule bullet in artifacts.ts) and watch the header timestamp update after the watcher rebuilds the bundle.
Offline policy builds call Ollama (qwen3-8b-16k:latest by default) to read agent artifacts and emit structured rules. Output is validated (JSON shape, enums, minimum rule counts) and falls back to fixtures if Ollama is down or returns invalid JSON.
# ensure Ollama is running and the model is pulled
ollama list
npm run verify:llm # health + ping + full extraction smoke testEnvironment (see .env.example):
| Variable | Default | Meaning |
|---|---|---|
POLICY_EXTRACTOR |
auto |
llm (require Ollama), fixture, or auto (LLM with fixture fallback) |
OLLAMA_BASE_URL |
http://127.0.0.1:11434 |
Ollama API |
OLLAMA_MODEL |
qwen3-8b-16k:latest |
Model tag from ollama list |
OLLAMA_TIMEOUT_MS |
300000 |
Per-agent extraction timeout (cold start) |
policy.json includes an extraction block recording whether each agent used llm or fixture.
| Script | Purpose |
|---|---|
npm run build:policy |
One-shot offline LLM extract + compile to policy/policy.json |
npm run verify:llm |
Validate Ollama connectivity and extraction |
npm run policy:watch |
File-change listener that rebuilds policy.json |
npm run server |
Runtime validation API |
npm run dev:ui |
UI only (requires server + policy.json already running) |
GET /api/policy— latest compiled bundle (policies + extracted rules)GET /api/policy/version—generatedAtfor reload detectionGET /api/evaluate— mutation evaluation against all fixture tracesPOST /api/validate/trace—{ agentId, events[] }→ per-event allow/block decisionsPOST /api/validate/event—{ agentId, priorEvents[], event }— single live tool-call checkGET /api/llm/health— Ollama reachability + model + last bundle extraction metadata
Fixture traces in traces.ts are scripted tool-call logs (no real agent). SQL integration adds a real SQLite warehouse at data/sql/analytics.db (customers, orders, products + unapproved payments for negative tests).
Flow per trace: policy check → if allowed, run SQL.
| Command | Purpose |
|---|---|
npm run seed:sql |
Create/refresh seed database |
npm run test:sql |
Run all SQL fixture traces against policy + DB (CI-friendly) |
GET /api/sql/integration-test |
Same matrix over HTTP |
POST /api/sql/run-trace/:id |
Run one trace (e.g. sql-safe) |
POST /api/sql/execute |
{ priorEvents?, event } — middleware-style single step |
See TESTING.md for the full testing strategy.
AgentSpecGap demonstrates the core thesis: prompt-only agent safety is not enforcement. A registry-resolved policy layer can recover scattered rules, classify them by validation category, compile them into auditable policy JSON, and catch unsafe mutated tool-call traces before simulated execution.
The key contribution is normalizing heterogeneous LLM-agent tool calls into registry-resolved events, classifying extracted rules into consistent check categories, compiling enforceable rules into a compact policy IR, and validating the policy through mutation-based trace testing.
- Rule extraction is deterministic fixture-based and simulates LLM extraction.
- The policy engine implements a compact subset of generic operators.
- Business-logic validation is marked eval-required rather than enforced in middleware.
- There is no real orchestrator integration yet.
- Taint labels are fixture labels rather than propagated through a complete dataflow graph.
- Replace fixture extraction with an LLM extractor that emits source spans and uncertainty.
- Add richer registry metadata for actors, resource hierarchies, and environment-specific permissions.
- Add taint propagation across tool outputs, intermediate memory, retrieval, and final answers.
- Export policy JSON and trace results for CI.
- Add adapters for actual agent orchestrators and MCP-style tool registries.