Ayiru

A verified, machine-readable knowledge layer for AI agents.

Wikipedia tells humans what things are. Ayiru tells AI agents what tools can do, how to use them, and whether they're safe.

Quick Start · What's Built · MCP Integration · Architecture · Roadmap

The Problem

AI agents are about to take real actions on real systems — deleting repositories, deploying to production, sending emails, charging cards. Right now they have no reliable place to look up what a command actually does, whether it's safe, or whether anyone has verified that knowledge.

Today an agent has two options:

Guess from training data — hallucinated flags, outdated commands, fabricated behaviour.
Read scraped docs — including any prompt-injection attacks dressed up as instructions.

Both fail the same way: when the agent gets it wrong, you find out after the production database is gone.

# Without a verified knowledge layer:
agent.run("gh repo delete my-org/production-critical --yes")
# (the LLM "thought" this was safe. it wasn't.)

What Ayiru Does

A structured, evidence-backed knowledge graph that an agent queries before it acts.

verdict = atlas.validate_command(
    tool_id="github-cli",
    command="gh repo delete my-org/production-critical --yes",
)
# {
#   "safe_to_auto_execute": false,
#   "risk_level": "critical",
#   "requires_human_confirmation": true,
#   "verification_level": "L2_source_verified",
#   "confidence": 1.0,
#   "confidence_band": "strong",
#   "matched_claim_id": "claim_…",
#   "match_method": "prefix",
#   "reasons": [
#     "Deleting a GitHub repository is an irreversible remote mutation.",
#     "Safety policy blocks auto-execution at risk level 'critical'."
#   ],
#   "evidence": [
#     {"evidence_type": "official_docs",
#      "source_uri": "https://docs.github.com/en/manual/gh_repo_delete",
#      "trust_level": "high"}
#   ]
# }

Every fact Ayiru serves is backed by cited, captured evidence — not LLM reasoning. Every command is classified for risk by a deterministic engine, not a chatbot. Every claim is traceable to the byte of the source document that grounded it.

How It Works

flowchart LR
    subgraph Sources["Sources"]
        CLI[CLI --help]
        DOCS[Docs]
        OAS[OpenAPI]
        JSON[JSON Schema]
        GQL[GraphQL SDL]
        MCP[MCP Servers]
    end

    subgraph Ingestion["Ingestion (SSRF-safe)"]
        L1[CLI runner]
        L2[Docs fetcher]
        L3[OpenAPI fetcher]
        L4[JSON Schema fetcher]
        L5[GraphQL fetcher]
        L6[MCP stdio runner]
    end

    subgraph Pipeline["Verification Pipeline"]
        ORCH[Canon Orchestrator]
        RISK[Risk Engine]
        CONF[Confidence Scorer]
        SAND[Runtime Sandbox]
    end

    GRAPH[(Knowledge Graph<br/>L0 through L3 claims)]

    AGENT[AI Agent]

    CLI --> L1 --> ORCH
    DOCS --> L2 --> ORCH
    OAS --> L3 --> ORCH
    JSON --> L4 --> ORCH
    GQL --> L5 --> ORCH
    MCP --> L6 --> ORCH

    ORCH --> RISK
    ORCH --> CONF
    ORCH --> GRAPH
    SAND -.->|L3 promotion| GRAPH

    AGENT -->|validate_command| GRAPH
    GRAPH -->|structured verdict| AGENT

Six ingestion lanes pull evidence from trusted sources. A deterministic orchestrator validates schema, classifies risk, scores confidence, deduplicates, and detects conflicts. Accepted claims compile into canonical ToolSpec and WorkflowSpec records. A runtime sandbox verifies safe checks (e.g. git --version) and promotes claims to L3_runtime_verified. Agents query the result.

Quick Start

Stage 14 ships three supported install paths, all of which produce a working ayiru binary with no manual setup.

From a checkout (recommended for development)

git clone https://github.com/ruth411/ayiru.git
cd ayiru
python3.12 -m venv backend/.venv
source backend/.venv/bin/activate
pip install -e 'backend[dev]'

ayiru seed --reset       # populate the demo graph (~5s, offline-safe)
ayiru serve --reload     # API on http://localhost:8000

OpenAPI docs at http://localhost:8000/docs.

After ayiru seed --reset, the local graph holds 47 claims across 5 tools (git, github-cli, docker, vercel-cli, openai-api) and 4 published ToolSpecs (git, github-cli, docker, vercel-cli — openai-api's OpenAPI-derived claims stay pending review). The headline demo then resolves immediately:

ayiru query --tool github-cli --command 'gh repo delete my-org/x --yes'
# BLOCK  risk=critical  confidence=1.00
#   matched_claim=claim_…  verification_level=L2_source_verified
#   - Matched claim 'gh repo delete' by prefix.
#   - Safety policy blocks auto-execution at risk level 'critical'.
#   - Deleting a GitHub repository is an irreversible remote mutation.

Standalone wheel install (Stage 14)

The wheel bundles the trust contracts, seed artifacts, and alembic migrations, so an isolated pip install produces a fully-functional package — no checkout required.

python3.12 -m venv ~/venv-ayiru
~/venv-ayiru/bin/pip install ayiru    # once published to PyPI
ayiru seed --reset                         # uses the bundled demo data
ayiru serve                                # auto-migrates the schema first

v1.0 ships the wheel from a local source build (pip install /path/to/ayiru/backend). The PyPI upload itself lands with v1.0's release tag.

With Docker

docker build -t ayiru .
docker run --rm -p 8000:8000 ayiru         # serve the API
docker run --rm -i ayiru mcp               # MCP stdio bridge

The image bundles the same seed + contracts as the wheel and ships the ayiru binary as its entrypoint.

CLI reference

Command	Purpose
`ayiru serve [--host --port --reload --no-migrate]`	Run the FastAPI app under uvicorn; auto-migrates the schema on first start unless `--no-migrate` is passed
`ayiru mcp`	Speak MCP/JSON-RPC over stdio (for Claude Desktop, Cursor, …)
`ayiru seed [--reset --database-url URL]`	Replay `data/seed_artifacts/` into the DB
`ayiru migrate [--database-url URL]`	`alembic upgrade head`
`ayiru query --tool ID --command STR [--json]`	Ask the engine if a command is safe (exits 0 on ALLOW, 2 on BLOCK)
`ayiru verify --claim-id ID`	Run the runtime verifier; promote L2 → L3 when it passes
`ayiru tools [--json]`	List every published tool spec
`ayiru --version`	Print the package version

Run the dashboard (optional)

Stage 11b ships a minimal Next.js dashboard for visual exploration and the demo video. Requires Node.js 18+.

cd frontend
npm install
npm run dev

Open http://localhost:3000. The dashboard proxies /api/* to the FastAPI backend on localhost:8000 (override with AYIRU_API_URL), so the browser only talks to its own origin and no CORS configuration is needed.

Core Principles

These are non-negotiable. They're tested.

Principle	What it means
Evidence before publication	No claim enters the canonical graph without cited evidence. LLM reasoning is never primary evidence.
Structured over prose	Agents submit typed `KnowledgeClaim` objects, not free-form articles.
Safety is first-class	Every command is classified by side effects, risk, auth requirements, and destructive potential.
Verification levels are explicit	Claims expose `L0_unverified` through `L5_human_audited`. The orchestrator refuses to inflate.
Provenance is preserved	Every canonical spec traces back to the source claims and the source bytes.
Sources are data, not instructions	Docs, CLI output, MCP metadata are scanned; any instructions they contain are never executed.

What's Built

Stage	Capability	Status
0 — Trust contract	Locked tool scope, evidence types, risk model	✓
1 — Persistence	Pydantic + SQLAlchemy + Alembic, drift-locked	✓
2 — Claim API	Submit / list / retrieve with evidence policy	✓
3 — Orchestrator	Schema validation, dedup, conflict detection	✓
4 — Confidence	Weighted scoring, caps, conflict penalties	✓
5 — Risk engine	Deterministic, dimension-based, contract-backed	✓
6 — Canonical specs	`ToolSpec` / `WorkflowSpec` compilation with provenance	✓
7a — CLI ingestion	Safe subprocess capture with argv allowlist	✓
7b — Docs ingestion	HTTPS-only fetch with SSRF guard + sanitization	✓
7c.1 — OpenAPI	Per-endpoint claims with JSON Pointer provenance	✓
7c.2 — JSON Schema	Per-field claims, dialect-aware validation	✓
7c.3 — GraphQL SDL	Per-root-field claims with destructive detection	✓
7d — MCP metadata	Local stdio spawn + `tools/list` capture	✓
8 — Runtime verification	L2 → L3 promotion via safe sandboxed checks	✓
9 — Agent query surface	`validate_command`, `search_tools`, `explain_risk`, `safe_workflow`, `get_tool_spec`	✓
10 — MCP server (outbound)	Expose 6 query / write tools to Claude Desktop / Cursor over stdio JSON-RPC	✓
11a — Seed dataset	`scripts/seed_examples.py` replays pre-captured artifacts; ~47 claims across 5 tools	✓
11b — Demo dashboard	Minimal Next.js UI: landing + tools list + tool detail + query playground	✓
12 — CLI + Docker	One `ayiru` binary on PATH; one-stage Dockerfile	✓
13 — Human review + audit	`L5_human_audited` promotion path; append-only audit log; review queue; per-claim history endpoint	✓
14 — Hardening	Wheel bundles contracts + seed + migrations; `/v1/` API versioning; observability + request ids; optional API-key auth + reviewer registry; auto-init on serve; Postgres dialect smoke; GitHub Actions CI; LICENSE / CONTRIBUTING / SECURITY	✓

See docs/stage_report.md for the full per-stage report including required artifacts, pass-case audit, quality bar, audit log, and what each stage explicitly defers.

Architecture

Ayiru/
├── backend/
│   ├── app/
│   │   ├── api/             # FastAPI routes (claims, canonical, ingestion, verification, query)
│   │   ├── mcp_server/      # Stage 10 — stdio JSON-RPC MCP server (6 tools)
│   │   ├── schemas/         # Pydantic v2 typed models
│   │   ├── services/        # Orchestrator, risk engine, ingestion lanes, runtime verifier, query engine
│   │   ├── db/              # SQLAlchemy 2.0 models + session
│   │   ├── cli.py           # Stage 12 — the `ayiru` console script
│   │   └── main.py          # FastAPI app + middleware (body size guard, structured errors)
│   ├── alembic/             # Migrations (drift-locked against models)
│   └── tests/               # 693 tests; ruff clean; hermetic
├── frontend/                # Stage 11b — Next.js demo dashboard (4 pages)
├── data/seed_artifacts/     # Stage 11a — pre-captured artifacts for offline-safe seeding
├── scripts/                 # seed_examples.py and other operator tools
├── contracts/               # Versioned JSON contracts (trust sources, ingestion allowlists, risk taxonomy)
├── docs/                    # Stage report, product lock, trust contract, demo scenarios
├── Dockerfile               # Stage 12 — one-stage image
└── ROADMAP.md

Key design decisions

Contracts as ground truth. Trust allowlists, ingestion sources, and risk taxonomies are versioned JSON files in contracts/. They're loaded once, validated, and cached. They cannot drift from the code without a test failing.
Protocol-based dependency injection. Every external dependency (HTTP client, MCP runner, sandbox runner) is a typing.Protocol. Tests inject fakes; production injects the real thing. The suite is hermetic — no real network or subprocess execution in CI.
Migrations match models. tests/test_alembic_metadata_alignment.py fails on any drift.
Structured errors everywhere. All API errors return {"error": {"code": "…", "message": "…", "details": {…}}} with a typed ErrorCode enum.
Adversarial tests, not happy-path tests. Every ingestion lane has tests for SSRF, redirect attacks, oversized responses, malformed inputs, content-type bypasses, cache hits with deleted artifacts, and structured 422s.

API Surface

Every endpoint returns typed JSON; errors are structured.

Claims

POST /claims — submit a structured claim with evidence
GET /claims — paginated list with filters
GET /claims/{claim_id} — retrieve a single claim
POST /claims/{claim_id}/verify — re-run the orchestrator
GET /claims/{claim_id}/verification — latest verification result

Canonical Specs

POST /canonical/tools/{tool_id}/publish — compile accepted claims into a ToolSpec
GET /canonical/tools/{tool_id} — retrieve the published spec
POST /canonical/workflows/{workflow_id}/publish
GET /canonical/workflows/{workflow_id}

Ingestion

POST /ingestion/cli · POST /ingestion/cli/tools/{tool_id} — Stage 7a
POST /ingestion/docs · POST /ingestion/docs/tools/{tool_id} — Stage 7b
POST /ingestion/openapi · POST /ingestion/openapi/tools/{tool_id} — Stage 7c.1
POST /ingestion/json_schema · POST /ingestion/json_schema/tools/{tool_id} — Stage 7c.2
POST /ingestion/graphql · POST /ingestion/graphql/tools/{tool_id} — Stage 7c.3
POST /ingestion/mcp · POST /ingestion/mcp/publishers/{publisher} — Stage 7d
GET /ingestion/runs/{run_id} — inspect a run
GET /ingestion/artifacts/{artifact_id} — byte-stable raw evidence

Runtime Verification

POST /verification/runtime — promote a claim to L3 via a safe check
POST /verification/runtime/tools/{tool_id} — bulk-verify all claims for a tool

Human Review (Stage 13)

POST /verification/human-review — file an APPROVED / REJECTED / NEEDS_CHANGES decision; APPROVED against an L3+ claim promotes it to L5_human_audited.
GET /verification/review-queue — paginated list of claims awaiting a human decision (verification_status=requires_human_review).

Audit Log (Stage 13 — append-only)

GET /audit/events — paginated query with filters by entity_type, entity_id, event_type, actor, and timestamp range.
GET /audit/claims/{claim_id} — full chronological history of every event recorded against one claim.

Agent Query Surface (the agent-facing API)

POST /query/validate-command — the headline endpoint. Returns a structured {safe_to_auto_execute, risk_level, requires_human_confirmation, reasons, evidence, verification_level, confidence} verdict. Default-deny on no match.
GET /query/tools/{tool_id} — canonical ToolSpec retrieval; 404 if no spec published.
GET /query/search-tools?q=&limit=&offset= — tiered substring search across published tools.
POST /query/explain-risk — deterministic risk classification with dimensions + citing claim ids.
POST /query/safe-workflow — published workflows matching a goal, sorted safest-first.

Live interactive docs at http://localhost:8000/docs when the server is running.

MCP Integration

Ayiru ships with a built-in MCP server that exposes the query surface plus claim submission to any MCP-aware agent client (Claude Desktop, Cursor, Cline, Continue, …). One config block in the client and the agent can ask Ayiru about safety before acting.

Tools exposed

Tool	What it does
`validate_command`	Safety verdict for `{tool_id, command}`. Default-deny on no match.
`get_tool_spec`	Full canonical `ToolSpec` for a known tool.
`search_tools`	Tiered substring search across published tools.
`explain_risk`	Deterministic risk classification + six dimensions + citing claims.
`get_safe_workflow`	Goal-matched workflows, safest-first.
`submit_claim`	The only write tool — submits a `KnowledgeClaim` and runs it through the orchestrator.

Run it

ayiru mcp

Speaks JSON-RPC over stdio. Closing stdin (Ctrl-D) exits cleanly.

Wire it into Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):

{
  "mcpServers": {
    "ayiru": {
      "command": "/absolute/path/to/ayiru",
      "args": ["mcp"],
      "env": {
        "AYIRU_DATABASE_URL": "sqlite:////absolute/path/to/ayiru.db"
      }
    }
  }
}

Run which ayiru inside the activated venv to get the absolute path. Restart Claude Desktop and the 6 Ayiru tools appear in the tool list. Ask Claude "Is it safe to run gh repo delete?" and the verdict comes back inline with cited evidence.

Cursor / other MCP clients — the config shape is the same; consult the client's docs for where to register MCP servers.

Design notes

Server is hand-rolled (no mcp SDK dependency). The codebase is sync-throughout; the SDK is async. Stage 7d already implemented the inverse (an MCP client) so the protocol is well-understood.
Every tool's inputSchema declares additionalProperties: false so client typos surface as clean rejections instead of silently dropping fields.
Tool execution failures surface as MCP isError: True content blocks (per the MCP spec). Protocol-level failures (parse error, unknown method, unknown tool) surface as JSON-RPC error responses. The two paths are kept distinct so clients can write defensive code that treats them differently.
The server returns both a content[] text block (for older clients that string-parse) AND a structuredContent object (for newer clients that natively understand structured tool results).

Submitting claims and ingesting docs

Submit a claim directly:

curl -X POST http://localhost:8000/claims \
  -H 'Content-Type: application/json' \
  -d '{
    "claim_type": "destructive_action",
    "subject": "gh repo delete",
    "statement": "Deletes a GitHub repository.",
    "tool_id": "github-cli",
    "submitted_by": "demo-agent",
    "risk_level": "critical",
    "evidence": [{
      "evidence_type": "official_docs",
      "source_uri": "https://docs.github.com/en/github-cli/github-cli/github-cli-reference",
      "excerpt": "gh repo delete deletes a repository.",
      "hash": "sha256:0000000000000000000000000000000000000000000000000000000000000000",
      "captured_at": "2026-05-18T00:00:00+00:00",
      "trust_level": "high"
    }]
  }'

Or ingest a whole documentation page automatically:

curl -X POST http://localhost:8000/ingestion/docs \
  -H 'Content-Type: application/json' \
  -d '{"tool_id": "git", "url": "https://git-scm.com/docs/git-status"}'

Validation

cd backend

# Test suite (693 tests, hermetic, ~30s)
.venv/bin/python -m pytest -q

# Lint
.venv/bin/ruff check app tests

# Migration upgrade / downgrade / upgrade cycle
rm -f /tmp/ayiru-smoke.db
DATABASE_URL=sqlite:////tmp/ayiru-smoke.db .venv/bin/alembic upgrade head
DATABASE_URL=sqlite:////tmp/ayiru-smoke.db .venv/bin/alembic downgrade -5
DATABASE_URL=sqlite:////tmp/ayiru-smoke.db .venv/bin/alembic upgrade head

Configuration

Variable	Default	Purpose
`AYIRU_DATABASE_URL`	`sqlite:///./ayiru.db`	SQLAlchemy URL. SQLite is the test-matrix dialect; Stage 14 verified the schema also compiles cleanly under the Postgres dialect (no live Postgres in CI yet).
`AYIRU_ALEMBIC_INI`	autodetected	Optional override path to `alembic.ini`. The CLI resolver tries env-var → source-tree `backend/alembic.ini` → bundled `app/_alembic/` (wheel install) in order.
`AYIRU_SEED_SCRIPT`	autodetected	Optional override path to a `seed_examples.py` fork. Without it, `ayiru seed` uses the in-package `app.seed_data.runner`.
`AYIRU_API_KEY`	unset (auth off)	When set, Stage 14 enables Bearer-token auth on every state-changing endpoint. Read endpoints stay public regardless. Health endpoints stay public.
`AYIRU_REVIEWER_REGISTRY`	unset (open)	Comma-separated allowlist of `reviewer_id` values for `POST /verification/human-review`. When set, unlisted reviewers receive a structured 403.
`AYIRU_API_URL` (frontend)	`http://localhost:8000`	Where the dashboard's `/api/*` rewrite points.

What This Isn't (Yet)

Being honest about the gaps that remain after Stage 14:

Stage 0 scope is narrow. Initial tool coverage: git, github-cli, docker, vercel-cli, openai-api plus 5 MCP servers. Adding tools is a contract change, not code.
No PyPI upload yet. The wheel works (Stage 14 bundled everything), but the pip install ayiru invocation still points at a local build. The tagged release lands as part of v1.0.
SQLite is the only tested backend. SQLAlchemy targets Postgres + the schema is dialect-portable (Stage 14 added an offline DDL smoke), but no testcontainers-style live Postgres tests run in CI. v1.1 adds them.
No external auth provider integration. Stage 14 ships an optional API-key gate via env var — solid for protecting a deployment behind a reverse proxy, not a substitute for SSO. OAuth / OIDC integration is v1.1.
No rate limiting. Deploy behind a reverse proxy (nginx, Caddy) that enforces rate limits. Native rate-limiting is v1.1.
Reviewer auth is identity-by-string. AYIRU_REVIEWER_REGISTRY is a name allowlist; per-reviewer cryptographic identity (Ed25519 keys, signed reviews) is v1.1.

Documentation

Stage report — per-stage audit with quality bar, pass cases, deferred items, and audit log
Roadmap — what each stage exists to prove
Product lock — what Ayiru is, what it isn't, and the principles that hold for v1
Trust contract — claim taxonomy, evidence taxonomy, verification rules, risk semantics
Demo scenarios — the headline queries the project promises to answer correctly
Contributing — local setup, PR checklist, code style
Security policy — vulnerability reporting, what we treat as a vuln, disclosure timeline

Contributing

This is an early-stage open-source project. Contributions welcome — please open an issue to discuss before sending a large PR.

Local dev:

cd backend
python3.12 -m venv .venv
.venv/bin/python -m pip install -e '.[dev]'
.venv/bin/alembic upgrade head
.venv/bin/python -m pytest      # must stay green
.venv/bin/ruff check app tests  # must stay clean

Non-negotiables for any PR:

New domain rules require tests.
Migrations stay reversible (alembic downgrade -1 must work).
Contract changes are versioned (*.v1.json is locked; new versions get a new file).
Safety rules never weaken — never expand allowed_commands, never widen SSRF guards, never demote evidence-trust requirements.

See CONTRIBUTING.md for the full PR checklist and SECURITY.md for the vulnerability-reporting path.

License

MIT — see LICENSE.

Acknowledgements

Built on FastAPI, Pydantic v2, SQLAlchemy 2, Alembic, httpx, openapi-spec-validator, jsonschema, and graphql-core. The MCP protocol implementation follows the Model Context Protocol specification.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ayiru

The Problem

What Ayiru Does

How It Works

Quick Start

From a checkout (recommended for development)

Standalone wheel install (Stage 14)

With Docker

CLI reference

Run the dashboard (optional)

Core Principles

What's Built

Architecture

Key design decisions

API Surface

MCP Integration

Tools exposed

Run it

Wire it into Claude Desktop

Design notes

Submitting claims and ingesting docs

Validation

Configuration

What This Isn't (Yet)

Documentation

Contributing

License

Acknowledgements

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.claude		.claude
.github		.github
backend		backend
contracts		contracts
data/seed_artifacts		data/seed_artifacts
docs		docs
frontend		frontend
scripts		scripts
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
RELEASE_NOTES_v0.1.0.md		RELEASE_NOTES_v0.1.0.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
roadmap_v0.2.md		roadmap_v0.2.md

Folders and files

Latest commit

History

Repository files navigation

Ayiru

The Problem

What Ayiru Does

How It Works

Quick Start

From a checkout (recommended for development)

Standalone wheel install (Stage 14)

With Docker

CLI reference

Run the dashboard (optional)

Core Principles

What's Built

Architecture

Key design decisions

API Surface

MCP Integration

Tools exposed

Run it

Wire it into Claude Desktop

Design notes

Submitting claims and ingesting docs

Validation

Configuration

What This Isn't (Yet)

Documentation

Contributing

License

Acknowledgements

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages