AIN-205 Ed25519 audit + harden public surfaces (marketing launch) by hizrianraz · Pull Request #58 · ainfera-ai/api

hizrianraz · 2026-05-21T13:04:42Z

Summary

Marketing launch on ainfera.ai asserts on /audit, /security, /docs that the public chain is signed with Ed25519 — pre-this-PR the well-known endpoint served the HMAC secret labelled as a "public verify key" (forgeable). This PR rolls the chain to asymmetric Ed25519 + closes the remaining public-surface leaks the marketing site previously hid client-side.

Ed25519 (AIN-205)

alembic 0023 — add sig_alg (default 'hmac') + signature columns; loosen hmac_signature NOT NULL; check constraint binds payload to alg. Append-only trigger from 0001 stays in force.
services/audit — compute_ed25519_signature / verify_ed25519_signature derive the keypair from AUDIT_ED25519_PRIVATE_SEED_B64 (Doppler; deterministic dev seed when empty so tests run without secrets). append_event signs new rows with Ed25519; verify_chain dispatches on sig_alg so legacy HMAC rows still verify.
/v1/audit/public-key — now returns the Ed25519 PUBLIC half only (raw b64 + PEM + fingerprint).
scripts/gen_audit_ed25519_key.py — one-shot keypair generator. Already run; prod seed is set in Doppler ainfera-os/prd. Public key fingerprint: e141b7503518a7a722e2be6e5e0e519c730a7634057420e4b893d36c6c4e6049.

Public-surface gates (marketing v15 launch)

/v1/audit/public — server-side filter: tenant_handle LIKE 'internal-%' + agent_name = 'manwe' (founder's private brain — same tenant as public agents so the internal-* filter alone doesn't catch it).
/v1/audit/height (new) — true append-only length under the same filters, so marketing footer shows real chain height instead of counting a 200-row sample.
/v1/heartbeat/latest — was open; leaked mac-studio-01 + internal agent fleet. Now matches POST: internal-key gated.
/v1/stats/public/leaderboard — drop ats_* from public projection (ATS is internal vocabulary).
/v1/models — drop aa_index_source (the value 'aamc_v1_lock' leaks internal AAMC vocabulary). Column stays in the DB for internal tools.

Test plan

449 unit + smoke tests pass locally; mypy --strict clean on every touched module.
Pre-commit (ruff + mypy + pytest -x) clean.
CI green.
After merge — Railway redeploy applies migration 0023 + new code picks up prod seed from Doppler.
Live smoke post-deploy:
- curl https://api.ainfera.ai/v1/audit/public-key → alg: Ed25519
- curl https://api.ainfera.ai/v1/audit/public?limit=500 → 0 internal-* / 0 manwe rows
- curl https://api.ainfera.ai/v1/heartbeat/latest → 403 (no key)
- curl https://api.ainfera.ai/v1/audit/height → real count
- curl https://api.ainfera.ai/v1/models → no aa_index_source field
- curl https://api.ainfera.ai/v1/stats/public/leaderboard → no ats_* fields

…icy_version, cell Adds the four §16 fields the methodology v1.1 schema lock requires on every routed inference. · alembic 0022: additive columns on `inferences` (all nullable), two Pg enums (inference_task_type, inference_task_type_source), partial indexes on task_type + policy_version for the dashboard cell-coverage gauge (AIN-210) and policy-drift telemetry. · services/section16: pure helpers (policy_version, cell, resolve_task_type, constraint_band) + v0 Anthropic-haiku classifier with 1.5s timeout and best-effort fallback to "general" / source="default" on any failure. · services/routing.dispatch_inference: resolve task_type (caller → classifier → default), look up tenant's active routing policy, populate all four fields on the InferenceORM row AND on the inference.routed audit payload. Hash-chain invariant preserved: old events keep their old payloads + hashes; new events carry the richer payload and hash over it. · routers/inference: accept optional task_type on InferenceRequest, thread it through both the ainfera-auto and the direct dispatch paths, expose the four fields on GET /v1/inferences/{id}. Backfill: deferred (separate migration after Phase B Manwe traffic produces enough rows). NOT NULL tightening also deferred. Tests: 16 new unit tests cover policy_version determinism, ruleset_hash drift detection, cell format, resolve_task_type precedence + enum guard, classifier response parsing. 448/448 unit+smoke green. Closes AIN-218 Phase 1 (code). DDL run is the next founder tap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…empotency mock The §16 classifier in services/section16.py calls api.anthropic.com/v1/messages — the same URL the idempotency test mocks for the provider. The respx route matches by URL only (not by model), so without an explicit task_type the classifier fires on the first request and inflates call_count to 2. Providing task_type="chat" in the request body routes resolve_task_type down the source="caller" branch, skipping the classifier entirely. The test's intent ("provider hit once despite idempotent replay") is then accurately measured, and we incidentally cover the §16 caller-supplied path. Production code is unchanged. The classifier is already gated behind the idempotency check in dispatch_inference, so idempotent replays never re-classify. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…verage endpoint Two AIN-218 § follow-ups surfaced by the prod verification pass: 1. GET /v1/inferences/{id} returned provider=null because the code read it from response_payload (the raw upstream body), which has no `provider` key. The model row is already joined for `model.slug`; extend the join to ProviderORM and surface `provider.slug` directly. The "no stealth substitution" rule guarantees the routed model's provider answered, so no extra reconciliation is needed. 2. New endpoint GET /v1/users/{handle}/cell-coverage — the AIN-210 KPI surface. Aggregates distinct §16 cells observed across a handle's agents and splits by Tier-1 (reasoning|code|extraction|chat) vs Tier-2 (the rest). Pre-§16-migration rows are excluded via cell IS NOT NULL — they're permanent gaps in the chain, not bugs to backfill. Tests: empty-fleet zero-summary + mixed Tier-1/Tier-2 aggregation + openapi contract allowlist update. Closes the §16-render dependency for AIN-182 inference-detail + AIN-210 seed-readiness gauge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ULL) The InferenceORM.request_payload column is NOT NULL by schema; the direct-insert test path needs a non-null value. Use a minimal stub ({model, messages: []}) so the aggregation test exercises only the cell-coverage logic and not unrelated payload validation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Marketing launch on ainfera.ai asserts on /audit, /security, /docs that the public chain is signed with Ed25519 — but pre-this-commit the well- known endpoint served the HMAC secret labelled as a "public verify key", i.e. anyone fetching it could forge events. This rolls the audit chain forward to asymmetric Ed25519 signatures + closes the remaining public- surface leaks the marketing site previously hid client-side. ## Ed25519 (AIN-205) - alembic 0023: add `sig_alg` (default 'hmac') + `signature` columns; loosen `hmac_signature` NOT NULL; check constraint binds payload to alg. Append-only trigger from 0001 stays in force. - services/audit: compute_ed25519_signature / verify_ed25519_signature derive the keypair at load time from AUDIT_ED25519_PRIVATE_SEED_B64 (Doppler; deterministic dev seed when empty so tests stay reproducible). append_event signs new rows with Ed25519; verify_chain dispatches on sig_alg so legacy hmac rows still verify. - routers/audit: /v1/audit/public-key now returns the Ed25519 PUBLIC half only (raw b64 + PEM + fingerprint). Pre-rollout this endpoint leaked the signing secret. - scripts/gen_audit_ed25519_key.py: one-shot keypair generator — seed to stdout, instructions to stderr; operator pipes stdout into Doppler. Public key derived at runtime; no separate var. ## Public-surface gates - /v1/audit/public: server-side filter for tenant_handle LIKE 'internal-%' (CI/sacrificial leaks) + agent_name = 'manwe' (founder's private brain — runs under the same tenant as public agents so the internal-* filter alone doesn't catch it). - /v1/audit/height (new): true append-only length under the same public filters, so marketing footer shows real chain height instead of counting a 200-row sample that caps at the page size. - /v1/heartbeat/latest: was open; leaked `mac-studio-01` + the internal agent fleet to anyone. Now matches POST — internal-key gated. - /v1/stats/public/leaderboard: drop ats_overall + ats_reliability + ats_quality + ats_cost_efficiency + ats_latency + ats_compliance from the public projection (ATS is internal vocabulary). - /v1/models: drop aa_index_source (the value 'aamc_v1_lock' leaks internal AAMC vocabulary). Column stays in the DB for internal tools. 449 unit+smoke tests pass; mypy --strict clean on the touched modules. Integration tests verified against the new heartbeat lockdown. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor · 2026-05-21T13:04:48Z

You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace.

To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard.

linear-code · 2026-05-21T13:04:49Z

AIN-205 [verify] Ed25519 Migration & Trust Model Hardening

Shipped (`verify@757aa3c`)

VER-01: trust banners + pinned fingerprint
VER-02: verify_ed25519, optional ed25519_signature on chain, --agent-pubkey-file, tests, docs/ed25519-migration.md

Deferred

API producer signs every event
Rekor / Sigstore online verify
Bundle embeds agent PEM
P0: /.well-known/ainfera-public-key.json still 404 on marketing (blocks full HMAC verify in prod)

Crosscheck (Linear ↔ Notion ↔ HQ)

Field	Value
Linear	Done @ 2026-05-20 (VER-02 slice)
Notion	—
HQ commit	`verify@757aa3c`
Prod / tests	34 pytest pass; well-known 404 noted
Receipt	`.launch-snapshots/DONE-LINEAR-NOTION-STANDARD-20260520.md`

AIN-218 [§16 P0] Outcome-capture schema migration — task_type + task_type_source + policy_version + cell

§16 schema LOCKED 2026-05-21 — one-shot immutable decision

The audit chain is append-only + hash-chained → no backfill. Every routed call captured without these fields is a permanent gap. Manwe pipe is currently dead (zero traffic), so this is a lucky near-zero-loss window. Land this migration BEFORE the pipe is fixed and traffic resumes.

Full spec: Methodology v1.1 §16 schema section

Add to `inference.routed` / audit payload

task_type — enum: reasoning|code|extraction|chat|tool_use|embed|general
task_type_source — enum: caller|classifier|default
policy_version — string {policy_name}@{semver}+{ruleset_hash[:8]} e.g. balanced@1.0.0+a3f9c2e1
cell — derived (task_type × model × constraint_band) for coverage tracking

task_type resolution (hybrid, locked)

task_type = caller_supplied ?? classifier_inference(prompt) ?? "general"

v0: classifier fallback = hot-model LLM classify (~200-500ms, cheapest to ship)
post-traffic: upgrade to fine-tuned ModernBERT on CPU
record task_type_source so q_empirical can weight caller vs inferred labels

policy_version (locked)

{policy_name}@{semver}+{ruleset_hash[:8]} — name + intended version + drift-catching hash. Enables deterministic replay (methodology §7).

Acceptance

Migration adds 4 fields to routed/audit payload, additive only
Caller-supplied task_type honored; classifier fills when absent; default "general" last
task_type_source correctly tagged on every record
policy_version emitted with name@semver+hash on every routing decision
cell derivable for coverage dashboard (AIN-210 / AIN-182)
Verified: one real routed call writes all 4 fields (curl /v1/inferences/{id})
Dashboard inference-detail + workflow pages render real values (replace — from PR feat(api): AIN-182 Phase 2 · templates backend + 6 system seeds #52)

Blocks

Real traffic should not resume until this lands. Unblocks: dashboard §16 surfacing (AIN-182), cell-coverage gauge (AIN-210), q_empirical training (AIN-208 downstream).

Review in Linear

…ublic surfaces Pairs with ainfera-ai/api#58 (AIN-205 Ed25519 + public-surface lockdown). Brings the marketing site's copy in line with what the API now actually serves and bakes the single 8% Ainfera fee across every surface that talks about pricing. ## Real chain height (kills the fake ticker) - New components/v15/BlockTicker reads /v1/audit/height (added in api#58) — the true append-only chain length. SSR seeds via layout fetch so first paint has a real number; client polls every 30s with silent failure (last value sticks on network blip). - TopNav: drop the local `BlockTicker` that started at 8_432_189 and randomly incremented every 2.4s. Now uses the shared real ticker. - SiteFooter: drop the hardcoded `const block = "8,432,189"` constant. Now uses the shared real ticker too. - layout.tsx: async + fetches getAuditHeight() once on render, passes to both surfaces as `initialBlock`. ## Internal Linear IDs out of public copy - /audit: drop "AIN-205" from the "verify a call" paragraph and the "Ed25519 · AIN-205" badge in the signature scheme block. - /privacy: drop "AIN-205" reference in section 5 (audit chain). - /changelog: rename every entry's `version` from AIN-XXX to a semantic external label (design-v15, audit-v2, feed-v1.1, models-v1, feed-v1, intelligence-v1, coverage-v1). Brand v1.3.1 row unchanged. ## 8% flat margin (replaces drifted tier-by-tier copy) Tier-specific fees on /pricing had drifted to Builder 4% / Studio 2.5% / Scale "Volume Ainfera fee" while the design + dashboard claimed something else. Founder lock: single 8% fee across every plan. - /pricing tiers: all three now read "Pass-through Provider cost + 8% Ainfera fee". "The fee tier changes" intro rewritten — same fee across plans, tier only changes the controls/SLA/support. - components/v13/PricingMath: ainferaMargin constant 0.04 → 0.08; worked-example line "4% on inference" → "8% on inference". - /compare: pricing-row margin 5% → 8%. - dashboard /billing: "Margin model 5%" → "8%" + caption update. - dashboard /settings Builder card: "5–10% margin on inference" → "Flat 8% margin on inference". ## API type / scrubbing follow-throughs - lib/v15/api.ts: drop ats_* fields from LeaderboardRow and aa_index_source from CatalogModel — these are no longer projected on the public api after api#58. Adds PublicAuditHeight type + getAuditHeight() helper. - app/api/audit-ticker/route.ts: belt-and-suspenders scrub adds the manwe denylist alongside the existing internal-* tenant filter. api#58 makes both no-ops against a fixed upstream, but the local filter survives upstream regressions. next build clean on both apps. No fabricated numbers, no internal Linear IDs, no manwe references in rendered output. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Aule and others added 5 commits May 21, 2026 14:07

hizrianraz merged commit 1cfc7f5 into main May 21, 2026
3 checks passed

hizrianraz mentioned this pull request May 21, 2026

Marketing v15 launch: real chain height + 8% margin + scrubbed public copy ainfera-ai/web#54

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AIN-205 Ed25519 audit + harden public surfaces (marketing launch)#58

AIN-205 Ed25519 audit + harden public surfaces (marketing launch)#58
hizrianraz merged 5 commits into
mainfrom
feat/ain-218-followup-provider-cell-coverage

hizrianraz commented May 21, 2026

Uh oh!

cursor Bot commented May 21, 2026

Uh oh!

linear-code Bot commented May 21, 2026 •

edited

Loading

Shipped (`verify@757aa3c`)

Deferred

Crosscheck (Linear ↔ Notion ↔ HQ)

§16 schema LOCKED 2026-05-21 — one-shot immutable decision

Add to `inference.routed` / audit payload

task_type resolution (hybrid, locked)

policy_version (locked)

Acceptance

Blocks

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hizrianraz commented May 21, 2026

Summary

Ed25519 (AIN-205)

Public-surface gates (marketing v15 launch)

Test plan

Uh oh!

cursor Bot commented May 21, 2026

Uh oh!

linear-code Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Shipped (verify@757aa3c)

Deferred

Crosscheck (Linear ↔ Notion ↔ HQ)

§16 schema LOCKED 2026-05-21 — one-shot immutable decision

Add to inference.routed / audit payload

task_type resolution (hybrid, locked)

policy_version (locked)

Acceptance

Blocks

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

linear-code Bot commented May 21, 2026 •

edited

Loading

Shipped (`verify@757aa3c`)

Add to `inference.routed` / audit payload