feat(api): AIN-218 follow-up — provider on /v1/inferences/{id} + cell-coverage endpoint by hizrianraz · Pull Request #57 · ainfera-ai/api

hizrianraz · 2026-05-21T08:48:09Z

Summary

Two AIN-218 §16 follow-ups surfaced during the prod verification pass after PR #56 merged:

Fix provider null on inference detail. GET /v1/inferences/{id} was reading provider from response_payload (the raw upstream body, which never carries that key). The fix: join ProviderORM in the resolution query and surface provider.slug directly. The "no stealth substitution" rule guarantees the routed model's provider answered, so no extra reconciliation logic is needed.
Add GET /v1/users/{handle}/cell-coverage — the AIN-210 KPI surface. Aggregates distinct §16 cells observed across the handle's agents and splits by Tier-1 (reasoning, code, extraction, chat) vs Tier-2 (the rest). Pre-§16-migration rows are excluded via cell IS NOT NULL — they're permanent gaps in the chain, not bugs to backfill. Returns summary counts + top 100 cells by frequency.

Why now

The Phase A verification pass against prod confirmed provider: null on a freshly-routed inference where the model + provider were both populated in the DB. Dashboard inference-detail page renders "—" for provider in consequence.
AIN-210 cell-coverage dashboard gauge is one of the two §16 hooks called out in AIN-182's acceptance criteria. Unblocks the dashboard PR that follows this one.

Test plan

Local pre-commit: ruff check/format + mypy --strict + pytest unit + smoke all green
CI integration: new tests cover empty-fleet zero-summary + Tier-1/Tier-2 split aggregation
Post-merge prod: curl https://api.ainfera.ai/v1/users/hizrianraz/cell-coverage -H "Authorization: Bearer …" returns the expected shape
Post-merge prod: GET /v1/inferences/{id} on the Phase A verification inference id f0e31cd3-… now returns provider: "anthropic" (was null)

🤖 Generated with Claude Code

…icy_version, cell Adds the four §16 fields the methodology v1.1 schema lock requires on every routed inference. · alembic 0022: additive columns on `inferences` (all nullable), two Pg enums (inference_task_type, inference_task_type_source), partial indexes on task_type + policy_version for the dashboard cell-coverage gauge (AIN-210) and policy-drift telemetry. · services/section16: pure helpers (policy_version, cell, resolve_task_type, constraint_band) + v0 Anthropic-haiku classifier with 1.5s timeout and best-effort fallback to "general" / source="default" on any failure. · services/routing.dispatch_inference: resolve task_type (caller → classifier → default), look up tenant's active routing policy, populate all four fields on the InferenceORM row AND on the inference.routed audit payload. Hash-chain invariant preserved: old events keep their old payloads + hashes; new events carry the richer payload and hash over it. · routers/inference: accept optional task_type on InferenceRequest, thread it through both the ainfera-auto and the direct dispatch paths, expose the four fields on GET /v1/inferences/{id}. Backfill: deferred (separate migration after Phase B Manwe traffic produces enough rows). NOT NULL tightening also deferred. Tests: 16 new unit tests cover policy_version determinism, ruleset_hash drift detection, cell format, resolve_task_type precedence + enum guard, classifier response parsing. 448/448 unit+smoke green. Closes AIN-218 Phase 1 (code). DDL run is the next founder tap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…empotency mock The §16 classifier in services/section16.py calls api.anthropic.com/v1/messages — the same URL the idempotency test mocks for the provider. The respx route matches by URL only (not by model), so without an explicit task_type the classifier fires on the first request and inflates call_count to 2. Providing task_type="chat" in the request body routes resolve_task_type down the source="caller" branch, skipping the classifier entirely. The test's intent ("provider hit once despite idempotent replay") is then accurately measured, and we incidentally cover the §16 caller-supplied path. Production code is unchanged. The classifier is already gated behind the idempotency check in dispatch_inference, so idempotent replays never re-classify. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…verage endpoint Two AIN-218 § follow-ups surfaced by the prod verification pass: 1. GET /v1/inferences/{id} returned provider=null because the code read it from response_payload (the raw upstream body), which has no `provider` key. The model row is already joined for `model.slug`; extend the join to ProviderORM and surface `provider.slug` directly. The "no stealth substitution" rule guarantees the routed model's provider answered, so no extra reconciliation is needed. 2. New endpoint GET /v1/users/{handle}/cell-coverage — the AIN-210 KPI surface. Aggregates distinct §16 cells observed across a handle's agents and splits by Tier-1 (reasoning|code|extraction|chat) vs Tier-2 (the rest). Pre-§16-migration rows are excluded via cell IS NOT NULL — they're permanent gaps in the chain, not bugs to backfill. Tests: empty-fleet zero-summary + mixed Tier-1/Tier-2 aggregation + openapi contract allowlist update. Closes the §16-render dependency for AIN-182 inference-detail + AIN-210 seed-readiness gauge. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

linear-code · 2026-05-21T08:48:14Z

AIN-218 [§16 P0] Outcome-capture schema migration — task_type + task_type_source + policy_version + cell

§16 schema LOCKED 2026-05-21 — one-shot immutable decision

The audit chain is append-only + hash-chained → no backfill. Every routed call captured without these fields is a permanent gap. Manwe pipe is currently dead (zero traffic), so this is a lucky near-zero-loss window. Land this migration BEFORE the pipe is fixed and traffic resumes.

Full spec: Methodology v1.1 §16 schema section

Add to `inference.routed` / audit payload

task_type — enum: reasoning|code|extraction|chat|tool_use|embed|general
task_type_source — enum: caller|classifier|default
policy_version — string {policy_name}@{semver}+{ruleset_hash[:8]} e.g. balanced@1.0.0+a3f9c2e1
cell — derived (task_type × model × constraint_band) for coverage tracking

task_type resolution (hybrid, locked)

task_type = caller_supplied ?? classifier_inference(prompt) ?? "general"

v0: classifier fallback = hot-model LLM classify (~200-500ms, cheapest to ship)
post-traffic: upgrade to fine-tuned ModernBERT on CPU
record task_type_source so q_empirical can weight caller vs inferred labels

policy_version (locked)

{policy_name}@{semver}+{ruleset_hash[:8]} — name + intended version + drift-catching hash. Enables deterministic replay (methodology §7).

Acceptance

Migration adds 4 fields to routed/audit payload, additive only
Caller-supplied task_type honored; classifier fills when absent; default "general" last
task_type_source correctly tagged on every record
policy_version emitted with name@semver+hash on every routing decision
cell derivable for coverage dashboard (AIN-210 / AIN-182)
Verified: one real routed call writes all 4 fields (curl /v1/inferences/{id})
Dashboard inference-detail + workflow pages render real values (replace — from PR feat(api): AIN-182 Phase 2 · templates backend + 6 system seeds #52)

Blocks

Real traffic should not resume until this lands. Unblocks: dashboard §16 surfacing (AIN-182), cell-coverage gauge (AIN-210), q_empirical training (AIN-208 downstream).

Review in Linear

cursor · 2026-05-21T08:48:14Z

You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace.

To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard.

…ULL) The InferenceORM.request_payload column is NOT NULL by schema; the direct-insert test path needs a non-null value. Use a minimal stub ({model, messages: []}) so the aggregation test exercises only the cell-coverage logic and not unrelated payload validation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor · 2026-05-21T08:50:40Z

You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace.

To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard.

Aule and others added 3 commits May 21, 2026 14:07

hizrianraz merged commit 92511ed into main May 21, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(api): AIN-218 follow-up — provider on /v1/inferences/{id} + cell-coverage endpoint#57

feat(api): AIN-218 follow-up — provider on /v1/inferences/{id} + cell-coverage endpoint#57
hizrianraz merged 4 commits into
mainfrom
feat/ain-218-followup-provider-cell-coverage

hizrianraz commented May 21, 2026

Uh oh!

linear-code Bot commented May 21, 2026 •

edited

Loading

§16 schema LOCKED 2026-05-21 — one-shot immutable decision

Add to `inference.routed` / audit payload

task_type resolution (hybrid, locked)

policy_version (locked)

Acceptance

Blocks

Uh oh!

cursor Bot commented May 21, 2026

Uh oh!

cursor Bot commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hizrianraz commented May 21, 2026

Summary

Why now

Test plan

Uh oh!

linear-code Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

§16 schema LOCKED 2026-05-21 — one-shot immutable decision

Add to inference.routed / audit payload

task_type resolution (hybrid, locked)

policy_version (locked)

Acceptance

Blocks

Uh oh!

cursor Bot commented May 21, 2026

Uh oh!

cursor Bot commented May 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

linear-code Bot commented May 21, 2026 •

edited

Loading

Add to `inference.routed` / audit payload