Skip to content

feat(api): AIN-218 §16 schema lock — task_type, task_type_source, policy_version, cell#56

Merged
hizrianraz merged 2 commits into
mainfrom
feat/ain-218-section16-fields
May 21, 2026
Merged

feat(api): AIN-218 §16 schema lock — task_type, task_type_source, policy_version, cell#56
hizrianraz merged 2 commits into
mainfrom
feat/ain-218-section16-fields

Conversation

@hizrianraz
Copy link
Copy Markdown
Contributor

Summary

Implements the methodology v1.1 §16 schema lock on every routed inference. Audit chain stays valid: old events keep old payloads + old hashes; new events hash over the richer payload.

  • alembic 0022 — additive: 4 nullable columns on inferences, 2 Pg enums (inference_task_type, inference_task_type_source), partial indexes on task_type + policy_version for the AIN-210 cell-coverage gauge + policy-drift telemetry.
  • services/section16.py — pure helpers (policy_version, cell, constraint_band, resolve_task_type) + v0 Anthropic-haiku classifier with 1.5s timeout, best-effort None-on-failure (caller falls through to general / source=default).
  • services/routing.dispatch_inference — resolves task_type (caller → classifier → default), looks up tenant's active routing policy, populates all 4 fields on the InferenceORM row AND on the inference.routed audit payload.
  • routers/inference — accepts optional task_type on InferenceRequest, threads through both the ainfera-auto and direct dispatch paths, exposes the 4 fields on GET /v1/inferences/{id}.

Backfill + NOT NULL tightening deferred (depends on Phase B Manwe traffic producing enough rows).

Test plan

  • 16 new unit tests cover policy_version determinism, ruleset_hash drift detection, cell format, resolve_task_type precedence + enum guard, classifier response parse tolerance
  • 448/448 unit + smoke green locally
  • ruff check clean
  • mypy --strict clean (71 files)
  • After merge + Railway deploy + alembic upgrade head on prod Supabase: real inference call writes all 4 fields, verify with curl /v1/inferences/<id> | jq '{task_type, task_type_source, policy_version, cell}' — all non-null

Closes AIN-218 Phase 1.

…icy_version, cell

Adds the four §16 fields the methodology v1.1 schema lock requires on
every routed inference.

· alembic 0022: additive columns on `inferences` (all nullable), two
  Pg enums (inference_task_type, inference_task_type_source), partial
  indexes on task_type + policy_version for the dashboard cell-coverage
  gauge (AIN-210) and policy-drift telemetry.

· services/section16: pure helpers (policy_version, cell, resolve_task_type,
  constraint_band) + v0 Anthropic-haiku classifier with 1.5s timeout and
  best-effort fallback to "general" / source="default" on any failure.

· services/routing.dispatch_inference: resolve task_type (caller →
  classifier → default), look up tenant's active routing policy,
  populate all four fields on the InferenceORM row AND on the
  inference.routed audit payload. Hash-chain invariant preserved:
  old events keep their old payloads + hashes; new events carry the
  richer payload and hash over it.

· routers/inference: accept optional task_type on InferenceRequest,
  thread it through both the ainfera-auto and the direct dispatch
  paths, expose the four fields on GET /v1/inferences/{id}.

Backfill: deferred (separate migration after Phase B Manwe traffic
produces enough rows). NOT NULL tightening also deferred.

Tests: 16 new unit tests cover policy_version determinism, ruleset_hash
drift detection, cell format, resolve_task_type precedence + enum
guard, classifier response parsing. 448/448 unit+smoke green.

Closes AIN-218 Phase 1 (code). DDL run is the next founder tap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@linear-code
Copy link
Copy Markdown

linear-code Bot commented May 21, 2026

AIN-218 [§16 P0] Outcome-capture schema migration — task_type + task_type_source + policy_version + cell

§16 schema LOCKED 2026-05-21 — one-shot immutable decision

The audit chain is append-only + hash-chained → no backfill. Every routed call captured without these fields is a permanent gap. Manwe pipe is currently dead (zero traffic), so this is a lucky near-zero-loss window. Land this migration BEFORE the pipe is fixed and traffic resumes.

Full spec: Methodology v1.1 §16 schema section

Add to inference.routed / audit payload

  • task_type — enum: reasoning|code|extraction|chat|tool_use|embed|general
  • task_type_source — enum: caller|classifier|default
  • policy_version — string {policy_name}@{semver}+{ruleset_hash[:8]} e.g. balanced@1.0.0+a3f9c2e1
  • cell — derived (task_type × model × constraint_band) for coverage tracking

task_type resolution (hybrid, locked)

task_type = caller_supplied ?? classifier_inference(prompt) ?? "general"
  • v0: classifier fallback = hot-model LLM classify (~200-500ms, cheapest to ship)
  • post-traffic: upgrade to fine-tuned ModernBERT on CPU
  • record task_type_source so q_empirical can weight caller vs inferred labels

policy_version (locked)

{policy_name}@{semver}+{ruleset_hash[:8]} — name + intended version + drift-catching hash. Enables deterministic replay (methodology §7).

Acceptance

  • Migration adds 4 fields to routed/audit payload, additive only
  • Caller-supplied task_type honored; classifier fills when absent; default "general" last
  • task_type_source correctly tagged on every record
  • policy_version emitted with name@semver+hash on every routing decision
  • cell derivable for coverage dashboard (AIN-210 / AIN-182)
  • Verified: one real routed call writes all 4 fields (curl /v1/inferences/{id})
  • Dashboard inference-detail + workflow pages render real values (replace from PR feat(api): AIN-182 Phase 2 · templates backend + 6 system seeds #52)

Blocks

Real traffic should not resume until this lands. Unblocks: dashboard §16 surfacing (AIN-182), cell-coverage gauge (AIN-210), q_empirical training (AIN-208 downstream).

Review in Linear

@cursor
Copy link
Copy Markdown

cursor Bot commented May 21, 2026

You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace.

To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard.

…empotency mock

The §16 classifier in services/section16.py calls api.anthropic.com/v1/messages —
the same URL the idempotency test mocks for the provider. The respx route matches
by URL only (not by model), so without an explicit task_type the classifier fires
on the first request and inflates call_count to 2.

Providing task_type="chat" in the request body routes resolve_task_type down the
source="caller" branch, skipping the classifier entirely. The test's intent
("provider hit once despite idempotent replay") is then accurately measured,
and we incidentally cover the §16 caller-supplied path.

Production code is unchanged. The classifier is already gated behind the
idempotency check in dispatch_inference, so idempotent replays never re-classify.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@cursor
Copy link
Copy Markdown

cursor Bot commented May 21, 2026

You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace.

To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard.

@hizrianraz hizrianraz merged commit f405f10 into main May 21, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant