feat(api): AIN-183 P0-2 · AAMC voter aa_index_source lock by hizrianraz · Pull Request #46 · ainfera-ai/api

hizrianraz · 2026-05-19T09:31:52Z

Summary

Two AAMC v1.0 voter slugs (`gemini-3-1-pro`, `mistral-large-3`) shipped with `aa_index_source='estimate_2026_q2'` while the other three (`claude-opus-4-7`, `gpt-5-5`, `grok-4`) shipped with `aamc_v1_lock`. Marketing /models advertises 'AAMC v1.0' across the whole pool, so two voters carrying a provisional source label = discipline #11 mismatch (spec / runtime / copy disagree).

This PR adds an alembic migration to flip the two stragglers, plus an invariant test that prevents regressions.

What this changes

`alembic/versions/20260519_0017_aamc_voter_source_lock.py`

Single UPDATE filtered on the current value (`aa_index_source='estimate_2026_q2'`) so re-applying is a no-op. Rowcount assertion accepts 0 (fresh DB / re-apply) or 2 (first prod apply) and raises on any other value — per Memory #20 (silent no-ops ship as green CI).

Auto-applies on next Railway deploy via the Dockerfile's `alembic upgrade head` (committed in 5ed0e2b).

`tests/integration/test_aamc_invariants.py`

New test `test_aamc_5_voters_use_v1_lock_source` sits next to the existing active-flag invariant. Any future migration that regresses a voter back onto a provisional source label fails CI with a discipline #11 explanation.

Test plan

CI green (ruff + mypy --strict + pytest passed locally pre-commit)
After Railway deploy, `curl -s https://api.ainfera.ai/v1/models | jq '.[] | select(.id | test("gemini-3-1-pro|mistral-large-3")) | .aa_index_source'` returns `["aamc_v1_lock", "aamc_v1_lock"]`
All 5 voters `{claude-opus-4-7, gpt-5-5, gemini-3-1-pro, grok-4, mistral-large-3}` show `aamc_v1_lock`
Migration is idempotent — second `alembic upgrade head` call (in case of redeploy) succeeds with 0 rows affected

Friction notes

Audit prompt called the column `aamc_index_source`. Real ORM column is `aa_index_source` (api/ainfera_api/orm.py:332). Migration uses the real name.
Prompt's original plan was `mcp__claude_ai_Supabase__apply_migration` against project `dftfpwzqxoebwzepygzl`. That project doesn't appear in our authenticated Supabase MCP scope; the one that does (`mezyeiwufubdojlxpqou`) has zero application tables. Memory `project_supabase_is_prod.md` updated to reflect that schema work ships as alembic in api/, not via Supabase MCP — regardless of where DATABASE_URL points (Railway-managed PG or a Supabase project we can't see).

Closes: AIN-183 P0-2 (AAMC source label drift)
Discipline: #1, #11, Memory #20

…-3-1-pro + mistral-large-3 Two of the five AAMC v1.0 voter slugs landed in prod with aa_index_source='estimate_2026_q2' because at the time the Artificial Analysis Intelligence Index didn't have a stable reading for either model. The other three (claude-opus-4-7, gpt-5-5, grok-4) shipped with 'aamc_v1_lock'. AA Index has since locked readings for both, and the marketing /models page calls out 'AAMC v1.0' across the whole 5-voter pool — leaving two on a provisional source label reads as a discipline #11 mismatch (spec / runtime / copy disagree). ## Migration `20260519_0017_aamc_voter_source_lock.py` runs a single UPDATE filtered on the current value, so re-applying is a no-op. The rowcount assertion (Memory #20: silent no-ops ship as green CI) accepts 0 (fresh DB / re-apply) or 2 (first prod apply) but raises on any other value. Downgrade reverses the flip — safe because no row-level dependency hangs on the source label. ## Invariant test `test_aamc_5_voters_use_v1_lock_source` lives next to the existing `test_aamc_5_canonical_voters_always_active` and asserts the same lock shape but for `aa_index_source`. Any future migration that regresses one of the five voters back onto a provisional source label fails CI here, with the discipline #11 reasoning surfaced in the assertion message. ## Friction The audit prompt called the column `aamc_index_source`. Actual ORM declaration is `aa_index_source` (api/ainfera_api/orm.py:332) and the public API response field is the same. Migration uses the real name. Closes: AIN-183 P0-2 (AAMC source label drift) Discipline: #1 (no-op no longer ships green), #11 (catalog labels align across DB / API / marketing), Memory #20 (rowcount assertion).

linear-code · 2026-05-19T09:31:56Z

AIN-183 🟠 All-repos audit sweep — 14 repos × spec-vs-built + Discipline #3/#4/#11/#17 verification

Severity: 🟠 HIGH — comprehensive repo-level Discipline #1 + #17 audit

Filed 2026-05-18 PM after AIN-153 + AIN-158 spec-vs-built audit revealed Aule shipping less than spec'd then marking parents Done. Pattern requires systematic verification across ALL 14 repos in the ainfera-ai GitHub org.

Founder directive: "Hard revert to In Progress and force the missing work. Also check all repos."

Scope

For each of the 14 repos in ainfera-ai org, Aule audits:

Spec vs built — does the repo's stated purpose match what's actually in main?
Tickets marked Done — for each Done ticket touching this repo in last 14d, verify the deliverables match the ticket's acceptance gates
Surfaces live in production — does the prod deployment match what main branch's app/ or src/ declares?
Founder PII grep — 0 matches for "Hizrian|Izzy|Raz|founder|fibromyalgia|ADHD|snowboard|Julius Baer" in any public surface or repo README/docs
Internal agent name grep — public-facing surfaces should NOT mention Manwe/Yavanna/Namo/Aule/Tulkas (Varda is acceptable per PUBLIC build(deps): Bump astral-sh/setup-uv from 5 to 7 #1 lock)
Lock compliance — D7-D37 locks honored in any new code
Discipline fix(orm): use postgresql.ARRAY so capabilities.contains() emits @> #11 — Notion canonical vs code drift

14 repos to audit

Production-facing (Phase 1, urgent)

#	Repo	Audit focus
1	`ainfera-ai/web`	Marketing + dashboard pages — match D14 + Part 2 spec
2	`ainfera-ai/api`	FastAPI surfaces, /v1/* endpoints match docs.ainfera.ai claims
3	`ainfera-ai/ainfera-os`	Monorepo root, docs/ folder, SKILL.md files, package versions
4	`ainfera-ai/mcp`	mcp.ainfera.ai FastMCP server, tool surface matches docs
5	`ainfera-ai/sdk` (or python-sdk if named differently)	PyPI `ainfera` package matches API surface, exported symbols

Agent-implementation (Phase 2)

#	Repo	Audit focus
6	`ainfera-ai/varda` (or similar)	NemoClaw orchestrator config, GPT-5.5 binding, public surface compliance
7	`ainfera-ai/aule`	Claude SDK + Opus 4.7 xhigh config, author override in commit history
8	`ainfera-ai/yavanna`	LangGraph + Sonnet 4.6 + Grok 4, response_review heuristics
9	`ainfera-ai/namo`	Letta + Gemini 3.1 Pro, memory consolidation
10	`ainfera-ai/tulkas`	Garak + Mistral Large 3, probe batteries

Customer + tooling (Phase 3)

#	Repo	Audit focus
11	`ainfera-ai/hermes-agent` (Manwe Customer #1 fork)	v0.14.0 SHA a91a57fa matches reported state
12	`ainfera-ai/examples`	5 example agent repos per AIN-78
13	`ainfera-ai/specs` (or similar)	CC-BY 4.0 spec docs match Notion canonical
14	`ainfera-ai/.github` or org-level	Issue templates, PR templates, CODEOWNERS, branch protection

Audit deliverable per repo

For each repo, Aule produces a comment on this ticket with:

## Repo: ainfera-ai/<name>

### Spec match
- Stated purpose: <from README>
- Actual state: <from main branch traversal>
- Drift: <none / specifics>

### Recent Done tickets touching this repo
- AIN-XYZ — claimed: "..." → actual state: ✅ matches / ⚠️ partial / ❌ missing

### Production-vs-main drift
- Last deploy SHA: <sha>
- main HEAD: <sha>
- Drift: <none / files differ / etc>

### Grep results
- Founder PII: <count> matches → <files>
- Internal agent names in public: <count> matches → <files>

### Lock compliance
- D7-D37 references: <count> 
- Discipline #6 corollary violations: <count>

### Recommendation
- ✅ Clean / ⚠️ Cleanup needed / 🔴 Active violation

### Tickets to file
- <list of child tickets needed if cleanup work surfaces>

Audit commands Aule runs per repo

cd ~/code/ainfera-ai/<repo>
git pull origin main

# Discipline #3 grep
rg -i "hizrian|izzy|raz|fibromyalgia|adhd|snowboard|julius baer|sommelier" \
   --type-not lock \
   --type-not log \
   -l

# Internal agent naming in public surfaces
rg -i "manwe|yavanna|namo|aule|tulkas" \
   src/ app/ public/ docs/ README.md \
   --type-not lock \
   -l

# Discipline #4 author override check (last 50 commits)
git log -50 --pretty=format:'%h %an <%ae>' | rg -v "Aule <aule@" | head -20

# Production-vs-main drift (for deployed repos)
gh api repos/ainfera-ai/<repo>/deployments --jq '.[0:3] | .[].sha'
# Compare against `git rev-parse main`

# Spec vs files
ls -la docs/
cat README.md | head -30

Acceptance gates

All 14 repos audited
Audit comment posted per repo on this ticket
Any 🔴 Active violation gets immediate child ticket filed
Any ⚠️ Cleanup ticket gets queued for AIN-179 delivery wave
Summary comment at end: total findings + per-severity counts + recommended next actions
Aule author override on the audit branch
No remediation in this ticket — only findings. Remediation = follow-up tickets.

Out of scope

Code refactoring (only audit-and-report)
Linter sweeps (separate cleanup pass)
Test coverage analysis (covered by AIN-118 or successor)
Performance audit (separate concern)

Connection

Triggered by: AIN-153 + AIN-158 spec-vs-built mismatch finding (2026-05-18 PM session 3.5 audit)
Authority: AIN-179 delivery wave + "Also check all repos" directive
Discipline references: build(deps): Bump astral-sh/setup-uv from 5 to 7 #1 (no Done without proof) + fix(orm): use postgresql.ARRAY so capabilities.contains() emits @> #11 (no drift) + feat(phase-6): PR-J6a · signing-material endpoint + JWS verify middleware #17 (verify before claim)
Pattern reference: This is Discipline feat(phase-6): PR-J6a · signing-material endpoint + JWS verify middleware #17 operationalized at repo level

Founder authorization

Per "Hard revert to In Progress and force the missing work. Also check all repos" (2026-05-18 session 3.5 PM).

Review in Linear

cursor · 2026-05-19T09:31:56Z

You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace.

To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard.

…s not exist) CI integration failed with: asyncpg.exceptions.UndefinedColumnError: column "updated_at" of relation "models" does not exist The models table has only created_at, not updated_at. My original sketch carried updated_at over from a generic timestamp-touch pattern that doesn't apply here. Drop the clause from both upgrade() and downgrade() — the aa_index_source flip is the only data change. Locally: alembic upgrade head succeeds; the data-integrity assertion (0 or 2 rows affected) still guards against silent no-ops.

cursor · 2026-05-19T10:09:09Z

You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace.

To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard.

…AAMC voters (#48) The invariant test test_aamc_5_voters_use_v1_lock_source landed in PR #46 (AAMC source-lock) and main's integration CI now fails because: 1. alembic upgrade head runs migration 20260516_0008 which inserts the T2/T3 catalog rows but NOT the 5 AAMC voters. 2. scripts.seed_dev then upserts the AAMC voter rows — without setting aa_index_source. Column lands NULL. 3. Migration 20260519_0017 (AAMC source-lock) only flips rows where aa_index_source='estimate_2026_q2'. NULL rows aren't touched. 4. Test asserts aa_index_source='aamc_v1_lock'. Fails for all 5 voters because they're NULL post-seed. Two-part fix: - PROVIDERS table declares aa_index_source='aamc_v1_lock' on each of the five AAMC voter model_specs. - _upsert_model reads model_spec.get('aa_index_source') and sets it on both insert and upsert. Idempotent — only writes when the seed declares a value, so re-running seed doesn't stomp the locked label with NULL on rows that came in from elsewhere. Net effect: CI's post-seed state for the 5 voters now matches prod's post-migration state (aa_index_source='aamc_v1_lock' on all of them). Test passes; the source-lock invariant is exercised every integration run. Discipline: #1 (CI state matches prod state), #11 (seed/migration/test all agree on the lock value). Co-authored-by: Aule <aule@ainfera-internal.local>

hizrianraz merged commit 93c48b2 into main May 19, 2026
2 of 3 checks passed

hizrianraz mentioned this pull request May 19, 2026

fix(api): AIN-183 P0-2 follow-up · seed_dev sets aa_index_source for AAMC voters #48

Merged

2 tasks

This was referenced May 23, 2026

chore(api): AIN-243 · purge sweep · retire AAMC vocab from code surface #69

Merged

[test] AIN-243 W6: grep-gate for retired ATS/AAMC/TrustScore terms #92

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(api): AIN-183 P0-2 · AAMC voter aa_index_source lock#46

feat(api): AIN-183 P0-2 · AAMC voter aa_index_source lock#46
hizrianraz merged 2 commits into
mainfrom
feat/ain-183-aamc-voter-source-lock

hizrianraz commented May 19, 2026

Uh oh!

linear-code Bot commented May 19, 2026 •

edited

Loading

Severity: 🟠 HIGH — comprehensive repo-level Discipline #1 + #17 audit

Scope

14 repos to audit

Production-facing (Phase 1, urgent)

Agent-implementation (Phase 2)

Customer + tooling (Phase 3)

Audit deliverable per repo

Audit commands Aule runs per repo

Acceptance gates

Out of scope

Connection

Founder authorization

Uh oh!

cursor Bot commented May 19, 2026

Uh oh!

cursor Bot commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hizrianraz commented May 19, 2026

Summary

What this changes

`alembic/versions/20260519_0017_aamc_voter_source_lock.py`

`tests/integration/test_aamc_invariants.py`

Test plan

Friction notes

Uh oh!

linear-code Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Severity: 🟠 HIGH — comprehensive repo-level Discipline #1 + #17 audit

Scope

14 repos to audit

Production-facing (Phase 1, urgent)

Agent-implementation (Phase 2)

Customer + tooling (Phase 3)

Audit deliverable per repo

Audit commands Aule runs per repo

Acceptance gates

Out of scope

Connection

Founder authorization

Uh oh!

cursor Bot commented May 19, 2026

Uh oh!

cursor Bot commented May 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

linear-code Bot commented May 19, 2026 •

edited

Loading