feat(api): AIN-183 P0-2 · AAMC voter aa_index_source lock#46
Conversation
…-3-1-pro + mistral-large-3 Two of the five AAMC v1.0 voter slugs landed in prod with aa_index_source='estimate_2026_q2' because at the time the Artificial Analysis Intelligence Index didn't have a stable reading for either model. The other three (claude-opus-4-7, gpt-5-5, grok-4) shipped with 'aamc_v1_lock'. AA Index has since locked readings for both, and the marketing /models page calls out 'AAMC v1.0' across the whole 5-voter pool — leaving two on a provisional source label reads as a discipline #11 mismatch (spec / runtime / copy disagree). ## Migration `20260519_0017_aamc_voter_source_lock.py` runs a single UPDATE filtered on the current value, so re-applying is a no-op. The rowcount assertion (Memory #20: silent no-ops ship as green CI) accepts 0 (fresh DB / re-apply) or 2 (first prod apply) but raises on any other value. Downgrade reverses the flip — safe because no row-level dependency hangs on the source label. ## Invariant test `test_aamc_5_voters_use_v1_lock_source` lives next to the existing `test_aamc_5_canonical_voters_always_active` and asserts the same lock shape but for `aa_index_source`. Any future migration that regresses one of the five voters back onto a provisional source label fails CI here, with the discipline #11 reasoning surfaced in the assertion message. ## Friction The audit prompt called the column `aamc_index_source`. Actual ORM declaration is `aa_index_source` (api/ainfera_api/orm.py:332) and the public API response field is the same. Migration uses the real name. Closes: AIN-183 P0-2 (AAMC source label drift) Discipline: #1 (no-op no longer ships green), #11 (catalog labels align across DB / API / marketing), Memory #20 (rowcount assertion).
AIN-183 🟠 All-repos audit sweep — 14 repos × spec-vs-built + Discipline #3/#4/#11/#17 verification
Severity: 🟠 HIGH — comprehensive repo-level Discipline #1 + #17 auditFiled 2026-05-18 PM after AIN-153 + AIN-158 spec-vs-built audit revealed Aule shipping less than spec'd then marking parents Done. Pattern requires systematic verification across ALL 14 repos in the Founder directive: "Hard revert to In Progress and force the missing work. Also check all repos." ScopeFor each of the 14 repos in
14 repos to auditProduction-facing (Phase 1, urgent)
Agent-implementation (Phase 2)
Customer + tooling (Phase 3)
Audit deliverable per repoFor each repo, Aule produces a comment on this ticket with: ## Repo: ainfera-ai/<name>
### Spec match
- Stated purpose: <from README>
- Actual state: <from main branch traversal>
- Drift: <none / specifics>
### Recent Done tickets touching this repo
- AIN-XYZ — claimed: "..." → actual state: ✅ matches / ⚠️ partial / ❌ missing
### Production-vs-main drift
- Last deploy SHA: <sha>
- main HEAD: <sha>
- Drift: <none / files differ / etc>
### Grep results
- Founder PII: <count> matches → <files>
- Internal agent names in public: <count> matches → <files>
### Lock compliance
- D7-D37 references: <count>
- Discipline #6 corollary violations: <count>
### Recommendation
- ✅ Clean / ⚠️ Cleanup needed / 🔴 Active violation
### Tickets to file
- <list of child tickets needed if cleanup work surfaces>Audit commands Aule runs per repocd ~/code/ainfera-ai/<repo>
git pull origin main
# Discipline #3 grep
rg -i "hizrian|izzy|raz|fibromyalgia|adhd|snowboard|julius baer|sommelier" \
--type-not lock \
--type-not log \
-l
# Internal agent naming in public surfaces
rg -i "manwe|yavanna|namo|aule|tulkas" \
src/ app/ public/ docs/ README.md \
--type-not lock \
-l
# Discipline #4 author override check (last 50 commits)
git log -50 --pretty=format:'%h %an <%ae>' | rg -v "Aule <aule@" | head -20
# Production-vs-main drift (for deployed repos)
gh api repos/ainfera-ai/<repo>/deployments --jq '.[0:3] | .[].sha'
# Compare against `git rev-parse main`
# Spec vs files
ls -la docs/
cat README.md | head -30Acceptance gates
Out of scope
Connection
Founder authorizationPer "Hard revert to In Progress and force the missing work. Also check all repos" (2026-05-18 session 3.5 PM). |
|
You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace. To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard. |
…s not exist) CI integration failed with: asyncpg.exceptions.UndefinedColumnError: column "updated_at" of relation "models" does not exist The models table has only created_at, not updated_at. My original sketch carried updated_at over from a generic timestamp-touch pattern that doesn't apply here. Drop the clause from both upgrade() and downgrade() — the aa_index_source flip is the only data change. Locally: alembic upgrade head succeeds; the data-integrity assertion (0 or 2 rows affected) still guards against silent no-ops.
|
You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace. To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard. |
…AAMC voters (#48) The invariant test test_aamc_5_voters_use_v1_lock_source landed in PR #46 (AAMC source-lock) and main's integration CI now fails because: 1. alembic upgrade head runs migration 20260516_0008 which inserts the T2/T3 catalog rows but NOT the 5 AAMC voters. 2. scripts.seed_dev then upserts the AAMC voter rows — without setting aa_index_source. Column lands NULL. 3. Migration 20260519_0017 (AAMC source-lock) only flips rows where aa_index_source='estimate_2026_q2'. NULL rows aren't touched. 4. Test asserts aa_index_source='aamc_v1_lock'. Fails for all 5 voters because they're NULL post-seed. Two-part fix: - PROVIDERS table declares aa_index_source='aamc_v1_lock' on each of the five AAMC voter model_specs. - _upsert_model reads model_spec.get('aa_index_source') and sets it on both insert and upsert. Idempotent — only writes when the seed declares a value, so re-running seed doesn't stomp the locked label with NULL on rows that came in from elsewhere. Net effect: CI's post-seed state for the 5 voters now matches prod's post-migration state (aa_index_source='aamc_v1_lock' on all of them). Test passes; the source-lock invariant is exercised every integration run. Discipline: #1 (CI state matches prod state), #11 (seed/migration/test all agree on the lock value). Co-authored-by: Aule <aule@ainfera-internal.local>
Summary
Two AAMC v1.0 voter slugs (`gemini-3-1-pro`, `mistral-large-3`) shipped with `aa_index_source='estimate_2026_q2'` while the other three (`claude-opus-4-7`, `gpt-5-5`, `grok-4`) shipped with `aamc_v1_lock`. Marketing /models advertises 'AAMC v1.0' across the whole pool, so two voters carrying a provisional source label = discipline #11 mismatch (spec / runtime / copy disagree).
This PR adds an alembic migration to flip the two stragglers, plus an invariant test that prevents regressions.
What this changes
`alembic/versions/20260519_0017_aamc_voter_source_lock.py`
Single UPDATE filtered on the current value (`aa_index_source='estimate_2026_q2'`) so re-applying is a no-op. Rowcount assertion accepts 0 (fresh DB / re-apply) or 2 (first prod apply) and raises on any other value — per Memory #20 (silent no-ops ship as green CI).
Auto-applies on next Railway deploy via the Dockerfile's `alembic upgrade head` (committed in 5ed0e2b).
`tests/integration/test_aamc_invariants.py`
New test `test_aamc_5_voters_use_v1_lock_source` sits next to the existing active-flag invariant. Any future migration that regresses a voter back onto a provisional source label fails CI with a discipline #11 explanation.
Test plan
Friction notes
Closes: AIN-183 P0-2 (AAMC source label drift)
Discipline: #1, #11, Memory #20