docs(adr): ADR-012/013/014 — AA integration roadmap#4
Merged
Conversation
Three ADRs documenting AA-integration work deferred from D6 to Sprint v1.7. No code changes today. - ADR-012: AgentCard qualityFloor field — agents can declare an AA Intelligence Index minimum; L2 router enforces pre-dispatch. - ADR-013: Audit receipt AA enrichment — embed AA Intelligence and blended-cost snapshots in every AuditEvent. - ADR-014: Live AA API integration — replace static May-12 snapshot with cached live readings; gated on partnership outreach post-D8. Note: prompt suggested ADR-011/012/013 numbering; ADR-011 was already taken (user-handle-tenant-model). Continuing the sequence at 012-014. Co-Authored-By: Claude <noreply@anthropic.com>
5 tasks
hizrianraz
added a commit
that referenced
this pull request
May 18, 2026
…e diagnostics (#32) Per Manwe v0.14.0 production dogfood 2026-05-18 BUG 1. Hypothesis C fix only — addresses over-aggressive inference where descriptive system prompts triggered the `code` capability requirement that filtered out all candidate models. ## Changes ### 1. Role-weighted code detection `detect_capabilities` now extracts text into (system_text, other_text) buckets and only triggers `code` when the syntax markers appear in user/assistant content. System-only mentions of code keywords are treated as descriptive context, not capability requests. Rationale: SOUL system prompts (Manwe's 2KB pattern) often describe "this agent can review code / write scripts / etc." with example snippets. The user's actual turn might be "hi" — no need to filter to code-capable models for a greeting. `_CODE_SYNTAX_MARKERS` unchanged; only matching substrate moves from "full_text" to "other_text". ### 2. Empty-route diagnostic logging `auto_route` now emits a structured WARNING with per-filter elimination counts when returning empty: auto_route_empty agent=<id> required_caps=[code, text] quality_floor=good per_call_cap=2.00 rows_inspected=12 dropped_by={capability=8, quality=2, cost=2} Manwe's bug surfaced as a generic 400 with no per-filter breakdown — this fixes the visibility gap. ## Deferred with explicit notes - Hypothesis A catalog audit (model capability flag verification against provider docs) — founder lock required per Memory #5 - Hypothesis B cost-estimate recalibration — needs 30d rolling data via AIN-154 Phase E - Recommendation #4 downgrade header (200 + x-ainfera-capability- downgraded) — behavioral contract change deserves wider review ## Tests added 4 regression cases in tests/unit/test_t9_auto_routing.py: - code NOT inferred from system-only mention (Manwe pattern) - code IS inferred when user turn has code (regression guard) - code IS inferred when assistant turn has code (multi-turn) - long_context still role-agnostic (scope guard) Pre-commit hooks green. ## Refs - AIN-173 (parent · BUG 1) - AIN-154 (router hardening epic) - AIN-178 (Tulkas activation — verify 10 capability-trigger prompts) Co-authored-by: Aule <aule@ainfera-internal.local> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 19, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three ADRs documenting deferred AA-integration work for Sprint v1.7. No code changes today.
qualityFloorfieldThe prompt suggested numbering 011/012/013, but ADR-011 was already taken (user-handle-tenant-model). Continuing the sequence at 012-014.
Pairs with the D6 web PR at ainfera-ai/web.
Note
Low Risk
Low risk because this PR only adds documentation (new ADRs) and does not change runtime code, APIs, or data handling.
Overview
Adds three proposed ADRs (012–014) documenting the planned Artificial Analysis (AA) integration work for Sprint v1.7: introducing an AgentCard
qualityFloor, enriching audit receipts with an AAquality_snapshot, and integrating a live AA API feed (with caching and fallback). No implementation changes are included—these are decision records outlining future spec/SDK/API/verify updates and open questions.Reviewed by Cursor Bugbot for commit a654181. Bugbot is set up for automated code reviews on this repo. Configure here.