Skip to content

docs(adr): ADR-012/013/014 — AA integration roadmap#4

Merged
hizrianraz merged 1 commit into
mainfrom
docs/d6-deferred-adrs
May 15, 2026
Merged

docs(adr): ADR-012/013/014 — AA integration roadmap#4
hizrianraz merged 1 commit into
mainfrom
docs/d6-deferred-adrs

Conversation

@hizrianraz
Copy link
Copy Markdown
Contributor

@hizrianraz hizrianraz commented May 15, 2026

Three ADRs documenting deferred AA-integration work for Sprint v1.7. No code changes today.

  • ADR-012: AgentCard qualityFloor field
  • ADR-013: Audit receipt AA enrichment
  • ADR-014: Live Artificial Analysis API integration

The prompt suggested numbering 011/012/013, but ADR-011 was already taken (user-handle-tenant-model). Continuing the sequence at 012-014.

Pairs with the D6 web PR at ainfera-ai/web.


Note

Low Risk
Low risk because this PR only adds documentation (new ADRs) and does not change runtime code, APIs, or data handling.

Overview
Adds three proposed ADRs (012–014) documenting the planned Artificial Analysis (AA) integration work for Sprint v1.7: introducing an AgentCard qualityFloor, enriching audit receipts with an AA quality_snapshot, and integrating a live AA API feed (with caching and fallback). No implementation changes are included—these are decision records outlining future spec/SDK/API/verify updates and open questions.

Reviewed by Cursor Bugbot for commit a654181. Bugbot is set up for automated code reviews on this repo. Configure here.

Three ADRs documenting AA-integration work deferred from D6 to Sprint v1.7.
No code changes today.

- ADR-012: AgentCard qualityFloor field — agents can declare an AA
  Intelligence Index minimum; L2 router enforces pre-dispatch.
- ADR-013: Audit receipt AA enrichment — embed AA Intelligence and
  blended-cost snapshots in every AuditEvent.
- ADR-014: Live AA API integration — replace static May-12 snapshot
  with cached live readings; gated on partnership outreach post-D8.

Note: prompt suggested ADR-011/012/013 numbering; ADR-011 was already
taken (user-handle-tenant-model). Continuing the sequence at 012-014.

Co-Authored-By: Claude <noreply@anthropic.com>
@hizrianraz hizrianraz merged commit b314a9e into main May 15, 2026
4 checks passed
@hizrianraz hizrianraz deleted the docs/d6-deferred-adrs branch May 15, 2026 15:26
hizrianraz added a commit that referenced this pull request May 18, 2026
…e diagnostics (#32)

Per Manwe v0.14.0 production dogfood 2026-05-18 BUG 1. Hypothesis C
fix only — addresses over-aggressive inference where descriptive
system prompts triggered the `code` capability requirement that
filtered out all candidate models.

## Changes

### 1. Role-weighted code detection

`detect_capabilities` now extracts text into (system_text, other_text)
buckets and only triggers `code` when the syntax markers appear in
user/assistant content. System-only mentions of code keywords are
treated as descriptive context, not capability requests.

Rationale: SOUL system prompts (Manwe's 2KB pattern) often describe
"this agent can review code / write scripts / etc." with example
snippets. The user's actual turn might be "hi" — no need to filter
to code-capable models for a greeting.

`_CODE_SYNTAX_MARKERS` unchanged; only matching substrate moves
from "full_text" to "other_text".

### 2. Empty-route diagnostic logging

`auto_route` now emits a structured WARNING with per-filter
elimination counts when returning empty:

    auto_route_empty agent=<id> required_caps=[code, text]
    quality_floor=good per_call_cap=2.00 rows_inspected=12
    dropped_by={capability=8, quality=2, cost=2}

Manwe's bug surfaced as a generic 400 with no per-filter breakdown
— this fixes the visibility gap.

## Deferred with explicit notes

- Hypothesis A catalog audit (model capability flag verification
  against provider docs) — founder lock required per Memory #5
- Hypothesis B cost-estimate recalibration — needs 30d rolling data
  via AIN-154 Phase E
- Recommendation #4 downgrade header (200 + x-ainfera-capability-
  downgraded) — behavioral contract change deserves wider review

## Tests added

4 regression cases in tests/unit/test_t9_auto_routing.py:
- code NOT inferred from system-only mention (Manwe pattern)
- code IS inferred when user turn has code (regression guard)
- code IS inferred when assistant turn has code (multi-turn)
- long_context still role-agnostic (scope guard)

Pre-commit hooks green.

## Refs

- AIN-173 (parent · BUG 1)
- AIN-154 (router hardening epic)
- AIN-178 (Tulkas activation — verify 10 capability-trigger prompts)

Co-authored-by: Aule <aule@ainfera-internal.local>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant