fix(api): AIN-173 · soften code-capability inference + empty-route diagnostics#32
Conversation
…e diagnostics
Per Manwe v0.14.0 production dogfood 2026-05-18 BUG 1. Hypothesis C
fix only — addresses over-aggressive inference where descriptive
system prompts triggered the `code` capability requirement that
filtered out all candidate models.
## Changes
### 1. Role-weighted code detection
`detect_capabilities` now extracts text into (system_text, other_text)
buckets and only triggers `code` when the syntax markers appear in
user/assistant content. System-only mentions of code keywords are
treated as descriptive context, not capability requests.
Rationale: SOUL system prompts (Manwe's 2KB pattern) often describe
"this agent can review code / write scripts / etc." with example
snippets. The user's actual turn might be "hi" — no need to filter
to code-capable models for a greeting.
`_CODE_SYNTAX_MARKERS` unchanged; only matching substrate moves
from "full_text" to "other_text".
### 2. Empty-route diagnostic logging
`auto_route` now emits a structured WARNING with per-filter
elimination counts when returning empty:
auto_route_empty agent=<id> required_caps=[code, text]
quality_floor=good per_call_cap=2.00 rows_inspected=12
dropped_by={capability=8, quality=2, cost=2}
Manwe's bug surfaced as a generic 400 with no per-filter breakdown
— this fixes the visibility gap.
## Deferred with explicit notes
- Hypothesis A catalog audit (model capability flag verification
against provider docs) — founder lock required per Memory #5
- Hypothesis B cost-estimate recalibration — needs 30d rolling data
via AIN-154 Phase E
- Recommendation #4 downgrade header (200 + x-ainfera-capability-
downgraded) — behavioral contract change deserves wider review
## Tests added
4 regression cases in tests/unit/test_t9_auto_routing.py:
- code NOT inferred from system-only mention (Manwe pattern)
- code IS inferred when user turn has code (regression guard)
- code IS inferred when assistant turn has code (multi-turn)
- long_context still role-agnostic (scope guard)
Pre-commit hooks green.
## Refs
- AIN-173 (parent · BUG 1)
- AIN-154 (router hardening epic)
- AIN-178 (Tulkas activation — verify 10 capability-trigger prompts)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
You have used all Bugbot PR reviews included in your free trial for your GitHub account on this workspace. To continue using Bugbot reviews, enable Bugbot for your team in the Cursor dashboard. |
AIN-173 🔴 BUG 1: /v1/inference no_models_match_constraints over-rejects code-capable requests (hard-fails turn)
Severity: URGENT 🔴Filed from Manwe (hermes-agent v0.14.0) production dogfood 2026-05-18. Customer #1 evidence: hard-fails agent turns, framework retries 3× then aborts with empty response. Opaque error from consumer perspective. Symptom
{
"detail": {
"code": "no_models_match_constraints",
"message": "no active model satisfies the requested capabilities, quality_floor, and per_call_cap_usd",
"context": {
"capabilities_required": ["code", "text"],
"quality_floor": "good",
"per_call_cap_usd": "2"
}
}
}Reproducer# Yesterday's request worked
curl -X POST https://api.ainfera.ai/v1/inference \
-H "Authorization: Bearer $MANWE_KEY" \
-d '{
"model": "ainfera-auto",
"messages": [
{"role": "system", "content": "<512-byte SOUL without code keywords>"},
{"role": "user", "content": "hi"}
]
}'
# → 200 OK with claude-opus-4-7
# Today's request fails
curl -X POST https://api.ainfera.ai/v1/inference \
-H "Authorization: Bearer $MANWE_KEY" \
-d '{
"model": "ainfera-auto",
"messages": [
{"role": "system", "content": "<2KB SOUL mentioning coding/system admin/code execution>"},
{"role": "user", "content": "hi"}
]
}'
# → 400 no_models_match_constraintsRoot cause hypothesisAuto-router infers A. B. C. Routing layer's capability inference is too aggressive — promotes "coding" → Cross-framework impact
Net: 3 of 6 fleet agents at production risk. Manwe confirmed broken. Varda + Aule pending validation. Fix recommendation (pick one OR combine)Recommended: Combined fix (defense in depth)
Acceptance gates
Connection to existing tickets
Workaround (Manwe already using)Pin model via consumer env var: Founder authorizationPer "Fix this error for and check from all frameworks. Tulkas need to start working now." (2026-05-18 PM) |
Summary
codecapability only inferred from user/assistant content, not system role. Addresses Manwe's 2026-05-18 over-rejection (2KB SOUL → 400 no_models_match_constraints even though user just said "hi").Hypothesis C fix only (Hypothesis A/B deferred)
Per ticket §"Recommended: Combined fix (defense in depth)" — this PR ships #3 (soften capability inference) only.
Why not A (catalog audit): Per Memory #5 verify-the-plan, verifying
codecapability flags on top 5 frontier models against provider docs needs founder lock. SQL UPDATE without that lock is too risky. Filing as founder-action recommended for next session.Why not B (cost recalibration): Needs 30-day rolling actual-cost data which lives in AIN-154 Phase E (ATS scoring) pipeline. Deferred to that phase.
Why not #4 (downgrade header): Changing 400 → 200 +
x-ainfera-capability-downgradedis a behavioral contract change deserving wider review.Tests added (4 regressions)
test_code_not_detected_when_only_in_system_prompt— exact Manwe reproducer patterntest_code_detected_when_user_turn_has_code_even_with_neutral_system— legitimate code request guardtest_code_detected_when_assistant_turn_has_code— multi-turn (only system is descriptive-only)test_long_context_still_role_agnostic— scope guard (fix iscodeonly)Test plan
Refs
🤖 Generated with Claude Code