fix(auto-itemize): provider-aware LLM request shaping (#1547)#1558
Merged
Conversation
Anthropic's OpenAI-compat layer rejects `response_format: { type: 'json_object' }`
with 400, requiring `json_schema` instead. OpenAI/Gemini/Ollama accept
`json_object`. The previous one-shape-fits-all request body could not work with
Anthropic at all.
This change introduces a provider-profile system that adapts the outbound
request body per provider:
- New module `server/src/services/budgetExtraction/providerProfiles.ts`:
- `LlmProvider` type + `LLM_PROVIDERS` constant
- `detectProvider(baseUrl)` sniffs the host (api.anthropic.com →
anthropic, etc.) and falls back to 'generic'
- `parseProviderEnv(value)` accepts a normalized override
- `buildRequestBody({provider,model,systemPrompt,userPrompt})` shapes
the body per provider:
* openai/gemini/ollama → response_format: json_object
* anthropic → response_format: json_schema (full ExtractedLine[] schema)
* generic → no response_format; relies on system prompt + runtime validator
- All profiles include `max_tokens: 4096` (required by Anthropic, cost guard for others)
- `LlmConfig` and `AppConfig` gain `provider`/`llmProvider`.
- Config plugin parses optional `LLM_PROVIDER` env var; falls back to
auto-detection from `LLM_BASE_URL`, then to 'generic'. Rejects unknown values.
- `openAICompatibleProvider` now calls `buildRequestBody()` instead of
hardcoding the body.
- Docs: CLAUDE.md env var table + .env.example get the new `LLM_PROVIDER` row.
- Diagnostic script `scripts/diagnose-llm.mjs` (node-only, no shell needed)
captures the exact provider error from a no-shell DHI container.
Tests (all new + updated):
- `providerProfiles.test.ts`: detection matrix, env parsing, per-provider body
assertions, JSON-serializability across all profiles.
- `config.test.ts`: 7 new tests for LLM_PROVIDER (auto-detect by URL,
explicit override, case-insensitive, validation rejection, generic fallback).
- 5 existing test fixtures get `llmProvider` added so they still type-check.
Co-Authored-By: Claude backend-developer (Haiku 4.5) <noreply@anthropic.com>
Co-Authored-By: Claude qa-integration-tester (Haiku 4.5) <noreply@anthropic.com>
…1547) Server log was opaque on LLM failures — only `code` + `message` + `status` visible, with the actual upstream response body / fetch cause discarded. This change captures all available context into `details` on: - LlmUpstreamError: response body (up to 8KB), statusText, provider, url - LlmUnreachableError: underlying fetch error name/message/code + Node's nested cause (ENOTFOUND, ECONNREFUSED, AbortError, etc.) - LlmInvalidResponseError: raw response text, parse error message, truncated content All three errors now set `suppressDetails: true` so the diagnostic context appears in the server pino log but is OMITTED from the API response (the body can echo prompt content — vendor names, amounts from OCR — and must not leave the host). API consumers still see code + message; the message already includes the upstream status code for `LlmUpstreamError`. All response/content fields are capped at 8KB so a misbehaving provider can't flood the log. Co-Authored-By: Claude backend-developer (Haiku 4.5) <noreply@anthropic.com>
18da5d2 to
c8cb2e4
Compare
…add (#1547) - config.test.ts: 4 toEqual shape assertions need llmProvider: 'generic' added (the new field on AppConfig from this PR) - openAICompatibleProvider.test.ts: makeOkResponse() mock now exposes response.text() — the provider reads body as text first (so error details can include the raw response when JSON parsing fails) then JSON.parses it. Co-Authored-By: Claude qa-integration-tester (Haiku 4.5) <noreply@anthropic.com>
The previous commit switched to response.text() + JSON.parse() so error details could include the raw body. That broke 9+ inline test mocks that only defined response.json(). Reverting the success path to response.json() keeps the tests stable. The diagnostic improvement still kicks in for the two cases that actually matter for ops: - LlmUpstreamError (non-2xx): we read response.text() in the error branch and include the body (up to 8KB) in details - LlmInvalidResponseError: we include the parsed envelope or the parse error message in details What we lose: when JSON.parse(envelope) fails, we no longer have the raw text — only the parseError message. Acceptable trade-off; the 4xx body visibility was the actual blocker. Co-Authored-By: Claude qa-integration-tester (Haiku 4.5) <noreply@anthropic.com>
…ned lines (#1558) PR #1553 ('full edit + linked-item move') intentionally added a collapsed parent picker to the edit modal for already-assigned budget lines. The Story #1545 Scenario 4 test still asserted the old behavior (picker absent), failing post-merge. New assertions match the current behavior: - Parent picker fieldset IS visible (in collapsed state) - Collapsed current-parent row shows the current parent name - 'Change' button is present with aria-expanded='false' - Expanded picker body is hidden by default (#parent-picker-body) - Amount input still shown Note: beta CI does not run the full E2E suite (only Auto Fix + Release workflows fire on beta merge), so this regression went unnoticed until PR #1558 opened. Co-Authored-By: Claude e2e-test-engineer (Sonnet) <noreply@anthropic.com>
…mount PR #1553 renamed the assigned-line modal's amount input from #budget-line-amount to #budget-itemized-amount when it replaced the simple amount-only modal with the unified BudgetLineForm (which uses itemizedAmountField). The Scenario 4 test was still pointing at the old id. Co-Authored-By: Claude e2e-test-engineer (Sonnet) <noreply@anthropic.com>
These three tests were silently broken on beta after PRs #1553/#1555/#1557 merged. Beta does not run E2E in CI (only Auto Fix + Release fire on beta merge), so they only surfaced when PR #1558 opened. 1. invoice-budget-line-full-edit.spec.ts:584 — assertion expected '2100' but the remaining-cell renders '€2,100.00' (comma thousands separator). Other amount assertions in the same file (e.g. '300', '575') still pass via substring match inside '€300.00' etc. — only the 4-digit case fails because of the thousands separator. Switched to /2[.,]?100/. 2. i18n.spec.ts:133 — 'Projekt' heading flake/race. The reload after setLanguage('de') doesn't guarantee the dashboard's translation namespace has resolved before the assertion. Added waitForLoadState ('networkidle') + bumped timeout to 15s. 3. invoice-budget-line-create-and-link.spec.ts:88 (via POM) — the picker add button locator was matching /\+ Add Budget Line/i; React's accessible-name normalization can collapse the leading '+ ' inconsistently. Relaxed the regex to /Add Budget Line/i. All 3 are stale tests, not production bugs. The user-visible behavior is fine; the tests need to track the changes from #1553 et al. Co-Authored-By: Claude e2e-test-engineer (Sonnet) <noreply@anthropic.com>
Contributor
|
🎉 This PR is included in version 2.7.0-beta.7 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
3 tasks
steilerDev
added a commit
that referenced
this pull request
May 23, 2026
…format (#1561) Anthropic's OpenAI-compat layer rejected our schema with: response_format.json_schema.strict: Input should be True Per the diagnostic captured by the new error logging (PR #1558), Anthropic requires `strict: true` on the json_schema (where OpenAI treats it as optional). With strict mode, OpenAI's structured-outputs spec — which Anthropic mirrors — also imposes: 1. additionalProperties: false on every object node 2. EVERY property must be listed in `required` (optional fields use union-typed nulls: type: ['number', 'null']) Our `validateExtractedLines` already tolerates null on the optional fields, so the LLM emitting quantity: null instead of omitting the key is fine — the validator coerces null → undefined for the runtime ExtractedLine. Test updated to assert all three strict-mode invariants. Co-authored-by: Frank Steiler <frank@steiler.de> Co-authored-by: Claude backend-developer (Haiku 4.5) <noreply@anthropic.com>
3 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Auto-itemize was broken on Anthropic because the OpenAI-compat layer rejects
response_format: { type: 'json_object' }(it requiresjson_schema). The previous implementation sent the same body to every provider.This PR introduces per-provider request shaping:
response_formatmax_tokens{ type: 'json_object' }{ type: 'json_object' }{ type: 'json_object' }{ type: 'json_schema', json_schema: {…} }Provider is auto-detected from
LLM_BASE_URL(api.anthropic.com→ anthropic, etc.) and can be overridden with the new optionalLLM_PROVIDERenv var.Includes `scripts/diagnose-llm.mjs` — a node-only diagnostic script that works inside no-shell DHI containers and walks 4 progressive checks (DNS → TLS → endpoint → production-shaped payload) with detailed error dumps. This is what found the original 400.
Fixes #1547 (regression introduced by the original implementation).
Test plan
providerProfiles(detection, env parsing, body shapes) andconfig(7 new LLM_PROVIDER scenarios)LLM_PROVIDERenv var is optional with auto-detect default🤖 Generated with Claude Code