Anthropic backend + tightened keyword fallback + LLM call hardening#4
Closed
Conalh wants to merge 1 commit into
Closed
Conversation
Layers on top of the codex slice that landed scope-llm support, PR-body
ingestion, and the .taskbound.yml config. Three independent improvements:
1. Anthropic Messages API as a second LLM scope-extraction backend,
auto-routed by model-id prefix. 'claude-*' models go to Anthropic
(with prompt caching on the static system prompt via cache_control
on the system content block); anything else stays on the existing
OpenAI Responses backend. Both paths return the same normalized
InferredScope and share a single normalizeLlmScope helper, so the
review pipeline doesn't know or care which provider answered.
Structured output is forced via 'tool_choice: { type: tool, name:
report_scope }' so the response is always JSON-shaped against the
shared SCOPE_SCHEMA.
2. isFileInScope keyword fallback was 'substring anywhere in the path,'
which over-matched: a task saying 'fix header' would pull
src/auth/header-injection-fix.ts into scope. Now keywords must
appear as a substring of a path segment (split on '/' and '.'), so
src/components/Header.tsx and src/styles/header.css still match
while unrelated files don't.
3. LLM calls now share a callLlm wrapper with a 30-second
AbortSignal.timeout (a hung Anthropic/OpenAI call cannot hang the
GitHub Action) and a 64KiB content-length cap (a runaway response
cannot OOM the runner).
test/scope-anthropic-and-keyword.test.mjs locks the new behavior with
four cases: Anthropic routing+caching, OpenAI regression, Anthropic
failure fallback, and keyword segment-matching. Total suite 22/22 green.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Layers on top of the codex slice on `codey/taskbound-action-credibility` (LLM scope-llm support + PR-body ingestion + `.taskbound.yml` config). Three independent improvements with regression tests.
1. Anthropic Messages API as a second LLM backend
Codex went OpenAI-only. Adding Anthropic, auto-routed by model-id prefix:
claude-*→ Anthropic Messages API (ANTHROPIC_API_KEY), with prompt caching on the static system prompt (cache_control: { type: 'ephemeral' }on the system content block). Repeat invocations on the same Action job are cheap and fast.Structured output is forced via
tool_choice: { type: 'tool', name: 'report_scope' }. Both providers return the same normalizedInferredScope, sharing a singlenormalizeLlmScopehelper, so the review pipeline doesn't know or care which one answered.2. Tighter keyword fallback in
isFileInScopePrevious behavior was "keyword appears anywhere in the path." A task saying "fix header" pulled
src/auth/header-injection-fix.tsinto scope.Fix: split the path on
/and., require the keyword to appear as a substring of a segment.src/components/Header.tsxandsrc/styles/header.cssstill match (the basename segment contains "header"); unrelated paths don't.3. LLM call hardening
A new shared
callLlmwrapper:AbortSignal.timeout— a hung Anthropic/OpenAI call cannot hang the GitHub Action.Content-Lengthcheck — a runaway response cannot OOM the runner.Both backends go through it.
Verification
Test plan