Skip to content

feat(models): Phase 6 — anthropic + bedrock backends#707

Merged
heskew merged 8 commits into
mainfrom
feat/models-anthropic-bedrock-backends
May 22, 2026
Merged

feat(models): Phase 6 — anthropic + bedrock backends#707
heskew merged 8 commits into
mainfrom
feat/models-anthropic-bedrock-backends

Conversation

@heskew
Copy link
Copy Markdown
Member

@heskew heskew commented May 21, 2026

Summary

Phase 6 of #510 — adds the anthropic + bedrock backends, the
third and fourth real ModelBackend implementations. Also extracts
shared helpers out of ollama+openai now that the duplication crossed
the threshold Kris flagged on #698.

Closes #633
Tracking: #510

Note

Opened as Draft until CI clears, per the engineering-guidelines
step 18. Will flip to Ready when green.

What ships

Shared helper extraction (resources/models/backendHelpers.ts)

composeSignal, assignFiniteTokenCount, parseJsonResponse,
requireModel, requireCredential, normalizeOrigin. Backends pass
their error class as a constructor arg so per-backend names are
preserved for instanceof matching in tests. Ollama + OpenAI
migrated; local duplicates removed (~140 LOC deleted across the two).

components/anthropic/index.ts (native fetch)

Native fetch (no SDK dep) against the Anthropic Messages API. SSE
streaming. embed: false (Anthropic has no embed API), tools: true.

  • x-api-key + anthropic-version headers.
  • system extracted from Message[] or the object form's system and
    combined into Anthropic's top-level field (Anthropic forbids system
    as a message role).
  • Tool-use content-block parsing in non-streaming responses.
  • Tool-call delta accumulator for streaming (input_json_delta keyed by
    content_block_index; yields each call once at content_block_stop).
  • Mid-stream upstream errors (overloaded, rate_limit) raised as
    AnthropicBackendError, not silently swallowed.

components/bedrock/index.ts (AWS SDK via optional peerDependency)

@aws-sdk/client-bedrock-runtime is declared as an optional
peerDependency
in package.json. Harper doesn't depend on it
directly — users who want the bedrock backend add the SDK to their
own project's package.json. BedrockBackend dynamic-imports the
SDK on first call with a clear "add to your project" error if missing.

AWS credentials resolve via the SDK chain (env / shared profile /
IAM / IRSA). No apiKey field in config.

Per-family request dispatch on the model id prefix:

  • anthropic.* → Anthropic Messages body (Claude on Bedrock)
  • meta.* → Llama flat-prompt body with Llama 3 header tags
  • amazon.* → Titan body (chat + embed v2)
  • mistral.*, cohere.* → flat-prompt body

Streaming: per-family dispatch — Anthropic event-block parser for
Claude, flat per-chunk parser for the others.

Cohere embed honors EmbedOpts.inputType:
'query' → search_query, 'document' → search_document.

Non-Anthropic generative families REJECT GenerateInput.tools rather
than silently dropping them. Without that guard the caller couldn't
distinguish "model chose not to call" from "model never saw the tool".
Phase 1's capability negotiation is backend-level not model-level;
model-aware capabilities is a follow-up.

Mid-stream upstream errors from Claude-on-Bedrock surface as
BedrockBackendError.

Config schema + bootstrap

validation/configValidator.ts — discriminated schema extended with
anthropic + bedrock branches, both .unknown(false) so cross-backend
field typos block boot.

resources/models/bootstrap.ts — factory map adds anthropic + bedrock.
expandEnvVarsDeep already handled ${VAR} placeholders.

package.json

"peerDependencies": {
  "@aws-sdk/client-bedrock-runtime": "^3.0.0"
},
"peerDependenciesMeta": {
  "@aws-sdk/client-bedrock-runtime": { "optional": true }
}

Pre-PR verifier cadence

Per the agreed cadence (Phase 6 = deep-review single-pass per the
Flair memory): invoked Skill({ skill: 'deep-review', args: ... })
this time — NOT hand-rolled Agent calls.

  • api: 3 Tier-3 findings, all confirmed, all fixed before commit:
    1. Anthropic + Bedrock-Claude streams silently swallowed upstream
      error events. Fixed: both parsers now throw on type: 'error'
      with the upstream message capped at 500 chars.
    2. Bedrock Cohere embed hardcoded input_type: 'search_document',
      ignoring EmbedOpts.inputType. Fixed: maps 'query'
      search_query, defaults to search_document.
    3. Bedrock advertised tools: true but silently dropped tools on
      Llama/Titan/Mistral/Cohere. Fixed: non-Anthropic body builders
      throw with a structured error when tools are present.
  • concurrency: No blockers.
  • config: No blockers.

What to put attention on

  • The dynamic-import + optional peerDep pattern in
    components/bedrock/index.ts. New shape for Harper; happy to discuss
    if it's not the right precedent.
  • Per-family dispatch on model prefix in bedrock. Will fragment
    as new Bedrock model families land; could become a registry pattern
    later. v1 is mechanical.
  • Tool-call rejection on non-anthropic Bedrock families — option
    (b) from the deep-review recommendation. Option (a) would make
    capabilities() model-aware, which is a Phase 1 contract change.

Files

Path Change
resources/models/backendHelpers.ts new — extracted shared helpers
components/anthropic/index.ts new — native fetch
components/bedrock/index.ts new — SDK via dynamic-import peerDep
resources/models/bootstrap.ts + anthropic + bedrock factory entries
validation/configValidator.ts + anthropic + bedrock schema branches
package.json + optional peerDep
components/ollama/index.ts migrated to extracted helpers
components/openai/index.ts migrated to extracted helpers
unitTests/components/anthropic/index.test.js new
unitTests/components/bedrock/index.test.js new — with fake-SDK injection
unitTests/resources/models/backendHelpers.test.js new
unitTests/resources/models/bootstrap.test.js + anthropic + bedrock cases
unitTests/validation/configValidator.test.js + anthropic + bedrock schema cases
integrationTests/server/anthropic-backend.test.ts new — gated on ANTHROPIC_API_KEY
integrationTests/server/bedrock-backend.test.ts new — gated on RUN_BEDROCK_INTEGRATION + SDK installed

Acceptance criteria

  • AnthropicBackend + BedrockBackend implement ModelBackend per Phase 1
  • Bedrock dispatches on model field prefix to per-family request shape
  • harper.models.generate() produces completions from each provider
  • harper.models.generateStream() yields content + tool-call deltas per Phase 1 contract
  • Tool calls in 'return' mode work on Anthropic + Anthropic-on-Bedrock
  • harper.models.embed() via Bedrock embedding models (Titan / Cohere)
  • embed: false on anthropic → capability negotiation rejects
  • Per-call accounting records backend: 'anthropic' / 'bedrock'
  • AbortSignal cancels in-flight calls for both
  • AWS credentials resolved via SDK chain for Bedrock
  • SDK declared as optional peerDep; documented bump cadence in commit messages
  • Unit tests cover translation logic + the optional-SDK path
  • Integration tests gated on real creds
  • CI green across full matrix (Draft → wait → Ready)

Test plan

  • npm run build clean
  • npm run lint:required clean
  • npx prettier --check . clean (PR files)
  • 299 unit tests pass locally
  • Phase 1-3 model tests still pass (no regression)
  • CI green across full matrix
  • Manual smoke against real Anthropic + Bedrock with creds set

Out of scope (per #633)


🤖 Generated with Claude Code

heskew and others added 5 commits May 21, 2026 14:46
`resources/models/backendHelpers.ts` (new): pulls the byte-identical
helpers out of ollama and openai into a single module so anthropic and
bedrock (Phase 6) can reuse them and the surface stays consistent across
backends.

Extracted helpers:
- `composeSignal(caller?, timeoutMs?)` — combine caller AbortSignal with
  per-call timeout via `AbortSignal.any`.
- `assignFiniteTokenCount(usage, key, value)` — drop NaN / Infinity /
  negative / non-integer token counts so they don't poison
  `SUM(prompt_tokens)`-style aggregates over `hdb_model_calls`.
- `parseJsonResponse<T>(res, endpoint, Err)` — wrap `res.json()`,
  throw the backend's error class on parse failure with a static
  message (no upstream-bytes leak).
- `requireModel(model, op, Err)` — assert a model was specified.
- `requireCredential(value, backendLabel, fieldName, Err)` — validate
  apiKey-style fields and reject literal `${VAR}` placeholders that
  survived `expandEnvVarsDeep` because the env var was unset.
- `normalizeOrigin(value, { host, secure })` — fold the per-backend
  host/baseUrl normalization (Ollama's `http://localhost:11434` vs
  OpenAI's `https://api.openai.com/v1`) into one helper parameterized
  by scheme default.

Migrates `components/ollama/` and `components/openai/` to use them;
local duplicates removed. No behavior change.

Tracking: #633, #510

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Third real `ModelBackend` for #510, against the Anthropic Messages API.
Native fetch — no SDK dep — consistent with `components/openai/`.

What ships:

- `AnthropicBackend` implementing `generate` + `generateStream` (no
  `embed` — Anthropic doesn't ship an embedding API; capabilities
  advertise `embed: false` and Phase 1's `Models.embed` rejects the
  call before reaching the backend).
- `tools: true` — first-class tool support via Anthropic's
  `tool_use` / `tool_result` content blocks.
- SSE streaming with `x-api-key` + `anthropic-version` headers.
  Dispatches on `data.type` rather than the `event:` line for
  reliability against compat shims.
- Tool-call delta accumulation: `input_json_delta` strings keyed by
  `content_block_index`; each call yielded exactly once at its
  `content_block_stop` (Phase 1 contract: `ToolCall.arguments: object`,
  never a partial string).
- `system` role messages are extracted (from `Message[]` input or the
  object form's `system` field) and combined into Anthropic's
  top-level `system` parameter — Anthropic forbids `system` as a
  message role.
- Per-call accounting via the Phase 1 writer; native-fetch +
  `composeSignal` for AbortSignal + per-call timeout.

Mid-stream upstream `error` events (overloaded, rate_limit, etc.)
are explicitly raised as `AnthropicBackendError` rather than silently
ending the stream — surfaced as a deep-review finding pre-merge.

Tracking: #633, #510

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fourth real `ModelBackend` for #510, against AWS Bedrock. Unlike the
other three backends, Bedrock requires SigV4-signed requests which is
not worth rolling ourselves — the AWS SDK does it correctly. To avoid
shipping a ~5 MB SDK dependency for every Harper install:

- `@aws-sdk/client-bedrock-runtime` is declared as an **optional
  `peerDependency`** in `package.json`. Modern npm / pnpm / yarn skip
  installing it automatically and don't warn.
- `BedrockBackend` dynamic-imports the SDK on first call; throws with
  a clear "add to your project's package.json" message if missing.
- AWS credentials resolve via the SDK's standard chain (env / shared
  profile / IAM / IRSA) — no `apiKey` field in Harper config.

Per-family request dispatch on the model id prefix:

- `anthropic.*` → Anthropic Messages body (Claude on Bedrock; same
  shape as the direct anthropic backend with `anthropic_version: bedrock-2023-05-31`).
- `meta.*` → Llama flat-prompt body with Llama 3 header tags.
- `amazon.*` → Titan body (chat + embed v2; embed loops per-text).
- `mistral.*`, `cohere.*` → flat-prompt body.

Cohere embed honors `EmbedOpts.inputType` — `'query'` maps to
`search_query`, `'document'` to `search_document`. Cohere produces
materially different vectors for the two; the prior hardcoded
`search_document` would have silently degraded retrieval recall.

Non-Anthropic generative families REJECT `GenerateInput.tools` with a
structured error. Without that guard, Llama/Titan/Mistral/Cohere body
builders silently drop tools — caller can't distinguish "model chose
not to call" from "model never saw the tool". (Phase 1's capability
negotiation is backend-level not model-level; making it model-aware
is a follow-up.)

Mid-stream upstream errors from Claude-on-Bedrock surface as
`BedrockBackendError` rather than silently truncating the stream.

`_resetSdkCacheForTests` / `_injectSdkForTests` exported for test
injection of a fake SDK — keeps unit tests dependency-free.

Tracking: #633, #510

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`resources/models/bootstrap.ts` — factory map adds `anthropic` and
`bedrock` entries alongside `ollama` and `openai`. `expandEnvVarsDeep`
already runs on each entry before factory dispatch, so anthropic's
`apiKey: ${ANTHROPIC_API_KEY}` placeholder resolves in the same path.

`validation/configValidator.ts` — discriminated Joi schema extended
with two new branches:

- `anthropic`: required `apiKey`, optional `baseUrl` + `model` +
  `requestTimeoutMs`. `.unknown(false)` so OpenAI-only fields like
  `organization` blow up at validation rather than silently surviving.
- `bedrock`: optional `region` + `model` + `requestTimeoutMs`. No
  `apiKey` field — `.unknown(false)` rejects it if mistakenly added
  (operator probably copy/pasted from openai), pointing them at the
  AWS SDK credential chain instead of confusing first-call 401s.

Tracking: #633, #510

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Unit tests:

- `unitTests/components/anthropic/index.test.js` — full coverage of
  capability shape, baseUrl normalization, x-api-key + anthropic-version
  headers, message translation (string / array / object with system
  extraction), tool definitions to `input_schema` shape, tool_use
  content-block parsing, finishReason mapping, response error envelope
  passthrough, streaming with content + tool-call deltas, mid-stream
  upstream-error surfacing, SSE buffer + tool-call argument caps,
  AbortSignal propagation.
- `unitTests/components/bedrock/index.test.js` — capability shape,
  SDK-missing error path (default — no fake injected → real import
  fails), per-family generate dispatch (anthropic / meta / amazon /
  mistral / cohere / unknown), per-family embed dispatch
  (Titan / Cohere with inputType mapping), Anthropic-streaming chunk
  + finishReason flow, Llama flat-stream content + finishReason,
  tool-call rejection on non-anthropic families, tool-call routing on
  anthropic.* models, mid-stream upstream-error surfacing in the
  Anthropic-on-Bedrock stream parser, tool-call argument cap,
  AbortSignal composition with timeout.
- `unitTests/resources/models/bootstrap.test.js` — extended: anthropic
  + bedrock registrations under various kinds, "all four backends side
  by side" smoke, anthropic-embedding rejection (no Anthropic embed
  API).
- `unitTests/validation/configValidator.test.js` — extended: anthropic
  schema (apiKey required, ${VAR} placeholder, cross-backend field
  rejection), bedrock schema (region + model only, apiKey rejected),
  "all four side by side" smoke.

Integration tests (gated on real credentials):

- `integrationTests/server/anthropic-backend.test.ts` — exercises the
  real Anthropic API behind `ANTHROPIC_API_KEY`. Covers generate
  (string + messages + tools + system), streaming with content +
  optional tool_use, finishReason mapping.
- `integrationTests/server/bedrock-backend.test.ts` — exercises the
  real AWS Bedrock API behind `RUN_BEDROCK_INTEGRATION=1` plus AWS
  credentials. Double-gated: also skips when the optional peerDep
  isn't installed locally. Covers Anthropic-on-Bedrock generate +
  stream, Titan embed.

Dynamic imports inside `before()` blocks — same workaround as Phase 2
+ Phase 3 for the pre-existing CJS require cycle when integration
tests directly import Harper source (`utility/common_utils.ts` ↔
`utility/logging/harper_logger.ts`).

Tracking: #633, #510

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@heskew heskew requested a review from kriszyp May 21, 2026 21:49
@socket-security
Copy link
Copy Markdown

socket-security Bot commented May 21, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Added@​aws-sdk/​client-bedrock-runtime@​3.1052.09910010098100

View full report

Comment thread package.json
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 21, 2026

Reviewed; no blockers found.

heskew and others added 3 commits May 21, 2026 15:02
…dencies.md

Required by `harper-engineering-guidelines/rules/dependencies.md` — every
new third-party package gets a justification entry covering size,
security track record, environment interaction, deferral, binary
compilation, and a removal plan. Flagged by claude-bot on PR #707.

Entry documents the optional-peerDep classification, the dynamic-import
boundary, the SigV4 rationale (the genuine reason we use an SDK on
Bedrock and not on the other three native-fetch backends), and the
removal plan if SDK version churn becomes a maintenance burden.

Tracking: #633, #510

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`npm ci` on CI rejected the prior commit's lockfile — it carried both
my Bedrock peerDependency addition AND ~3000 lines of stale entries
(react-native, metro, hermes-parser, etc.) that drifted into the
lockfile before this session started.

Running `npm install --package-lock-only` regenerates the file
hermetically: keeps the new `peerDependencies` /
`peerDependenciesMeta` blocks at the root entry that `npm ci`
validates, drops the stale transitive entries that don't correspond
to anything in `package.json` (they were optional-peer-of-peer
graphs npm decided weren't reachable).

The bulk of the diff is removals of those stale entries, not new
content. The actual functional change to the lockfile is the
~13-line root-entry peerDeps addition.

Tracking: #633, #510

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous attempt at regenerating the lockfile after the peerDependency
addition pulled in spurious removals from my local box's npm resolution
state (different from CI's). Result: CI's npm ci kept failing on
missing react-native/metro/hermes transitive entries that exist in
main's lockfile.

Restore main's lockfile byte-for-byte. The peerDependencies declared
in package.json don't require corresponding entries in the lockfile's
root for `npm ci` to pass (no install actually happens — the SDK is
opt-in via the user's own project), so the minimal correct lockfile
delta vs main is zero.

Tracking: #633, #510

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@kriszyp kriszyp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feeling a little deja vu here, but looks good to me.

@heskew heskew marked this pull request as ready for review May 22, 2026 16:45
Copy link
Copy Markdown
Member Author

@heskew heskew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blockers.


🤖 Posted by Antigravity on Nathan's behalf

@heskew heskew merged commit 07a3349 into main May 22, 2026
36 of 39 checks passed
@heskew heskew deleted the feat/models-anthropic-bedrock-backends branch May 22, 2026 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Models] Phase 6 — anthropic + bedrock backends

2 participants