From 8ef82ea1efc5dcced65c03a133761d20a390552e Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sat, 2 May 2026 14:19:39 +0200
Subject: [PATCH 01/20] docs(plans): scope pi-ai migration spike (#1205)

Capture the actual call graph before any provider port: graders consume
provider.asLanguageModel() (Vercel LanguageModel) directly, not provider.invoke(),
so the migration needs either a Vercel LanguageModelV2 shim over pi-ai (Path A)
or a richer Provider API that drops asLanguageModel (Path B). Document the
trade-offs so the spike implementation path is decided before code lands.

Refs #1205

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 docs/plans/1205-pi-ai-spike.md | 96 ++++++++++++++++++++++++++++++++++
 1 file changed, 96 insertions(+)
 create mode 100644 docs/plans/1205-pi-ai-spike.md

diff --git a/docs/plans/1205-pi-ai-spike.md b/docs/plans/1205-pi-ai-spike.md
new file mode 100644
index 00000000..44e1b229
--- /dev/null
+++ b/docs/plans/1205-pi-ai-spike.md
@@ -0,0 +1,96 @@
+# Spike: pi-ai migration — path selection
+
+Tracks #1205. This doc captures the spike findings before any provider is ported.
+Once the implementation path is chosen and the spike port lands, delete this file.
+
+## Initial assumption (wrong)
+
+Original assumption: `Provider.invoke(request) -> response` is the contract every
+grader call site uses, so we can swap the implementation behind `invoke()` from
+Vercel `generateText` to pi-ai `complete()` and call it a day.
+
+## Actual call graph
+
+`asLanguageModel(): import('ai').LanguageModel` is part of the `Provider`
+interface (`providers/types.ts:307-309`) and is the load-bearing entry point for
+every real grader path. The consumers don't go through `provider.invoke()`:
+
+| Consumer | What it does |
+| --- | --- |
+| `graders/llm-grader.ts:485` | `provider.asLanguageModel()` → `generateText({ model, tools: fsTools, stopWhen: stepCountIs(...) })` (built-in agent mode with sandboxed filesystem tools and multi-step) |
+| `graders/llm-grader.ts:1106` | `asLanguageModel()` → `generateText({ model, messages })` (LLM-judge mode) |
+| `graders/composite.ts:343` | `asLanguageModel()` → `generateText({ model, messages })` |
+| `generators/rubric-generator.ts:35` | `asLanguageModel()` → `generateText({ model, messages })` |
+| `providers/agentv-provider.ts:73-84` | `invoke()` throws; `asLanguageModel()` is the only supported path |
+
+`provider.invoke()` exists and is implemented in `ai-sdk.ts:invokeModel`, but the
+grader hot paths bypass it. They depend on the Vercel `LanguageModel` *type*,
+not on AgentV's `Provider` abstraction.
+
+This means a pi-ai migration is a real refactor, not a one-file swap.
+
+## Two viable paths
+
+### Path A — Vercel LanguageModelV2 shim over pi-ai
+
+Implement Vercel's `LanguageModelV2` interface (the contract `generateText`
+expects) as an adapter around pi-ai's `complete()` / `stream()`. `asLanguageModel()`
+keeps returning a `LanguageModel`; no consumer changes.
+
+**Pros**
+- Zero changes to `llm-grader`, `composite`, `rubric-generator`, `agentv-provider`.
+- Migration is incremental — port one provider at a time, others keep using ai-sdk.
+- Tool-definition shape (Zod via `tool()`) stays as-is in graders.
+
+**Cons**
+- Have to implement Vercel's V2 spec faithfully — stream parts, tool-call deltas,
+  finish reasons, usage metadata, provider-specific options pass-through.
+- `ai` and `@ai-sdk/*` peer types stay as a dev/runtime dep (we still import the
+  V2 interface) — partial dependency reduction, not full removal.
+- Adapter layer is non-trivial code to maintain; bugs in the shim show up as
+  weird grader behavior.
+
+**Spike work to validate**: build a minimal `LanguageModelV2` shim around
+pi-ai's `complete()` for non-streaming, non-tool calls. Run the rubric-generator
+through it against the existing baselines. If that works, the shim is viable;
+streaming + tool-call deltas are the next risk areas.
+
+### Path B — Replace `asLanguageModel` with a richer `Provider` API
+
+Drop `asLanguageModel()` from the `Provider` interface. Add what consumers
+actually need to `invoke()`: tool calling, multi-step (`stopWhen`-equivalent),
+structured-output bias. Migrate the four consumers to call `provider.invoke()`.
+
+**Pros**
+- Removes `ai` / `@ai-sdk/*` from the internal type surface entirely.
+- `Provider` becomes a real abstraction, not a thin Vercel passthrough.
+- Tool definitions can move to TypeBox (pi-ai native) and stop dragging Zod via
+  the AI-SDK `tool()` helper.
+
+**Cons**
+- All four consumers change. Bigger blast radius, more baseline runs needed.
+- Have to design a tool-calling shape that survives ai-sdk → pi-ai mapping
+  (and any future provider lib swap).
+- More surface for behavior drift between old and new code paths.
+
+**Spike work to validate**: sketch the new `Provider.invoke()` signature
+covering the multi-step + tools case used by `llm-grader.ts:485-540`, and port
+*one* consumer (rubric-generator is the simplest — no tools, no multi-step) to
+prove the ergonomics.
+
+## Recommendation
+
+Lean **Path A** for the spike, with a clear exit criterion: if the
+`LanguageModelV2` shim explodes in scope (tool-call deltas, stream parts), pivot
+to Path B before merging. Path A's appeal is that it lets the migration happen
+in one provider at a time without churning grader code; Path B is a cleaner
+endpoint but a much larger initial PR.
+
+## Out-of-scope for this spike
+
+- Anthropic thinking-budget mapping (numeric → bucket) — design separately.
+- Custom retry/backoff (`ai-sdk.ts:520-559`) — port wholesale, evaluate
+  trimming in a follow-up.
+- Token-usage object shape changes — preserve the current `tokenUsage` fields
+  for JSONL compatibility regardless of which path we pick.
+- Streaming support — current consumers don't stream; defer.

From ee9ccbd9e2335c4ac1487a7dd4e92d15388ebfec Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sat, 2 May 2026 14:22:42 +0200
Subject: [PATCH 02/20] chore: fix pre-existing import order in
 targets-validator
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Pre-push lint was failing on a Biome organizeImports rule for
targets-validator.ts (introduced in #1203). Reorder the imports so
the lint passes — unblocks pushing from this branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 packages/core/src/evaluation/validation/targets-validator.ts | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/packages/core/src/evaluation/validation/targets-validator.ts b/packages/core/src/evaluation/validation/targets-validator.ts
index 7d9fc74f..524b6654 100644
--- a/packages/core/src/evaluation/validation/targets-validator.ts
+++ b/packages/core/src/evaluation/validation/targets-validator.ts
@@ -1,12 +1,12 @@
 import { readFile } from 'node:fs/promises';
 import path from 'node:path';
 
+import { interpolateEnv } from '../interpolation.js';
 import {
   CLI_PLACEHOLDERS,
   COMMON_TARGET_SETTINGS,
   findDeprecatedCamelCaseTargetWarnings,
 } from '../providers/targets.js';
-import { interpolateEnv } from '../interpolation.js';
 import { KNOWN_PROVIDERS, PROVIDER_ALIASES } from '../providers/types.js';
 import { parseYamlValue } from '../yaml-loader.js';
 import type { ValidationError, ValidationResult } from './types.js';

From 55f8d0614939470d5a942f71a83397f366e935c9 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sat, 2 May 2026 14:30:11 +0200
Subject: [PATCH 03/20] docs(plans): commit to Path B for pi-ai migration

Drop asLanguageModel() from the Provider interface; enrich Provider.invoke()
with optional `tools` + `maxSteps` and `steps` in the response so it covers
the hardest consumer (llm-grader built-in agent mode). Tools use JSON Schema
on the wire (provider-library-neutral). Document consumer migration order
(simplest first), provider port order, and open questions (Anthropic thinking
budget mapping, retry placement, token-usage shape).

Refs #1205

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 docs/plans/1205-pi-ai-spike.md | 254 +++++++++++++++++++++++----------
 1 file changed, 175 insertions(+), 79 deletions(-)

diff --git a/docs/plans/1205-pi-ai-spike.md b/docs/plans/1205-pi-ai-spike.md
index 44e1b229..f4ffa2b0 100644
--- a/docs/plans/1205-pi-ai-spike.md
+++ b/docs/plans/1205-pi-ai-spike.md
@@ -1,7 +1,26 @@
-# Spike: pi-ai migration — path selection
+# Spike: pi-ai migration — Path B selected
 
-Tracks #1205. This doc captures the spike findings before any provider is ported.
-Once the implementation path is chosen and the spike port lands, delete this file.
+Tracks #1205. This doc captures the spike findings and the chosen migration
+path. Once the spike port lands, delete this file and fold any user-relevant
+content into module headers / the issue.
+
+## Decision: Path B
+
+We're going with Path B — drop `asLanguageModel()` from the `Provider` interface
+and enrich `Provider.invoke()` to cover the full grader hot path (multi-step +
+tools). The four consumers migrate to the new API.
+
+**Why not Path A** (Vercel `LanguageModelV2` shim over pi-ai): A is a shim, not
+an abstraction. With A our `Provider` interface stays a thin facade — we'd be
+implementing Vercel's contract on top of pi-ai, and every consumer would still
+depend on Vercel's API surface. The next time we want to swap LLM libs, A leaves
+the consumer-side coupling untouched. B fixes the coupling: `Provider` becomes
+the real boundary, consumers depend on AgentV's own API, and only provider
+implementations change when the underlying lib changes.
+
+The cost is honest: bigger initial PR (4 consumer changes vs. 1 shim), more
+baseline runs. But if we're spending the migration budget anyway, spend it on
+the change that leaves the codebase better.
 
 ## Initial assumption (wrong)
 
@@ -12,85 +31,162 @@ Vercel `generateText` to pi-ai `complete()` and call it a day.
 ## Actual call graph
 
 `asLanguageModel(): import('ai').LanguageModel` is part of the `Provider`
-interface (`providers/types.ts:307-309`) and is the load-bearing entry point for
+interface (`providers/types.ts:309`) and is the load-bearing entry point for
 every real grader path. The consumers don't go through `provider.invoke()`:
 
-| Consumer | What it does |
-| --- | --- |
-| `graders/llm-grader.ts:485` | `provider.asLanguageModel()` → `generateText({ model, tools: fsTools, stopWhen: stepCountIs(...) })` (built-in agent mode with sandboxed filesystem tools and multi-step) |
-| `graders/llm-grader.ts:1106` | `asLanguageModel()` → `generateText({ model, messages })` (LLM-judge mode) |
-| `graders/composite.ts:343` | `asLanguageModel()` → `generateText({ model, messages })` |
-| `generators/rubric-generator.ts:35` | `asLanguageModel()` → `generateText({ model, messages })` |
-| `providers/agentv-provider.ts:73-84` | `invoke()` throws; `asLanguageModel()` is the only supported path |
-
-`provider.invoke()` exists and is implemented in `ai-sdk.ts:invokeModel`, but the
-grader hot paths bypass it. They depend on the Vercel `LanguageModel` *type*,
-not on AgentV's `Provider` abstraction.
-
-This means a pi-ai migration is a real refactor, not a one-file swap.
-
-## Two viable paths
-
-### Path A — Vercel LanguageModelV2 shim over pi-ai
-
-Implement Vercel's `LanguageModelV2` interface (the contract `generateText`
-expects) as an adapter around pi-ai's `complete()` / `stream()`. `asLanguageModel()`
-keeps returning a `LanguageModel`; no consumer changes.
-
-**Pros**
-- Zero changes to `llm-grader`, `composite`, `rubric-generator`, `agentv-provider`.
-- Migration is incremental — port one provider at a time, others keep using ai-sdk.
-- Tool-definition shape (Zod via `tool()`) stays as-is in graders.
-
-**Cons**
-- Have to implement Vercel's V2 spec faithfully — stream parts, tool-call deltas,
-  finish reasons, usage metadata, provider-specific options pass-through.
-- `ai` and `@ai-sdk/*` peer types stay as a dev/runtime dep (we still import the
-  V2 interface) — partial dependency reduction, not full removal.
-- Adapter layer is non-trivial code to maintain; bugs in the shim show up as
-  weird grader behavior.
-
-**Spike work to validate**: build a minimal `LanguageModelV2` shim around
-pi-ai's `complete()` for non-streaming, non-tool calls. Run the rubric-generator
-through it against the existing baselines. If that works, the shim is viable;
-streaming + tool-call deltas are the next risk areas.
-
-### Path B — Replace `asLanguageModel` with a richer `Provider` API
-
-Drop `asLanguageModel()` from the `Provider` interface. Add what consumers
-actually need to `invoke()`: tool calling, multi-step (`stopWhen`-equivalent),
-structured-output bias. Migrate the four consumers to call `provider.invoke()`.
-
-**Pros**
-- Removes `ai` / `@ai-sdk/*` from the internal type surface entirely.
-- `Provider` becomes a real abstraction, not a thin Vercel passthrough.
-- Tool definitions can move to TypeBox (pi-ai native) and stop dragging Zod via
-  the AI-SDK `tool()` helper.
-
-**Cons**
-- All four consumers change. Bigger blast radius, more baseline runs needed.
-- Have to design a tool-calling shape that survives ai-sdk → pi-ai mapping
-  (and any future provider lib swap).
-- More surface for behavior drift between old and new code paths.
-
-**Spike work to validate**: sketch the new `Provider.invoke()` signature
-covering the multi-step + tools case used by `llm-grader.ts:485-540`, and port
-*one* consumer (rubric-generator is the simplest — no tools, no multi-step) to
-prove the ergonomics.
-
-## Recommendation
-
-Lean **Path A** for the spike, with a clear exit criterion: if the
-`LanguageModelV2` shim explodes in scope (tool-call deltas, stream parts), pivot
-to Path B before merging. Path A's appeal is that it lets the migration happen
-in one provider at a time without churning grader code; Path B is a cleaner
-endpoint but a much larger initial PR.
+| Consumer | What it does | Tools? | Multi-step? |
+| --- | --- | --- | --- |
+| `graders/llm-grader.ts:485` (built-in agent) | `asLanguageModel()` → `generateText({ model, system, prompt, tools, stopWhen, temperature })` | yes (3 sandboxed FS tools) | yes (`stepCountIs(maxSteps)`) |
+| `graders/llm-grader.ts:1106` (LLM-judge) | `asLanguageModel()` → `generateText({ model, messages })` | no | no |
+| `graders/composite.ts:343` | `asLanguageModel()` → `generateText({ model, messages })` | no | no |
+| `generators/rubric-generator.ts:35` | `asLanguageModel()` → `generateText({ model, messages })` | no | no |
+| `providers/agentv-provider.ts:73-84` | `invoke()` actively throws "use asLanguageModel() instead" | — | — |
+
+The `built-in agent` case in `llm-grader.ts:485` is the hardest consumer — any
+new `Provider` API has to cover its full surface or we end up with two ways to
+call providers.
+
+## New `Provider.invoke()` design
+
+### Goals
+
+- One `invoke()` shape covers single-shot, judged-message, and tool-using
+  multi-step calls.
+- Tool schema language is provider-library-neutral (JSON Schema on the wire).
+- Existing fields (`question`, `chatPrompt`, `temperature`, `maxOutputTokens`,
+  `signal`, `evalCaseId`, `attempt`, etc.) stay as-is — additive change.
+- Existing `ProviderResponse` fields (`output`, `tokenUsage`, `costUsd`,
+  `durationMs`, `startTime`, `endTime`) stay as-is.
+
+### Additions to `ProviderRequest`
+
+```ts
+export interface ProviderTool {
+  /** Tool name as shown to the model. */
+  readonly name: string;
+  /** Tool description as shown to the model. */
+  readonly description: string;
+  /** JSON Schema for the tool's input. Pi-ai TypeBox compiles to JSON Schema; Zod
+   * compiles via zod-to-json-schema. Provider implementations translate to the
+   * underlying lib's native shape (TypeBox object for pi-ai). */
+  readonly parameters: JsonObject;
+  /** Executes the tool. Receives parsed JSON input, returns a JSON-serializable
+   * result. Errors are caught and surfaced to the model as tool-error results. */
+  execute(input: unknown): Promise<unknown>;
+}
+
+export interface ProviderRequest {
+  // ...existing fields unchanged...
+
+  /** Tools the model may call. Provider runs the agent loop, calling
+   * tool.execute() for each tool call until either the model returns no
+   * further tool calls or `maxSteps` is reached. */
+  readonly tools?: readonly ProviderTool[];
+
+  /** Maximum number of agent loop iterations (model turn + tool execution =
+   * one step). Required when `tools` is non-empty. Ignored otherwise. */
+  readonly maxSteps?: number;
+}
+```
+
+### Additions to `ProviderResponse`
+
+```ts
+export interface ProviderStepInfo {
+  /** Number of agent loop steps executed (0 if no tools were used). */
+  readonly count: number;
+  /** Total tool calls across all steps. */
+  readonly toolCallCount: number;
+}
+
+export interface ProviderResponse {
+  // ...existing fields unchanged...
+
+  /** Populated when the request used tools. Undefined for single-shot calls. */
+  readonly steps?: ProviderStepInfo;
+}
+```
+
+This is the minimum llm-grader's `built-in` mode actually needs from
+`generateText`'s richer `steps[]` array (see `llm-grader.ts:524`). If a future
+consumer needs per-step detail (which tool, what input, what output), promote
+`ProviderStepInfo` then — YAGNI for now.
+
+### Removed
+
+- `Provider.asLanguageModel?(): import('ai').LanguageModel` — gone.
+- `import('ai').LanguageModel` reference in `providers/types.ts:309` — gone.
+- `agentv-provider.ts`'s `invoke()`-throws-by-design — `agentv` becomes a
+  normal `Provider` that runs through `invoke()` like the others.
+
+### Tool schema neutrality
+
+JSON Schema on the wire keeps consumers free to author tools with whatever
+schema lib they want. The two grader call sites today use Zod via ai-sdk's
+`tool()` helper; under Path B they'd switch to **TypeBox** (pi-ai native, no
+extra conversion step). That's a small port — three filesystem tools in
+`llm-grader.ts:1473-1554`. Provider implementations are responsible for
+translating `ProviderTool.parameters` (JSON Schema) → the underlying lib's
+expected shape.
+
+## Consumer migration order
+
+Smallest blast radius first so we can flush the design through real code before
+touching the hardest case:
+
+1. **`rubric-generator.ts`** — single-shot, no tools. Simplest possible exercise
+   of `provider.invoke({ chatPrompt: [...] })`. Validates token usage + response
+   text plumbing.
+2. **`composite.ts`** — same shape as rubric-generator. Smoke test that the API
+   works for a second consumer.
+3. **`llm-grader.ts:1106`** (LLM-judge mode) — same shape again, different
+   prompt construction.
+4. **`llm-grader.ts:485`** (built-in agent mode) — exercises `tools` +
+   `maxSteps`. The whole point of the new API.
+5. **`agentv-provider.ts`** — collapse the `invoke()`-throws path. Provider
+   becomes a normal pi-ai-backed implementation.
+
+After step 5, `asLanguageModel?` can be removed from the `Provider` interface
+and `import { generateText } from 'ai'` disappears from grader code.
+
+## Provider implementation order
+
+After consumers compile against the new interface, port providers one at a time:
+
+1. **OpenAIProvider** — pi-ai native, simplest. Run grader-score baselines.
+2. **OpenRouterProvider** — pi-ai treats it as an OpenAI-compatible endpoint;
+   should fall out of step 1 with config differences only.
+3. **GeminiProvider** — pi-ai native (`google` provider).
+4. **AnthropicProvider** — pi-ai native, but thinking-budget mapping needs
+   design (see open question below).
+5. **AzureProvider** — pi-ai has `azure-openai-responses.js`; verify the
+   `useDeploymentBasedUrls` + `apiFormat` cases.
+
+Each step ends with: build green, lint green, baselines re-run for an eval that
+exercises that provider.
+
+## Open design questions
+
+- **Anthropic thinking-budget mapping.** ai-sdk takes a numeric `budgetTokens`;
+  pi-ai exposes a 5-bucket `reasoning` enum (`minimal|low|medium|high|xhigh`).
+  Lossy. Pick one of: (a) coerce numeric → bucket via thresholds, (b) drop the
+  knob to a bucket-only YAML field with deprecation warning, (c) bypass pi-ai's
+  abstraction and pass through to its Anthropic provider directly. Decide
+  before porting `AnthropicProvider`.
+- **Retry/backoff.** `ai-sdk.ts:520-559` has bespoke exponential backoff with
+  configurable status-code list. pi-ai's behavior differs. Likely answer: keep
+  the existing `withRetry` wrapper around `provider.invoke()`'s underlying
+  pi-ai call — the retry logic is library-agnostic. Confirm in step 1.
+- **Token-usage object shape.** pi-ai returns `{input, output, cost}`; ai-sdk
+  surfaces `{inputTokens, outputTokens, cachedInputTokens, reasoningTokens}`.
+  Map to the existing `ProviderTokenUsage` shape (`input`, `output`, optional
+  `cached`, optional `reasoning`) — which is already what consumers see today.
+  Cost goes to the existing `costUsd` field.
 
 ## Out-of-scope for this spike
 
-- Anthropic thinking-budget mapping (numeric → bucket) — design separately.
-- Custom retry/backoff (`ai-sdk.ts:520-559`) — port wholesale, evaluate
-  trimming in a follow-up.
-- Token-usage object shape changes — preserve the current `tokenUsage` fields
-  for JSONL compatibility regardless of which path we pick.
+- Anthropic thinking-budget mapping resolution (call it out, design separately).
 - Streaming support — current consumers don't stream; defer.
+- Adding new providers exposed by pi-ai (Bedrock, Vertex, Mistral, etc.) — this
+  PR ports the existing 5, no more.
+- Orchestrator-side changes (agent provider kinds, batching) — untouched.

From 3eaca6ed6ffeefca53e8dc189f1fc6d4056fd485 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sat, 2 May 2026 15:58:09 +0200
Subject: [PATCH 04/20] refactor(core): port OpenAI provider + rubric-generator
 to pi-ai (step 1)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

First consumer + first provider on Path B of #1205:

- OpenAIProvider.invoke() now calls @mariozechner/pi-ai's complete() instead
  of Vercel AI SDK's generateText. asLanguageModel() still returns the Vercel
  model so llm-grader, composite, and agentv-provider keep working until
  later steps migrate them.
- rubric-generator.ts switches from provider.asLanguageModel() + generateText()
  to provider.invoke(). It is the simplest consumer (single-shot, no tools)
  and validates the new shape end-to-end.
- pi-ai loaded via dynamic import + `any` casts, mirroring the pattern in
  pi-coding-agent.ts:250 — pi-ai's published d.ts files do not statically
  resolve named exports under NodeNext or Bundler module resolution.
- @mariozechner/pi-ai added as a regular dependency (was transitive via
  pi-coding-agent peer dep).
- chatPromptToPiContext only handles system + user roles; assistant /
  tool / function paths throw with a pointer to #1205. YAGNI for step 1 —
  later consumers (llm-grader multi-turn, tools) will add what they need.
- targets.test.ts: openai test now mocks pi-ai's complete/getModel and
  asserts those are called instead of ai-sdk's generateText.

Refs #1205

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 bun.lock                                      | 325 +++++++++++++++++-
 packages/core/package.json                    |   1 +
 .../evaluation/generators/rubric-generator.ts |  19 +-
 .../core/src/evaluation/providers/ai-sdk.ts   | 229 +++++++++++-
 .../test/evaluation/providers/targets.test.ts |  46 ++-
 5 files changed, 601 insertions(+), 19 deletions(-)

diff --git a/bun.lock b/bun.lock
index 50d2b5bd..1d7ece4f 100644
--- a/bun.lock
+++ b/bun.lock
@@ -23,7 +23,7 @@
     },
     "apps/cli": {
       "name": "agentv",
-      "version": "4.16.0",
+      "version": "4.25.1",
       "bin": {
         "agentv": "./dist/cli.js",
       },
@@ -87,7 +87,7 @@
     },
     "packages/core": {
       "name": "@agentv/core",
-      "version": "4.16.0",
+      "version": "4.25.1",
       "dependencies": {
         "@agentclientprotocol/sdk": "^0.14.1",
         "@agentv/eval": "workspace:*",
@@ -96,6 +96,7 @@
         "@ai-sdk/google": "^3.0.0",
         "@ai-sdk/openai": "^3.0.0",
         "@github/copilot-sdk": "^0.1.25",
+        "@mariozechner/pi-ai": "^0.62.0",
         "@openai/codex-sdk": "^0.104.0",
         "@openrouter/ai-sdk-provider": "^2.3.1",
         "ai": "^6.0.0",
@@ -128,7 +129,7 @@
     },
     "packages/eval": {
       "name": "@agentv/eval",
-      "version": "4.16.0",
+      "version": "4.25.1",
       "dependencies": {
         "zod": "^3.23.8",
       },
@@ -161,6 +162,8 @@
 
     "@anthropic-ai/claude-agent-sdk": ["@anthropic-ai/claude-agent-sdk@0.2.49", "", { "optionalDependencies": { "@img/sharp-darwin-arm64": "^0.34.2", "@img/sharp-darwin-x64": "^0.34.2", "@img/sharp-linux-arm": "^0.34.2", "@img/sharp-linux-arm64": "^0.34.2", "@img/sharp-linux-x64": "^0.34.2", "@img/sharp-linuxmusl-arm64": "^0.34.2", "@img/sharp-linuxmusl-x64": "^0.34.2", "@img/sharp-win32-arm64": "^0.34.2", "@img/sharp-win32-x64": "^0.34.2" }, "peerDependencies": { "zod": "^4.0.0" } }, "sha512-3avi409dwuGkPEETpWa0gyJvRMr3b6LxeuW5/sAPCOtLD9WxH9fYltbA5wZoazxTw5mlbXmjDp7JqO1rlmpaIQ=="],
 
+    "@anthropic-ai/sdk": ["@anthropic-ai/sdk@0.73.0", "", { "dependencies": { "json-schema-to-ts": "^3.1.1" }, "peerDependencies": { "zod": "^3.25.0 || ^4.0.0" }, "optionalPeers": ["zod"], "bin": { "anthropic-ai-sdk": "bin/cli" } }, "sha512-URURVzhxXGJDGUGFunIOtBlSl7KWvZiAAKY/ttTkZAkXT9bTPqdk2eK0b8qqSxXpikh3QKPnPYpiyX98zf5ebw=="],
+
     "@astrojs/compiler": ["@astrojs/compiler@2.13.0", "", {}, "sha512-mqVORhUJViA28fwHYaWmsXSzLO9osbdZ5ImUfxBarqsYdMlPbqAqGJCxsNzvppp1BEzc1mJNjOVvQqeDN8Vspw=="],
 
     "@astrojs/internal-helpers": ["@astrojs/internal-helpers@0.7.5", "", {}, "sha512-vreGnYSSKhAjFJCWAwe/CNhONvoc5lokxtRoZims+0wa3KbHBdPHSSthJsKxPd8d/aic6lWKpRTYGY/hsgK6EA=="],
@@ -177,6 +180,78 @@
 
     "@astrojs/telemetry": ["@astrojs/telemetry@3.3.0", "", { "dependencies": { "ci-info": "^4.2.0", "debug": "^4.4.0", "dlv": "^1.1.3", "dset": "^3.1.4", "is-docker": "^3.0.0", "is-wsl": "^3.1.0", "which-pm-runs": "^1.1.0" } }, "sha512-UFBgfeldP06qu6khs/yY+q1cDAaArM2/7AEIqQ9Cuvf7B1hNLq0xDrZkct+QoIGyjq56y8IaE2I3CTvG99mlhQ=="],
 
+    "@aws-crypto/crc32": ["@aws-crypto/crc32@5.2.0", "", { "dependencies": { "@aws-crypto/util": "^5.2.0", "@aws-sdk/types": "^3.222.0", "tslib": "^2.6.2" } }, "sha512-nLbCWqQNgUiwwtFsen1AdzAtvuLRsQS8rYgMuxCrdKf9kOssamGLuPwyTY9wyYblNr9+1XM8v6zoDTPPSIeANg=="],
+
+    "@aws-crypto/sha256-browser": ["@aws-crypto/sha256-browser@5.2.0", "", { "dependencies": { "@aws-crypto/sha256-js": "^5.2.0", "@aws-crypto/supports-web-crypto": "^5.2.0", "@aws-crypto/util": "^5.2.0", "@aws-sdk/types": "^3.222.0", "@aws-sdk/util-locate-window": "^3.0.0", "@smithy/util-utf8": "^2.0.0", "tslib": "^2.6.2" } }, "sha512-AXfN/lGotSQwu6HNcEsIASo7kWXZ5HYWvfOmSNKDsEqC4OashTp8alTmaz+F7TC2L083SFv5RdB+qU3Vs1kZqw=="],
+
+    "@aws-crypto/sha256-js": ["@aws-crypto/sha256-js@5.2.0", "", { "dependencies": { "@aws-crypto/util": "^5.2.0", "@aws-sdk/types": "^3.222.0", "tslib": "^2.6.2" } }, "sha512-FFQQyu7edu4ufvIZ+OadFpHHOt+eSTBaYaki44c+akjg7qZg9oOQeLlk77F6tSYqjDAFClrHJk9tMf0HdVyOvA=="],
+
+    "@aws-crypto/supports-web-crypto": ["@aws-crypto/supports-web-crypto@5.2.0", "", { "dependencies": { "tslib": "^2.6.2" } }, "sha512-iAvUotm021kM33eCdNfwIN//F77/IADDSs58i+MDaOqFrVjZo9bAal0NK7HurRuWLLpF1iLX7gbWrjHjeo+YFg=="],
+
+    "@aws-crypto/util": ["@aws-crypto/util@5.2.0", "", { "dependencies": { "@aws-sdk/types": "^3.222.0", "@smithy/util-utf8": "^2.0.0", "tslib": "^2.6.2" } }, "sha512-4RkU9EsI6ZpBve5fseQlGNUWKMa1RLPQ1dnjnQoe07ldfIzcsGb5hC5W0Dm7u423KWzawlrpbjXBrXCEv9zazQ=="],
+
+    "@aws-sdk/client-bedrock-runtime": ["@aws-sdk/client-bedrock-runtime@3.1041.0", "", { "dependencies": { "@aws-crypto/sha256-browser": "5.2.0", "@aws-crypto/sha256-js": "5.2.0", "@aws-sdk/core": "^3.974.8", "@aws-sdk/credential-provider-node": "^3.972.39", "@aws-sdk/eventstream-handler-node": "^3.972.14", "@aws-sdk/middleware-eventstream": "^3.972.10", "@aws-sdk/middleware-host-header": "^3.972.10", "@aws-sdk/middleware-logger": "^3.972.10", "@aws-sdk/middleware-recursion-detection": "^3.972.11", "@aws-sdk/middleware-user-agent": "^3.972.38", "@aws-sdk/middleware-websocket": "^3.972.16", "@aws-sdk/region-config-resolver": "^3.972.13", "@aws-sdk/token-providers": "3.1041.0", "@aws-sdk/types": "^3.973.8", "@aws-sdk/util-endpoints": "^3.996.8", "@aws-sdk/util-user-agent-browser": "^3.972.10", "@aws-sdk/util-user-agent-node": "^3.973.24", "@smithy/config-resolver": "^4.4.17", "@smithy/core": "^3.23.17", "@smithy/eventstream-serde-browser": "^4.2.14", "@smithy/eventstream-serde-config-resolver": "^4.3.14", "@smithy/eventstream-serde-node": "^4.2.14", "@smithy/fetch-http-handler": "^5.3.17", "@smithy/hash-node": "^4.2.14", "@smithy/invalid-dependency": "^4.2.14", "@smithy/middleware-content-length": "^4.2.14", "@smithy/middleware-endpoint": "^4.4.32", "@smithy/middleware-retry": "^4.5.7", "@smithy/middleware-serde": "^4.2.20", "@smithy/middleware-stack": "^4.2.14", "@smithy/node-config-provider": "^4.3.14", "@smithy/node-http-handler": "^4.6.1", "@smithy/protocol-http": "^5.3.14", "@smithy/smithy-client": "^4.12.13", "@smithy/types": "^4.14.1", "@smithy/url-parser": "^4.2.14", "@smithy/util-base64": "^4.3.2", "@smithy/util-body-length-browser": "^4.2.2", "@smithy/util-body-length-node": "^4.2.3", "@smithy/util-defaults-mode-browser": "^4.3.49", "@smithy/util-defaults-mode-node": "^4.2.54", "@smithy/util-endpoints": "^3.4.2", "@smithy/util-middleware": "^4.2.14", "@smithy/util-retry": "^4.3.6", "@smithy/util-stream": "^4.5.25", "@smithy/util-utf8": "^4.2.2", "tslib": "^2.6.2" } }, "sha512-1QehYO3jhdvNQ5mOKtwIiNV04y4aywaNZw9HzCp7SSYCX4yy+AGXc2hhYjCiMDUvQPIELuvbR8MXw81NGAj8ZQ=="],
+
+    "@aws-sdk/core": ["@aws-sdk/core@3.974.8", "", { "dependencies": { "@aws-sdk/types": "^3.973.8", "@aws-sdk/xml-builder": "^3.972.22", "@smithy/core": "^3.23.17", "@smithy/node-config-provider": "^4.3.14", "@smithy/property-provider": "^4.2.14", "@smithy/protocol-http": "^5.3.14", "@smithy/signature-v4": "^5.3.14", "@smithy/smithy-client": "^4.12.13", "@smithy/types": "^4.14.1", "@smithy/util-base64": "^4.3.2", "@smithy/util-middleware": "^4.2.14", "@smithy/util-retry": "^4.3.6", "@smithy/util-utf8": "^4.2.2", "tslib": "^2.6.2" } }, "sha512-njR2qoG6ZuB0kvAS2FyICsFZJ6gmCcf2X/7JcD14sUvGDm26wiZ5BrA6LOiUxKFEF+IVe7kdroxyE00YlkiYsw=="],
+
+    "@aws-sdk/credential-provider-env": ["@aws-sdk/credential-provider-env@3.972.34", "", { "dependencies": { "@aws-sdk/core": "^3.974.8", "@aws-sdk/types": "^3.973.8", "@smithy/property-provider": "^4.2.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-XT0jtf8Fw9JE6ppsQeoNnZRiG+jqRixMT1v1ZR17G60UvVdsQmTG8nbEyHuEPfMxDXEhfdARaM/XiEhca4lGHQ=="],
+
+    "@aws-sdk/credential-provider-http": ["@aws-sdk/credential-provider-http@3.972.36", "", { "dependencies": { "@aws-sdk/core": "^3.974.8", "@aws-sdk/types": "^3.973.8", "@smithy/fetch-http-handler": "^5.3.17", "@smithy/node-http-handler": "^4.6.1", "@smithy/property-provider": "^4.2.14", "@smithy/protocol-http": "^5.3.14", "@smithy/smithy-client": "^4.12.13", "@smithy/types": "^4.14.1", "@smithy/util-stream": "^4.5.25", "tslib": "^2.6.2" } }, "sha512-DPoGWfy7J7RKxvbf5kOKIGQkD2ek3dbKgzKIGrnLuvZBz5myU+Im/H6pmc14QcnFbqHMqxvtWSgRDSJW3qXLQg=="],
+
+    "@aws-sdk/credential-provider-ini": ["@aws-sdk/credential-provider-ini@3.972.38", "", { "dependencies": { "@aws-sdk/core": "^3.974.8", "@aws-sdk/credential-provider-env": "^3.972.34", "@aws-sdk/credential-provider-http": "^3.972.36", "@aws-sdk/credential-provider-login": "^3.972.38", "@aws-sdk/credential-provider-process": "^3.972.34", "@aws-sdk/credential-provider-sso": "^3.972.38", "@aws-sdk/credential-provider-web-identity": "^3.972.38", "@aws-sdk/nested-clients": "^3.997.6", "@aws-sdk/types": "^3.973.8", "@smithy/credential-provider-imds": "^4.2.14", "@smithy/property-provider": "^4.2.14", "@smithy/shared-ini-file-loader": "^4.4.9", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-oDzUBu2MGJFgoar05sPMCwSrhw44ASyccrHzj66vO69OZqi7I6hZZxXfuPLC8OCzW7C+sU+bI73XHij41yekgQ=="],
+
+    "@aws-sdk/credential-provider-login": ["@aws-sdk/credential-provider-login@3.972.38", "", { "dependencies": { "@aws-sdk/core": "^3.974.8", "@aws-sdk/nested-clients": "^3.997.6", "@aws-sdk/types": "^3.973.8", "@smithy/property-provider": "^4.2.14", "@smithy/protocol-http": "^5.3.14", "@smithy/shared-ini-file-loader": "^4.4.9", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-g1NosS8qe4OF++G2UFCM5ovSkgipC7YYor5KCWatG0UoMSO5YFj9C8muePlyVmOBV/WTI16Jo3/s1NUo/o1Bww=="],
+
+    "@aws-sdk/credential-provider-node": ["@aws-sdk/credential-provider-node@3.972.39", "", { "dependencies": { "@aws-sdk/credential-provider-env": "^3.972.34", "@aws-sdk/credential-provider-http": "^3.972.36", "@aws-sdk/credential-provider-ini": "^3.972.38", "@aws-sdk/credential-provider-process": "^3.972.34", "@aws-sdk/credential-provider-sso": "^3.972.38", "@aws-sdk/credential-provider-web-identity": "^3.972.38", "@aws-sdk/types": "^3.973.8", "@smithy/credential-provider-imds": "^4.2.14", "@smithy/property-provider": "^4.2.14", "@smithy/shared-ini-file-loader": "^4.4.9", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-HEswDQyxUtadoZ/bJsPPENHg7R0Lzym5LuMksJeHvqhCOpP+rtkDLKI4/ZChH4w3cf5kG8n6bZuI8PzajoiqMg=="],
+
+    "@aws-sdk/credential-provider-process": ["@aws-sdk/credential-provider-process@3.972.34", "", { "dependencies": { "@aws-sdk/core": "^3.974.8", "@aws-sdk/types": "^3.973.8", "@smithy/property-provider": "^4.2.14", "@smithy/shared-ini-file-loader": "^4.4.9", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-T3IFs4EVmVi1dVN5RciFnklCANSzvrQd/VuHY9ThHSQmYkTogjcGkoJEr+oNUPQZnso52183088NqysMPji1/Q=="],
+
+    "@aws-sdk/credential-provider-sso": ["@aws-sdk/credential-provider-sso@3.972.38", "", { "dependencies": { "@aws-sdk/core": "^3.974.8", "@aws-sdk/nested-clients": "^3.997.6", "@aws-sdk/token-providers": "3.1041.0", "@aws-sdk/types": "^3.973.8", "@smithy/property-provider": "^4.2.14", "@smithy/shared-ini-file-loader": "^4.4.9", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-5ZxG+t0+3Q3QPh8KEjX6syskhgNf7I0MN7oGioTf6Lm1NTjfP7sIcYGNsthXC2qR8vcD3edNZwCr2ovfSSWuRA=="],
+
+    "@aws-sdk/credential-provider-web-identity": ["@aws-sdk/credential-provider-web-identity@3.972.38", "", { "dependencies": { "@aws-sdk/core": "^3.974.8", "@aws-sdk/nested-clients": "^3.997.6", "@aws-sdk/types": "^3.973.8", "@smithy/property-provider": "^4.2.14", "@smithy/shared-ini-file-loader": "^4.4.9", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-lYHFF30DGI20jZcYX8cm6Ns0V7f1dDN6g/MBDLTyD/5iw+bXs3yBr2iAiHDkx4RFU5JgsnZvCHYKiRVPRdmOgw=="],
+
+    "@aws-sdk/eventstream-handler-node": ["@aws-sdk/eventstream-handler-node@3.972.14", "", { "dependencies": { "@aws-sdk/types": "^3.973.8", "@smithy/eventstream-codec": "^4.2.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-m4X56gxG76/CKfxNVbOFuYwnAZcHgS6HOH8lgp15HoGHIAVTcZfZrXvcYzJFOMLEJgVn+JHBu6EiNV+xSNXXFg=="],
+
+    "@aws-sdk/middleware-eventstream": ["@aws-sdk/middleware-eventstream@3.972.10", "", { "dependencies": { "@aws-sdk/types": "^3.973.8", "@smithy/protocol-http": "^5.3.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-QUqLs7Af1II9X4fCRAu+EGHG3KHyOp4RkuLhRKoA3NuFlh6TL8i+zXBl8w2LUxqm44B/Kom45hgSlwA1SpTsXQ=="],
+
+    "@aws-sdk/middleware-host-header": ["@aws-sdk/middleware-host-header@3.972.10", "", { "dependencies": { "@aws-sdk/types": "^3.973.8", "@smithy/protocol-http": "^5.3.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-IJSsIMeVQ8MMCPbuh1AbltkFhLBLXn7aejzfX5YKT/VLDHn++Dcz8886tXckE+wQssyPUhaXrJhdakO2VilRhg=="],
+
+    "@aws-sdk/middleware-logger": ["@aws-sdk/middleware-logger@3.972.10", "", { "dependencies": { "@aws-sdk/types": "^3.973.8", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-OOuGvvz1Dm20SjZo5oEBePFqxt5nf8AwkNDSyUHvD9/bfNASmstcYxFAHUowy4n6Io7mWUZ04JURZwSBvyQanQ=="],
+
+    "@aws-sdk/middleware-recursion-detection": ["@aws-sdk/middleware-recursion-detection@3.972.11", "", { "dependencies": { "@aws-sdk/types": "^3.973.8", "@aws/lambda-invoke-store": "^0.2.2", "@smithy/protocol-http": "^5.3.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-+zz6f79Kj9V5qFK2P+D8Ehjnw4AhphAlCAsPjUqEcInA9umtSSKMrHbSagEeOIsDNuvVrH98bjRHcyQukTrhaQ=="],
+
+    "@aws-sdk/middleware-sdk-s3": ["@aws-sdk/middleware-sdk-s3@3.972.37", "", { "dependencies": { "@aws-sdk/core": "^3.974.8", "@aws-sdk/types": "^3.973.8", "@aws-sdk/util-arn-parser": "^3.972.3", "@smithy/core": "^3.23.17", "@smithy/node-config-provider": "^4.3.14", "@smithy/protocol-http": "^5.3.14", "@smithy/signature-v4": "^5.3.14", "@smithy/smithy-client": "^4.12.13", "@smithy/types": "^4.14.1", "@smithy/util-config-provider": "^4.2.2", "@smithy/util-middleware": "^4.2.14", "@smithy/util-stream": "^4.5.25", "@smithy/util-utf8": "^4.2.2", "tslib": "^2.6.2" } }, "sha512-Km7M+i8DrLArVzrid1gfxeGhYHBd3uxvE77g0s5a52zPSVosxzQBnJ0gwWb6NIp/DOk8gsBMhi7V+cpJG0ndTA=="],
+
+    "@aws-sdk/middleware-user-agent": ["@aws-sdk/middleware-user-agent@3.972.38", "", { "dependencies": { "@aws-sdk/core": "^3.974.8", "@aws-sdk/types": "^3.973.8", "@aws-sdk/util-endpoints": "^3.996.8", "@smithy/core": "^3.23.17", "@smithy/protocol-http": "^5.3.14", "@smithy/types": "^4.14.1", "@smithy/util-retry": "^4.3.6", "tslib": "^2.6.2" } }, "sha512-iz+B29TXcAZsJpwB+AwG/TTGA5l/VnmMZ2UxtiySOZjI6gCdmviXPwdgzcmuazMy16rXoPY4mYCGe7zdNKfx5A=="],
+
+    "@aws-sdk/middleware-websocket": ["@aws-sdk/middleware-websocket@3.972.16", "", { "dependencies": { "@aws-sdk/types": "^3.973.8", "@aws-sdk/util-format-url": "^3.972.10", "@smithy/eventstream-codec": "^4.2.14", "@smithy/eventstream-serde-browser": "^4.2.14", "@smithy/fetch-http-handler": "^5.3.17", "@smithy/protocol-http": "^5.3.14", "@smithy/signature-v4": "^5.3.14", "@smithy/types": "^4.14.1", "@smithy/util-base64": "^4.3.2", "@smithy/util-hex-encoding": "^4.2.2", "@smithy/util-utf8": "^4.2.2", "tslib": "^2.6.2" } }, "sha512-86+S9oCyRVGzoMRpQhxkArp7kD2K75GPmaNevd9B6EyNhWoNvnCZZ3WbgN4j7ZT+jvtvBCGZvI2XHsWZJ+BRIg=="],
+
+    "@aws-sdk/nested-clients": ["@aws-sdk/nested-clients@3.997.6", "", { "dependencies": { "@aws-crypto/sha256-browser": "5.2.0", "@aws-crypto/sha256-js": "5.2.0", "@aws-sdk/core": "^3.974.8", "@aws-sdk/middleware-host-header": "^3.972.10", "@aws-sdk/middleware-logger": "^3.972.10", "@aws-sdk/middleware-recursion-detection": "^3.972.11", "@aws-sdk/middleware-user-agent": "^3.972.38", "@aws-sdk/region-config-resolver": "^3.972.13", "@aws-sdk/signature-v4-multi-region": "^3.996.25", "@aws-sdk/types": "^3.973.8", "@aws-sdk/util-endpoints": "^3.996.8", "@aws-sdk/util-user-agent-browser": "^3.972.10", "@aws-sdk/util-user-agent-node": "^3.973.24", "@smithy/config-resolver": "^4.4.17", "@smithy/core": "^3.23.17", "@smithy/fetch-http-handler": "^5.3.17", "@smithy/hash-node": "^4.2.14", "@smithy/invalid-dependency": "^4.2.14", "@smithy/middleware-content-length": "^4.2.14", "@smithy/middleware-endpoint": "^4.4.32", "@smithy/middleware-retry": "^4.5.7", "@smithy/middleware-serde": "^4.2.20", "@smithy/middleware-stack": "^4.2.14", "@smithy/node-config-provider": "^4.3.14", "@smithy/node-http-handler": "^4.6.1", "@smithy/protocol-http": "^5.3.14", "@smithy/smithy-client": "^4.12.13", "@smithy/types": "^4.14.1", "@smithy/url-parser": "^4.2.14", "@smithy/util-base64": "^4.3.2", "@smithy/util-body-length-browser": "^4.2.2", "@smithy/util-body-length-node": "^4.2.3", "@smithy/util-defaults-mode-browser": "^4.3.49", "@smithy/util-defaults-mode-node": "^4.2.54", "@smithy/util-endpoints": "^3.4.2", "@smithy/util-middleware": "^4.2.14", "@smithy/util-retry": "^4.3.6", "@smithy/util-utf8": "^4.2.2", "tslib": "^2.6.2" } }, "sha512-WBDnqatJl+kGObpfmfSxqnXeYTu3Me8wx8WCtvoxX3pfWrrTv8I4WTMSSs7PZqcRcVh8WeUKMgGFjMG+52SR1w=="],
+
+    "@aws-sdk/region-config-resolver": ["@aws-sdk/region-config-resolver@3.972.13", "", { "dependencies": { "@aws-sdk/types": "^3.973.8", "@smithy/config-resolver": "^4.4.17", "@smithy/node-config-provider": "^4.3.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-CvJ2ZIjK/jVD/lbOpowBVElJyC1YxLTIJ13yM0AEo0t2v7swOzGjSA6lJGH+DwZXQhcjUjoYwc8bVYCX5MDr1A=="],
+
+    "@aws-sdk/signature-v4-multi-region": ["@aws-sdk/signature-v4-multi-region@3.996.25", "", { "dependencies": { "@aws-sdk/middleware-sdk-s3": "^3.972.37", "@aws-sdk/types": "^3.973.8", "@smithy/protocol-http": "^5.3.14", "@smithy/signature-v4": "^5.3.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-+CMIt3e1VzlklAECmG+DtP1sV8iKq25FuA0OKpnJ4KA0kxUtd7CgClY7/RU6VzJBQwbN4EJ9Ue6plvqx1qGadw=="],
+
+    "@aws-sdk/token-providers": ["@aws-sdk/token-providers@3.1041.0", "", { "dependencies": { "@aws-sdk/core": "^3.974.8", "@aws-sdk/nested-clients": "^3.997.6", "@aws-sdk/types": "^3.973.8", "@smithy/property-provider": "^4.2.14", "@smithy/shared-ini-file-loader": "^4.4.9", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-Th7kPI6YPtvJUcdznooXJMy+9rQWjmEF81LxaJssngBzuysK4a/x+l8kjm1zb7nYsUPbndnBdUnwng/3PLvtGw=="],
+
+    "@aws-sdk/types": ["@aws-sdk/types@3.973.8", "", { "dependencies": { "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-gjlAdtHMbtR9X5iIhVUvbVcy55KnznpC6bkDUWW9z915bi0ckdUr5cjf16Kp6xq0bP5HBD2xzgbL9F9Quv5vUw=="],
+
+    "@aws-sdk/util-arn-parser": ["@aws-sdk/util-arn-parser@3.972.3", "", { "dependencies": { "tslib": "^2.6.2" } }, "sha512-HzSD8PMFrvgi2Kserxuff5VitNq2sgf3w9qxmskKDiDTThWfVteJxuCS9JXiPIPtmCrp+7N9asfIaVhBFORllA=="],
+
+    "@aws-sdk/util-endpoints": ["@aws-sdk/util-endpoints@3.996.8", "", { "dependencies": { "@aws-sdk/types": "^3.973.8", "@smithy/types": "^4.14.1", "@smithy/url-parser": "^4.2.14", "@smithy/util-endpoints": "^3.4.2", "tslib": "^2.6.2" } }, "sha512-oOZHcRDihk5iEe5V25NVWg45b3qEA8OpHWVdU/XQh8Zj4heVPAJqWvMphQnU7LkufmUo10EpvFPZuQMiFLJK3g=="],
+
+    "@aws-sdk/util-format-url": ["@aws-sdk/util-format-url@3.972.10", "", { "dependencies": { "@aws-sdk/types": "^3.973.8", "@smithy/querystring-builder": "^4.2.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-DEKiHNJVtNxdyTeQspzY+15Po/kHm6sF0Cs4HV9Q2+lplB63+DrvdeiSoOSdWEWAoO2RcY1veoXVDz2tWxWCgQ=="],
+
+    "@aws-sdk/util-locate-window": ["@aws-sdk/util-locate-window@3.965.5", "", { "dependencies": { "tslib": "^2.6.2" } }, "sha512-WhlJNNINQB+9qtLtZJcpQdgZw3SCDCpXdUJP7cToGwHbCWCnRckGlc6Bx/OhWwIYFNAn+FIydY8SZ0QmVu3xTQ=="],
+
+    "@aws-sdk/util-user-agent-browser": ["@aws-sdk/util-user-agent-browser@3.972.10", "", { "dependencies": { "@aws-sdk/types": "^3.973.8", "@smithy/types": "^4.14.1", "bowser": "^2.11.0", "tslib": "^2.6.2" } }, "sha512-FAzqXvfEssGdSIz8ejatan0bOdx1qefBWKF/gWmVBXIP1HkS7v/wjjaqrAGGKvyihrXTXW00/2/1nTJtxpXz7g=="],
+
+    "@aws-sdk/util-user-agent-node": ["@aws-sdk/util-user-agent-node@3.973.24", "", { "dependencies": { "@aws-sdk/middleware-user-agent": "^3.972.38", "@aws-sdk/types": "^3.973.8", "@smithy/node-config-provider": "^4.3.14", "@smithy/types": "^4.14.1", "@smithy/util-config-provider": "^4.2.2", "tslib": "^2.6.2" }, "peerDependencies": { "aws-crt": ">=1.0.0" }, "optionalPeers": ["aws-crt"] }, "sha512-ZWwlkjcIp7cEL8ZfTpTAPNkwx25p7xol0xlKoWVVf22+nsjwmLcHYtTPjIV1cSpmB/b6DaK4cb1fSkvCXHgRdw=="],
+
+    "@aws-sdk/xml-builder": ["@aws-sdk/xml-builder@3.972.22", "", { "dependencies": { "@nodable/entities": "2.1.0", "@smithy/types": "^4.14.1", "fast-xml-parser": "5.7.2", "tslib": "^2.6.2" } }, "sha512-PMYKKtJd70IsSG0yHrdAbxBr+ZWBKLvzFZfD3/urxgf6hXVMzuU5M+3MJ5G67RpOmLBu1fAUN65SbWuKUCOlAA=="],
+
+    "@aws/lambda-invoke-store": ["@aws/lambda-invoke-store@0.2.4", "", {}, "sha512-iY8yvjE0y651BixKNPgmv1WrQc+GZ142sb0z4gYnChDDY2YqI4P/jsSopBWrKfAt7LOJAkOXt7rC/hms+WclQQ=="],
+
     "@babel/code-frame": ["@babel/code-frame@7.29.0", "", { "dependencies": { "@babel/helper-validator-identifier": "^7.28.5", "js-tokens": "^4.0.0", "picocolors": "^1.1.1" } }, "sha512-9NhCeYjq9+3uxgdtp20LSiJXJvN0FeCtNGpJxuMFZ1Kv3cWUNb6DOhJwUvcVCzKGR66cw4njwM6hrJLqgOwbcw=="],
 
     "@babel/compat-data": ["@babel/compat-data@7.29.0", "", {}, "sha512-T1NCJqT/j9+cn8fvkt7jtwbLBfLC/1y1c7NtCeXFRgzGTsafi68MRv8yzkYSapBnFA6L3U2VSc02ciDzoAJhJg=="],
@@ -321,6 +396,8 @@
 
     "@github/copilot-win32-x64": ["@github/copilot-win32-x64@0.0.411", "", { "os": "win32", "cpu": "x64", "bin": { "copilot-win32-x64": "copilot.exe" } }, "sha512-xmOgi1lGvUBHQJWmq5AK1EP95+Y8xR4TFoK9OCSOaGbQ+LFcX2jF7iavnMolfWwddabew/AMQjsEHlXvbgMG8Q=="],
 
+    "@google/genai": ["@google/genai@1.51.0", "", { "dependencies": { "google-auth-library": "^10.3.0", "p-retry": "^4.6.2", "protobufjs": "^7.5.4", "ws": "^8.18.0" }, "peerDependencies": { "@modelcontextprotocol/sdk": "^1.25.2" }, "optionalPeers": ["@modelcontextprotocol/sdk"] }, "sha512-vTZZF3CSimN7cn2zsLpW2p5WF0eZa5Gz69ITMPCNHpPrDlAstOfGifSfi0p/s9Z9400f7xJRkgvkQNrcM7pJ6w=="],
+
     "@hono/node-server": ["@hono/node-server@1.19.11", "", { "peerDependencies": { "hono": "^4" } }, "sha512-dr8/3zEaB+p0D2n/IUrlPF1HZm586qgJNXK1a9fhg/PzdtkK7Ksd5l312tJX2yBuALqDYBlG20QEbayqPyxn+g=="],
 
     "@img/colour": ["@img/colour@1.0.0", "", {}, "sha512-A5P/LfWGFSl6nsckYtjw9da+19jB8hkJ6ACTGcDfEJ0aE+l2n2El7dsVM7UVHZQ9s2lmYMWlrS21YLy2IR1LUw=="],
@@ -421,12 +498,18 @@
 
     "@jridgewell/trace-mapping": ["@jridgewell/trace-mapping@0.3.31", "", { "dependencies": { "@jridgewell/resolve-uri": "^3.1.0", "@jridgewell/sourcemap-codec": "^1.4.14" } }, "sha512-zzNR+SdQSDJzc8joaeP8QQoCQr8NuYx2dIIytl1QeBEZHJ9uW6hebsrYgbz8hJwUQao3TWCMtmfV8Nu1twOLAw=="],
 
+    "@mariozechner/pi-ai": ["@mariozechner/pi-ai@0.62.0", "", { "dependencies": { "@anthropic-ai/sdk": "^0.73.0", "@aws-sdk/client-bedrock-runtime": "^3.983.0", "@google/genai": "^1.40.0", "@mistralai/mistralai": "1.14.1", "@sinclair/typebox": "^0.34.41", "ajv": "^8.17.1", "ajv-formats": "^3.0.1", "chalk": "^5.6.2", "openai": "6.26.0", "partial-json": "^0.1.7", "proxy-agent": "^6.5.0", "undici": "^7.19.1", "zod-to-json-schema": "^3.24.6" }, "bin": { "pi-ai": "dist/cli.js" } }, "sha512-mJgryZ5RgBQG++tiETMtCQQJoH2MAhKetCfqI98NMvGydu7L9x2qC2JekQlRaAgIlTgv4MRH1UXHMEs4UweE/Q=="],
+
     "@mdx-js/mdx": ["@mdx-js/mdx@3.1.1", "", { "dependencies": { "@types/estree": "^1.0.0", "@types/estree-jsx": "^1.0.0", "@types/hast": "^3.0.0", "@types/mdx": "^2.0.0", "acorn": "^8.0.0", "collapse-white-space": "^2.0.0", "devlop": "^1.0.0", "estree-util-is-identifier-name": "^3.0.0", "estree-util-scope": "^1.0.0", "estree-walker": "^3.0.0", "hast-util-to-jsx-runtime": "^2.0.0", "markdown-extensions": "^2.0.0", "recma-build-jsx": "^1.0.0", "recma-jsx": "^1.0.0", "recma-stringify": "^1.0.0", "rehype-recma": "^1.0.0", "remark-mdx": "^3.0.0", "remark-parse": "^11.0.0", "remark-rehype": "^11.0.0", "source-map": "^0.7.0", "unified": "^11.0.0", "unist-util-position-from-estree": "^2.0.0", "unist-util-stringify-position": "^4.0.0", "unist-util-visit": "^5.0.0", "vfile": "^6.0.0" } }, "sha512-f6ZO2ifpwAQIpzGWaBQT2TXxPv6z3RBzQKpVftEWN78Vl/YweF1uwussDx8ECAXVtr3Rs89fKyG9YlzUs9DyGQ=="],
 
+    "@mistralai/mistralai": ["@mistralai/mistralai@1.14.1", "", { "dependencies": { "ws": "^8.18.0", "zod": "^3.25.0 || ^4.0.0", "zod-to-json-schema": "^3.24.1" } }, "sha512-IiLmmZFCCTReQgPAT33r7KQ1nYo5JPdvGkrkZqA8qQ2qB1GHgs5LoP5K2ICyrjnpw2n8oSxMM/VP+liiKcGNlQ=="],
+
     "@monaco-editor/loader": ["@monaco-editor/loader@1.7.0", "", { "dependencies": { "state-local": "^1.0.6" } }, "sha512-gIwR1HrJrrx+vfyOhYmCZ0/JcWqG5kbfG7+d3f/C1LXk2EvzAbHSg3MQ5lO2sMlo9izoAZ04shohfKLVT6crVA=="],
 
     "@monaco-editor/react": ["@monaco-editor/react@4.7.0", "", { "dependencies": { "@monaco-editor/loader": "^1.5.0" }, "peerDependencies": { "monaco-editor": ">= 0.25.0 < 1", "react": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0", "react-dom": "^16.8.0 || ^17.0.0 || ^18.0.0 || ^19.0.0" } }, "sha512-cyzXQCtO47ydzxpQtCGSQGOC8Gk3ZUeBXFAxD+CWXYFo5OqZyZUonFl0DwUlTyAfRHntBfw2p3w4s9R6oe1eCA=="],
 
+    "@nodable/entities": ["@nodable/entities@2.1.0", "", {}, "sha512-nyT7T3nbMyBI/lvr6L5TyWbFJAI9FTgVRakNoBqCD+PmID8DzFrrNdLLtHMwMszOtqZa8PAOV24ZqDnQrhQINA=="],
+
     "@nodelib/fs.scandir": ["@nodelib/fs.scandir@2.1.5", "", { "dependencies": { "@nodelib/fs.stat": "2.0.5", "run-parallel": "^1.1.9" } }, "sha512-vq24Bq3ym5HEQm2NKCr3yXDwjc7vTsEThRDnkp2DK9p1uqLR+DHurm/NOTo0KG7HYHU7eppKZj3MyqYuMBf62g=="],
 
     "@nodelib/fs.stat": ["@nodelib/fs.stat@2.0.5", "", {}, "sha512-RkhPPp2zrqDAQA/2jNhnztcPAlv64XdhIp7a7454A5ovI7Bukxgt7MX7udwAu3zg1DcpPU0rz3VV1SeaqvY4+A=="],
@@ -579,8 +662,98 @@
 
     "@shikijs/vscode-textmate": ["@shikijs/vscode-textmate@10.0.2", "", {}, "sha512-83yeghZ2xxin3Nj8z1NMd/NCuca+gsYXswywDy5bHvwlWL8tpTQmzGeUuHd9FC3E/SBEMvzJRwWEOz5gGes9Qg=="],
 
+    "@sinclair/typebox": ["@sinclair/typebox@0.34.49", "", {}, "sha512-brySQQs7Jtn0joV8Xh9ZV/hZb9Ozb0pmazDIASBkYKCjXrXU3mpcFahmK/z4YDhGkQvP9mWJbVyahdtU5wQA+A=="],
+
     "@sindresorhus/merge-streams": ["@sindresorhus/merge-streams@4.0.0", "", {}, "sha512-tlqY9xq5ukxTUZBmoOp+m61cqwQD5pHJtFY3Mn8CA8ps6yghLH/Hw8UPdqg4OLmFW3IFlcXnQNmo/dh8HzXYIQ=="],
 
+    "@smithy/config-resolver": ["@smithy/config-resolver@4.4.17", "", { "dependencies": { "@smithy/node-config-provider": "^4.3.14", "@smithy/types": "^4.14.1", "@smithy/util-config-provider": "^4.2.2", "@smithy/util-endpoints": "^3.4.2", "@smithy/util-middleware": "^4.2.14", "tslib": "^2.6.2" } }, "sha512-TzDZcAnhTyAHbXVxWZo7/tEcrIeFq20IBk8So3OLOetWpR8EwY/yEqBMBFaJMeyEiREDq4NfEl+qO3OAUD+vbQ=="],
+
+    "@smithy/core": ["@smithy/core@3.23.17", "", { "dependencies": { "@smithy/protocol-http": "^5.3.14", "@smithy/types": "^4.14.1", "@smithy/url-parser": "^4.2.14", "@smithy/util-base64": "^4.3.2", "@smithy/util-body-length-browser": "^4.2.2", "@smithy/util-middleware": "^4.2.14", "@smithy/util-stream": "^4.5.25", "@smithy/util-utf8": "^4.2.2", "@smithy/uuid": "^1.1.2", "tslib": "^2.6.2" } }, "sha512-x7BlLbUFL8NWCGjMF9C+1N5cVCxcPa7g6Tv9B4A2luWx3be3oU8hQ96wIwxe/s7OhIzvoJH73HAUSg5JXVlEtQ=="],
+
+    "@smithy/credential-provider-imds": ["@smithy/credential-provider-imds@4.2.14", "", { "dependencies": { "@smithy/node-config-provider": "^4.3.14", "@smithy/property-provider": "^4.2.14", "@smithy/types": "^4.14.1", "@smithy/url-parser": "^4.2.14", "tslib": "^2.6.2" } }, "sha512-Au28zBN48ZAoXdooGUHemuVBrkE+Ie6RPmGNIAJsFqj33Vhb6xAgRifUydZ2aY+M+KaMAETAlKk5NC5h1G7wpg=="],
+
+    "@smithy/eventstream-codec": ["@smithy/eventstream-codec@4.2.14", "", { "dependencies": { "@aws-crypto/crc32": "5.2.0", "@smithy/types": "^4.14.1", "@smithy/util-hex-encoding": "^4.2.2", "tslib": "^2.6.2" } }, "sha512-erZq0nOIpzfeZdCyzZjdJb4nVSKLUmSkaQUVkRGQTXs30gyUGeKnrYEg+Xe1W5gE3aReS7IgsvANwVPxSzY6Pw=="],
+
+    "@smithy/eventstream-serde-browser": ["@smithy/eventstream-serde-browser@4.2.14", "", { "dependencies": { "@smithy/eventstream-serde-universal": "^4.2.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-8IelTCtTctWRbb+0Dcy+C0aICh1qa0qWXqgjcXDmMuCvPJRnv26hiDZoAau2ILOniki65mCPKqOQs/BaWvO4CQ=="],
+
+    "@smithy/eventstream-serde-config-resolver": ["@smithy/eventstream-serde-config-resolver@4.3.14", "", { "dependencies": { "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-sqHiHpYRYo3FJlaIxD1J8PhbcmJAm7IuM16mVnwSkCToD7g00IBZzKuiLNMGmftULmEUX6/UAz8/NN5uMP8bVA=="],
+
+    "@smithy/eventstream-serde-node": ["@smithy/eventstream-serde-node@4.2.14", "", { "dependencies": { "@smithy/eventstream-serde-universal": "^4.2.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-Ht/8BuGlKfFTy0H3+8eEu0vdpwGztCnaLLXtpXNdQqiR7Hj4vFScU3T436vRAjATglOIPjJXronY+1WxxNLSiw=="],
+
+    "@smithy/eventstream-serde-universal": ["@smithy/eventstream-serde-universal@4.2.14", "", { "dependencies": { "@smithy/eventstream-codec": "^4.2.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-lWyt4T2XQZUZgK3tQ3Wn0w3XBvZsK/vjTuJl6bXbnGZBHH0ZUSONTYiK9TgjTTzU54xQr3DRFwpjmhp0oLm3gg=="],
+
+    "@smithy/fetch-http-handler": ["@smithy/fetch-http-handler@5.3.17", "", { "dependencies": { "@smithy/protocol-http": "^5.3.14", "@smithy/querystring-builder": "^4.2.14", "@smithy/types": "^4.14.1", "@smithy/util-base64": "^4.3.2", "tslib": "^2.6.2" } }, "sha512-bXOvQzaSm6MnmLaWA1elgfQcAtN4UP3vXqV97bHuoOrHQOJiLT3ds6o9eo5bqd0TJfRFpzdGnDQdW3FACiAVdw=="],
+
+    "@smithy/hash-node": ["@smithy/hash-node@4.2.14", "", { "dependencies": { "@smithy/types": "^4.14.1", "@smithy/util-buffer-from": "^4.2.2", "@smithy/util-utf8": "^4.2.2", "tslib": "^2.6.2" } }, "sha512-8ZBDY2DD4wr+GGjTpPtiglEsqr0lUP+KHqgZcWczFf6qeZ/YRjMIOoQWVQlmwu7EtxKTd8YXD8lblmYcpBIA1g=="],
+
+    "@smithy/invalid-dependency": ["@smithy/invalid-dependency@4.2.14", "", { "dependencies": { "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-c21qJiTSb25xvvOp+H2TNZzPCngrvl5vIPqPB8zQ/DmJF4QWXO19x1dWfMJZ6wZuuWUPPm0gV8C0cU3+ifcWuw=="],
+
+    "@smithy/is-array-buffer": ["@smithy/is-array-buffer@4.2.2", "", { "dependencies": { "tslib": "^2.6.2" } }, "sha512-n6rQ4N8Jj4YTQO3YFrlgZuwKodf4zUFs7EJIWH86pSCWBaAtAGBFfCM7Wx6D2bBJ2xqFNxGBSrUWswT3M0VJow=="],
+
+    "@smithy/middleware-content-length": ["@smithy/middleware-content-length@4.2.14", "", { "dependencies": { "@smithy/protocol-http": "^5.3.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-xhHq7fX4/3lv5NHxLUk3OeEvl0xZ+Ek3qIbWaCL4f9JwgDZEclPBElljaZCAItdGPQl/kSM4LPMOpy1MYgprpw=="],
+
+    "@smithy/middleware-endpoint": ["@smithy/middleware-endpoint@4.4.32", "", { "dependencies": { "@smithy/core": "^3.23.17", "@smithy/middleware-serde": "^4.2.20", "@smithy/node-config-provider": "^4.3.14", "@smithy/shared-ini-file-loader": "^4.4.9", "@smithy/types": "^4.14.1", "@smithy/url-parser": "^4.2.14", "@smithy/util-middleware": "^4.2.14", "tslib": "^2.6.2" } }, "sha512-ZZkgyjnJppiZbIm6Qbx92pbXYi1uzenIvGhBSCDlc7NwuAkiqSgS75j1czAD25ZLs2FjMjYy1q7gyRVWG6JA0Q=="],
+
+    "@smithy/middleware-retry": ["@smithy/middleware-retry@4.5.7", "", { "dependencies": { "@smithy/core": "^3.23.17", "@smithy/node-config-provider": "^4.3.14", "@smithy/protocol-http": "^5.3.14", "@smithy/service-error-classification": "^4.3.1", "@smithy/smithy-client": "^4.12.13", "@smithy/types": "^4.14.1", "@smithy/util-middleware": "^4.2.14", "@smithy/util-retry": "^4.3.6", "@smithy/uuid": "^1.1.2", "tslib": "^2.6.2" } }, "sha512-bRt6ZImqVSeTk39Nm81K20ObIiAZ3WefY7G6+iz/0tZjs4dgRRjvRX2sgsH+zi6iDCRR/aQvQofLKxxz4rPBZg=="],
+
+    "@smithy/middleware-serde": ["@smithy/middleware-serde@4.2.20", "", { "dependencies": { "@smithy/core": "^3.23.17", "@smithy/protocol-http": "^5.3.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-Lx9JMO9vArPtiChE3wbEZ5akMIDQpWQtlu90lhACQmNOXcGXRbaDywMHDzuDZ2OkZzP+9wQfZi3YJT9F67zTQQ=="],
+
+    "@smithy/middleware-stack": ["@smithy/middleware-stack@4.2.14", "", { "dependencies": { "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-2dvkUKLuFdKsCRmOE4Mn63co0Djtsm+JMh0bYZQupN1pJwMeE8FmQmRLLzzEMN0dnNi7CDCYYH8F0EVwWiPBeA=="],
+
+    "@smithy/node-config-provider": ["@smithy/node-config-provider@4.3.14", "", { "dependencies": { "@smithy/property-provider": "^4.2.14", "@smithy/shared-ini-file-loader": "^4.4.9", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-S+gFjyo/weSVL0P1b9Ts8C/CwIfNCgUPikk3sl6QVsfE/uUuO+QsF+NsE/JkpvWqqyz1wg7HFdiaZuj5CoBMRg=="],
+
+    "@smithy/node-http-handler": ["@smithy/node-http-handler@4.6.1", "", { "dependencies": { "@smithy/protocol-http": "^5.3.14", "@smithy/querystring-builder": "^4.2.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-iB+orM4x3xrr57X3YaXazfKnntl0LHlZB1kcXSGzMV1Tt0+YwEjGlbjk/44qEGtBzXAz6yFDzkYTKSV6Pj2HUg=="],
+
+    "@smithy/property-provider": ["@smithy/property-provider@4.2.14", "", { "dependencies": { "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-WuM31CgfsnQ/10i7NYr0PyxqknD72Y5uMfUMVSniPjbEPceiTErb4eIqJQ+pdxNEAUEWrewrGjIRjVbVHsxZiQ=="],
+
+    "@smithy/protocol-http": ["@smithy/protocol-http@5.3.14", "", { "dependencies": { "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-dN5F8kHx8RNU0r+pCwNmFZyz6ChjMkzShy/zup6MtkRmmix4vZzJdW+di7x//b1LiynIev88FM18ie+wwPcQtQ=="],
+
+    "@smithy/querystring-builder": ["@smithy/querystring-builder@4.2.14", "", { "dependencies": { "@smithy/types": "^4.14.1", "@smithy/util-uri-escape": "^4.2.2", "tslib": "^2.6.2" } }, "sha512-XYA5Z0IqTeF+5XDdh4BBmSA0HvbgVZIyv4cmOoUheDNR57K1HgBp9ukUMx3Cr3XpDHHpLBnexPE3LAtDsZkj2A=="],
+
+    "@smithy/querystring-parser": ["@smithy/querystring-parser@4.2.14", "", { "dependencies": { "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-hr+YyqBD23GVvRxGGrcc/oOeNlK3PzT5Fu4dzrDXxzS1LpFiuL2PQQqKPs87M79aW7ziMs+nvB3qdw77SqE7Lw=="],
+
+    "@smithy/service-error-classification": ["@smithy/service-error-classification@4.3.1", "", { "dependencies": { "@smithy/types": "^4.14.1" } }, "sha512-aUQuDGh760ts/8MU+APjIZhlLPKhIIfqyzZaJikLEIMrdxFvxuLYD0WxWzaYWpmLbQlXDe9p7EWM3HsBe0K6Gw=="],
+
+    "@smithy/shared-ini-file-loader": ["@smithy/shared-ini-file-loader@4.4.9", "", { "dependencies": { "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-495/V2I15SHgedSJoDPD23JuSfKAp726ZI1V0wtjB07Wh7q/0tri/0e0DLefZCHgxZonrGKt/OCTpAtP1wE1kQ=="],
+
+    "@smithy/signature-v4": ["@smithy/signature-v4@5.3.14", "", { "dependencies": { "@smithy/is-array-buffer": "^4.2.2", "@smithy/protocol-http": "^5.3.14", "@smithy/types": "^4.14.1", "@smithy/util-hex-encoding": "^4.2.2", "@smithy/util-middleware": "^4.2.14", "@smithy/util-uri-escape": "^4.2.2", "@smithy/util-utf8": "^4.2.2", "tslib": "^2.6.2" } }, "sha512-1D9Y/nmlVjCeSivCbhZ7hgEpmHyY1h0GvpSZt3l0xcD9JjmjVC1CHOozS6+Gh+/ldMH8JuJ6cujObQqfayAVFA=="],
+
+    "@smithy/smithy-client": ["@smithy/smithy-client@4.12.13", "", { "dependencies": { "@smithy/core": "^3.23.17", "@smithy/middleware-endpoint": "^4.4.32", "@smithy/middleware-stack": "^4.2.14", "@smithy/protocol-http": "^5.3.14", "@smithy/types": "^4.14.1", "@smithy/util-stream": "^4.5.25", "tslib": "^2.6.2" } }, "sha512-y/Pcj1V9+qG98gyu1gvftHB7rDpdh+7kIBIggs55yGm3JdtBV8GT8IFF3a1qxZ79QnaJHX9GXzvBG6tAd+czJA=="],
+
+    "@smithy/types": ["@smithy/types@4.14.1", "", { "dependencies": { "tslib": "^2.6.2" } }, "sha512-59b5HtSVrVR/eYNei3BUj3DCPKD/G7EtDDe7OEJE7i7FtQFugYo6MxbotS8mVJkLNVf8gYaAlEBwwtJ9HzhWSg=="],
+
+    "@smithy/url-parser": ["@smithy/url-parser@4.2.14", "", { "dependencies": { "@smithy/querystring-parser": "^4.2.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-p06BiBigJ8bTA3MgnOfCtDUWnAMY0YfedO/GRpmc7p+wg3KW8vbXy1xwSu5ASy0wV7rRYtlfZOIKH4XqfhjSQQ=="],
+
+    "@smithy/util-base64": ["@smithy/util-base64@4.3.2", "", { "dependencies": { "@smithy/util-buffer-from": "^4.2.2", "@smithy/util-utf8": "^4.2.2", "tslib": "^2.6.2" } }, "sha512-XRH6b0H/5A3SgblmMa5ErXQ2XKhfbQB+Fm/oyLZ2O2kCUrwgg55bU0RekmzAhuwOjA9qdN5VU2BprOvGGUkOOQ=="],
+
+    "@smithy/util-body-length-browser": ["@smithy/util-body-length-browser@4.2.2", "", { "dependencies": { "tslib": "^2.6.2" } }, "sha512-JKCrLNOup3OOgmzeaKQwi4ZCTWlYR5H4Gm1r2uTMVBXoemo1UEghk5vtMi1xSu2ymgKVGW631e2fp9/R610ZjQ=="],
+
+    "@smithy/util-body-length-node": ["@smithy/util-body-length-node@4.2.3", "", { "dependencies": { "tslib": "^2.6.2" } }, "sha512-ZkJGvqBzMHVHE7r/hcuCxlTY8pQr1kMtdsVPs7ex4mMU+EAbcXppfo5NmyxMYi2XU49eqaz56j2gsk4dHHPG/g=="],
+
+    "@smithy/util-buffer-from": ["@smithy/util-buffer-from@4.2.2", "", { "dependencies": { "@smithy/is-array-buffer": "^4.2.2", "tslib": "^2.6.2" } }, "sha512-FDXD7cvUoFWwN6vtQfEta540Y/YBe5JneK3SoZg9bThSoOAC/eGeYEua6RkBgKjGa/sz6Y+DuBZj3+YEY21y4Q=="],
+
+    "@smithy/util-config-provider": ["@smithy/util-config-provider@4.2.2", "", { "dependencies": { "tslib": "^2.6.2" } }, "sha512-dWU03V3XUprJwaUIFVv4iOnS1FC9HnMHDfUrlNDSh4315v0cWyaIErP8KiqGVbf5z+JupoVpNM7ZB3jFiTejvQ=="],
+
+    "@smithy/util-defaults-mode-browser": ["@smithy/util-defaults-mode-browser@4.3.49", "", { "dependencies": { "@smithy/property-provider": "^4.2.14", "@smithy/smithy-client": "^4.12.13", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-a5bNrdiONYB/qE2BuKegvUMd/+ZDwdg4vsNuuSzYE8qs2EYAdK9CynL+Rzn29PbPiUqoz/cbpRbcLzD5lEevHw=="],
+
+    "@smithy/util-defaults-mode-node": ["@smithy/util-defaults-mode-node@4.2.54", "", { "dependencies": { "@smithy/config-resolver": "^4.4.17", "@smithy/credential-provider-imds": "^4.2.14", "@smithy/node-config-provider": "^4.3.14", "@smithy/property-provider": "^4.2.14", "@smithy/smithy-client": "^4.12.13", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-g1cvrJvOnzeJgEdf7AE4luI7gp6L8weE0y9a9wQUSGtjb8QRHDbCJYuE4Sy0SD9N8RrnNPFsPltAz/OSoBR9Zw=="],
+
+    "@smithy/util-endpoints": ["@smithy/util-endpoints@3.4.2", "", { "dependencies": { "@smithy/node-config-provider": "^4.3.14", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-a55Tr+3OKld4TTtnT+RhKOQHyPxm3j/xL4OR83WBUhLJaKDS9dnJ7arRMOp3t31dcLhApwG9bgvrRXBHlLdIkg=="],
+
+    "@smithy/util-hex-encoding": ["@smithy/util-hex-encoding@4.2.2", "", { "dependencies": { "tslib": "^2.6.2" } }, "sha512-Qcz3W5vuHK4sLQdyT93k/rfrUwdJ8/HZ+nMUOyGdpeGA1Wxt65zYwi3oEl9kOM+RswvYq90fzkNDahPS8K0OIg=="],
+
+    "@smithy/util-middleware": ["@smithy/util-middleware@4.2.14", "", { "dependencies": { "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-1Su2vj9RYNDEv/V+2E+jXkkwGsgR7dc4sfHn9Z7ruzQHJIEni9zzw5CauvRXlFJfmgcqYP8fWa0dkh2Q2YaQyw=="],
+
+    "@smithy/util-retry": ["@smithy/util-retry@4.3.8", "", { "dependencies": { "@smithy/service-error-classification": "^4.3.1", "@smithy/types": "^4.14.1", "tslib": "^2.6.2" } }, "sha512-LUIxbTBi+OpvXpg91poGA6BdyoleMDLnfXjVDqyi2RvZmTveY5loE/FgYUBCR5LU2BThW2SoZRh8dTIIy38IPw=="],
+
+    "@smithy/util-stream": ["@smithy/util-stream@4.5.25", "", { "dependencies": { "@smithy/fetch-http-handler": "^5.3.17", "@smithy/node-http-handler": "^4.6.1", "@smithy/types": "^4.14.1", "@smithy/util-base64": "^4.3.2", "@smithy/util-buffer-from": "^4.2.2", "@smithy/util-hex-encoding": "^4.2.2", "@smithy/util-utf8": "^4.2.2", "tslib": "^2.6.2" } }, "sha512-/PFpG4k8Ze8Ei+mMKj3oiPICYekthuzePZMgZbCqMiXIHHf4n2aZ4Ps0aSRShycFTGuj/J6XldmC0x0DwednIA=="],
+
+    "@smithy/util-uri-escape": ["@smithy/util-uri-escape@4.2.2", "", { "dependencies": { "tslib": "^2.6.2" } }, "sha512-2kAStBlvq+lTXHyAZYfJRb/DfS3rsinLiwb+69SstC9Vb0s9vNWkRwpnj918Pfi85mzi42sOqdV72OLxWAISnw=="],
+
+    "@smithy/util-utf8": ["@smithy/util-utf8@4.2.2", "", { "dependencies": { "@smithy/util-buffer-from": "^4.2.2", "tslib": "^2.6.2" } }, "sha512-75MeYpjdWRe8M5E3AW0O4Cx3UadweS+cwdXjwYGBW5h/gxxnbeZ877sLPX/ZJA9GVTlL/qG0dXP29JWFCD1Ayw=="],
+
+    "@smithy/uuid": ["@smithy/uuid@1.1.2", "", { "dependencies": { "tslib": "^2.6.2" } }, "sha512-O/IEdcCUKkubz60tFbGA7ceITTAJsty+lBjNoorP4Z6XRqaFb/OjQjZODophEcuq68nKm6/0r+6/lLQ+XVpk8g=="],
+
     "@standard-schema/spec": ["@standard-schema/spec@1.1.0", "", {}, "sha512-l2aFy5jALhniG5HgqrD6jXLi/rUWrKvqN/qJx6yoJsgKhblVd+iqqU4RCXavm/jPityDo5TCvKMnpjKnOriy0w=="],
 
     "@standard-schema/utils": ["@standard-schema/utils@0.3.0", "", {}, "sha512-e7Mew686owMaPJVNNLs55PUvgz371nKgwsc4vxE49zsODpJEnxgxRo2y/OKrqueavXgZNMDVj3DdHFlaSAeU8g=="],
@@ -637,6 +810,8 @@
 
     "@tanstack/virtual-file-routes": ["@tanstack/virtual-file-routes@1.161.7", "", { "bin": { "intent": "bin/intent.js" } }, "sha512-olW33+Cn+bsCsZKPwEGhlkqS6w3M2slFv11JIobdnCFKMLG97oAI2kWKdx5/zsywTL8flpnoIgaZZPlQTFYhdQ=="],
 
+    "@tootallnate/quickjs-emscripten": ["@tootallnate/quickjs-emscripten@0.23.0", "", {}, "sha512-C5Mc6rdnsaJDjO3UpGW/CQTHtCKaYlScZTly4JIu97Jxo/odCiH0ITnDXSJPTOrEKk/ycSZ0AOgTmkDtkOsvIA=="],
+
     "@types/babel__core": ["@types/babel__core@7.20.5", "", { "dependencies": { "@babel/parser": "^7.20.7", "@babel/types": "^7.20.7", "@types/babel__generator": "*", "@types/babel__template": "*", "@types/babel__traverse": "*" } }, "sha512-qoQprZvz5wQFJwMDqeseRXWv3rqMvhgpbXFfVyWhbx9X47POIA6i/+dXefEmZKoAgOaTdaIgNSMqMIU61yRyzA=="],
 
     "@types/babel__generator": ["@types/babel__generator@7.27.0", "", { "dependencies": { "@babel/types": "^7.0.0" } }, "sha512-ufFd2Xi92OAVPYsy+P4n7/U7e68fex0+Ee8gSG9KX7eo084CWiQ4sdxktvdl0bOPupXtVJPY19zk6EwWqUQ8lg=="],
@@ -693,6 +868,8 @@
 
     "@types/react-dom": ["@types/react-dom@19.2.3", "", { "peerDependencies": { "@types/react": "^19.2.0" } }, "sha512-jp2L/eY6fn+KgVVQAOqYItbF0VY/YApe5Mz2F0aykSO8gx31bYCZyvSeYxCHKvzHG5eZjc+zyaS5BrBWya2+kQ=="],
 
+    "@types/retry": ["@types/retry@0.12.0", "", {}, "sha512-wWKOClTTiizcZhXnPY4wikVAwmdYHp8q6DmC+EJUzAMsycb7HB32Kh9RN4+0gExjmPmZSAQjgURXIGATPegAvA=="],
+
     "@types/sax": ["@types/sax@1.2.7", "", { "dependencies": { "@types/node": "*" } }, "sha512-rO73L89PJxeYM3s3pPPjiPgVVcymqU490g0YO5n5By0k2Erzj6tay/4lr1CHAAU4JyOWd1rpQ8bCf6cZfHU96A=="],
 
     "@types/semver": ["@types/semver@7.7.1", "", {}, "sha512-FmgJfu+MOcQ370SD0ev7EI8TlCAfKYU+B4m5T3yXc1CiRN94g/SZPtsCkk506aUDtlMnFZvasDwHHUcZUEaYuA=="],
@@ -713,10 +890,16 @@
 
     "acorn-jsx": ["acorn-jsx@5.3.2", "", { "peerDependencies": { "acorn": "^6.0.0 || ^7.0.0 || ^8.0.0" } }, "sha512-rq9s+JNhf0IChjtDXxllJ7g41oZk5SlXtp0LHwyA5cejwn7vKmKp4pPri6YEePv2PU65sAsegbXtIinmDFDXgQ=="],
 
+    "agent-base": ["agent-base@7.1.4", "", {}, "sha512-MnA+YT8fwfJPgBx3m60MNqakm30XOkyIoH1y6huTQvC0PwZG7ki8NacLBcrPbNoo8vEZy7Jpuk7+jMO+CUovTQ=="],
+
     "agentv": ["agentv@workspace:apps/cli"],
 
     "ai": ["ai@6.0.116", "", { "dependencies": { "@ai-sdk/gateway": "3.0.66", "@ai-sdk/provider": "3.0.8", "@ai-sdk/provider-utils": "4.0.19", "@opentelemetry/api": "1.9.0" }, "peerDependencies": { "zod": "^3.25.76 || ^4.1.8" } }, "sha512-7yM+cTmyRLeNIXwt4Vj+mrrJgVQ9RMIW5WO0ydoLoYkewIvsMcvUmqS4j2RJTUXaF1HphwmSKUMQ/HypNRGOmA=="],
 
+    "ajv": ["ajv@8.20.0", "", { "dependencies": { "fast-deep-equal": "^3.1.3", "fast-uri": "^3.0.1", "json-schema-traverse": "^1.0.0", "require-from-string": "^2.0.2" } }, "sha512-Thbli+OlOj+iMPYFBVBfJ3OmCAnaSyNn4M1vz9T6Gka5Jt9ba/HIR56joy65tY6kx/FCF5VXNB819Y7/GUrBGA=="],
+
+    "ajv-formats": ["ajv-formats@3.0.1", "", { "dependencies": { "ajv": "^8.0.0" } }, "sha512-8iUql50EUR+uUcdRQ3HDqa6EVyo3docL8g5WJ3FNcWmu62IbkGUue/pEyLBW8VGKKucTPgqeks4fIU1DA4yowQ=="],
+
     "ansi-align": ["ansi-align@3.0.1", "", { "dependencies": { "string-width": "^4.1.0" } }, "sha512-IOfwwBF5iczOjp/WeY4YxyjqAFMQoZufdQWDd19SEExbVLNXqvpzSJ/M7Za4/sCPmQ0+GRquoA7bGcINcxew6w=="],
 
     "ansi-regex": ["ansi-regex@6.2.2", "", {}, "sha512-Bq3SmSpyFHaWjPk8If9yc6svM8c56dB5BAtW4Qbw5jHTwwXXcTLoRMkpDJp6VL0XzlWaCHTXrkFURMYmD0sLqg=="],
@@ -761,22 +944,32 @@
 
     "base-64": ["base-64@1.0.0", "", {}, "sha512-kwDPIFCGx0NZHog36dj+tHiwP4QMzsZ3AgMViUBKI0+V5n4U0ufTCUMhnQ04diaRI8EX/QcPfql7zlhZ7j4zgg=="],
 
+    "base64-js": ["base64-js@1.5.1", "", {}, "sha512-AKpaYlHn8t4SVbOHCy+b5+KKgvR4vrsD8vbvrbiQJps7fKDTkjkDry6ji0rUJjC0kzbNePLwzxq8iypo41qeWA=="],
+
     "baseline-browser-mapping": ["baseline-browser-mapping@2.10.11", "", { "bin": { "baseline-browser-mapping": "dist/cli.cjs" } }, "sha512-DAKrHphkJyiGuau/cFieRYhcTFeK/lBuD++C7cZ6KZHbMhBrisoi+EvhQ5RZrIfV5qwsW8kgQ07JIC+MDJRAhg=="],
 
+    "basic-ftp": ["basic-ftp@5.3.1", "", {}, "sha512-bopVNp6ugyA150DDuZfPFdt1KZ5a94ZDiwX4hMgZDzF+GttD80lEy8kj98kbyhLXnPvhtIo93mdnLIjpCAeeOw=="],
+
     "bcp-47": ["bcp-47@2.1.0", "", { "dependencies": { "is-alphabetical": "^2.0.0", "is-alphanumerical": "^2.0.0", "is-decimal": "^2.0.0" } }, "sha512-9IIS3UPrvIa1Ej+lVDdDwO7zLehjqsaByECw0bu2RRGP73jALm6FYbzI5gWbgHLvNdkvfXB5YrSbocZdOS0c0w=="],
 
     "bcp-47-match": ["bcp-47-match@2.0.3", "", {}, "sha512-JtTezzbAibu8G0R9op9zb3vcWZd9JF6M0xOYGPn0fNCd7wOpRB1mU2mH9T8gaBGbAAyIIVgB2G7xG0GP98zMAQ=="],
 
+    "bignumber.js": ["bignumber.js@9.3.1", "", {}, "sha512-Ko0uX15oIUS7wJ3Rb30Fs6SkVbLmPBAKdlm7q9+ak9bbIeFf0MwuBsQV6z7+X768/cHsfg+WlysDWJcmthjsjQ=="],
+
     "binary-extensions": ["binary-extensions@2.3.0", "", {}, "sha512-Ceh+7ox5qe7LJuLHoY0feh3pHuUDHAcRUeyL2VYghZwfpkNIy/+8Ocg0a3UuSoYzavmylwuLWQOf3hl0jjMMIw=="],
 
     "boolbase": ["boolbase@1.0.0", "", {}, "sha512-JZOSA7Mo9sNGB8+UjSgzdLtokWAky1zbztM3WRLCbZ70/3cTANmQmOdR7y2g+J0e2WXywy1yS468tY+IruqEww=="],
 
+    "bowser": ["bowser@2.14.1", "", {}, "sha512-tzPjzCxygAKWFOJP011oxFHs57HzIhOEracIgAePE4pqB3LikALKnSzUyU4MGs9/iCEUuHlAJTjTc5M+u7YEGg=="],
+
     "boxen": ["boxen@8.0.1", "", { "dependencies": { "ansi-align": "^3.0.1", "camelcase": "^8.0.0", "chalk": "^5.3.0", "cli-boxes": "^3.0.0", "string-width": "^7.2.0", "type-fest": "^4.21.0", "widest-line": "^5.0.0", "wrap-ansi": "^9.0.0" } }, "sha512-F3PH5k5juxom4xktynS7MoFY+NUWH5LC4CnH11YB8NPew+HLpmBLCybSAEyb2F+4pRXhuhWqFesoQd6DAyc2hw=="],
 
     "braces": ["braces@3.0.3", "", { "dependencies": { "fill-range": "^7.1.1" } }, "sha512-yQbXgO/OSZVD2IsiLlro+7Hf6Q18EJrKSEsdoMzKePKXct3gvD8oLcOQdIzGupr5Fj+EDe8gO/lxc1BzfMpxvA=="],
 
     "browserslist": ["browserslist@4.28.1", "", { "dependencies": { "baseline-browser-mapping": "^2.9.0", "caniuse-lite": "^1.0.30001759", "electron-to-chromium": "^1.5.263", "node-releases": "^2.0.27", "update-browserslist-db": "^1.2.0" }, "bin": { "browserslist": "cli.js" } }, "sha512-ZC5Bd0LgJXgwGqUknZY/vkUQ04r8NXnJZ3yYi4vDmSiZmC/pdSN0NbNRPxZpbtO4uAfDUAFffO8IZoM3Gj8IkA=="],
 
+    "buffer-equal-constant-time": ["buffer-equal-constant-time@1.0.1", "", {}, "sha512-zRpUiDwd/xk6ADqPMATG8vc9VPrkck7T07OIx0gnjmJAnHnTVXNQG3vfvWNuiZIkwu9KrKdA1iJKfsfTVxE6NA=="],
+
     "bun-types": ["bun-types@1.3.4", "", { "dependencies": { "@types/node": "*" } }, "sha512-5ua817+BZPZOlNaRgGBpZJOSAQ9RQ17pkwPD0yR7CfJg+r8DgIILByFifDTa+IPDDxzf5VNhtNlcKqFzDgJvlQ=="],
 
     "bundle-require": ["bundle-require@5.1.0", "", { "dependencies": { "load-tsconfig": "^0.2.3" }, "peerDependencies": { "esbuild": ">=0.18" } }, "sha512-3WrrOuZiyaaZPWiEt4G3+IffISVC9HYlWueJEBWED4ZH4aIAC2PnkdnuRrR94M+w6yGWn4AglWtJtBI8YqvgoA=="],
@@ -877,6 +1070,8 @@
 
     "d3-timer": ["d3-timer@3.0.1", "", {}, "sha512-ndfJ/JxxMd3nw31uyKoY2naivF+r29V+Lc0svZxe1JvvIRmi8hUsrMvdOwgS1o6uBHmiz91geQ0ylPP0aj1VUA=="],
 
+    "data-uri-to-buffer": ["data-uri-to-buffer@6.0.2", "", {}, "sha512-7hvf7/GW8e86rW0ptuwS3OcBGDjIi6SZva7hCyWC0yYry2cOPmLIjXAUHI6DK2HsnwJd9ifmt57i8eV2n4YNpw=="],
+
     "debug": ["debug@4.4.3", "", { "dependencies": { "ms": "^2.1.3" } }, "sha512-RGwwWnwQvkVfavKVt22FGLw+xYSdzARwm0ru6DhTVA3umU5hZc28V3kO4stgYryrTlLpuvgI9GiijltAjNbcqA=="],
 
     "decimal.js-light": ["decimal.js-light@2.5.1", "", {}, "sha512-qIMFpTMZmny+MMIitAB6D7iVPEorVw6YQRWkvarTkT4tBeSLLiHzcwj6q0MmYSFCiVpiqPJTJEYIrpcPzVEIvg=="],
@@ -887,6 +1082,8 @@
 
     "defu": ["defu@6.1.4", "", {}, "sha512-mEQCMmwJu317oSz8CwdIOdwf3xMif1ttiM8LTufzc3g6kR+9Pe236twL8j3IYT1F7GfRgGcW6MWxzZjLIkuHIg=="],
 
+    "degenerator": ["degenerator@5.0.1", "", { "dependencies": { "ast-types": "^0.13.4", "escodegen": "^2.1.0", "esprima": "^4.0.1" } }, "sha512-TllpMR/t0M5sqCXfj85i4XaAzxmS5tVA16dqvdkMwGmzI+dXLXnw3J+3Vdv7VKw+ThlTMboK6i9rnZ6Nntj5CQ=="],
+
     "delayed-stream": ["delayed-stream@1.0.0", "", {}, "sha512-ZySD7Nf91aLB0RxL4KGrKHBXl7Eds1DAmEdcoVawXnLD7SDhpNgtuII2aAkg7a7QS41jxPSZ17p4VdGnMHk3MQ=="],
 
     "dequal": ["dequal@2.0.3", "", {}, "sha512-0je+qPKHEMohvfRTCEo3CrPG6cAzAYgmzKyxRiYSSDkS6eGJdyVJm7WaYA5ECaAD9wLB2T4EEeymA5aFVcYXCA=="],
@@ -927,6 +1124,8 @@
 
     "easy-table": ["easy-table@1.1.0", "", { "optionalDependencies": { "wcwidth": ">=1.0.1" } }, "sha512-oq33hWOSSnl2Hoh00tZWaIPi1ievrD9aFG82/IgjlycAnW9hHx5PkJiXpxPsgEE+H7BsbVQXFVFST8TEXS6/pA=="],
 
+    "ecdsa-sig-formatter": ["ecdsa-sig-formatter@1.0.11", "", { "dependencies": { "safe-buffer": "^5.0.1" } }, "sha512-nagl3RYrbNv6kQkeJIpt6NJZy8twLB/2vtz6yN9Z4vRKHN4/QZJIEbqohALSgwKdnksuY3k5Addp5lg8sVoVcQ=="],
+
     "electron-to-chromium": ["electron-to-chromium@1.5.328", "", {}, "sha512-QNQ5l45DzYytThO21403XN3FvK0hOkWDG8viNf6jqS42msJ8I4tGDSpBCgvDRRPnkffafiwAym2X2eHeGD2V0w=="],
 
     "emoji-regex": ["emoji-regex@10.6.0", "", {}, "sha512-toUI84YS5YmxW219erniWD0CIVOo46xGKColeNQRgOzDorgBi1v4D71/OFzgD9GO2UGKIv1C3Sp8DAn0+j5w7A=="],
@@ -957,8 +1156,12 @@
 
     "escape-string-regexp": ["escape-string-regexp@5.0.0", "", {}, "sha512-/veY75JbMK4j1yjvuUxuVsiS/hr/4iHs9FTT6cgTexxdE0Ly/glccBAkloH/DofkjRbZU3bnoj38mOmhkZ0lHw=="],
 
+    "escodegen": ["escodegen@2.1.0", "", { "dependencies": { "esprima": "^4.0.1", "estraverse": "^5.2.0", "esutils": "^2.0.2" }, "optionalDependencies": { "source-map": "~0.6.1" }, "bin": { "esgenerate": "bin/esgenerate.js", "escodegen": "bin/escodegen.js" } }, "sha512-2NlIDTwUWJN0mRPQOdtQBzbUHvdGY2P1VXSyU83Q3xKxM7WHX2Ql8dKq782Q9TgQUNOLEzEYu9bzLNj1q88I5w=="],
+
     "esprima": ["esprima@4.0.1", "", { "bin": { "esparse": "./bin/esparse.js", "esvalidate": "./bin/esvalidate.js" } }, "sha512-eGuFFw7Upda+g4p+QHvnW0RyTX/SVeJBDM/gCtMARO0cLuT2HcEKnTPvhjV6aGeqrCB/sbNop0Kszm0jsaWU4A=="],
 
+    "estraverse": ["estraverse@5.3.0", "", {}, "sha512-MMdARuVEQziNTeJD8DgMqmhwR11BRQ/cBP+pLtYdSTnf3MIO8fFeiINEbX36ZdNlfU/7A9f3gUw49B3oQsvwBA=="],
+
     "estree-util-attach-comments": ["estree-util-attach-comments@3.0.0", "", { "dependencies": { "@types/estree": "^1.0.0" } }, "sha512-cKUwm/HUcTDsYh/9FgnuFqpfquUbwIqwKM26BVCGDPVgvaCl/nDCCjUfiLlx6lsEZ3Z4RFxNbOQ60pkaEwFxGw=="],
 
     "estree-util-build-jsx": ["estree-util-build-jsx@3.0.1", "", { "dependencies": { "@types/estree-jsx": "^1.0.0", "devlop": "^1.0.0", "estree-util-is-identifier-name": "^3.0.0", "estree-walker": "^3.0.0" } }, "sha512-8U5eiL6BTrPxp/CHbs2yMgP8ftMhR5ww1eIKoWRMlqvltHF8fZn5LRDvTKuxD3DUn+shRbLGqXemcP51oFCsGQ=="],
@@ -973,6 +1176,8 @@
 
     "estree-walker": ["estree-walker@3.0.3", "", { "dependencies": { "@types/estree": "^1.0.0" } }, "sha512-7RUKfXgSMMkzt6ZuXmqapOurLGPPfgj6l9uRZ7lRGolvk0y2yocc35LdcxKC5PQZdn2DMqioAQ2NoWcrTKmm6g=="],
 
+    "esutils": ["esutils@2.0.3", "", {}, "sha512-kVscqXk4OCp68SZ0dkgEKVi6/8ij300KBWTJq32P/dYeWTSwK41WyTxalN1eRmA5Z9UU/LX9D7FWSmV9SAYx6g=="],
+
     "eventemitter3": ["eventemitter3@5.0.4", "", {}, "sha512-mlsTRyGaPBjPedk6Bvw+aqbsXDtoAyAzm5MO7JgU+yVRyMQ5O8bD4Kcci7BS85f93veegeCPkL8R4GLClnjLFw=="],
 
     "eventsource-parser": ["eventsource-parser@3.0.6", "", {}, "sha512-Vo1ab+QXPzZ4tCa8SwIHJFaSzy4R6SHf7BY79rFBDf0idraZWAkYrDjDj8uWaSm3S2TK+hJ7/t1CEmZ7jXw+pg=="],
@@ -983,18 +1188,28 @@
 
     "extend": ["extend@3.0.2", "", {}, "sha512-fjquC59cD7CyW6urNXK0FBufkZcoiGG80wTuPujX590cB5Ttln20E2UB4S/WARVqhXffZl2LNgS+gQdPIIim/g=="],
 
+    "fast-deep-equal": ["fast-deep-equal@3.1.3", "", {}, "sha512-f3qQ9oQy9j2AhBe/H9VC91wLmKBCCU/gDOnKNAYG5hswO7BLKj09Hc5HYNz9cGI++xlpDCIgDaitVs03ATR84Q=="],
+
     "fast-glob": ["fast-glob@3.3.3", "", { "dependencies": { "@nodelib/fs.stat": "^2.0.2", "@nodelib/fs.walk": "^1.2.3", "glob-parent": "^5.1.2", "merge2": "^1.3.0", "micromatch": "^4.0.8" } }, "sha512-7MptL8U0cqcFdzIzwOTHoilX9x5BrNqye7Z/LuC7kCMRio1EMSyqRK3BEAUD7sXRq4iT4AzTVuZdhgQ2TCvYLg=="],
 
     "fast-string-truncated-width": ["fast-string-truncated-width@3.0.3", "", {}, "sha512-0jjjIEL6+0jag3l2XWWizO64/aZVtpiGE3t0Zgqxv0DPuxiMjvB3M24fCyhZUO4KomJQPj3LTSUnDP3GpdwC0g=="],
 
     "fast-string-width": ["fast-string-width@3.0.2", "", { "dependencies": { "fast-string-truncated-width": "^3.0.2" } }, "sha512-gX8LrtNEI5hq8DVUfRQMbr5lpaS4nMIWV+7XEbXk2b8kiQIizgnlr12B4dA3ZEx3308ze0O4Q1R+cHts8kyUJg=="],
 
+    "fast-uri": ["fast-uri@3.1.0", "", {}, "sha512-iPeeDKJSWf4IEOasVVrknXpaBV0IApz/gp7S2bb7Z4Lljbl2MGJRqInZiUrQwV16cpzw/D3S5j5Julj/gT52AA=="],
+
     "fast-wrap-ansi": ["fast-wrap-ansi@0.2.0", "", { "dependencies": { "fast-string-width": "^3.0.2" } }, "sha512-rLV8JHxTyhVmFYhBJuMujcrHqOT2cnO5Zxj37qROj23CP39GXubJRBUFF0z8KFK77Uc0SukZUf7JZhsVEQ6n8w=="],
 
+    "fast-xml-builder": ["fast-xml-builder@1.1.5", "", { "dependencies": { "path-expression-matcher": "^1.1.3" } }, "sha512-4TJn/8FKLeslLAH3dnohXqE3QSoxkhvaMzepOIZytwJXZO69Bfz0HBdDHzOTOon6G59Zrk6VQ2bEiv1t61rfkA=="],
+
+    "fast-xml-parser": ["fast-xml-parser@5.7.2", "", { "dependencies": { "@nodable/entities": "^2.1.0", "fast-xml-builder": "^1.1.5", "path-expression-matcher": "^1.5.0", "strnum": "^2.2.3" }, "bin": { "fxparser": "src/cli/cli.js" } }, "sha512-P7oW7tLbYnhOLQk/Gv7cZgzgMPP/XN03K02/Jy6Y/NHzyIAIpxuZIM/YqAkfiXFPxA2CTm7NtCijK9EDu09u2w=="],
+
     "fastq": ["fastq@1.19.1", "", { "dependencies": { "reusify": "^1.0.4" } }, "sha512-GwLTyxkCXjXbxqIhTsMI2Nui8huMPtnxg7krajPJAjnEG/iiOS7i+zCtWGZR9G0NBKbXKh6X9m9UIsYX/N6vvQ=="],
 
     "fdir": ["fdir@6.5.0", "", { "peerDependencies": { "picomatch": "^3 || ^4" }, "optionalPeers": ["picomatch"] }, "sha512-tIbYtZbucOs0BRGqPJkshJUYdL+SDH7dVM8gjy+ERp3WAUjLEFJE+02kanyHtwjWOnwrKYBiwAmM0p4kLJAnXg=="],
 
+    "fetch-blob": ["fetch-blob@3.2.0", "", { "dependencies": { "node-domexception": "^1.0.0", "web-streams-polyfill": "^3.0.3" } }, "sha512-7yAQpD2UMJzLi1Dqv7qFYnPbaPx7ZfFK6PiIxQ4PfkGPyNyl2Ugx+a/umUonmKqjhM4DnfbMvdX6otXq83soQQ=="],
+
     "figures": ["figures@6.1.0", "", { "dependencies": { "is-unicode-supported": "^2.0.0" } }, "sha512-d+l3qxjSesT4V7v2fh+QnmFnUWv9lSpjarhShNTgBOfA0ttejbQUAlHLitbjkoRiDulW0OPoQPYIGhIC8ohejg=="],
 
     "fill-range": ["fill-range@7.1.1", "", { "dependencies": { "to-regex-range": "^5.0.1" } }, "sha512-YsGpe3WHLK8ZYi4tWDg2Jy3ebRz2rXowDxnld4bkQB00cc/1Zw9AWnC0i9ztDJitivtQvaI9KaLyKrc+hBW0yg=="],
@@ -1009,10 +1224,16 @@
 
     "form-data": ["form-data@4.0.5", "", { "dependencies": { "asynckit": "^0.4.0", "combined-stream": "^1.0.8", "es-set-tostringtag": "^2.1.0", "hasown": "^2.0.2", "mime-types": "^2.1.12" } }, "sha512-8RipRLol37bNs2bhoV67fiTEvdTrbMUYcFTiy3+wuuOnUog2QBHCZWXDRijWQfAkhBj2Uf5UnVaiWwA5vdd82w=="],
 
+    "formdata-polyfill": ["formdata-polyfill@4.0.10", "", { "dependencies": { "fetch-blob": "^3.1.2" } }, "sha512-buewHzMvYL29jdeQTVILecSaZKnt/RJWjoZCF5OW60Z67/GmSLBkOFM7qh1PI3zFNtJbaZL5eQu1vLfazOwj4g=="],
+
     "fsevents": ["fsevents@2.3.3", "", { "os": "darwin" }, "sha512-5xoDfX+fL7faATnagmWPpbFtwh/R77WmMMqqHGS65C3vvB0YHrgF+B1YmZ3441tMj5n63k0212XNoJwzlhffQw=="],
 
     "function-bind": ["function-bind@1.1.2", "", {}, "sha512-7XHNxH7qX9xG5mIwxkhumTox/MIRNcOgDrxWsMt2pAr23WHp6MrRlN7FBSFpCpr+oVO0F744iUgR82nJMfG2SA=="],
 
+    "gaxios": ["gaxios@7.1.4", "", { "dependencies": { "extend": "^3.0.2", "https-proxy-agent": "^7.0.1", "node-fetch": "^3.3.2" } }, "sha512-bTIgTsM2bWn3XklZISBTQX7ZSddGW+IO3bMdGaemHZ3tbqExMENHLx6kKZ/KlejgrMtj8q7wBItt51yegqalrA=="],
+
+    "gcp-metadata": ["gcp-metadata@8.1.2", "", { "dependencies": { "gaxios": "^7.0.0", "google-logging-utils": "^1.0.0", "json-bigint": "^1.0.0" } }, "sha512-zV/5HKTfCeKWnxG0Dmrw51hEWFGfcF2xiXqcA3+J90WDuP0SvoiSO5ORvcBsifmx/FoIjgQN3oNOGaQ5PhLFkg=="],
+
     "gensync": ["gensync@1.0.0-beta.2", "", {}, "sha512-3hN7NaskYvMDLQY55gnW3NQ+mesEAepTqlg+VEbj7zzqEMBVNhzcGYYeqFo/TlYz6eQiFcp1HcsCZO+nGgS8zg=="],
 
     "get-east-asian-width": ["get-east-asian-width@1.4.0", "", {}, "sha512-QZjmEOC+IT1uk6Rx0sX22V6uHWVwbdbxf1faPqJ1QhLdGgsRGCZoyaQBm/piRdJy/D2um6hM1UP7ZEeQ4EkP+Q=="],
@@ -1025,12 +1246,18 @@
 
     "get-tsconfig": ["get-tsconfig@4.13.0", "", { "dependencies": { "resolve-pkg-maps": "^1.0.0" } }, "sha512-1VKTZJCwBrvbd+Wn3AOgQP/2Av+TfTCOlE4AcRJE72W1ksZXbAx8PPBR9RzgTeSPzlPMHrbANMH3LbltH73wxQ=="],
 
+    "get-uri": ["get-uri@6.0.5", "", { "dependencies": { "basic-ftp": "^5.0.2", "data-uri-to-buffer": "^6.0.2", "debug": "^4.3.4" } }, "sha512-b1O07XYq8eRuVzBNgJLstU6FYc1tS6wnMtF1I1D9lE8LxZSOGZ7LhxN54yPP6mGw5f2CkXY2BQUL9Fx41qvcIg=="],
+
     "github-slugger": ["github-slugger@2.0.0", "", {}, "sha512-IaOQ9puYtjrkq7Y0Ygl9KDZnrf/aiUJYUpVf89y8kyaxbRG7Y1SrX/jaumrv81vc61+kiMempujsM3Yw7w5qcw=="],
 
     "glob": ["glob@13.0.0", "", { "dependencies": { "minimatch": "^10.1.1", "minipass": "^7.1.2", "path-scurry": "^2.0.0" } }, "sha512-tvZgpqk6fz4BaNZ66ZsRaZnbHvP/jG3uKJvAZOwEVUL4RTA5nJeeLYfyN9/VA8NX/V3IBG+hkeuGpKjvELkVhA=="],
 
     "glob-parent": ["glob-parent@5.1.2", "", { "dependencies": { "is-glob": "^4.0.1" } }, "sha512-AOIgSQCepiJYwP3ARnGx+5VnTu2HBYdzbGP45eLw1vr3zB3vZLeyed1sC9hnbcOc9/SrMyM5RPQrkGz4aS9Zow=="],
 
+    "google-auth-library": ["google-auth-library@10.6.2", "", { "dependencies": { "base64-js": "^1.3.0", "ecdsa-sig-formatter": "^1.0.11", "gaxios": "^7.1.4", "gcp-metadata": "8.1.2", "google-logging-utils": "1.1.3", "jws": "^4.0.0" } }, "sha512-e27Z6EThmVNNvtYASwQxose/G57rkRuaRbQyxM2bvYLLX/GqWZ5chWq2EBoUchJbCc57eC9ArzO5wMsEmWftCw=="],
+
+    "google-logging-utils": ["google-logging-utils@1.1.3", "", {}, "sha512-eAmLkjDjAFCVXg7A1unxHsLf961m6y17QFqXqAXGj/gVkKFrEICfStRfwUlGNfeCEjNRa32JEWOUTlYXPyyKvA=="],
+
     "gopd": ["gopd@1.2.0", "", {}, "sha512-ZUKRh6/kUFoAiTAtTYPZJ3hw9wNxx+BIBOijnlG9PnrJsCcSjs1wyyD6vJpaYtgnzDrKYRSqf3OO6Rfa93xsRg=="],
 
     "graceful-fs": ["graceful-fs@4.2.11", "", {}, "sha512-RbJ5/jmFcNNCcDV5o9eTnBLJ/HszWV0P73bc+Ff4nS/rJj+YaS6IGyiOL0VoBYX+l1Wrl3k63h/KrH+nhJ0XvQ=="],
@@ -1093,6 +1320,10 @@
 
     "http-cache-semantics": ["http-cache-semantics@4.2.0", "", {}, "sha512-dTxcvPXqPvXBQpq5dUr6mEMJX4oIEFv6bwom3FDwKRDsuIjjJGANqhBuoAn9c1RQJIdAKav33ED65E2ys+87QQ=="],
 
+    "http-proxy-agent": ["http-proxy-agent@7.0.2", "", { "dependencies": { "agent-base": "^7.1.0", "debug": "^4.3.4" } }, "sha512-T1gkAiYYDWYx3V5Bmyu7HcfcvL7mUrTWiM6yOfa3PIphViJ/gFPbvidQ+veqSOHci/PxBcDabeUNCzpOODJZig=="],
+
+    "https-proxy-agent": ["https-proxy-agent@7.0.6", "", { "dependencies": { "agent-base": "^7.1.2", "debug": "4" } }, "sha512-vK9P5/iUfdl95AI+JVyUuIcVtd4ofvtrOr3HNtM2yxC9bnMbEdp3x01OhQNnjb8IJYi38VlTE3mBXwcfvywuSw=="],
+
     "human-signals": ["human-signals@8.0.1", "", {}, "sha512-eKCa6bwnJhvxj14kZk5NCPc6Hb6BdsU9DZcOnmQKSnO1VKrfV0zCvtttPZUsBvjmNDn8rpcJfpwSYnHBjc95MQ=="],
 
     "i18next": ["i18next@23.16.8", "", { "dependencies": { "@babel/runtime": "^7.23.2" } }, "sha512-06r/TitrM88Mg5FdUXAKL96dJMzgqLE5dv3ryBAra4KCwD9mJ4ndOTS95ZuymIGoE+2hzfdaMak2X11/es7ZWg=="],
@@ -1107,6 +1338,8 @@
 
     "internmap": ["internmap@2.0.3", "", {}, "sha512-5Hh7Y1wQbvY5ooGgPbDaL5iYLAPzMTUrjMulskHLH6wnv/A+1q5rgEaiuqEjB+oxGXIVZs1FF+R/KPN3ZSQYYg=="],
 
+    "ip-address": ["ip-address@10.2.0", "", {}, "sha512-/+S6j4E9AHvW9SWMSEY9Xfy66O5PWvVEJ08O0y5JGyEKQpojb0K0GKpz/v5HJ/G0vi3D2sjGK78119oXZeE0qA=="],
+
     "iron-webcrypto": ["iron-webcrypto@1.2.1", "", {}, "sha512-feOM6FaSr6rEABp/eDfVseKyTMDt+KGpeB35SkVn9Tyn0CqvVsY3EwI0v5i8nMHyJnzCIQf7nsy3p41TPkJZhg=="],
 
     "is-alphabetical": ["is-alphabetical@2.0.1", "", {}, "sha512-FWyyY60MeTNyeSRpkM2Iry0G9hpr7/9kD40mD/cGQEuilcZYS4okz8SN2Q6rLCJ8gbCt6fN+rC+6tMGS99LaxQ=="],
@@ -1153,10 +1386,20 @@
 
     "jsesc": ["jsesc@3.1.0", "", { "bin": { "jsesc": "bin/jsesc" } }, "sha512-/sM3dO2FOzXjKQhJuo0Q173wf2KOo8t4I8vHy6lF9poUp7bKT0/NHE8fPX23PwfhnykfqnC2xRxOnVw5XuGIaA=="],
 
+    "json-bigint": ["json-bigint@1.0.0", "", { "dependencies": { "bignumber.js": "^9.0.0" } }, "sha512-SiPv/8VpZuWbvLSMtTDU8hEfrZWg/mH/nV/b4o0CYbSxu1UIQPLdwKOCIyLQX+VIPO5vrLX3i8qtqFyhdPSUSQ=="],
+
     "json-schema": ["json-schema@0.4.0", "", {}, "sha512-es94M3nTIfsEPisRafak+HDLfHXnKBhV3vU5eqPcS3flIWqcxJWgXHXiey3YrpaNsanY5ei1VoYEbOzijuq9BA=="],
 
+    "json-schema-to-ts": ["json-schema-to-ts@3.1.1", "", { "dependencies": { "@babel/runtime": "^7.18.3", "ts-algebra": "^2.0.0" } }, "sha512-+DWg8jCJG2TEnpy7kOm/7/AxaYoaRbjVB4LFZLySZlWn8exGs3A4OLJR966cVvU26N7X9TWxl+Jsw7dzAqKT6g=="],
+
+    "json-schema-traverse": ["json-schema-traverse@1.0.0", "", {}, "sha512-NM8/P9n3XjXhIZn1lLhkFaACTOURQXjWhV4BA/RnOv8xvgqtqpAX9IO4mRQxSx1Rlo4tqzeqb0sOlruaOy3dug=="],
+
     "json5": ["json5@2.2.3", "", { "bin": { "json5": "lib/cli.js" } }, "sha512-XmOWe7eyHYH14cLdVPoyg+GOH3rYX++KpzrylJwSW98t3Nk+U8XOl8FWKOgwtzdb8lXGf6zYwDUzeHMWfxasyg=="],
 
+    "jwa": ["jwa@2.0.1", "", { "dependencies": { "buffer-equal-constant-time": "^1.0.1", "ecdsa-sig-formatter": "1.0.11", "safe-buffer": "^5.0.1" } }, "sha512-hRF04fqJIP8Abbkq5NKGN0Bbr3JxlQ+qhZufXVr0DvujKy93ZCbXZMHDL4EOtodSbCWxOqR8MS1tXA5hwqCXDg=="],
+
+    "jws": ["jws@4.0.1", "", { "dependencies": { "jwa": "^2.0.1", "safe-buffer": "^5.0.1" } }, "sha512-EKI/M/yqPncGUUh44xz0PxSidXFr/+r0pA70+gIYhjv+et7yxM+s29Y+VGDkovRofQem0fs7Uvf4+YmAdyRduA=="],
+
     "kleur": ["kleur@3.0.3", "", {}, "sha512-eTIzlVOSUR+JxdDFepEYcBMtZ9Qqdef+rnzWdRZuMbOywu5tO2w2N7rqjoANZ5k9vywhL6Br1VRjUIgTQx4E8w=="],
 
     "klona": ["klona@2.0.6", "", {}, "sha512-dhG34DXATL5hSxJbIexCft8FChFXtmskoZYnoPWjXQuebWYCNkVeV3KkGegCK9CP1oswI/vQibS2GY7Em/sJJA=="],
@@ -1197,7 +1440,7 @@
 
     "longest-streak": ["longest-streak@3.1.0", "", {}, "sha512-9Ri+o0JYgehTaVBBDoMqIl8GXtbWg711O3srftcHhZ0dqnETqLaoIK0x17fUw9rFSlK/0NlsKe0Ahhyl5pXE2g=="],
 
-    "lru-cache": ["lru-cache@11.2.5", "", {}, "sha512-vFrFJkWtJvJnD5hg+hJvVE8Lh/TcMzKnTgCWmtBipwI5yLX/iX+5UB2tfuyODF5E7k9xEzMdYgGqaSb1c0c5Yw=="],
+    "lru-cache": ["lru-cache@7.18.3", "", {}, "sha512-jumlc0BIUrS3qJGgIkWZsyfAM7NCWiBcCDhnd+3NNM5KbBmLTgHVfWBcg6W+rLUsIpzpERPsvwUP7CckAQSOoA=="],
 
     "magic-string": ["magic-string@0.30.21", "", { "dependencies": { "@jridgewell/sourcemap-codec": "^1.5.5" } }, "sha512-vd2F4YUyEXKGcLHoq+TEyCjxueSeHnFxyyjNp80yg0XV4vUhnDer/lvvlqM/arB5bXQN5K2/3oinyCRyx8T2CQ=="],
 
@@ -1347,8 +1590,14 @@
 
     "neotraverse": ["neotraverse@0.6.18", "", {}, "sha512-Z4SmBUweYa09+o6pG+eASabEpP6QkQ70yHj351pQoEXIs8uHbaU2DWVmzBANKgflPa47A50PtB2+NgRpQvr7vA=="],
 
+    "netmask": ["netmask@2.1.1", "", {}, "sha512-eonl3sLUha+S1GzTPxychyhnUzKyeQkZ7jLjKrBagJgPla13F+uQ71HgpFefyHgqrjEbCPkDArxYsjY8/+gLKA=="],
+
     "nlcst-to-string": ["nlcst-to-string@4.0.0", "", { "dependencies": { "@types/nlcst": "^2.0.0" } }, "sha512-YKLBCcUYKAg0FNlOBT6aI91qFmSiFKiluk655WzPF+DDMA02qIyy8uiRqI8QXtcFpEvll12LpL5MXqEmAZ+dcA=="],
 
+    "node-domexception": ["node-domexception@1.0.0", "", {}, "sha512-/jKZoMpw0F8GRwl4/eLROPA3cfcXtLApP0QzLmUT/HuPCZWyB7IY9ZrMeKw2O/nFIqPQB3PVM9aYm0F312AXDQ=="],
+
+    "node-fetch": ["node-fetch@3.3.2", "", { "dependencies": { "data-uri-to-buffer": "^4.0.0", "fetch-blob": "^3.1.4", "formdata-polyfill": "^4.0.10" } }, "sha512-dRB78srN/l6gqWulah9SrxeYnxeddIG30+GOqK/9OlLVyLg3HPnr6SqOWTWOXKRwC2eGYCkZ59NNuSgvSrpgOA=="],
+
     "node-fetch-native": ["node-fetch-native@1.6.7", "", {}, "sha512-g9yhqoedzIUm0nTnTqAQvueMPVOuIY16bqgAJJC8XOOubYFNwz6IER9qs0Gq2Xd0+CecCKFjtdDTMA4u4xG06Q=="],
 
     "node-mock-http": ["node-mock-http@1.0.4", "", {}, "sha512-8DY+kFsDkNXy1sJglUfuODx1/opAGJGyrTuFqEoN90oRc2Vk0ZbD4K2qmKXBBEhZQzdKHIVfEJpDU8Ak2NJEvQ=="],
@@ -1371,12 +1620,20 @@
 
     "oniguruma-to-es": ["oniguruma-to-es@4.3.4", "", { "dependencies": { "oniguruma-parser": "^0.12.1", "regex": "^6.0.1", "regex-recursion": "^6.0.2" } }, "sha512-3VhUGN3w2eYxnTzHn+ikMI+fp/96KoRSVK9/kMTcFqj1NRDh2IhQCKvYxDnWePKRXY/AqH+Fuiyb7VHSzBjHfA=="],
 
+    "openai": ["openai@6.26.0", "", { "peerDependencies": { "ws": "^8.18.0", "zod": "^3.25 || ^4.0" }, "optionalPeers": ["ws", "zod"], "bin": { "openai": "bin/cli" } }, "sha512-zd23dbWTjiJ6sSAX6s0HrCZi41JwTA1bQVs0wLQPZ2/5o2gxOJA5wh7yOAUgwYybfhDXyhwlpeQf7Mlgx8EOCA=="],
+
     "p-limit": ["p-limit@6.2.0", "", { "dependencies": { "yocto-queue": "^1.1.1" } }, "sha512-kuUqqHNUqoIWp/c467RI4X6mmyuojY5jGutNU0wVTmEOOfcuwLqyMVoAi9MKi2Ak+5i9+nhmrK4ufZE8069kHA=="],
 
     "p-queue": ["p-queue@8.1.1", "", { "dependencies": { "eventemitter3": "^5.0.1", "p-timeout": "^6.1.2" } }, "sha512-aNZ+VfjobsWryoiPnEApGGmf5WmNsCo9xu8dfaYamG5qaLP7ClhLN6NgsFe6SwJ2UbLEBK5dv9x8Mn5+RVhMWQ=="],
 
+    "p-retry": ["p-retry@4.6.2", "", { "dependencies": { "@types/retry": "0.12.0", "retry": "^0.13.1" } }, "sha512-312Id396EbJdvRONlngUx0NydfrIQ5lsYu0znKVUzVvArzEIt08V1qhtyESbGVd1FGX7UKtiFp5uwKZdM8wIuQ=="],
+
     "p-timeout": ["p-timeout@6.1.4", "", {}, "sha512-MyIV3ZA/PmyBN/ud8vV9XzwTrNtR4jFrObymZYnZqMmW0zA8Z17vnT0rBgFE/TlohB+YCHqXMgZzb3Csp49vqg=="],
 
+    "pac-proxy-agent": ["pac-proxy-agent@7.2.0", "", { "dependencies": { "@tootallnate/quickjs-emscripten": "^0.23.0", "agent-base": "^7.1.2", "debug": "^4.3.4", "get-uri": "^6.0.1", "http-proxy-agent": "^7.0.0", "https-proxy-agent": "^7.0.6", "pac-resolver": "^7.0.1", "socks-proxy-agent": "^8.0.5" } }, "sha512-TEB8ESquiLMc0lV8vcd5Ql/JAKAoyzHFXaStwjkzpOpC5Yv+pIzLfHvjTSdf3vpa2bMiUQrg9i6276yn8666aA=="],
+
+    "pac-resolver": ["pac-resolver@7.0.1", "", { "dependencies": { "degenerator": "^5.0.0", "netmask": "^2.0.2" } }, "sha512-5NPgf87AT2STgwa2ntRMr45jTKrYBGkVU36yT0ig/n/GMAa3oPqhZfIQ2kMEimReg0+t9kZViDVZ83qfVUlckg=="],
+
     "package-json-from-dist": ["package-json-from-dist@1.0.1", "", {}, "sha512-UEZIS3/by4OC8vL3P2dTXRETpebLI2NiI5vIrjaD/5UtrkFX/tNbwjTSRAGC/+7CAo2pIcBaRgWmcBBHcsaCIw=="],
 
     "package-manager-detector": ["package-manager-detector@1.6.0", "", {}, "sha512-61A5ThoTiDG/C8s8UMZwSorAGwMJ0ERVGj2OjoW5pAalsNOg15+iQiPzrLJ4jhZ1HJzmC2PIHT2oEiH3R5fzNA=="],
@@ -1391,6 +1648,10 @@
 
     "parse5": ["parse5@7.3.0", "", { "dependencies": { "entities": "^6.0.0" } }, "sha512-IInvU7fabl34qmi9gY8XOVxhYyMyuH2xUNpb2q8/Y+7552KlejkRvqvD19nMoUW/uQGGbqNpA6Tufu5FL5BZgw=="],
 
+    "partial-json": ["partial-json@0.1.7", "", {}, "sha512-Njv/59hHaokb/hRUjce3Hdv12wd60MtM9Z5Olmn+nehe0QDAsRtRbJPvJ0Z91TusF0SuZRIvnM+S4l6EIP8leA=="],
+
+    "path-expression-matcher": ["path-expression-matcher@1.5.0", "", {}, "sha512-cbrerZV+6rvdQrrD+iGMcZFEiiSrbv9Tfdkvnusy6y0x0GKBXREFg/Y65GhIfm0tnLntThhzCnfKwp1WRjeCyQ=="],
+
     "path-key": ["path-key@3.1.1", "", {}, "sha512-ojmeN0qd+y0jszEtoY48r0Peq5dwMEkIlCOu6Q5f41lfkswXuKtYrhgoTpLnyIcHm24Uhqx+5Tqm2InSwLhE6Q=="],
 
     "path-scurry": ["path-scurry@2.0.1", "", { "dependencies": { "lru-cache": "^11.0.0", "minipass": "^7.1.2" } }, "sha512-oWyT4gICAu+kaA7QWk/jvCHWarMKNs6pXOGWKDTr7cw4IGcUbW+PeTfbaQiLGheFRpjo6O9J0PmyMfQPjH71oA=="],
@@ -1425,6 +1686,8 @@
 
     "protobufjs": ["protobufjs@8.0.0", "", { "dependencies": { "@protobufjs/aspromise": "^1.1.2", "@protobufjs/base64": "^1.1.2", "@protobufjs/codegen": "^2.0.4", "@protobufjs/eventemitter": "^1.1.0", "@protobufjs/fetch": "^1.1.0", "@protobufjs/float": "^1.0.2", "@protobufjs/inquire": "^1.1.0", "@protobufjs/path": "^1.1.2", "@protobufjs/pool": "^1.1.0", "@protobufjs/utf8": "^1.1.0", "@types/node": ">=13.7.0", "long": "^5.0.0" } }, "sha512-jx6+sE9h/UryaCZhsJWbJtTEy47yXoGNYI4z8ZaRncM0zBKeRqjO2JEcOUYwrYGb1WLhXM1FfMzW3annvFv0rw=="],
 
+    "proxy-agent": ["proxy-agent@6.5.0", "", { "dependencies": { "agent-base": "^7.1.2", "debug": "^4.3.4", "http-proxy-agent": "^7.0.1", "https-proxy-agent": "^7.0.6", "lru-cache": "^7.14.1", "pac-proxy-agent": "^7.1.0", "proxy-from-env": "^1.1.0", "socks-proxy-agent": "^8.0.5" } }, "sha512-TmatMXdr2KlRiA2CyDu8GqR8EjahTG3aY3nXjdzFyoZbmB8hrBsTyMezhULIXKnC0jpfjlmiZ3+EaCzoInSu/A=="],
+
     "proxy-from-env": ["proxy-from-env@1.1.0", "", {}, "sha512-D+zkORCbA9f1tdWRK0RaCR3GPv50cMxcrz4X8k5LTSUD1Dkw47mKJEZQNunItRTkWwgtaUSo1RVFRIG9ZXiFYg=="],
 
     "punycode": ["punycode@2.3.1", "", {}, "sha512-vYt7UD1U9Wg6138shLtLOvdAu+8DsC/ilFtEVHcH+wydcSpNE20AfSOduf6MkRFahL5FY7X1oU7nKVZFtfq8Fg=="],
@@ -1495,6 +1758,8 @@
 
     "remark-stringify": ["remark-stringify@11.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "mdast-util-to-markdown": "^2.0.0", "unified": "^11.0.0" } }, "sha512-1OSmLd3awB/t8qdoEOMazZkNsfVTeY4fTsgzcQFdXNq8ToTN4ZGwrMnlda4K6smTFKD+GRV6O48i6Z4iKgPPpw=="],
 
+    "require-from-string": ["require-from-string@2.0.2", "", {}, "sha512-Xf0nWe6RseziFMu+Ap9biiUbmplq6S9/p+7w7YXP/JBHhrUDDUhwa+vANyubuqfZWTveU//DYVGsDG7RKL/vEw=="],
+
     "reselect": ["reselect@5.1.1", "", {}, "sha512-K/BG6eIky/SBpzfHZv/dd+9JBFiS4SWV7FIujVyJRux6e45+73RaUHXLmIR1f7WOMaQ0U1km6qwklRQxpJJY0w=="],
 
     "resolve-from": ["resolve-from@5.0.0", "", {}, "sha512-qYg9KP24dD5qka9J47d0aVky0N+b4fTU89LN9iDnjB5waksiC49rvMB0PrUJQGoTmH50XPiqOvAjDfaijGxYZw=="],
@@ -1509,6 +1774,8 @@
 
     "retext-stringify": ["retext-stringify@4.0.0", "", { "dependencies": { "@types/nlcst": "^2.0.0", "nlcst-to-string": "^4.0.0", "unified": "^11.0.0" } }, "sha512-rtfN/0o8kL1e+78+uxPTqu1Klt0yPzKuQ2BfWwwfgIUSayyzxpM1PJzkKt4V8803uB9qSy32MvI7Xep9khTpiA=="],
 
+    "retry": ["retry@0.13.1", "", {}, "sha512-XQBQ3I8W1Cge0Seh+6gjj03LbmRFWuoszgK9ooCpwYIrhhoO80pfq4cUkU5DkknwfOfFteRwlZ56PYOGYyFWdg=="],
+
     "reusify": ["reusify@1.1.0", "", {}, "sha512-g6QUff04oZpHs0eG5p83rFLhHeV00ug/Yf9nZM6fLeUrPguBTkTQOdpAWWspMh55TZfVQDPaN3NQJfbVRAxdIw=="],
 
     "rimraf": ["rimraf@6.1.2", "", { "dependencies": { "glob": "^13.0.0", "package-json-from-dist": "^1.0.1" }, "bin": { "rimraf": "dist/esm/bin.mjs" } }, "sha512-cFCkPslJv7BAXJsYlK1dZsbP8/ZNLkCAQ0bi1hf5EKX2QHegmDFEFA6QhuYJlk7UDdc+02JjO80YSOrWPpw06g=="],
@@ -1517,6 +1784,8 @@
 
     "run-parallel": ["run-parallel@1.2.0", "", { "dependencies": { "queue-microtask": "^1.2.2" } }, "sha512-5l4VyZR86LZ/lDxZTR6jqL8AFE2S0IFLMP26AbjsLVADxHdhB/c0GUsH+y39UfCi3dzz8OlQuPmnaJOMoDHQBA=="],
 
+    "safe-buffer": ["safe-buffer@5.2.1", "", {}, "sha512-rp3So07KcdmmKbGvgaNxQSJr7bGVSVk5S9Eq1F+ppbRo70+YeaDxkw5Dd8NPN+GD6bjnYm2VuPuCXmpuYvmCXQ=="],
+
     "safer-buffer": ["safer-buffer@2.1.2", "", {}, "sha512-YZo3K82SD7Riyi0E1EQPojLz7kpepnSQI9IyPbHHg1XXXevb5dJI7tpyN2ADxGcQbHG7vcyRHk0cbwqcQriUtg=="],
 
     "sax": ["sax@1.4.4", "", {}, "sha512-1n3r/tGXO6b6VXMdFT54SHzT9ytu9yr7TaELowdYpMqY/Ao7EnlQGmAQ1+RatX7Tkkdm6hONI2owqNx2aZj5Sw=="],
@@ -1543,8 +1812,14 @@
 
     "sitemap": ["sitemap@8.0.2", "", { "dependencies": { "@types/node": "^17.0.5", "@types/sax": "^1.2.1", "arg": "^5.0.0", "sax": "^1.4.1" }, "bin": { "sitemap": "dist/cli.js" } }, "sha512-LwktpJcyZDoa0IL6KT++lQ53pbSrx2c9ge41/SeLTyqy2XUNA6uR4+P9u5IVo5lPeL2arAcOKn1aZAxoYbCKlQ=="],
 
+    "smart-buffer": ["smart-buffer@4.2.0", "", {}, "sha512-94hK0Hh8rPqQl2xXc3HsaBoOXKV20MToPkcXvwbISWLEs+64sBq5kFgn2kJDHb1Pry9yrP0dxrCI9RRci7RXKg=="],
+
     "smol-toml": ["smol-toml@1.6.0", "", {}, "sha512-4zemZi0HvTnYwLfrpk/CF9LOd9Lt87kAt50GnqhMpyF9U3poDAP2+iukq2bZsO/ufegbYehBkqINbsWxj4l4cw=="],
 
+    "socks": ["socks@2.8.8", "", { "dependencies": { "ip-address": "^10.1.1", "smart-buffer": "^4.2.0" } }, "sha512-NlGELfPrgX2f1TAAcz0WawlLn+0r3FyhhCRpFFK2CemXenPYvzMWWZINv3eDNo9ucdwme7oCHRY0Jnbs4aIkog=="],
+
+    "socks-proxy-agent": ["socks-proxy-agent@8.0.5", "", { "dependencies": { "agent-base": "^7.1.2", "debug": "^4.3.4", "socks": "^2.8.3" } }, "sha512-HehCEsotFqbPW9sJ8WVYB6UbmIMv7kUUORIF2Nncq4VQvBfNBLibW9YZR5dlYCSUhwcD628pRllm7n+E+YTzJw=="],
+
     "source-map": ["source-map@0.8.0-beta.0", "", { "dependencies": { "whatwg-url": "^7.0.0" } }, "sha512-2ymg6oRBpebeZi9UUNsgQ89bhx01TcTkmNTGnNO88imTmbSgy4nfujrgVEFKWpMTEGA11EDkTt7mqObTPdigIA=="],
 
     "source-map-js": ["source-map-js@1.2.1", "", {}, "sha512-UXWMKhLOwVKb728IUtQPXxfYU+usdybtUrK/8uGE8CQMvrhOpwvzDBwj0QhSL7MQc7vIsISBG8VQ8+IDQxpfQA=="],
@@ -1563,6 +1838,8 @@
 
     "strip-final-newline": ["strip-final-newline@4.0.0", "", {}, "sha512-aulFJcD6YK8V1G7iRB5tigAP4TsHBZZrOV8pjV++zdUwmeV8uzbY7yn6h9MswN62adStNZFuCIx4haBnRuMDaw=="],
 
+    "strnum": ["strnum@2.2.3", "", {}, "sha512-oKx6RUCuHfT3oyVjtnrmn19H1SiCqgJSg+54XqURKp5aCMbrXrhLjRN9TjuwMjiYstZ0MzDrHqkGZ5dFTKd+zg=="],
+
     "style-to-js": ["style-to-js@1.1.21", "", { "dependencies": { "style-to-object": "1.0.14" } }, "sha512-RjQetxJrrUJLQPHbLku6U/ocGtzyjbJMP9lCNK7Ag0CNh690nSH8woqWH9u16nMjYBAok+i7JO1NP2pOy8IsPQ=="],
 
     "style-to-object": ["style-to-object@1.0.14", "", { "dependencies": { "inline-style-parser": "0.2.7" } }, "sha512-LIN7rULI0jBscWQYaSswptyderlarFkjQ+t79nzty8tcIAceVomEVlLzH5VP4Cmsv6MtKhs7qaAiwlcp+Mgaxw=="],
@@ -1597,6 +1874,8 @@
 
     "trough": ["trough@2.2.0", "", {}, "sha512-tmMpK00BjZiUyVyvrBK7knerNgmgvcV/KLVyuma/SC+TQN167GrMRciANTz09+k3zW8L8t60jWO1GpfkZdjTaw=="],
 
+    "ts-algebra": ["ts-algebra@2.0.0", "", {}, "sha512-FPAhNPFMrkwz76P7cdjdmiShwMynZYN6SgOujD1urY4oNm80Ou9oMdmbR45LotcKOXoy7wSmHkRFE6Mxbrhefw=="],
+
     "ts-interface-checker": ["ts-interface-checker@0.1.13", "", {}, "sha512-Y/arvbn+rrz3JCKl9C4kVNfTfSm2/mEp5FSz5EsZSANGPSlQrpRI5M4PKF+mJnE52jOO90PnPSc3Ur3bTQw0gA=="],
 
     "tsconfck": ["tsconfck@3.1.6", "", { "peerDependencies": { "typescript": "^5.0.0" }, "optionalPeers": ["typescript"], "bin": { "tsconfck": "bin/tsconfck.js" } }, "sha512-ks6Vjr/jEw0P1gmOVwutM3B7fWxoWBL2KRDb1JfqGVawBmO5UsvmWOQFGHBPl5yxYz4eERr19E6L7NMv+Fej4w=="],
@@ -1619,6 +1898,8 @@
 
     "uncrypto": ["uncrypto@0.1.3", "", {}, "sha512-Ql87qFHB3s/De2ClA9e0gsnS6zXG27SkTiSJwjCc9MebbfapQfuPzumMIUMi38ezPZVNFcHI9sUIepeQfw8J8Q=="],
 
+    "undici": ["undici@7.25.0", "", {}, "sha512-xXnp4kTyor2Zq+J1FfPI6Eq3ew5h6Vl0F/8d9XU5zZQf1tX9s2Su1/3PiMmUANFULpmksxkClamIZcaUqryHsQ=="],
+
     "undici-types": ["undici-types@7.8.0", "", {}, "sha512-9UJ2xGDvQ43tYyVMpuHlsgApydB8ZKfVYTsLDhXkFL/6gfkp+U8xTGdh8pMJv1SpZna0zxG1DwsKZsreLbXBxw=="],
 
     "unicorn-magic": ["unicorn-magic@0.3.0", "", {}, "sha512-+QBBXBCvifc56fsbuxZQ6Sic3wqqc3WWaqxs58gvJrcOuN83HGTCwz3oS5phzU9LthRNE9VrJCFCLUgHeeFnfA=="],
@@ -1675,6 +1956,8 @@
 
     "web-namespaces": ["web-namespaces@2.0.1", "", {}, "sha512-bKr1DkiNa2krS7qxNtdrtHAmzuYGFQLiQ13TsorsdT6ULTkPLKuu5+GsFpDlg6JFjUTwX2DyhMPG2be8uPrqsQ=="],
 
+    "web-streams-polyfill": ["web-streams-polyfill@3.3.3", "", {}, "sha512-d2JWLCivmZYTSIoge9MsgFCZrt571BikcWGYkjC1khllbTeDlGqZ2D8vD8E/lJa8WGWbb7Plm8/XJYV7IJHZZw=="],
+
     "webidl-conversions": ["webidl-conversions@4.0.2", "", {}, "sha512-YQ+BmxuTgd6UXZW3+ICGfyqRyHXVlD5GtQr5+qjiNW7bF0cqrzX500HVXPBOvgXb5YnzDd+h0zqyv61KUD7+Sg=="],
 
     "webpack-virtual-modules": ["webpack-virtual-modules@0.6.2", "", {}, "sha512-66/V2i5hQanC51vBQKPH4aI8NMAcBW59FVBs+rC7eGHupMyfn34q7rZIE+ETlJ+XTevqfUhVVBgSUNSW2flEUQ=="],
@@ -1689,6 +1972,8 @@
 
     "wrap-ansi": ["wrap-ansi@9.0.2", "", { "dependencies": { "ansi-styles": "^6.2.1", "string-width": "^7.0.0", "strip-ansi": "^7.1.0" } }, "sha512-42AtmgqjV+X1VpdOfyTGOYRi0/zsoLqtXQckTmqTeybT+BDIbM/Guxo7x3pE2vtpr1ok6xRqM9OpBe+Jyoqyww=="],
 
+    "ws": ["ws@8.20.0", "", { "peerDependencies": { "bufferutil": "^4.0.1", "utf-8-validate": ">=5.0.2" }, "optionalPeers": ["bufferutil", "utf-8-validate"] }, "sha512-sAt8BhgNbzCtgGbt2OxmpuryO63ZoDk/sqaB/znQm94T4fCEsy/yV+7CdC1kJhOU9lboAEU7R3kquuycDoibVA=="],
+
     "xxhash-wasm": ["xxhash-wasm@1.1.0", "", {}, "sha512-147y/6YNh+tlp6nd/2pWq38i9h6mz/EuQ6njIrmW8D1BS5nCqs0P6DG+m6zTGnNz5I+uhZ0SHxBs9BsPrwcKDA=="],
 
     "yallist": ["yallist@3.1.1", "", {}, "sha512-a4UGQaWPH59mOXUYnAG2ewncQS4i4F43Tv3JoAM+s2VDAmS9NsK8GpDMLrCHPksFT7h3K6TOoUNn2pb7RoXx4g=="],
@@ -1715,6 +2000,10 @@
 
     "@astrojs/mdx/source-map": ["source-map@0.7.6", "", {}, "sha512-i5uvt8C3ikiWeNZSVZNWcfZPItFQOsYTUAOkcUPGd8DqDy1uOUikjt5dG+uRlwyvR108Fb9DOd4GvXfT0N2/uQ=="],
 
+    "@aws-crypto/sha256-browser/@smithy/util-utf8": ["@smithy/util-utf8@2.3.0", "", { "dependencies": { "@smithy/util-buffer-from": "^2.2.0", "tslib": "^2.6.2" } }, "sha512-R8Rdn8Hy72KKcebgLiv8jQcQkXoLMOGGv5uI1/k0l+snqkOzQ1R0ChUBCxWMlBsFMekWjq0wRudIweFs7sKT5A=="],
+
+    "@aws-crypto/util/@smithy/util-utf8": ["@smithy/util-utf8@2.3.0", "", { "dependencies": { "@smithy/util-buffer-from": "^2.2.0", "tslib": "^2.6.2" } }, "sha512-R8Rdn8Hy72KKcebgLiv8jQcQkXoLMOGGv5uI1/k0l+snqkOzQ1R0ChUBCxWMlBsFMekWjq0wRudIweFs7sKT5A=="],
+
     "@babel/core/@babel/types": ["@babel/types@7.29.0", "", { "dependencies": { "@babel/helper-string-parser": "^7.27.1", "@babel/helper-validator-identifier": "^7.28.5" } }, "sha512-LwdZHpScM4Qz8Xw2iKSzS+cfglZzJGvofQICy7W7v4caru4EaAmyUuO6BGrbyQ2mYV11W0U8j5mBhd14dd3B0A=="],
 
     "@babel/core/semver": ["semver@6.3.1", "", { "bin": { "semver": "bin/semver.js" } }, "sha512-BR7VvDCVHO+q2xBEWskxS6DJE1qRnb7DxzUrogb71CWoSficBxYsiAGd+Kl0mmq/MprG9yArRkyrQxTO6XjMzA=="],
@@ -1737,8 +2026,12 @@
 
     "@github/copilot-sdk/zod": ["zod@4.3.6", "", {}, "sha512-rftlrkhHZOcjDwkGlnUtZZkvaPHCsDATp4pGpuOOMDaTdDDXF91wuVDJoWoPsKX/3YPQ5fHuF3STjcYyKr+Qhg=="],
 
+    "@google/genai/protobufjs": ["protobufjs@7.5.6", "", { "dependencies": { "@protobufjs/aspromise": "^1.1.2", "@protobufjs/base64": "^1.1.2", "@protobufjs/codegen": "^2.0.5", "@protobufjs/eventemitter": "^1.1.0", "@protobufjs/fetch": "^1.1.0", "@protobufjs/float": "^1.0.2", "@protobufjs/inquire": "^1.1.1", "@protobufjs/path": "^1.1.2", "@protobufjs/pool": "^1.1.0", "@protobufjs/utf8": "^1.1.1", "@types/node": ">=13.7.0", "long": "^5.0.0" } }, "sha512-M71sTMB146U3u0di3yup8iM+zv8yPRNQVr1KK4tyBitl3qFvEGucq/rGDRShD2rsJhtN02RJaJ7j5X5hmy8SJg=="],
+
     "@mdx-js/mdx/source-map": ["source-map@0.7.6", "", {}, "sha512-i5uvt8C3ikiWeNZSVZNWcfZPItFQOsYTUAOkcUPGd8DqDy1uOUikjt5dG+uRlwyvR108Fb9DOd4GvXfT0N2/uQ=="],
 
+    "@mistralai/mistralai/zod": ["zod@4.3.6", "", {}, "sha512-rftlrkhHZOcjDwkGlnUtZZkvaPHCsDATp4pGpuOOMDaTdDDXF91wuVDJoWoPsKX/3YPQ5fHuF3STjcYyKr+Qhg=="],
+
     "@reduxjs/toolkit/immer": ["immer@11.1.4", "", {}, "sha512-XREFCPo6ksxVzP4E0ekD5aMdf8WMwmdNaz6vuvxgI40UaEiu6q3p8X52aU6GdyvLY3XXX/8R7JOTXStz/nBbRw=="],
 
     "@rollup/pluginutils/estree-walker": ["estree-walker@2.0.2", "", {}, "sha512-Rfkk/Mp/DL7JVje3u18FxFujQlTNR2q6QfMSMB7AvCBx91NGj/ba3kCfza0f6dVDbw7YlRf/nDrn7pQrCCyQ/w=="],
@@ -1785,18 +2078,26 @@
 
     "csso/css-tree": ["css-tree@2.2.1", "", { "dependencies": { "mdn-data": "2.0.28", "source-map-js": "^1.0.1" } }, "sha512-OA0mILzGc1kCOCSJerOeqDxDQ4HOh+G8NbOJFOTgOCzpw7fCBubk0fEyxp8AgOL/jvLgYA/uV0cMbe43ElF1JA=="],
 
+    "degenerator/ast-types": ["ast-types@0.13.4", "", { "dependencies": { "tslib": "^2.0.1" } }, "sha512-x1FCFnFifvYDDzTaLII71vG5uvDwgtmDTEVWAxrgeiR8VjMONcCXJx7E+USjDtHlwFmt9MysbqgF9b9Vjr6w+w=="],
+
     "dom-serializer/entities": ["entities@4.5.0", "", {}, "sha512-V0hjH4dGPh9Ao5p0MoRY6BVqtwCjhz6vI5LT8AJ55H+4g9/4vbHx1I54fS0XuclLhDHArPQCiMjDxjaL8fPxhw=="],
 
+    "escodegen/source-map": ["source-map@0.6.1", "", {}, "sha512-UjgapumWlbMhkBgzT7Ykc5YXUT46F0iKu8SGXq0bcwP5dz/h0Plj6enJqjz1Zbq2l5WaqYnrVbwWOWMyF3F47g=="],
+
     "estree-util-to-js/source-map": ["source-map@0.7.6", "", {}, "sha512-i5uvt8C3ikiWeNZSVZNWcfZPItFQOsYTUAOkcUPGd8DqDy1uOUikjt5dG+uRlwyvR108Fb9DOd4GvXfT0N2/uQ=="],
 
     "h3/cookie-es": ["cookie-es@1.2.2", "", {}, "sha512-+W7VmiVINB+ywl1HGXJXmrqkOhpKrIiVZV6tQuV54ZyQC7MMuBt81Vc336GMLoHBq5hV/F9eXgt5Mnx0Rha5Fg=="],
 
     "magicast/@babel/parser": ["@babel/parser@7.28.6", "", { "dependencies": { "@babel/types": "^7.28.6" }, "bin": "./bin/babel-parser.js" }, "sha512-TeR9zWR18BvbfPmGbLampPMW+uW1NZnJlRuuHso8i87QZNq2JRF9i6RgxRqtEq+wQGsS19NNTWr2duhnE49mfQ=="],
 
+    "node-fetch/data-uri-to-buffer": ["data-uri-to-buffer@4.0.1", "", {}, "sha512-0R9ikRb668HB7QDxT1vkpuUBtqc53YyAwMwGeUFKRojY/NWKvdZ+9UYtRfGmhqNbRkTSVpMbmyhXipFFv2cb/A=="],
+
     "npm-run-path/path-key": ["path-key@4.0.0", "", {}, "sha512-haREypq7xkM7ErfgIyA0z+Bj4AGKlMSdlQE2jvJo6huWD1EdkKYV+G/T4nq0YEF2vgTT8kqMFKo1uHn950r4SQ=="],
 
     "parse-entities/@types/unist": ["@types/unist@2.0.11", "", {}, "sha512-CmBKiL6NNo/OqgmMn95Fk9Whlp2mtvIv+KNpQKN2F4SjvrEesubTRWGYSg+BnWZOnlCaSTU1sMpsBOzgbYhnsA=="],
 
+    "path-scurry/lru-cache": ["lru-cache@11.2.5", "", {}, "sha512-vFrFJkWtJvJnD5hg+hJvVE8Lh/TcMzKnTgCWmtBipwI5yLX/iX+5UB2tfuyODF5E7k9xEzMdYgGqaSb1c0c5Yw=="],
+
     "recast/source-map": ["source-map@0.6.1", "", {}, "sha512-UjgapumWlbMhkBgzT7Ykc5YXUT46F0iKu8SGXq0bcwP5dz/h0Plj6enJqjz1Zbq2l5WaqYnrVbwWOWMyF3F47g=="],
 
     "sharp/semver": ["semver@7.7.3", "", { "bin": { "semver": "bin/semver.js" } }, "sha512-SdsKMrI9TdgjdweUSR9MweHA4EJ8YxHn8DFaDisvhVlUOe4BF1tLD7GAj0lIqWVl+dPb/rExr0Btby5loQm20Q=="],
@@ -1813,10 +2114,22 @@
 
     "unstorage/chokidar": ["chokidar@5.0.0", "", { "dependencies": { "readdirp": "^5.0.0" } }, "sha512-TQMmc3w+5AxjpL8iIiwebF73dRDF4fBIieAqGn9RGCWaEVwQ6Fb2cGe31Yns0RRIzii5goJ1Y7xbMwo1TxMplw=="],
 
+    "unstorage/lru-cache": ["lru-cache@11.2.5", "", {}, "sha512-vFrFJkWtJvJnD5hg+hJvVE8Lh/TcMzKnTgCWmtBipwI5yLX/iX+5UB2tfuyODF5E7k9xEzMdYgGqaSb1c0c5Yw=="],
+
     "vite/esbuild": ["esbuild@0.25.12", "", { "optionalDependencies": { "@esbuild/aix-ppc64": "0.25.12", "@esbuild/android-arm": "0.25.12", "@esbuild/android-arm64": "0.25.12", "@esbuild/android-x64": "0.25.12", "@esbuild/darwin-arm64": "0.25.12", "@esbuild/darwin-x64": "0.25.12", "@esbuild/freebsd-arm64": "0.25.12", "@esbuild/freebsd-x64": "0.25.12", "@esbuild/linux-arm": "0.25.12", "@esbuild/linux-arm64": "0.25.12", "@esbuild/linux-ia32": "0.25.12", "@esbuild/linux-loong64": "0.25.12", "@esbuild/linux-mips64el": "0.25.12", "@esbuild/linux-ppc64": "0.25.12", "@esbuild/linux-riscv64": "0.25.12", "@esbuild/linux-s390x": "0.25.12", "@esbuild/linux-x64": "0.25.12", "@esbuild/netbsd-arm64": "0.25.12", "@esbuild/netbsd-x64": "0.25.12", "@esbuild/openbsd-arm64": "0.25.12", "@esbuild/openbsd-x64": "0.25.12", "@esbuild/openharmony-arm64": "0.25.12", "@esbuild/sunos-x64": "0.25.12", "@esbuild/win32-arm64": "0.25.12", "@esbuild/win32-ia32": "0.25.12", "@esbuild/win32-x64": "0.25.12" }, "bin": { "esbuild": "bin/esbuild" } }, "sha512-bbPBYYrtZbkt6Os6FiTLCTFxvq4tt3JKall1vRwshA3fdVztsLAatFaZobhkBC8/BrPetoa0oksYoKXoG4ryJg=="],
 
     "vite/picomatch": ["picomatch@4.0.3", "", {}, "sha512-5gTmgEY/sqK6gFXLIsQNH19lWb4ebPDLA4SdLP7dsWkIXHWlG66oPuVvXSGFPppYZz8ZDZq0dYYrbHfBCVUb1Q=="],
 
+    "@aws-crypto/sha256-browser/@smithy/util-utf8/@smithy/util-buffer-from": ["@smithy/util-buffer-from@2.2.0", "", { "dependencies": { "@smithy/is-array-buffer": "^2.2.0", "tslib": "^2.6.2" } }, "sha512-IJdWBbTcMQ6DA0gdNhh/BwrLkDR+ADW5Kr1aZmd4k3DIF6ezMV4R2NIAmT08wQJ3yUK82thHWmC/TnK/wpMMIA=="],
+
+    "@aws-crypto/util/@smithy/util-utf8/@smithy/util-buffer-from": ["@smithy/util-buffer-from@2.2.0", "", { "dependencies": { "@smithy/is-array-buffer": "^2.2.0", "tslib": "^2.6.2" } }, "sha512-IJdWBbTcMQ6DA0gdNhh/BwrLkDR+ADW5Kr1aZmd4k3DIF6ezMV4R2NIAmT08wQJ3yUK82thHWmC/TnK/wpMMIA=="],
+
+    "@google/genai/protobufjs/@protobufjs/codegen": ["@protobufjs/codegen@2.0.5", "", {}, "sha512-zgXFLzW3Ap33e6d0Wlj4MGIm6Ce8O89n/apUaGNB/jx+hw+ruWEp7EwGUshdLKVRCxZW12fp9r40E1mQrf/34g=="],
+
+    "@google/genai/protobufjs/@protobufjs/inquire": ["@protobufjs/inquire@1.1.1", "", {}, "sha512-mnzgDV26ueAvk7rsbt9L7bE0SuAoqyuys/sMMrmVcN5x9VsxpcG3rqAUSgDyLp0UZlmNfIbQ4fHfCtreVBk8Ew=="],
+
+    "@google/genai/protobufjs/@protobufjs/utf8": ["@protobufjs/utf8@1.1.1", "", {}, "sha512-oOAWABowe8EAbMyWKM0tYDKi8Yaox52D+HWZhAIJqQXbqe0xI/GV7FhLWqlEKreMkfDjshR5FKgi3mnle0h6Eg=="],
+
     "@tanstack/router-plugin/chokidar/readdirp": ["readdirp@3.6.0", "", { "dependencies": { "picomatch": "^2.2.1" } }, "sha512-hOS089on8RduqdbhvQ5Z37A0ESjsqz6qnRcffsMU3495FuTdqSm+7bhJ29JvIOsBDEEnan5DPu9t3To9VRlMzA=="],
 
     "ansi-align/string-width/emoji-regex": ["emoji-regex@8.0.0", "", {}, "sha512-MSjYzcWNOA0ewAHpz0MxpYFvwg6yjy1NG3xteoqz644VCo/RPgnr1/GGt+ic3iJTzQ8Eu3TdM14SawnVUmGE6A=="],
@@ -1979,6 +2292,10 @@
 
     "vite/esbuild/@esbuild/win32-x64": ["@esbuild/win32-x64@0.25.12", "", { "os": "win32", "cpu": "x64" }, "sha512-alJC0uCZpTFrSL0CCDjcgleBXPnCrEAhTBILpeAp7M/OFgoqtAetfBzX0xM00MUsVVPpVjlPuMbREqnZCXaTnA=="],
 
+    "@aws-crypto/sha256-browser/@smithy/util-utf8/@smithy/util-buffer-from/@smithy/is-array-buffer": ["@smithy/is-array-buffer@2.2.0", "", { "dependencies": { "tslib": "^2.6.2" } }, "sha512-GGP3O9QFD24uGeAXYUjwSTXARoqpZykHadOmA8G5vfJPK0/DC67qa//0qvqrJzL1xc8WQWX7/yc7fwudjPHPhA=="],
+
+    "@aws-crypto/util/@smithy/util-utf8/@smithy/util-buffer-from/@smithy/is-array-buffer": ["@smithy/is-array-buffer@2.2.0", "", { "dependencies": { "tslib": "^2.6.2" } }, "sha512-GGP3O9QFD24uGeAXYUjwSTXARoqpZykHadOmA8G5vfJPK0/DC67qa//0qvqrJzL1xc8WQWX7/yc7fwudjPHPhA=="],
+
     "ansi-align/string-width/strip-ansi/ansi-regex": ["ansi-regex@5.0.1", "", {}, "sha512-quJQXlTSUGL2LH9SUXo8VwsY4soanhgo6LNSm84E1LBcE8s3O0wpdiRzyR9z/ZZJMlMWv37qOOb9pdJlMUEKFQ=="],
   }
 }
diff --git a/packages/core/package.json b/packages/core/package.json
index 287b6e7b..b8f23a80 100644
--- a/packages/core/package.json
+++ b/packages/core/package.json
@@ -47,6 +47,7 @@
     "@ai-sdk/google": "^3.0.0",
     "@ai-sdk/openai": "^3.0.0",
     "@github/copilot-sdk": "^0.1.25",
+    "@mariozechner/pi-ai": "^0.62.0",
     "@openai/codex-sdk": "^0.104.0",
     "@openrouter/ai-sdk-provider": "^2.3.1",
     "ai": "^6.0.0",
diff --git a/packages/core/src/evaluation/generators/rubric-generator.ts b/packages/core/src/evaluation/generators/rubric-generator.ts
index b1255eb5..fba83ce6 100644
--- a/packages/core/src/evaluation/generators/rubric-generator.ts
+++ b/packages/core/src/evaluation/generators/rubric-generator.ts
@@ -1,7 +1,7 @@
-import { generateText } from 'ai';
 import { z } from 'zod';
 
 import type { Provider } from '../providers/types.js';
+import { extractLastAssistantContent } from '../providers/types.js';
 import type { RubricItem } from '../types.js';
 
 const rubricItemSchema = z.object({
@@ -24,6 +24,10 @@ export interface GenerateRubricsOptions {
 
 /**
  * Generate rubrics from expected outcome using an LLM.
+ *
+ * Calls the provider through `Provider.invoke()` — the LLM call itself is
+ * a single non-streaming, non-tool-using completion. JSON output is parsed
+ * with up to 3 retries to absorb model formatting variance.
  */
 export async function generateRubrics(
   options: GenerateRubricsOptions,
@@ -32,11 +36,6 @@ export async function generateRubrics(
 
   const prompt = buildPrompt(criteria, question, referenceAnswer);
 
-  const model = provider.asLanguageModel?.();
-  if (!model) {
-    throw new Error('Provider does not support language model interface');
-  }
-
   const system = `You are an expert at creating evaluation rubrics.
 You must return a valid JSON object matching this schema:
 {
@@ -55,12 +54,12 @@ You must return a valid JSON object matching this schema:
 
   for (let attempt = 1; attempt <= 3; attempt++) {
     try {
-      const { text } = await generateText({
-        model,
-        system,
-        prompt,
+      const response = await provider.invoke({
+        question: prompt,
+        systemPrompt: system,
       });
 
+      const text = extractLastAssistantContent(response.output);
       const cleaned = text.replace(/```json\n?|```/g, '').trim();
       result = rubricGenerationSchema.parse(JSON.parse(cleaned));
       break;
diff --git a/packages/core/src/evaluation/providers/ai-sdk.ts b/packages/core/src/evaluation/providers/ai-sdk.ts
index 97bf3128..044f4845 100644
--- a/packages/core/src/evaluation/providers/ai-sdk.ts
+++ b/packages/core/src/evaluation/providers/ai-sdk.ts
@@ -58,8 +58,12 @@ export class OpenAIProvider implements Provider {
   }
 
   async invoke(request: ProviderRequest): Promise<ProviderResponse> {
-    return invokeModel({
-      model: this.model,
+    return invokePiAi({
+      providerName: 'openai',
+      apiId: this.config.apiFormat === 'responses' ? 'openai-responses' : 'openai-completions',
+      modelId: this.config.model,
+      apiKey: this.config.apiKey,
+      baseUrl: this.config.baseURL,
       request,
       defaults: this.defaults,
       retryConfig: this.retryConfig,
@@ -517,6 +521,227 @@ async function sleep(ms: number): Promise<void> {
   return new Promise((resolve) => setTimeout(resolve, ms));
 }
 
+// ---------------------------------------------------------------------------
+// pi-ai migration (issue #1205)
+// ---------------------------------------------------------------------------
+//
+// invokePiAi runs a single non-streaming, non-tool-using completion through
+// @mariozechner/pi-ai. It is the new code path; the existing invokeModel
+// (Vercel AI SDK) above is still in use for the four providers we have not
+// ported yet (Azure, OpenRouter, Anthropic, Gemini).
+//
+// Why dynamic import + `any` casts: pi-ai ships .d.ts files whose top-level
+// named exports do not resolve through TypeScript's NodeNext/Bundler module
+// resolution (same issue worked around in pi-coding-agent.ts:250). The runtime
+// exports are correct; only the static type graph is broken. We mirror the
+// pi-coding-agent.ts pattern: load the module dynamically once, cast through
+// `any`, and rely on the local invokePiAi shape for type safety.
+//
+// To port a provider:
+//   1. Map its config to the invokePiAi options below (api id, baseUrl, key).
+//   2. Replace the provider's invoke() to call invokePiAi.
+//   3. Drop the createX() / this.model build from the constructor when
+//      asLanguageModel() is no longer used by any consumer.
+
+// biome-ignore lint/suspicious/noExplicitAny: pi-ai type defs do not statically resolve named exports; mirrors the existing workaround in pi-coding-agent.ts.
+let piAiSdk: any | null = null;
+let piAiLoading: Promise<void> | null = null;
+
+async function loadPiAi(): Promise<void> {
+  if (piAiSdk) return;
+  if (!piAiLoading) {
+    piAiLoading = (async () => {
+      const mod = await import('@mariozechner/pi-ai');
+      // biome-ignore lint/suspicious/noExplicitAny: see comment above
+      const m = mod as any;
+      m.registerBuiltInApiProviders?.();
+      piAiSdk = m;
+    })().catch((err) => {
+      piAiLoading = null;
+      throw err;
+    });
+  }
+  await piAiLoading;
+}
+
+interface InvokePiAiOptions {
+  /** pi-ai provider name (matches `KnownProvider`, e.g. 'openai'). */
+  readonly providerName: string;
+  /** pi-ai api id, picks which provider impl runs the call. */
+  readonly apiId: string;
+  /** Model id from user config (may or may not exist in pi-ai's registry). */
+  readonly modelId: string;
+  readonly apiKey: string;
+  /** Optional baseUrl override; falls back to the registry's default. */
+  readonly baseUrl?: string;
+  readonly request: ProviderRequest;
+  readonly defaults: ProviderDefaults;
+  readonly retryConfig?: RetryConfig;
+}
+
+async function invokePiAi(options: InvokePiAiOptions): Promise<ProviderResponse> {
+  const { providerName, apiId, modelId, apiKey, baseUrl, request, defaults, retryConfig } = options;
+
+  await loadPiAi();
+  const sdk = piAiSdk;
+
+  const model = resolvePiModel(sdk, { providerName, apiId, modelId, baseUrl });
+  const { systemPrompt, messages } = chatPromptToPiContext(buildChatPrompt(request));
+  const { temperature, maxOutputTokens } = resolveModelSettings(request, defaults);
+
+  const startTime = new Date().toISOString();
+  const startMs = Date.now();
+
+  const result = await withRetry(
+    () =>
+      sdk.complete(
+        model,
+        { systemPrompt, messages },
+        {
+          apiKey,
+          temperature,
+          ...(maxOutputTokens !== undefined ? { maxTokens: maxOutputTokens } : {}),
+          signal: request.signal,
+        },
+      ),
+    retryConfig,
+    request.signal,
+  );
+
+  const endTime = new Date().toISOString();
+  const durationMs = Date.now() - startMs;
+
+  return mapPiResponse(result as PiAssistantMessage, { durationMs, startTime, endTime });
+}
+
+function resolvePiModel(
+  // biome-ignore lint/suspicious/noExplicitAny: pi-ai SDK module — see top-of-section comment
+  sdk: any,
+  args: {
+    providerName: string;
+    apiId: string;
+    modelId: string;
+    baseUrl?: string;
+  },
+  // biome-ignore lint/suspicious/noExplicitAny: pi-ai Model<Api> shape
+): any {
+  const { providerName, apiId, modelId, baseUrl } = args;
+
+  // pi-ai's getModel returns a strongly-typed Model<Api> when the (provider,
+  // modelId) pair is in its generated registry. For runtime-string configs or
+  // unknown model ids we construct a minimal descriptor with the same shape
+  // the providers consume — every field below is required.
+  let model: { api: string; baseUrl: string } | undefined;
+  try {
+    model = sdk.getModel(providerName, modelId);
+  } catch {
+    model = undefined;
+  }
+
+  if (!model) {
+    model = {
+      id: modelId,
+      name: modelId,
+      api: apiId,
+      provider: providerName,
+      baseUrl: baseUrl ?? '',
+      reasoning: false,
+      input: ['text'],
+      cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+      contextWindow: 128000,
+      maxTokens: 16384,
+      // biome-ignore lint/suspicious/noExplicitAny: minimal Model<Api> descriptor
+    } as any;
+  }
+
+  // model is always defined past this point.
+  // biome-ignore lint/style/noNonNullAssertion: see comment above
+  let m = model!;
+  if (m.api !== apiId) {
+    m = { ...m, api: apiId };
+  }
+  if (baseUrl) {
+    m = { ...m, baseUrl };
+  }
+
+  return m;
+}
+
+interface PiContext {
+  readonly systemPrompt: string | undefined;
+  // biome-ignore lint/suspicious/noExplicitAny: pi-ai Message shape
+  readonly messages: any[];
+}
+
+function chatPromptToPiContext(chatPrompt: ChatPrompt): PiContext {
+  // Step 1 of the pi-ai migration only ports rubric-generator, which sends
+  // a single system prompt + a single user turn. We intentionally don't
+  // handle assistant/tool/function roles here — when later consumer steps
+  // need them (llm-grader's multi-turn / tool-use paths), add the cases
+  // alongside the work that exercises them. YAGNI today.
+  const systemSegments: string[] = [];
+  // biome-ignore lint/suspicious/noExplicitAny: pi-ai Message shape
+  const messages: any[] = [];
+
+  for (const message of chatPrompt) {
+    if (message.role === 'system') {
+      systemSegments.push(message.content);
+      continue;
+    }
+    if (message.role !== 'user') {
+      throw new Error(
+        `pi-ai adapter received unsupported message role '${message.role}'. Only system + user are wired up in step 1 of the pi-ai migration (#1205).`,
+      );
+    }
+    messages.push({ role: 'user', content: message.content, timestamp: Date.now() });
+  }
+
+  return {
+    systemPrompt: systemSegments.length > 0 ? systemSegments.join('\n\n') : undefined,
+    messages,
+  };
+}
+
+interface PiUsage {
+  readonly input: number;
+  readonly output: number;
+  readonly cacheRead: number;
+  readonly cost: { readonly total: number };
+}
+
+interface PiAssistantMessage {
+  readonly content: ReadonlyArray<{ type: string; text?: string }>;
+  readonly usage: PiUsage;
+}
+
+function mapPiResponse(
+  result: PiAssistantMessage,
+  timing: { durationMs: number; startTime: string; endTime: string },
+): ProviderResponse {
+  const text = result.content
+    .filter((b) => b.type === 'text')
+    .map((b) => b.text ?? '')
+    .join('');
+
+  const cached = result.usage.cacheRead > 0 ? result.usage.cacheRead : undefined;
+  const tokenUsage = {
+    input: result.usage.input,
+    output: result.usage.output,
+    ...(cached !== undefined ? { cached } : {}),
+  };
+
+  return {
+    raw: result,
+    usage: toJsonObject(result.usage),
+    output: [{ role: 'assistant' as const, content: text }],
+    tokenUsage,
+    costUsd: result.usage.cost.total,
+    durationMs: timing.durationMs,
+    startTime: timing.startTime,
+    endTime: timing.endTime,
+  };
+}
+
 async function withRetry<T>(
   fn: () => Promise<T>,
   retryConfig?: RetryConfig,
diff --git a/packages/core/test/evaluation/providers/targets.test.ts b/packages/core/test/evaluation/providers/targets.test.ts
index fba010d3..0e806120 100644
--- a/packages/core/test/evaluation/providers/targets.test.ts
+++ b/packages/core/test/evaluation/providers/targets.test.ts
@@ -38,6 +38,43 @@ const createOpenRouterMock = mock((options: unknown) => () => ({
 const createAnthropicMock = mock(() => () => ({ provider: 'anthropic' }));
 const createGeminiMock = mock(() => () => ({ provider: 'gemini' }));
 
+const piCompleteMock = mock(async () => ({
+  content: [{ type: 'text', text: 'ok' }],
+  usage: {
+    input: 1,
+    output: 1,
+    cacheRead: 0,
+    cacheWrite: 0,
+    totalTokens: 2,
+    cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 },
+  },
+  api: 'openai-completions',
+  provider: 'openai',
+  model: 'gpt-test',
+  stopReason: 'stop',
+  timestamp: Date.now(),
+  role: 'assistant',
+}));
+const piGetModelMock = mock((provider: string, modelId: string) => ({
+  id: modelId,
+  name: modelId,
+  api: 'openai-completions',
+  provider,
+  baseUrl: '',
+  reasoning: false,
+  input: ['text'],
+  cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+  contextWindow: 128000,
+  maxTokens: 16384,
+}));
+const piRegisterMock = mock(() => {});
+
+mock.module('@mariozechner/pi-ai', () => ({
+  complete: (...args: unknown[]) => piCompleteMock(...(args as [])),
+  getModel: (provider: string, modelId: string) => piGetModelMock(provider, modelId),
+  registerBuiltInApiProviders: () => piRegisterMock(),
+}));
+
 mock.module('ai', () => ({
   generateText: () => generateTextMock(),
 }));
@@ -1183,7 +1220,7 @@ describe('createProvider', () => {
     expect(extractLastAssistantContent(response.output)).toBe('ok');
   });
 
-  it('creates an openai provider that calls the Vercel AI SDK', async () => {
+  it('creates an openai provider that calls @mariozechner/pi-ai', async () => {
     const env = {
       OPENAI_ENDPOINT: 'https://llm-gateway.example.com/v1',
       OPENAI_API_KEY: 'openai-key',
@@ -1201,13 +1238,16 @@ describe('createProvider', () => {
       env,
     );
 
+    piCompleteMock.mockClear();
+    piGetModelMock.mockClear();
+
     const provider = createProvider(resolved);
     expect(provider.kind).toBe('openai');
 
     const response = await provider.invoke({ question: 'Hello from OpenAI' });
 
-    expect(createOpenAIMock).toHaveBeenCalledTimes(1);
-    expect(generateTextMock).toHaveBeenCalledTimes(1);
+    expect(piGetModelMock).toHaveBeenCalledWith('openai', 'gpt-5.4');
+    expect(piCompleteMock).toHaveBeenCalledTimes(1);
     expect(extractLastAssistantContent(response.output)).toBe('ok');
   });
 

From dba722eee554d1480f211cee95a9409ff26cdbd5 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sun, 3 May 2026 00:31:01 +0200
Subject: [PATCH 05/20] fix(core): handle assistant/tool roles + safer
 baseUrl/cost in pi-ai adapter

Address three review findings on the pi-ai adapter (#1205 step 1):

1. chatPromptToPiContext now passes assistant messages through and folds
   tool/function roles into prefixed assistant text, mirroring the Vercel
   path's toModelMessages. Previously turn 2+ of any multi-turn eval against
   an openai target threw on the prior turn's assistant message.

2. resolvePiModel falls back to https://api.openai.com/v1 for the openai
   provider when getModel misses and no baseUrl is configured, and throws
   a clear error otherwise. Empty baseUrl was forwarded into pi-ai's OpenAI
   client and failed opaquely.

3. mapPiResponse omits costUsd when pi-ai reports 0 (typically the fallback
   model descriptor with no pricing) instead of surfacing 0 as "free".
   Matches the Vercel path, which never sets costUsd.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .../core/src/evaluation/providers/ai-sdk.ts   | 74 ++++++++++++++++---
 1 file changed, 62 insertions(+), 12 deletions(-)

diff --git a/packages/core/src/evaluation/providers/ai-sdk.ts b/packages/core/src/evaluation/providers/ai-sdk.ts
index 044f4845..919c7ae7 100644
--- a/packages/core/src/evaluation/providers/ai-sdk.ts
+++ b/packages/core/src/evaluation/providers/ai-sdk.ts
@@ -639,12 +639,22 @@ function resolvePiModel(
   }
 
   if (!model) {
+    // pi-ai's getModel didn't recognize this (provider, modelId) — typical when
+    // the user is on a custom gateway, a brand-new model, or an Azure deployment
+    // name. We must still hand pi-ai a non-empty baseUrl: pi-ai forwards it to
+    // `new OpenAI({ baseURL })` which misbehaves on empty string.
+    const fallbackBaseUrl = baseUrl ?? defaultBaseUrlFor(providerName);
+    if (!fallbackBaseUrl) {
+      throw new Error(
+        `pi-ai adapter cannot resolve a baseUrl for provider '${providerName}' / model '${modelId}'. Either set the target's baseUrl/endpoint or use a model id pi-ai recognizes.`,
+      );
+    }
     model = {
       id: modelId,
       name: modelId,
       api: apiId,
       provider: providerName,
-      baseUrl: baseUrl ?? '',
+      baseUrl: fallbackBaseUrl,
       reasoning: false,
       input: ['text'],
       cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
@@ -667,6 +677,18 @@ function resolvePiModel(
   return m;
 }
 
+/**
+ * Default baseUrl when `getModel` misses and the caller didn't supply one.
+ * Returning `undefined` makes resolvePiModel throw — preferable to passing an
+ * empty string into pi-ai's OpenAI client, which fails opaquely.
+ */
+function defaultBaseUrlFor(providerName: string): string | undefined {
+  if (providerName === 'openai') {
+    return 'https://api.openai.com/v1';
+  }
+  return undefined;
+}
+
 interface PiContext {
   readonly systemPrompt: string | undefined;
   // biome-ignore lint/suspicious/noExplicitAny: pi-ai Message shape
@@ -674,11 +696,13 @@ interface PiContext {
 }
 
 function chatPromptToPiContext(chatPrompt: ChatPrompt): PiContext {
-  // Step 1 of the pi-ai migration only ports rubric-generator, which sends
-  // a single system prompt + a single user turn. We intentionally don't
-  // handle assistant/tool/function roles here — when later consumer steps
-  // need them (llm-grader's multi-turn / tool-use paths), add the cases
-  // alongside the work that exercises them. YAGNI today.
+  // OpenAIProvider.invoke() is reached from the orchestrator's multi-turn
+  // and single-turn paths, so the chatPrompt may legitimately contain
+  // `assistant` (prior turn output) and `tool`/`function` (rare — most callers
+  // remap these upstream in prompt-builder). We mirror the Vercel path's
+  // toModelMessages: pass assistant through as-is; fold tool/function back
+  // into assistant text with a `@[name]:` prefix so pi-ai sees a clean
+  // user/assistant alternation.
   const systemSegments: string[] = [];
   // biome-ignore lint/suspicious/noExplicitAny: pi-ai Message shape
   const messages: any[] = [];
@@ -688,12 +712,32 @@ function chatPromptToPiContext(chatPrompt: ChatPrompt): PiContext {
       systemSegments.push(message.content);
       continue;
     }
-    if (message.role !== 'user') {
-      throw new Error(
-        `pi-ai adapter received unsupported message role '${message.role}'. Only system + user are wired up in step 1 of the pi-ai migration (#1205).`,
-      );
+    if (message.role === 'user') {
+      messages.push({ role: 'user', content: message.content, timestamp: Date.now() });
+      continue;
     }
-    messages.push({ role: 'user', content: message.content, timestamp: Date.now() });
+    if (message.role === 'assistant') {
+      // pi-ai's AssistantMessage type carries api/provider/model/usage/stopReason
+      // for round-trip continuity, but its OpenAI-completions converter only
+      // reads role + content blocks for replayed history. Omitting them is safe
+      // at runtime — `messages` is typed `any[]` to absorb the type mismatch.
+      messages.push({
+        role: 'assistant',
+        content: [{ type: 'text', text: message.content }],
+        timestamp: Date.now(),
+      });
+      continue;
+    }
+    if (message.role === 'tool' || message.role === 'function') {
+      const prefix = message.name ? `@[${message.name}]: ` : '@[Tool]: ';
+      messages.push({
+        role: 'assistant',
+        content: [{ type: 'text', text: `${prefix}${message.content}` }],
+        timestamp: Date.now(),
+      });
+      continue;
+    }
+    throw new Error(`pi-ai adapter received unsupported message role '${message.role}'.`);
   }
 
   return {
@@ -730,12 +774,18 @@ function mapPiResponse(
     ...(cached !== undefined ? { cached } : {}),
   };
 
+  // pi-ai always populates `cost.total`, but it computes 0 when the model
+  // descriptor lacks pricing (fallback descriptor for unknown ids, or pi-ai's
+  // registry simply not having rates yet). Surface 0 as "unknown" by leaving
+  // costUsd undefined — matches the Vercel path, which never sets it.
+  const costUsd = result.usage.cost.total > 0 ? result.usage.cost.total : undefined;
+
   return {
     raw: result,
     usage: toJsonObject(result.usage),
     output: [{ role: 'assistant' as const, content: text }],
     tokenUsage,
-    costUsd: result.usage.cost.total,
+    ...(costUsd !== undefined ? { costUsd } : {}),
     durationMs: timing.durationMs,
     startTime: timing.startTime,
     endTime: timing.endTime,

From a7d51f03fe58e4153054dc7a288fe6a4db309ab9 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sun, 3 May 2026 06:12:43 +0200
Subject: [PATCH 06/20] =?UTF-8?q?refactor(core):=20treat=20pi-ai=20as=20a?=
 =?UTF-8?q?=20normal=20dep=20=E2=80=94=20drop=20dynamic-import=20dance?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Make pi-ai a first-class static dependency, like ai-sdk:
- Add @sinclair/typebox as a direct dep so pi-ai's transitive types resolve.
- Add packages/core/src/evaluation/providers/pi-ai-shim.d.ts that augments
  '@mariozechner/pi-ai' with the subset we use. Pi-ai's published d.ts has
  cross-module re-exports that don't surface at the package root under
  NodeNext (and Bundler) — only direct primary declarations leak through.
  Re-declaring just what we call gives us static imports + real types.
- ai-sdk.ts: replace `let piAiSdk: any | null` + lazy `loadPiAi()` + `as any`
  casts with plain top-level imports of `complete`, `getModel`,
  `registerBuiltInApiProviders`, and the Model/Message/AssistantMessage
  types. registerBuiltInApiProviders() runs once at module load.

The previous dynamic-import + any-cast pattern was inherited from
pi-coding-agent.ts where pi-ai is an optional peer dep. Now that pi-ai is
a real dep, that workaround was earning nothing and costing readability —
this PR drops it across the new code path. (pi-coding-agent.ts itself
keeps the lazy-load because the pi-coding-agent peer dep can be uninstalled.)

Refs #1205

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 bun.lock                                      |   1 +
 packages/core/package.json                    |   1 +
 .../core/src/evaluation/providers/ai-sdk.ts   | 160 +++++++++---------
 .../src/evaluation/providers/pi-ai-shim.d.ts  | 113 +++++++++++++
 4 files changed, 191 insertions(+), 84 deletions(-)
 create mode 100644 packages/core/src/evaluation/providers/pi-ai-shim.d.ts

diff --git a/bun.lock b/bun.lock
index 1d7ece4f..ef7c0db7 100644
--- a/bun.lock
+++ b/bun.lock
@@ -99,6 +99,7 @@
         "@mariozechner/pi-ai": "^0.62.0",
         "@openai/codex-sdk": "^0.104.0",
         "@openrouter/ai-sdk-provider": "^2.3.1",
+        "@sinclair/typebox": "^0.34.41",
         "ai": "^6.0.0",
         "fast-glob": "^3.3.3",
         "json5": "^2.2.3",
diff --git a/packages/core/package.json b/packages/core/package.json
index b8f23a80..39308f31 100644
--- a/packages/core/package.json
+++ b/packages/core/package.json
@@ -50,6 +50,7 @@
     "@mariozechner/pi-ai": "^0.62.0",
     "@openai/codex-sdk": "^0.104.0",
     "@openrouter/ai-sdk-provider": "^2.3.1",
+    "@sinclair/typebox": "^0.34.41",
     "ai": "^6.0.0",
     "fast-glob": "^3.3.3",
     "json5": "^2.2.3",
diff --git a/packages/core/src/evaluation/providers/ai-sdk.ts b/packages/core/src/evaluation/providers/ai-sdk.ts
index 919c7ae7..aa366d60 100644
--- a/packages/core/src/evaluation/providers/ai-sdk.ts
+++ b/packages/core/src/evaluation/providers/ai-sdk.ts
@@ -2,9 +2,21 @@ import { createAnthropic } from '@ai-sdk/anthropic';
 import { type AzureOpenAIProviderSettings, createAzure } from '@ai-sdk/azure';
 import { createGoogleGenerativeAI } from '@ai-sdk/google';
 import { createOpenAI } from '@ai-sdk/openai';
+import {
+  type AssistantMessage as PiAssistantMessage,
+  type Message as PiMessage,
+  type Model as PiModel,
+  complete as piComplete,
+  getModel as piGetModel,
+  registerBuiltInApiProviders,
+} from '@mariozechner/pi-ai';
 import { createOpenRouter } from '@openrouter/ai-sdk-provider';
 import { type LanguageModel, type ModelMessage, generateText } from 'ai';
 
+// pi-ai routes complete()/stream() by Model.api; the built-in providers must be
+// registered once at module load. Cheap; idempotent across repeated imports.
+registerBuiltInApiProviders();
+
 import type { JsonObject } from '../types.js';
 import type {
   AnthropicResolvedConfig,
@@ -530,12 +542,10 @@ async function sleep(ms: number): Promise<void> {
 // (Vercel AI SDK) above is still in use for the four providers we have not
 // ported yet (Azure, OpenRouter, Anthropic, Gemini).
 //
-// Why dynamic import + `any` casts: pi-ai ships .d.ts files whose top-level
-// named exports do not resolve through TypeScript's NodeNext/Bundler module
-// resolution (same issue worked around in pi-coding-agent.ts:250). The runtime
-// exports are correct; only the static type graph is broken. We mirror the
-// pi-coding-agent.ts pattern: load the module dynamically once, cast through
-// `any`, and rely on the local invokePiAi shape for type safety.
+// Types come through `@mariozechner/pi-ai` plus our local `pi-ai-shim.d.ts`
+// ambient augmentation. Pi-ai's published d.ts re-exports do not surface at
+// the package root under NodeNext, so the shim re-declares the small subset
+// we use (Model, Message, complete, getModel, ...). See pi-ai-shim.d.ts.
 //
 // To port a provider:
 //   1. Map its config to the invokePiAi options below (api id, baseUrl, key).
@@ -543,27 +553,6 @@ async function sleep(ms: number): Promise<void> {
 //   3. Drop the createX() / this.model build from the constructor when
 //      asLanguageModel() is no longer used by any consumer.
 
-// biome-ignore lint/suspicious/noExplicitAny: pi-ai type defs do not statically resolve named exports; mirrors the existing workaround in pi-coding-agent.ts.
-let piAiSdk: any | null = null;
-let piAiLoading: Promise<void> | null = null;
-
-async function loadPiAi(): Promise<void> {
-  if (piAiSdk) return;
-  if (!piAiLoading) {
-    piAiLoading = (async () => {
-      const mod = await import('@mariozechner/pi-ai');
-      // biome-ignore lint/suspicious/noExplicitAny: see comment above
-      const m = mod as any;
-      m.registerBuiltInApiProviders?.();
-      piAiSdk = m;
-    })().catch((err) => {
-      piAiLoading = null;
-      throw err;
-    });
-  }
-  await piAiLoading;
-}
-
 interface InvokePiAiOptions {
   /** pi-ai provider name (matches `KnownProvider`, e.g. 'openai'). */
   readonly providerName: string;
@@ -582,10 +571,7 @@ interface InvokePiAiOptions {
 async function invokePiAi(options: InvokePiAiOptions): Promise<ProviderResponse> {
   const { providerName, apiId, modelId, apiKey, baseUrl, request, defaults, retryConfig } = options;
 
-  await loadPiAi();
-  const sdk = piAiSdk;
-
-  const model = resolvePiModel(sdk, { providerName, apiId, modelId, baseUrl });
+  const model = resolvePiModel({ providerName, apiId, modelId, baseUrl });
   const { systemPrompt, messages } = chatPromptToPiContext(buildChatPrompt(request));
   const { temperature, maxOutputTokens } = resolveModelSettings(request, defaults);
 
@@ -594,7 +580,7 @@ async function invokePiAi(options: InvokePiAiOptions): Promise<ProviderResponse>
 
   const result = await withRetry(
     () =>
-      sdk.complete(
+      piComplete(
         model,
         { systemPrompt, messages },
         {
@@ -611,29 +597,26 @@ async function invokePiAi(options: InvokePiAiOptions): Promise<ProviderResponse>
   const endTime = new Date().toISOString();
   const durationMs = Date.now() - startMs;
 
-  return mapPiResponse(result as PiAssistantMessage, { durationMs, startTime, endTime });
+  return mapPiResponse(result, { durationMs, startTime, endTime });
 }
 
-function resolvePiModel(
-  // biome-ignore lint/suspicious/noExplicitAny: pi-ai SDK module — see top-of-section comment
-  sdk: any,
-  args: {
-    providerName: string;
-    apiId: string;
-    modelId: string;
-    baseUrl?: string;
-  },
-  // biome-ignore lint/suspicious/noExplicitAny: pi-ai Model<Api> shape
-): any {
+function resolvePiModel(args: {
+  providerName: string;
+  apiId: string;
+  modelId: string;
+  baseUrl?: string;
+}): PiModel {
   const { providerName, apiId, modelId, baseUrl } = args;
 
-  // pi-ai's getModel returns a strongly-typed Model<Api> when the (provider,
-  // modelId) pair is in its generated registry. For runtime-string configs or
-  // unknown model ids we construct a minimal descriptor with the same shape
-  // the providers consume — every field below is required.
-  let model: { api: string; baseUrl: string } | undefined;
+  // pi-ai's getModel returns a Model when the (provider, modelId) pair is in
+  // its generated registry. For runtime-string configs or unknown model ids
+  // we construct a minimal descriptor — every field is required by Model.
+  // piGetModel's upstream signature is generic over a typed model registry; at
+  // runtime the strings flow through and it returns a plain Model. The cast
+  // converts the unresolved generic return type to the shim's Model.
+  let model: PiModel | undefined;
   try {
-    model = sdk.getModel(providerName, modelId);
+    model = piGetModel(providerName, modelId) as PiModel;
   } catch {
     model = undefined;
   }
@@ -660,21 +643,17 @@ function resolvePiModel(
       cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
       contextWindow: 128000,
       maxTokens: 16384,
-      // biome-ignore lint/suspicious/noExplicitAny: minimal Model<Api> descriptor
-    } as any;
+    };
   }
 
-  // model is always defined past this point.
-  // biome-ignore lint/style/noNonNullAssertion: see comment above
-  let m = model!;
-  if (m.api !== apiId) {
-    m = { ...m, api: apiId };
+  if (model.api !== apiId) {
+    model = { ...model, api: apiId };
   }
   if (baseUrl) {
-    m = { ...m, baseUrl };
+    model = { ...model, baseUrl };
   }
 
-  return m;
+  return model;
 }
 
 /**
@@ -691,8 +670,7 @@ function defaultBaseUrlFor(providerName: string): string | undefined {
 
 interface PiContext {
   readonly systemPrompt: string | undefined;
-  // biome-ignore lint/suspicious/noExplicitAny: pi-ai Message shape
-  readonly messages: any[];
+  readonly messages: PiMessage[];
 }
 
 function chatPromptToPiContext(chatPrompt: ChatPrompt): PiContext {
@@ -703,9 +681,15 @@ function chatPromptToPiContext(chatPrompt: ChatPrompt): PiContext {
   // toModelMessages: pass assistant through as-is; fold tool/function back
   // into assistant text with a `@[name]:` prefix so pi-ai sees a clean
   // user/assistant alternation.
+  //
+  // Pi-ai's AssistantMessage type carries api/provider/model/usage/stopReason
+  // for round-trip continuity, but its OpenAI-completions converter only reads
+  // role + content blocks for replayed history. We synthesize a minimal
+  // assistant turn with placeholder metadata — pi-ai ignores those fields when
+  // converting to the wire format.
   const systemSegments: string[] = [];
-  // biome-ignore lint/suspicious/noExplicitAny: pi-ai Message shape
-  const messages: any[] = [];
+  const messages: PiMessage[] = [];
+  const now = Date.now();
 
   for (const message of chatPrompt) {
     if (message.role === 'system') {
@@ -713,18 +697,26 @@ function chatPromptToPiContext(chatPrompt: ChatPrompt): PiContext {
       continue;
     }
     if (message.role === 'user') {
-      messages.push({ role: 'user', content: message.content, timestamp: Date.now() });
+      messages.push({ role: 'user', content: message.content, timestamp: now });
       continue;
     }
     if (message.role === 'assistant') {
-      // pi-ai's AssistantMessage type carries api/provider/model/usage/stopReason
-      // for round-trip continuity, but its OpenAI-completions converter only
-      // reads role + content blocks for replayed history. Omitting them is safe
-      // at runtime — `messages` is typed `any[]` to absorb the type mismatch.
       messages.push({
         role: 'assistant',
         content: [{ type: 'text', text: message.content }],
-        timestamp: Date.now(),
+        api: '',
+        provider: '',
+        model: '',
+        usage: {
+          input: 0,
+          output: 0,
+          cacheRead: 0,
+          cacheWrite: 0,
+          totalTokens: 0,
+          cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 },
+        },
+        stopReason: 'stop',
+        timestamp: now,
       });
       continue;
     }
@@ -733,7 +725,19 @@ function chatPromptToPiContext(chatPrompt: ChatPrompt): PiContext {
       messages.push({
         role: 'assistant',
         content: [{ type: 'text', text: `${prefix}${message.content}` }],
-        timestamp: Date.now(),
+        api: '',
+        provider: '',
+        model: '',
+        usage: {
+          input: 0,
+          output: 0,
+          cacheRead: 0,
+          cacheWrite: 0,
+          totalTokens: 0,
+          cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 },
+        },
+        stopReason: 'stop',
+        timestamp: now,
       });
       continue;
     }
@@ -746,25 +750,13 @@ function chatPromptToPiContext(chatPrompt: ChatPrompt): PiContext {
   };
 }
 
-interface PiUsage {
-  readonly input: number;
-  readonly output: number;
-  readonly cacheRead: number;
-  readonly cost: { readonly total: number };
-}
-
-interface PiAssistantMessage {
-  readonly content: ReadonlyArray<{ type: string; text?: string }>;
-  readonly usage: PiUsage;
-}
-
 function mapPiResponse(
   result: PiAssistantMessage,
   timing: { durationMs: number; startTime: string; endTime: string },
 ): ProviderResponse {
   const text = result.content
-    .filter((b) => b.type === 'text')
-    .map((b) => b.text ?? '')
+    .filter((b): b is { type: 'text'; text: string } => b.type === 'text')
+    .map((b) => b.text)
     .join('');
 
   const cached = result.usage.cacheRead > 0 ? result.usage.cacheRead : undefined;
diff --git a/packages/core/src/evaluation/providers/pi-ai-shim.d.ts b/packages/core/src/evaluation/providers/pi-ai-shim.d.ts
new file mode 100644
index 00000000..0a5f2ec4
--- /dev/null
+++ b/packages/core/src/evaluation/providers/pi-ai-shim.d.ts
@@ -0,0 +1,113 @@
+// Augments '@mariozechner/pi-ai' types with the subset we use.
+// Pi-ai's published d.ts has cross-module re-exports (`export * from`,
+// `export { X } from`) that TypeScript's NodeNext resolution does not surface
+// at the top-level — only direct primary declarations make it through (e.g.
+// `getModel` from models.d.ts is fine; `complete` from stream.d.ts isn't).
+// This shim re-declares the surface we depend on so our code can use plain
+// static imports and real types instead of dynamic-import + any casts.
+//
+// Keep this minimal: only what we actively call. Mirror the upstream shape
+// from node_modules/.bun/@mariozechner+pi-ai@*/dist/*.d.ts.
+
+declare module '@mariozechner/pi-ai' {
+  // ---- types/types.d.ts ----
+  export type Api = string;
+  export type KnownProvider = string;
+  export type Provider = string;
+  export type ThinkingLevel = 'minimal' | 'low' | 'medium' | 'high' | 'xhigh';
+
+  export interface TextContent {
+    type: 'text';
+    text: string;
+  }
+  export interface ThinkingContent {
+    type: 'thinking';
+    thinking: string;
+  }
+  export interface ImageContent {
+    type: 'image';
+    [k: string]: unknown;
+  }
+  export interface ToolCall {
+    type: 'toolCall';
+    id: string;
+    name: string;
+    arguments: unknown;
+  }
+
+  export interface Usage {
+    input: number;
+    output: number;
+    cacheRead: number;
+    cacheWrite: number;
+    totalTokens: number;
+    cost: {
+      input: number;
+      output: number;
+      cacheRead: number;
+      cacheWrite: number;
+      total: number;
+    };
+  }
+
+  export interface UserMessage {
+    role: 'user';
+    content: string | Array<TextContent | ImageContent>;
+    timestamp: number;
+  }
+  export interface AssistantMessage {
+    role: 'assistant';
+    content: Array<TextContent | ThinkingContent | ToolCall>;
+    api: Api;
+    provider: Provider;
+    model: string;
+    usage: Usage;
+    stopReason: 'stop' | 'length' | 'toolUse' | 'error' | 'aborted';
+    timestamp: number;
+  }
+  export interface ToolResultMessage {
+    role: 'toolResult';
+    toolCallId: string;
+    toolName: string;
+    content: Array<TextContent | ImageContent>;
+    isError: boolean;
+    timestamp: number;
+  }
+  export type Message = UserMessage | AssistantMessage | ToolResultMessage;
+
+  export interface Model {
+    id: string;
+    name: string;
+    api: Api;
+    provider: Provider;
+    baseUrl: string;
+    reasoning: boolean;
+    input: ReadonlyArray<'text' | 'image'>;
+    cost: { input: number; output: number; cacheRead: number; cacheWrite: number };
+    contextWindow: number;
+    maxTokens: number;
+  }
+
+  export interface Context {
+    systemPrompt?: string;
+    messages: Message[];
+  }
+
+  export interface StreamOptions {
+    temperature?: number;
+    maxTokens?: number;
+    apiKey?: string;
+    signal?: AbortSignal;
+    headers?: Record<string, string>;
+  }
+
+  // ---- stream.d.ts ----
+  export function complete(
+    model: Model,
+    context: Context,
+    options?: StreamOptions,
+  ): Promise<AssistantMessage>;
+
+  // ---- providers/register-builtins.d.ts ----
+  export function registerBuiltInApiProviders(): void;
+}

From ca907df237b8b688f201bb821f75b82c43786170 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sun, 3 May 2026 06:25:03 +0200
Subject: [PATCH 07/20] refactor(core): pre-resolve pi-ai Model in
 OpenAIProvider constructor
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Lean into pi-ai's design rather than papering over it. Pi-ai treats Model
as plain data and apiKey as a per-call StreamOptions field — model and
credentials are orthogonal. Reflect that in the adapter:

- Add `private readonly piModel: PiModel` field; resolved once in the
  constructor via resolvePiModel().
- invoke() passes the prebuilt model + apiKey to invokePiAi(); no per-call
  registry lookup or field merge.
- InvokePiAiOptions shrinks from 7 fields to 5 — model is data, the call
  needs the model + auth + the request.

The previous shape rebuilt the model on every invoke from raw config
strings, conflating "what model" with "construction details" at the call
site. The new shape is both more efficient (resolve once) and more
faithful to pi-ai's API: a Model object you carry around, an apiKey you
pass when you actually call.

Refs #1205

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .../core/src/evaluation/providers/ai-sdk.ts   | 34 +++++++++++--------
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/packages/core/src/evaluation/providers/ai-sdk.ts b/packages/core/src/evaluation/providers/ai-sdk.ts
index aa366d60..3f1ccd52 100644
--- a/packages/core/src/evaluation/providers/ai-sdk.ts
+++ b/packages/core/src/evaluation/providers/ai-sdk.ts
@@ -45,7 +45,15 @@ export class OpenAIProvider implements Provider {
   readonly kind = 'openai' as const;
   readonly targetName: string;
 
+  // Vercel LanguageModel kept only for asLanguageModel() callers (llm-grader,
+  // composite, agentv-provider) until they migrate off it in #1205. Once gone,
+  // delete this field and the createOpenAI build below.
   private readonly model: LanguageModel;
+  // pi-ai's Model is plain data — what model, where it lives — with no auth.
+  // We resolve once at construction (registry lookup + field merges) and pass
+  // it on each invoke. apiKey stays a per-call StreamOptions field, mirroring
+  // pi-ai's own API: model and credentials are orthogonal concerns.
+  private readonly piModel: PiModel;
   private readonly defaults: ProviderDefaults;
   private readonly retryConfig?: RetryConfig;
 
@@ -67,15 +75,19 @@ export class OpenAIProvider implements Provider {
     });
     this.model =
       config.apiFormat === 'responses' ? openai(config.model) : openai.chat(config.model);
+
+    this.piModel = resolvePiModel({
+      providerName: 'openai',
+      apiId: config.apiFormat === 'responses' ? 'openai-responses' : 'openai-completions',
+      modelId: config.model,
+      baseUrl: config.baseURL,
+    });
   }
 
   async invoke(request: ProviderRequest): Promise<ProviderResponse> {
     return invokePiAi({
-      providerName: 'openai',
-      apiId: this.config.apiFormat === 'responses' ? 'openai-responses' : 'openai-completions',
-      modelId: this.config.model,
+      model: this.piModel,
       apiKey: this.config.apiKey,
-      baseUrl: this.config.baseURL,
       request,
       defaults: this.defaults,
       retryConfig: this.retryConfig,
@@ -554,24 +566,18 @@ async function sleep(ms: number): Promise<void> {
 //      asLanguageModel() is no longer used by any consumer.
 
 interface InvokePiAiOptions {
-  /** pi-ai provider name (matches `KnownProvider`, e.g. 'openai'). */
-  readonly providerName: string;
-  /** pi-ai api id, picks which provider impl runs the call. */
-  readonly apiId: string;
-  /** Model id from user config (may or may not exist in pi-ai's registry). */
-  readonly modelId: string;
+  /** Pre-resolved pi-ai model (built once in the provider constructor). */
+  readonly model: PiModel;
+  /** Per-call credential — pi-ai treats apiKey as a StreamOptions field. */
   readonly apiKey: string;
-  /** Optional baseUrl override; falls back to the registry's default. */
-  readonly baseUrl?: string;
   readonly request: ProviderRequest;
   readonly defaults: ProviderDefaults;
   readonly retryConfig?: RetryConfig;
 }
 
 async function invokePiAi(options: InvokePiAiOptions): Promise<ProviderResponse> {
-  const { providerName, apiId, modelId, apiKey, baseUrl, request, defaults, retryConfig } = options;
+  const { model, apiKey, request, defaults, retryConfig } = options;
 
-  const model = resolvePiModel({ providerName, apiId, modelId, baseUrl });
   const { systemPrompt, messages } = chatPromptToPiContext(buildChatPrompt(request));
   const { temperature, maxOutputTokens } = resolveModelSettings(request, defaults);
 

From 18236ac9917a37b831b4a0ec1af725e350abc501 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sun, 3 May 2026 10:31:24 +0200
Subject: [PATCH 08/20] fix(cli): declare @mariozechner/pi-ai as a runtime dep
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The CLI bundles @agentv/core (noExternal), and core now imports pi-ai
directly. tsup keeps pi-ai external in the bundle (correct — it has
dynamic requires), so the published CLI needs pi-ai resolvable at
runtime. apps/cli/package.json wasn't listing it, which surfaced as
"Cannot find module '@mariozechner/pi-ai'" in CI's Validate Evals job.

Reproduces locally with `bun apps/cli/dist/cli.js validate ...`; passes
after adding the dep.

Refs #1205

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 apps/cli/package.json | 1 +
 bun.lock              | 1 +
 2 files changed, 2 insertions(+)

diff --git a/apps/cli/package.json b/apps/cli/package.json
index 28484055..ea34b2e1 100644
--- a/apps/cli/package.json
+++ b/apps/cli/package.json
@@ -33,6 +33,7 @@
     "@github/copilot-sdk": "^0.1.25",
     "@hono/node-server": "^1.19.11",
     "@inquirer/prompts": "^8.2.1",
+    "@mariozechner/pi-ai": "^0.62.0",
     "@openai/codex-sdk": "^0.104.0",
     "cmd-ts": "^0.14.3",
     "dotenv": "^16.4.5",
diff --git a/bun.lock b/bun.lock
index ef7c0db7..fe9c6c59 100644
--- a/bun.lock
+++ b/bun.lock
@@ -33,6 +33,7 @@
         "@github/copilot-sdk": "^0.1.25",
         "@hono/node-server": "^1.19.11",
         "@inquirer/prompts": "^8.2.1",
+        "@mariozechner/pi-ai": "^0.62.0",
         "@openai/codex-sdk": "^0.104.0",
         "cmd-ts": "^0.14.3",
         "dotenv": "^16.4.5",

From 6dbb106b481f607297659e06944b42f1150cfb40 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sun, 3 May 2026 12:01:44 +0200
Subject: [PATCH 09/20] feat(core): teach Provider.invoke about tools +
 multi-step
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Extend the Provider interface so invoke() can replace asLanguageModel() across
every grader call site. The new fields are additive — single-shot consumers
keep their current shape.

types.ts:
- Add ProviderTool: { name, description, parameters: JsonObject (JSON Schema),
  execute(input): unknown }
- ProviderRequest: optional tools, maxSteps
- ProviderResponse: optional steps: { count, toolCallCount }

ai-sdk.ts (invokePiAi):
- Run the agent loop when tools are provided: model turn → execute tool calls
  → next model turn, until the model stops requesting tools or maxSteps hits.
- Aggregate token usage and cost across all turns; surface step + tool counts
  on the response.
- Tool parameters flow as JSON Schema — pi-ai's openai-completions converter
  passes them through to the wire format unchanged.

pi-ai-shim.d.ts:
- Declare Tool, Context.tools so the loop typechecks.
- Declare ToolCall.thoughtSignature (set by some providers, optional).

No consumer changes yet; next commit migrates llm-grader / composite /
agentv-provider / rubric-generator off asLanguageModel onto invoke().

Refs #1205

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .../core/src/evaluation/providers/ai-sdk.ts   | 125 +++++++++++++++---
 .../src/evaluation/providers/pi-ai-shim.d.ts  |  15 +++
 .../core/src/evaluation/providers/types.ts    |  45 +++++++
 3 files changed, 167 insertions(+), 18 deletions(-)

diff --git a/packages/core/src/evaluation/providers/ai-sdk.ts b/packages/core/src/evaluation/providers/ai-sdk.ts
index 3f1ccd52..cb88e772 100644
--- a/packages/core/src/evaluation/providers/ai-sdk.ts
+++ b/packages/core/src/evaluation/providers/ai-sdk.ts
@@ -6,6 +6,8 @@ import {
   type AssistantMessage as PiAssistantMessage,
   type Message as PiMessage,
   type Model as PiModel,
+  type Tool as PiTool,
+  type ToolCall as PiToolCall,
   complete as piComplete,
   getModel as piGetModel,
   registerBuiltInApiProviders,
@@ -577,33 +579,111 @@ interface InvokePiAiOptions {
 
 async function invokePiAi(options: InvokePiAiOptions): Promise<ProviderResponse> {
   const { model, apiKey, request, defaults, retryConfig } = options;
+  const tools = request.tools && request.tools.length > 0 ? request.tools : undefined;
+  const maxSteps = tools ? Math.max(1, request.maxSteps ?? 1) : 1;
 
   const { systemPrompt, messages } = chatPromptToPiContext(buildChatPrompt(request));
+  const piTools: PiTool[] | undefined = tools
+    ? tools.map((t) => ({
+        name: t.name,
+        description: t.description,
+        parameters: t.parameters,
+      }))
+    : undefined;
+  const ctx = { systemPrompt, messages, ...(piTools ? { tools: piTools } : {}) };
   const { temperature, maxOutputTokens } = resolveModelSettings(request, defaults);
+  const callOptions = {
+    apiKey,
+    temperature,
+    ...(maxOutputTokens !== undefined ? { maxTokens: maxOutputTokens } : {}),
+    signal: request.signal,
+  };
 
   const startTime = new Date().toISOString();
   const startMs = Date.now();
 
-  const result = await withRetry(
-    () =>
-      piComplete(
-        model,
-        { systemPrompt, messages },
-        {
-          apiKey,
-          temperature,
-          ...(maxOutputTokens !== undefined ? { maxTokens: maxOutputTokens } : {}),
-          signal: request.signal,
-        },
-      ),
+  const aggregateUsage: AggregatedUsage = { input: 0, output: 0, cacheRead: 0, cost: 0 };
+  let stepCount = 0;
+  let toolCallCount = 0;
+  let result: PiAssistantMessage = await withRetry(
+    () => piComplete(model, ctx, callOptions),
     retryConfig,
     request.signal,
   );
+  ctx.messages.push(result);
+  stepCount = 1;
+  accumulateUsage(aggregateUsage, result.usage);
+
+  // Agent loop: run tool calls and re-invoke until the model stops requesting
+  // tools or we hit maxSteps. Single-shot calls (no tools) skip this entirely.
+  while (tools) {
+    const calls = result.content.filter(
+      (b: PiAssistantMessage['content'][number]): b is PiToolCall => b.type === 'toolCall',
+    );
+    if (calls.length === 0) break;
+    if (stepCount >= maxSteps) break;
+
+    toolCallCount += calls.length;
+
+    for (const call of calls) {
+      const tool = tools.find((t) => t.name === call.name);
+      let output: unknown;
+      let isError = false;
+      try {
+        if (!tool) {
+          throw new Error(`pi-ai adapter: model called unknown tool '${call.name}'`);
+        }
+        output = await tool.execute(call.arguments);
+      } catch (err) {
+        output = err instanceof Error ? err.message : String(err);
+        isError = true;
+      }
+      ctx.messages.push({
+        role: 'toolResult',
+        toolCallId: call.id,
+        toolName: call.name,
+        content: [
+          { type: 'text', text: typeof output === 'string' ? output : JSON.stringify(output) },
+        ],
+        isError,
+        timestamp: Date.now(),
+      });
+    }
+
+    result = await withRetry(
+      () => piComplete(model, ctx, callOptions),
+      retryConfig,
+      request.signal,
+    );
+    ctx.messages.push(result);
+    stepCount += 1;
+    accumulateUsage(aggregateUsage, result.usage);
+  }
 
   const endTime = new Date().toISOString();
   const durationMs = Date.now() - startMs;
 
-  return mapPiResponse(result, { durationMs, startTime, endTime });
+  return mapPiResponse(result, {
+    durationMs,
+    startTime,
+    endTime,
+    aggregateUsage,
+    steps: tools ? { count: stepCount, toolCallCount } : undefined,
+  });
+}
+
+interface AggregatedUsage {
+  input: number;
+  output: number;
+  cacheRead: number;
+  cost: number;
+}
+
+function accumulateUsage(agg: AggregatedUsage, u: PiAssistantMessage['usage']): void {
+  agg.input += u.input;
+  agg.output += u.output;
+  agg.cacheRead += u.cacheRead;
+  agg.cost += u.cost.total;
 }
 
 function resolvePiModel(args: {
@@ -758,17 +838,25 @@ function chatPromptToPiContext(chatPrompt: ChatPrompt): PiContext {
 
 function mapPiResponse(
   result: PiAssistantMessage,
-  timing: { durationMs: number; startTime: string; endTime: string },
+  timing: {
+    durationMs: number;
+    startTime: string;
+    endTime: string;
+    aggregateUsage: AggregatedUsage;
+    steps?: { count: number; toolCallCount: number };
+  },
 ): ProviderResponse {
   const text = result.content
     .filter((b): b is { type: 'text'; text: string } => b.type === 'text')
     .map((b) => b.text)
     .join('');
 
-  const cached = result.usage.cacheRead > 0 ? result.usage.cacheRead : undefined;
+  // Token usage is aggregated across all model turns in the agent loop, not
+  // just the final turn. Single-shot calls have aggregateUsage == lastTurnUsage.
+  const cached = timing.aggregateUsage.cacheRead > 0 ? timing.aggregateUsage.cacheRead : undefined;
   const tokenUsage = {
-    input: result.usage.input,
-    output: result.usage.output,
+    input: timing.aggregateUsage.input,
+    output: timing.aggregateUsage.output,
     ...(cached !== undefined ? { cached } : {}),
   };
 
@@ -776,7 +864,7 @@ function mapPiResponse(
   // descriptor lacks pricing (fallback descriptor for unknown ids, or pi-ai's
   // registry simply not having rates yet). Surface 0 as "unknown" by leaving
   // costUsd undefined — matches the Vercel path, which never sets it.
-  const costUsd = result.usage.cost.total > 0 ? result.usage.cost.total : undefined;
+  const costUsd = timing.aggregateUsage.cost > 0 ? timing.aggregateUsage.cost : undefined;
 
   return {
     raw: result,
@@ -787,6 +875,7 @@ function mapPiResponse(
     durationMs: timing.durationMs,
     startTime: timing.startTime,
     endTime: timing.endTime,
+    ...(timing.steps ? { steps: timing.steps } : {}),
   };
 }
 
diff --git a/packages/core/src/evaluation/providers/pi-ai-shim.d.ts b/packages/core/src/evaluation/providers/pi-ai-shim.d.ts
index 0a5f2ec4..780e7d0b 100644
--- a/packages/core/src/evaluation/providers/pi-ai-shim.d.ts
+++ b/packages/core/src/evaluation/providers/pi-ai-shim.d.ts
@@ -33,6 +33,7 @@ declare module '@mariozechner/pi-ai' {
     id: string;
     name: string;
     arguments: unknown;
+    thoughtSignature?: string;
   }
 
   export interface Usage {
@@ -88,9 +89,23 @@ declare module '@mariozechner/pi-ai' {
     maxTokens: number;
   }
 
+  /**
+   * Pi-ai's Tool wraps a TypeBox schema; we send JSON Schema directly via the
+   * adapter, so the relaxed `parameters: object` here lets us pass plain
+   * JSON-Schema objects without round-tripping through TypeBox builders. Pi-ai
+   * forwards `parameters` to the provider's wire format unchanged (it
+   * stringifies it for OpenAI completions, etc.) so this is safe at runtime.
+   */
+  export interface Tool {
+    name: string;
+    description: string;
+    parameters: object;
+  }
+
   export interface Context {
     systemPrompt?: string;
     messages: Message[];
+    tools?: Tool[];
   }
 
   export interface StreamOptions {
diff --git a/packages/core/src/evaluation/providers/types.ts b/packages/core/src/evaluation/providers/types.ts
index 573f42df..f25a4abf 100644
--- a/packages/core/src/evaluation/providers/types.ts
+++ b/packages/core/src/evaluation/providers/types.ts
@@ -139,6 +139,26 @@ export interface ProviderStreamCallbacks {
   getActiveSpanIds?: () => { parentSpanId: string; rootSpanId: string } | null;
 }
 
+/**
+ * A tool the model may call during multi-step provider execution. Pi-ai-shaped:
+ * the parameter shape is JSON Schema (provider-library-neutral wire format),
+ * and execute() is invoked by the provider once the model emits a tool call.
+ */
+export interface ProviderTool {
+  /** Tool name as shown to the model. */
+  readonly name: string;
+  /** Tool description as shown to the model. */
+  readonly description: string;
+  /** JSON Schema for the tool's input. */
+  readonly parameters: JsonObject;
+  /**
+   * Executes the tool. Receives the parsed input the model produced. Errors
+   * are caught and surfaced to the model as tool-error results; the loop
+   * continues unless `maxSteps` is reached.
+   */
+  execute(input: unknown): Promise<unknown> | unknown;
+}
+
 export interface ProviderRequest {
   readonly question: string;
   readonly systemPrompt?: string;
@@ -160,6 +180,18 @@ export interface ProviderRequest {
   readonly streamCallbacks?: ProviderStreamCallbacks;
   /** Braintrust span IDs for trace-claude-code plugin (optional) */
   readonly braintrustSpanIds?: { readonly parentSpanId: string; readonly rootSpanId: string };
+  /**
+   * Tools the model may call. When provided, the provider runs the agent loop:
+   * model turn → tool execution → model turn, repeated until the model returns
+   * no further tool calls or `maxSteps` is reached. Required for built-in agent
+   * grader mode (filesystem-introspection rubrics).
+   */
+  readonly tools?: readonly ProviderTool[];
+  /**
+   * Maximum number of agent loop iterations (model turn + tool execution = one
+   * step). Required when `tools` is non-empty. Ignored otherwise.
+   */
+  readonly maxSteps?: number;
 }
 
 /**
@@ -225,6 +257,17 @@ export interface ProviderTokenUsage {
   readonly reasoning?: number;
 }
 
+/**
+ * Per-step trace summary for tool-using provider calls. Populated only when
+ * the request had `tools`. Single-shot calls leave `steps` undefined.
+ */
+export interface ProviderStepInfo {
+  /** Number of agent loop steps executed (1 = single model turn, no tool calls). */
+  readonly count: number;
+  /** Total tool calls across all steps. */
+  readonly toolCallCount: number;
+}
+
 export interface ProviderResponse {
   readonly raw?: unknown;
   readonly usage?: JsonObject;
@@ -240,6 +283,8 @@ export interface ProviderResponse {
   readonly startTime?: string;
   /** ISO 8601 timestamp when execution ended (optional) */
   readonly endTime?: string;
+  /** Multi-step trace summary; populated only when the request used `tools`. */
+  readonly steps?: ProviderStepInfo;
   /**
    * Synthetic unified diff of files generated by the provider outside the
    * eval workspace_path (e.g. copilot session-state artifacts in

From 43f6076dda25f2752be08c96fd3fe1887ac6aacd Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sun, 3 May 2026 12:14:39 +0200
Subject: [PATCH 10/20] refactor(core): migrate all grader consumers off
 asLanguageModel
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Every grader call site now goes through Provider.invoke(). The Vercel
LanguageModel branches are gone; provider.invoke() is the single API.

composite.ts:
- Drop the asLanguageModel + generateText branch; rely on provider.invoke()
  (which used to be the fallback path).

llm-grader.ts:
- LLM-judge mode (generateStructuredResponse): single invoke() call. Image
  inputs flow as ProviderRequest.images instead of ai-sdk image parts.
- Built-in agent mode (evaluateBuiltIn): replace generateText({tools, stopWhen})
  with provider.invoke({tools, maxSteps}); read step + tool counts off
  ProviderResponse.steps.
- Filesystem tools (createFilesystemTools) now return ProviderTool[] with
  JSON Schema parameters — no zod, no ai-sdk tool() helper.
- Drop ai-sdk imports (generateText, stepCountIs, tool); drop toAiSdkImageParts.

agentv-provider.ts:
- Was: throws on invoke(), exposes Vercel asLanguageModel().
- Now: parses provider:model into pi-ai (providerName, apiId), resolves the
  PiModel in the constructor, and routes invoke() through invokePiAi(). API
  keys come from pi-ai's env-var fallback (OPENAI_API_KEY, ANTHROPIC_API_KEY,
  GOOGLE_GENERATIVE_AI_API_KEY, ...).

ai-sdk.ts:
- Export resolvePiModel, invokePiAi, ProviderDefaults so other providers can
  be ported without copying the adapter.
- InvokePiAiOptions.apiKey is now optional (agentv provider relies on env
  fallback).
- invokePiAi handles the agent loop: tool calls → execute → next model turn,
  bounded by maxSteps. Aggregates token usage and cost across turns.

types.ts:
- ProviderRequest.images: optional ContentImage[] for multimodal grader inputs.

Tests:
- agentv-provider.test.ts: rewritten — mocks pi-ai, asserts the new
  provider:model → (providerName, modelId) routing and that invoke() calls
  pi-ai's complete().
- llm-grader-multimodal.test.ts: rewritten — verifies images flow through
  ProviderRequest.images instead of ai-sdk message parts.

Refs #1205

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .../core/src/evaluation/graders/composite.ts  |  26 ---
 .../core/src/evaluation/graders/llm-grader.ts | 173 +++++++-----------
 .../evaluation/providers/agentv-provider.ts   | 121 ++++++------
 .../core/src/evaluation/providers/ai-sdk.ts   |  55 +++++-
 .../src/evaluation/providers/pi-ai-shim.d.ts  |   5 +-
 .../core/src/evaluation/providers/types.ts    |   8 +-
 .../evaluation/llm-grader-multimodal.test.ts  | 157 ++++++----------
 .../providers/agentv-provider.test.ts         | 134 ++++++--------
 8 files changed, 302 insertions(+), 377 deletions(-)

diff --git a/packages/core/src/evaluation/graders/composite.ts b/packages/core/src/evaluation/graders/composite.ts
index aa497438..66c88fe2 100644
--- a/packages/core/src/evaluation/graders/composite.ts
+++ b/packages/core/src/evaluation/graders/composite.ts
@@ -1,5 +1,3 @@
-import { generateText } from 'ai';
-
 import { extractLastAssistantContent } from '../providers/types.js';
 import type {
   AssertionEntry,
@@ -340,30 +338,6 @@ export class CompositeGrader implements Grader {
     };
 
     try {
-      const model = graderProvider.asLanguageModel?.();
-      if (model) {
-        const { text } = await generateText({
-          model,
-          system: systemPrompt,
-          prompt: userPrompt,
-        });
-
-        const data = freeformEvaluationSchema.parse(parseJsonFromText(text));
-        const score = clampScore(data.score);
-        const assertions: AssertionEntry[] = Array.isArray(data.assertions)
-          ? data.assertions.slice(0, 8)
-          : [];
-
-        return {
-          score,
-          verdict: scoreToVerdict(score),
-          assertions,
-          expectedAspectCount: Math.max(assertions.length, 1),
-          graderRawRequest,
-          scores,
-        };
-      }
-
       const response = await graderProvider.invoke({
         question: userPrompt,
         systemPrompt,
diff --git a/packages/core/src/evaluation/graders/llm-grader.ts b/packages/core/src/evaluation/graders/llm-grader.ts
index 4e192d78..08840dea 100644
--- a/packages/core/src/evaluation/graders/llm-grader.ts
+++ b/packages/core/src/evaluation/graders/llm-grader.ts
@@ -1,7 +1,6 @@
 import fs from 'node:fs/promises';
 import path from 'node:path';
 
-import { generateText, stepCountIs, tool } from 'ai';
 import { z } from 'zod';
 
 import {
@@ -10,7 +9,7 @@ import {
 } from '../content-preprocessor.js';
 import type { ContentImage } from '../content.js';
 import { isContentArray } from '../content.js';
-import type { Message, Provider, ProviderResponse } from '../providers/types.js';
+import type { Message, Provider, ProviderResponse, ProviderTool } from '../providers/types.js';
 import { extractLastAssistantContent, isAgentProvider } from '../providers/types.js';
 import { DEPRECATED_TEMPLATE_VARIABLES, TEMPLATE_VARIABLES } from '../template-variables.js';
 import type { TokenUsage } from '../trace.js';
@@ -482,13 +481,6 @@ export class LlmGrader implements Grader {
     context: EvaluationContext,
     graderProvider: Provider,
   ): Promise<EvaluationScore> {
-    const model = graderProvider.asLanguageModel?.();
-    if (!model) {
-      throw new Error(
-        `Grader provider '${graderProvider.targetName}' does not support asLanguageModel() — required for built-in agent mode`,
-      );
-    }
-
     const workspacePath = context.workspacePath;
     if (!workspacePath) {
       throw new Error(
@@ -512,20 +504,23 @@ export class LlmGrader implements Grader {
     };
 
     try {
-      const { text, steps } = await generateText({
-        model,
-        system: systemPrompt,
-        prompt: userPrompt,
-        tools: fsTools,
-        stopWhen: stepCountIs(this.maxSteps),
+      const response = await graderProvider.invoke({
+        question: userPrompt,
+        systemPrompt,
+        evalCaseId: context.evalCase.id,
+        attempt: context.attempt,
         temperature: this.temperature ?? 0,
+        tools: fsTools,
+        maxSteps: this.maxSteps,
       });
 
-      const toolCallCount = steps.reduce((count, step) => count + (step.toolCalls?.length ?? 0), 0);
+      const text = extractLastAssistantContent(response.output);
+      const stepCount = response.steps?.count ?? 1;
+      const toolCallCount = response.steps?.toolCallCount ?? 0;
 
       const details: JsonObject = {
         mode: 'built-in',
-        steps: steps.length,
+        steps: stepCount,
         tool_calls: toolCallCount,
       };
 
@@ -1103,44 +1098,6 @@ export class LlmGrader implements Grader {
   }): Promise<StructuredGenerationResult> {
     const { context, graderProvider, systemPrompt, userPrompt, images } = options;
 
-    const model = graderProvider.asLanguageModel?.();
-    if (model) {
-      const modelOptions = {
-        ...(this.maxOutputTokens ? { maxTokens: this.maxOutputTokens } : {}),
-        ...(typeof this.temperature === 'number' ? { temperature: this.temperature } : {}),
-      };
-
-      const hasImages = images && images.length > 0;
-      const result = hasImages
-        ? await generateText({
-            model,
-            system: systemPrompt,
-            messages: [
-              {
-                role: 'user' as const,
-                content: [
-                  { type: 'text' as const, text: userPrompt },
-                  ...toAiSdkImageParts(images),
-                ],
-              },
-            ],
-            ...modelOptions,
-          })
-        : await generateText({
-            model,
-            system: systemPrompt,
-            prompt: userPrompt,
-            ...modelOptions,
-          });
-
-      const rawUsage = result.usage;
-      const tokenUsage =
-        rawUsage?.inputTokens != null && rawUsage?.outputTokens != null
-          ? { input: rawUsage.inputTokens, output: rawUsage.outputTokens }
-          : undefined;
-      return { text: result.text, tokenUsage };
-    }
-
     const response = await graderProvider.invoke({
       question: userPrompt,
       systemPrompt,
@@ -1148,6 +1105,7 @@ export class LlmGrader implements Grader {
       attempt: context.attempt,
       maxOutputTokens: this.maxOutputTokens,
       temperature: this.temperature,
+      ...(images && images.length > 0 ? { images } : {}),
     });
 
     return {
@@ -1434,23 +1392,6 @@ export function extractImageBlocks(messages: readonly Message[]): ContentImage[]
   return images;
 }
 
-/**
- * Convert AgentV `ContentImage` blocks to Vercel AI SDK image content parts.
- *
- * The AI SDK `ImagePart` expects `{ type: 'image', image: string | URL, mediaType?: string }`.
- * `ContentImage.source` may be a URL, data URI, or base64 string — all are passed through
- * as the `image` field which the SDK handles natively.
- */
-function toAiSdkImageParts(
-  images: readonly ContentImage[],
-): Array<{ type: 'image'; image: string; mediaType?: string }> {
-  return images.map((img) => ({
-    type: 'image' as const,
-    image: img.source,
-    mediaType: img.media_type || undefined,
-  }));
-}
-
 // ---------------------------------------------------------------------------
 // Sandboxed filesystem tools for built-in agent mode
 // ---------------------------------------------------------------------------
@@ -1468,19 +1409,32 @@ function resolveSandboxed(basePath: string, relativePath: string): string {
 }
 
 /**
- * Create sandboxed filesystem tools for the AI SDK agent loop.
+ * Create sandboxed filesystem tools for the built-in grader agent loop.
+ *
+ * Tools are returned as plain `ProviderTool` records with JSON Schema
+ * `parameters`. The provider serializes them to whatever wire format the
+ * underlying API expects (OpenAI: tools[], Anthropic: tools, ...).
  */
-function createFilesystemTools(workspacePath: string) {
-  return {
-    list_files: tool({
+function createFilesystemTools(workspacePath: string): ProviderTool[] {
+  return [
+    {
+      name: 'list_files',
       description:
         'List files and directories at a relative path within the workspace. Returns names only (single level, no recursion).',
-      inputSchema: z.object({
-        path: z.string().describe('Relative path within workspace (use "." for root)').default('.'),
-      }),
-      execute: async (input: { path: string }) => {
+      parameters: {
+        type: 'object',
+        properties: {
+          path: {
+            type: 'string',
+            description: 'Relative path within workspace (use "." for root)',
+            default: '.',
+          },
+        },
+      },
+      execute: async (input: unknown) => {
+        const args = (input ?? {}) as { path?: string };
         try {
-          const resolved = resolveSandboxed(workspacePath, input.path);
+          const resolved = resolveSandboxed(workspacePath, args.path ?? '.');
           const entries = await fs.readdir(resolved, { withFileTypes: true });
           return entries
             .map((e) => ({
@@ -1492,20 +1446,26 @@ function createFilesystemTools(workspacePath: string) {
           return { error: error instanceof Error ? error.message : String(error) };
         }
       },
-    }),
-
-    read_file: tool({
+    },
+    {
+      name: 'read_file',
       description:
         'Read the content of a file at a relative path within the workspace. Large files are truncated at 50KB.',
-      inputSchema: z.object({
-        path: z.string().describe('Relative path to file within workspace'),
-      }),
-      execute: async (input: { path: string }) => {
+      parameters: {
+        type: 'object',
+        properties: {
+          path: { type: 'string', description: 'Relative path to file within workspace' },
+        },
+        required: ['path'],
+      },
+      execute: async (input: unknown) => {
+        const args = (input ?? {}) as { path?: string };
+        const relPath = args.path ?? '';
         try {
-          const resolved = resolveSandboxed(workspacePath, input.path);
+          const resolved = resolveSandboxed(workspacePath, relPath);
           const stat = await fs.stat(resolved);
           if (stat.isDirectory()) {
-            return { error: `'${input.path}' is a directory, not a file` };
+            return { error: `'${relPath}' is a directory, not a file` };
           }
           const buffer = Buffer.alloc(Math.min(stat.size, MAX_FILE_SIZE));
           const fd = await fs.open(resolved, 'r');
@@ -1521,21 +1481,30 @@ function createFilesystemTools(workspacePath: string) {
           return { error: error instanceof Error ? error.message : String(error) };
         }
       },
-    }),
-
-    search_files: tool({
+    },
+    {
+      name: 'search_files',
       description:
         'Search for a regex pattern across files in the workspace. Returns up to 20 matches. Skips binary files and node_modules/.git.',
-      inputSchema: z.object({
-        pattern: z.string().describe('Regex pattern to search for'),
-        path: z.string().describe('Relative path to search within (use "." for root)').default('.'),
-      }),
-      execute: async (input: { pattern: string; path: string }) => {
+      parameters: {
+        type: 'object',
+        properties: {
+          pattern: { type: 'string', description: 'Regex pattern to search for' },
+          path: {
+            type: 'string',
+            description: 'Relative path to search within (use "." for root)',
+            default: '.',
+          },
+        },
+        required: ['pattern'],
+      },
+      execute: async (input: unknown) => {
+        const args = (input ?? {}) as { pattern?: string; path?: string };
         try {
-          const resolved = resolveSandboxed(workspacePath, input.path);
+          const resolved = resolveSandboxed(workspacePath, args.path ?? '.');
           let regex: RegExp;
           try {
-            regex = new RegExp(input.pattern, 'gi');
+            regex = new RegExp(args.pattern ?? '', 'gi');
           } catch (regexErr) {
             return {
               error: `Invalid regex pattern: ${regexErr instanceof Error ? regexErr.message : String(regexErr)}`,
@@ -1550,8 +1519,8 @@ function createFilesystemTools(workspacePath: string) {
           return { error: error instanceof Error ? error.message : String(error) };
         }
       },
-    }),
-  };
+    },
+  ];
 }
 
 /**
diff --git a/packages/core/src/evaluation/providers/agentv-provider.ts b/packages/core/src/evaluation/providers/agentv-provider.ts
index e8bc1b25..7097524a 100644
--- a/packages/core/src/evaluation/providers/agentv-provider.ts
+++ b/packages/core/src/evaluation/providers/agentv-provider.ts
@@ -1,87 +1,82 @@
-import { createAnthropic } from '@ai-sdk/anthropic';
-import { createAzure } from '@ai-sdk/azure';
-import { createGoogleGenerativeAI } from '@ai-sdk/google';
-import { createOpenAI } from '@ai-sdk/openai';
-import type { LanguageModel } from 'ai';
-
+import { invokePiAi, resolvePiModel } from './ai-sdk.js';
 import type { AgentVResolvedConfig } from './targets.js';
 import type { Provider, ProviderRequest, ProviderResponse } from './types.js';
 
 /**
- * Parse a model string like "openai:gpt-5-mini" into provider prefix and model name.
- */
-function parseModelString(model: string): { provider: string; modelName: string } {
-  const colonIndex = model.indexOf(':');
-  if (colonIndex === -1) {
-    throw new Error(
-      `Invalid model string "${model}". Expected format "provider:model" (e.g., "openai:gpt-5-mini")`,
-    );
-  }
-  return {
-    provider: model.slice(0, colonIndex),
-    modelName: model.slice(colonIndex + 1),
-  };
-}
-
-/**
- * Create a LanguageModel from a model string using the appropriate AI SDK provider.
- */
-function createLanguageModel(modelString: string): LanguageModel {
-  const { provider, modelName } = parseModelString(modelString);
-
-  switch (provider) {
-    case 'openai':
-      return createOpenAI()(modelName);
-    case 'anthropic':
-      return createAnthropic()(modelName);
-    case 'azure':
-      return createAzure().chat(modelName);
-    case 'google':
-      return createGoogleGenerativeAI()(modelName);
-    default:
-      throw new Error(
-        `Unsupported AI SDK provider "${provider}" in model string "${modelString}". Supported providers: openai, anthropic, azure, google`,
-      );
-  }
-}
-
-/**
- * AgentV built-in provider for LLM grader evaluation.
+ * AgentV built-in grader provider.
  *
- * Resolves an AI SDK model string (e.g., "openai:gpt-5-mini", "anthropic:claude-sonnet-4-20250514")
- * to a Vercel AI SDK LanguageModel by parsing the provider prefix and creating the appropriate
- * AI SDK provider directly. This provider is used exclusively for grader evaluation — it does not
- * support direct agent invocation.
+ * Resolves a `provider:model` string (e.g. `openai:gpt-5-mini`,
+ * `anthropic:claude-sonnet-4-20250514`) into a pi-ai Model and runs the call
+ * through the shared invokePiAi adapter. API keys are read from the
+ * provider-specific env var (OPENAI_API_KEY, ANTHROPIC_API_KEY, ...) by pi-ai;
+ * we don't carry credentials in this provider's config.
  *
- * Usage: `--grader-target agentv --model openai:gpt-5-mini`
+ * Used as `--grader-target agentv --model openai:gpt-5-mini`.
  */
 export class AgentvProvider implements Provider {
   readonly id: string;
   readonly kind = 'agentv' as const;
   readonly targetName: string;
 
-  private readonly model: LanguageModel;
+  private readonly piModel: ReturnType<typeof resolvePiModel>;
+  private readonly defaults: { temperature: number };
 
   constructor(targetName: string, config: AgentVResolvedConfig) {
     this.id = `agentv:${targetName}`;
     this.targetName = targetName;
-    this.model = createLanguageModel(config.model);
+    const { providerName, apiId, modelId } = parseAgentvModel(config.model);
+    this.piModel = resolvePiModel({ providerName, apiId, modelId });
+    this.defaults = { temperature: config.temperature };
+  }
+
+  async invoke(request: ProviderRequest): Promise<ProviderResponse> {
+    return invokePiAi({
+      model: this.piModel,
+      request,
+      defaults: this.defaults,
+    });
   }
+}
 
-  /**
-   * Direct invoke is not supported for the agentv provider.
-   * Use asLanguageModel() with generateText() instead.
-   */
-  async invoke(_request: ProviderRequest): Promise<ProviderResponse> {
+/**
+ * Parse `provider:model` into the pi-ai routing fields. Each ai-sdk-style
+ * provider name maps to a pi-ai (providerName, apiId) pair:
+ *
+ *   openai:<id>    → ('openai', 'openai-completions')
+ *   anthropic:<id> → ('anthropic', 'anthropic-messages')
+ *   azure:<id>     → ('azure-openai-responses', 'azure-openai-responses')
+ *   google:<id>    → ('google', 'google-generative-ai')
+ */
+function parseAgentvModel(model: string): {
+  providerName: string;
+  apiId: string;
+  modelId: string;
+} {
+  const colonIndex = model.indexOf(':');
+  if (colonIndex === -1) {
     throw new Error(
-      'AgentvProvider does not support direct invoke(). Use asLanguageModel() with generateText() instead.',
+      `Invalid agentv model "${model}". Expected "provider:model" (e.g., "openai:gpt-5-mini").`,
     );
   }
+  const provider = model.slice(0, colonIndex);
+  const modelId = model.slice(colonIndex + 1);
 
-  /**
-   * Returns the resolved AI SDK LanguageModel for use with generateText/generateObject.
-   */
-  asLanguageModel(): LanguageModel {
-    return this.model;
+  switch (provider) {
+    case 'openai':
+      return { providerName: 'openai', apiId: 'openai-completions', modelId };
+    case 'anthropic':
+      return { providerName: 'anthropic', apiId: 'anthropic-messages', modelId };
+    case 'azure':
+      return {
+        providerName: 'azure-openai-responses',
+        apiId: 'azure-openai-responses',
+        modelId,
+      };
+    case 'google':
+      return { providerName: 'google', apiId: 'google-generative-ai', modelId };
+    default:
+      throw new Error(
+        `Unsupported agentv provider "${provider}" in "${model}". Supported: openai, anthropic, azure, google.`,
+      );
   }
 }
diff --git a/packages/core/src/evaluation/providers/ai-sdk.ts b/packages/core/src/evaluation/providers/ai-sdk.ts
index cb88e772..8150e1db 100644
--- a/packages/core/src/evaluation/providers/ai-sdk.ts
+++ b/packages/core/src/evaluation/providers/ai-sdk.ts
@@ -36,7 +36,7 @@ const DEFAULT_SYSTEM_PROMPT =
 type TextResult = Awaited<ReturnType<typeof generateText>>;
 type GenerateTextOptions = Parameters<typeof generateText>[0];
 
-interface ProviderDefaults {
+export interface ProviderDefaults {
   readonly temperature?: number;
   readonly maxOutputTokens?: number;
   readonly thinkingBudget?: number;
@@ -567,22 +567,29 @@ async function sleep(ms: number): Promise<void> {
 //   3. Drop the createX() / this.model build from the constructor when
 //      asLanguageModel() is no longer used by any consumer.
 
-interface InvokePiAiOptions {
+export interface InvokePiAiOptions {
   /** Pre-resolved pi-ai model (built once in the provider constructor). */
   readonly model: PiModel;
-  /** Per-call credential — pi-ai treats apiKey as a StreamOptions field. */
-  readonly apiKey: string;
+  /**
+   * Per-call credential — pi-ai treats apiKey as a StreamOptions field. When
+   * omitted, pi-ai falls back to the provider-specific env var (OPENAI_API_KEY,
+   * ANTHROPIC_API_KEY, ...). The agentv provider relies on that fallback.
+   */
+  readonly apiKey?: string;
   readonly request: ProviderRequest;
   readonly defaults: ProviderDefaults;
   readonly retryConfig?: RetryConfig;
 }
 
-async function invokePiAi(options: InvokePiAiOptions): Promise<ProviderResponse> {
+export async function invokePiAi(options: InvokePiAiOptions): Promise<ProviderResponse> {
   const { model, apiKey, request, defaults, retryConfig } = options;
   const tools = request.tools && request.tools.length > 0 ? request.tools : undefined;
   const maxSteps = tools ? Math.max(1, request.maxSteps ?? 1) : 1;
 
   const { systemPrompt, messages } = chatPromptToPiContext(buildChatPrompt(request));
+  if (request.images && request.images.length > 0) {
+    attachImagesToLastUserMessage(messages, request.images);
+  }
   const piTools: PiTool[] | undefined = tools
     ? tools.map((t) => ({
         name: t.name,
@@ -593,7 +600,7 @@ async function invokePiAi(options: InvokePiAiOptions): Promise<ProviderResponse>
   const ctx = { systemPrompt, messages, ...(piTools ? { tools: piTools } : {}) };
   const { temperature, maxOutputTokens } = resolveModelSettings(request, defaults);
   const callOptions = {
-    apiKey,
+    ...(apiKey !== undefined ? { apiKey } : {}),
     temperature,
     ...(maxOutputTokens !== undefined ? { maxTokens: maxOutputTokens } : {}),
     signal: request.signal,
@@ -686,7 +693,7 @@ function accumulateUsage(agg: AggregatedUsage, u: PiAssistantMessage['usage']):
   agg.cost += u.cost.total;
 }
 
-function resolvePiModel(args: {
+export function resolvePiModel(args: {
   providerName: string;
   apiId: string;
   modelId: string;
@@ -759,6 +766,40 @@ interface PiContext {
   readonly messages: PiMessage[];
 }
 
+function attachImagesToLastUserMessage(
+  messages: PiMessage[],
+  images: ProviderRequest['images'],
+): void {
+  if (!images || images.length === 0) return;
+  for (let i = messages.length - 1; i >= 0; i--) {
+    const m = messages[i];
+    if (m.role !== 'user') continue;
+    const text = typeof m.content === 'string' ? m.content : '';
+    messages[i] = {
+      ...m,
+      content: [
+        ...(text ? [{ type: 'text' as const, text }] : []),
+        ...images.map((img) => ({
+          type: 'image' as const,
+          data: img.source,
+          mimeType: img.media_type,
+        })),
+      ],
+    };
+    return;
+  }
+  // No user message to attach images to — synthesize one.
+  messages.push({
+    role: 'user',
+    content: images.map((img) => ({
+      type: 'image' as const,
+      data: img.source,
+      mimeType: img.media_type,
+    })),
+    timestamp: Date.now(),
+  });
+}
+
 function chatPromptToPiContext(chatPrompt: ChatPrompt): PiContext {
   // OpenAIProvider.invoke() is reached from the orchestrator's multi-turn
   // and single-turn paths, so the chatPrompt may legitimately contain
diff --git a/packages/core/src/evaluation/providers/pi-ai-shim.d.ts b/packages/core/src/evaluation/providers/pi-ai-shim.d.ts
index 780e7d0b..41464eba 100644
--- a/packages/core/src/evaluation/providers/pi-ai-shim.d.ts
+++ b/packages/core/src/evaluation/providers/pi-ai-shim.d.ts
@@ -26,7 +26,10 @@ declare module '@mariozechner/pi-ai' {
   }
   export interface ImageContent {
     type: 'image';
-    [k: string]: unknown;
+    /** Base64 data, data URL, or absolute URL. */
+    data: string;
+    /** MIME type, e.g. "image/png". */
+    mimeType: string;
   }
   export interface ToolCall {
     type: 'toolCall';
diff --git a/packages/core/src/evaluation/providers/types.ts b/packages/core/src/evaluation/providers/types.ts
index f25a4abf..0caf874e 100644
--- a/packages/core/src/evaluation/providers/types.ts
+++ b/packages/core/src/evaluation/providers/types.ts
@@ -1,4 +1,4 @@
-import type { Content } from '../content.js';
+import type { Content, ContentImage } from '../content.js';
 import { getTextContent, isContentArray } from '../content.js';
 import type { JsonObject } from '../types.js';
 
@@ -192,6 +192,12 @@ export interface ProviderRequest {
    * step). Required when `tools` is non-empty. Ignored otherwise.
    */
   readonly maxSteps?: number;
+  /**
+   * Image inputs appended to the last user turn. Used by graders that judge
+   * screenshot/image content (e.g. red-team UI evals). Providers that do not
+   * support multimodal input should drop these gracefully.
+   */
+  readonly images?: readonly ContentImage[];
 }
 
 /**
diff --git a/packages/core/test/evaluation/llm-grader-multimodal.test.ts b/packages/core/test/evaluation/llm-grader-multimodal.test.ts
index 7c94014e..99e23587 100644
--- a/packages/core/test/evaluation/llm-grader-multimodal.test.ts
+++ b/packages/core/test/evaluation/llm-grader-multimodal.test.ts
@@ -1,11 +1,11 @@
 /**
  * Tests for LLM grader multimodal support — auto-appending image content blocks
- * from agent output to the judge message.
+ * from agent output to the judge invocation.
  *
  * Verifies:
- * - Images from assistant messages are extracted and sent to the judge
+ * - Images from assistant messages are extracted and threaded through provider.invoke
  * - Text-only output is unchanged (backward compatible)
- * - Multiple images are all appended
+ * - Multiple images are all forwarded
  * - Images in non-assistant messages are ignored
  */
 
@@ -15,42 +15,11 @@ import { tmpdir } from 'node:os';
 import { join } from 'node:path';
 
 import type { ResolvedTarget } from '../../src/evaluation/providers/targets.js';
-import type { Message } from '../../src/evaluation/providers/types.js';
+import type { Message, ProviderRequest } from '../../src/evaluation/providers/types.js';
 import type { EvalTest } from '../../src/evaluation/types.js';
 
-// ---------------------------------------------------------------------------
-// Mock generateText to capture what the LLM grader sends to the judge.
-// Must be set up before importing the module under test.
-// ---------------------------------------------------------------------------
-
-let capturedGenerateTextArgs: Record<string, unknown> | undefined;
-
-function graderJsonResponse(score: number): string {
-  return JSON.stringify({
-    score,
-    assertions: [{ text: 'Checked output', passed: score >= 0.5 }],
-  });
-}
-
-mock.module('ai', () => {
-  const actual = require('ai');
-  return {
-    ...actual,
-    generateText: mock(async (args: Record<string, unknown>) => {
-      capturedGenerateTextArgs = args;
-      return {
-        text: graderJsonResponse(0.85),
-        usage: { inputTokens: 10, outputTokens: 20 },
-        finishReason: 'stop',
-        response: { id: 'test', timestamp: new Date(), modelId: 'test' },
-      };
-    }),
-  };
-});
-
-// Import AFTER mock is set up
-const { extractImageBlocks } = await import('../../src/evaluation/graders/llm-grader.js');
-const { LlmGrader } = await import('../../src/evaluation/graders.js');
+import { LlmGrader } from '../../src/evaluation/graders.js';
+import { extractImageBlocks } from '../../src/evaluation/graders/llm-grader.js';
 
 // ---------------------------------------------------------------------------
 // Test helpers
@@ -74,18 +43,33 @@ const baseTarget: ResolvedTarget = {
   config: { response: '{}' },
 };
 
+function graderJsonResponse(score: number): string {
+  return JSON.stringify({
+    score,
+    assertions: [{ text: 'Checked output', passed: score >= 0.5 }],
+  });
+}
+
 /**
- * Creates a provider with a fake asLanguageModel() that returns a sentinel
- * object. The actual model behavior is handled by the mocked generateText.
+ * Creates a provider whose invoke() returns a canned grader response and
+ * records the request it was called with.
  */
-function createLmProvider() {
-  const fakeModel = { modelId: 'test-model', provider: 'test' };
+function createCapturingProvider() {
+  const captured: { request?: ProviderRequest } = {};
   return {
-    id: 'test-lm',
-    kind: 'mock' as const,
-    targetName: 'test-lm',
-    invoke: mock(async () => ({ output: [] })),
-    asLanguageModel: () => fakeModel as never,
+    captured,
+    provider: {
+      id: 'test-lm',
+      kind: 'mock' as const,
+      targetName: 'test-lm',
+      invoke: mock(async (request: ProviderRequest) => {
+        captured.request = request;
+        return {
+          output: [{ role: 'assistant' as const, content: graderJsonResponse(0.85) }],
+          tokenUsage: { input: 10, output: 20 },
+        };
+      }),
+    },
   };
 }
 
@@ -200,7 +184,7 @@ describe('LlmGrader multimodal', () => {
   let tempDir: string | undefined;
 
   beforeEach(() => {
-    capturedGenerateTextArgs = undefined;
+    // no-op; each test uses its own capturing provider
   });
 
   afterEach(async () => {
@@ -210,8 +194,8 @@ describe('LlmGrader multimodal', () => {
     }
   });
 
-  it('sends plain text prompt when output has no images', async () => {
-    const provider = createLmProvider();
+  it('omits images when assistant output has none', async () => {
+    const { captured, provider } = createCapturingProvider();
 
     const evaluator = new LlmGrader({
       resolveGraderProvider: async () => provider,
@@ -229,15 +213,12 @@ describe('LlmGrader multimodal', () => {
     });
 
     expect(result.score).toBe(0.85);
-    expect(capturedGenerateTextArgs).toBeDefined();
-
-    // When no images, generateText should receive `prompt` (string), not `messages`
-    expect(capturedGenerateTextArgs?.prompt).toBeTypeOf('string');
-    expect(capturedGenerateTextArgs?.messages).toBeUndefined();
+    expect(captured.request).toBeDefined();
+    expect(captured.request?.images).toBeUndefined();
   });
 
-  it('sends multi-part messages when output contains images', async () => {
-    const provider = createLmProvider();
+  it('forwards images on the invoke request when assistant output contains them', async () => {
+    const { captured, provider } = createCapturingProvider();
 
     const evaluator = new LlmGrader({
       resolveGraderProvider: async () => provider,
@@ -265,32 +246,16 @@ describe('LlmGrader multimodal', () => {
     });
 
     expect(result.score).toBe(0.85);
-    expect(capturedGenerateTextArgs).toBeDefined();
-
-    // When images exist, generateText should receive `messages` with multi-part content
-    expect(capturedGenerateTextArgs?.messages).toBeDefined();
-    expect(capturedGenerateTextArgs?.prompt).toBeUndefined();
-
-    const messages = capturedGenerateTextArgs?.messages as Array<Record<string, unknown>>;
-    expect(messages).toHaveLength(1);
-    expect(messages[0].role).toBe('user');
-
-    const content = messages[0].content as Array<Record<string, unknown>>;
-
-    // Should contain text part + image part
-    const textParts = content.filter((p) => p.type === 'text');
-    const imageParts = content.filter((p) => p.type === 'image');
-
-    expect(textParts.length).toBeGreaterThanOrEqual(1);
-    expect(imageParts).toHaveLength(1);
-
-    // Verify image data is passed through
-    expect(imageParts[0].image).toBe('data:image/png;base64,CATIMAGE');
-    expect(imageParts[0].mediaType).toBe('image/png');
+    expect(captured.request?.images).toHaveLength(1);
+    expect(captured.request?.images?.[0]).toEqual({
+      type: 'image',
+      media_type: 'image/png',
+      source: 'data:image/png;base64,CATIMAGE',
+    });
   });
 
-  it('appends multiple images from output', async () => {
-    const provider = createLmProvider();
+  it('forwards multiple images', async () => {
+    const { captured, provider } = createCapturingProvider();
 
     const evaluator = new LlmGrader({
       resolveGraderProvider: async () => provider,
@@ -318,18 +283,13 @@ describe('LlmGrader multimodal', () => {
       output: outputMessages,
     });
 
-    expect(capturedGenerateTextArgs).toBeDefined();
-    const messages = capturedGenerateTextArgs?.messages as Array<Record<string, unknown>>;
-    const content = messages[0].content as Array<Record<string, unknown>>;
-
-    const imageParts = content.filter((p) => p.type === 'image');
-    expect(imageParts).toHaveLength(2);
-    expect(imageParts[0].image).toBe('https://example.com/img1.png');
-    expect(imageParts[1].image).toBe('data:image/jpeg;base64,IMG2DATA');
+    expect(captured.request?.images).toHaveLength(2);
+    expect(captured.request?.images?.[0].source).toBe('https://example.com/img1.png');
+    expect(captured.request?.images?.[1].source).toBe('data:image/jpeg;base64,IMG2DATA');
   });
 
   it('ignores images in user/tool messages (only assistant)', async () => {
-    const provider = createLmProvider();
+    const { captured, provider } = createCapturingProvider();
 
     const evaluator = new LlmGrader({
       resolveGraderProvider: async () => provider,
@@ -342,10 +302,7 @@ describe('LlmGrader multimodal', () => {
           { type: 'image', media_type: 'image/png', source: 'data:image/png;base64,USERIMG' },
         ],
       },
-      {
-        role: 'assistant',
-        content: 'Just text, no images',
-      },
+      { role: 'assistant', content: 'Just text, no images' },
     ];
 
     await evaluator.evaluate({
@@ -359,14 +316,10 @@ describe('LlmGrader multimodal', () => {
       output: outputMessages,
     });
 
-    expect(capturedGenerateTextArgs).toBeDefined();
-
-    // No images in assistant messages → should use plain prompt
-    expect(capturedGenerateTextArgs?.prompt).toBeTypeOf('string');
-    expect(capturedGenerateTextArgs?.messages).toBeUndefined();
+    expect(captured.request?.images).toBeUndefined();
   });
 
-  it('injects preprocessed file text into the plain prompt', async () => {
+  it('injects preprocessed file text into the user prompt', async () => {
     tempDir = await mkdtemp(join(tmpdir(), 'agentv-llm-file-'));
     const filePath = join(tempDir, 'report.xlsx');
     const scriptPath = join(tempDir, 'xlsx-to-text.js');
@@ -380,7 +333,7 @@ console.log('spreadsheet:' + path.basename(payload.original_path));`,
       'utf8',
     );
 
-    const provider = createLmProvider();
+    const { captured, provider } = createCapturingProvider();
     const evaluator = new LlmGrader({
       resolveGraderProvider: async () => provider,
     });
@@ -412,7 +365,7 @@ console.log('spreadsheet:' + path.basename(payload.original_path));`,
       ],
     });
 
-    expect(capturedGenerateTextArgs?.prompt).toBeTypeOf('string');
-    expect(String(capturedGenerateTextArgs?.prompt)).toContain('spreadsheet:report.xlsx');
+    expect(captured.request?.question).toBeTypeOf('string');
+    expect(String(captured.request?.question)).toContain('spreadsheet:report.xlsx');
   });
 });
diff --git a/packages/core/test/evaluation/providers/agentv-provider.test.ts b/packages/core/test/evaluation/providers/agentv-provider.test.ts
index f617dca3..2f26c03a 100644
--- a/packages/core/test/evaluation/providers/agentv-provider.test.ts
+++ b/packages/core/test/evaluation/providers/agentv-provider.test.ts
@@ -1,39 +1,41 @@
 import { describe, expect, it, vi } from 'vitest';
 
-// Mock AI SDK provider packages before importing the provider.
-// Each createXxx() returns a callable factory: createXxx()(modelName) => model stub.
-vi.mock('@ai-sdk/openai', () => ({
-  createOpenAI: () => (modelId: string) => ({
-    modelId,
-    specificationVersion: 'v3',
-    provider: 'openai',
-  }),
+// Mock pi-ai's runtime exports. AgentvProvider now resolves a pi-ai Model in
+// the constructor and routes invoke() through the shared invokePiAi adapter.
+const piGetModelMock = vi.fn((provider: string, modelId: string) => ({
+  id: modelId,
+  name: modelId,
+  api: 'openai-completions',
+  provider,
+  baseUrl: 'https://example.test/v1',
+  reasoning: false,
+  input: ['text'],
+  cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+  contextWindow: 128000,
+  maxTokens: 16384,
 }));
-
-vi.mock('@ai-sdk/anthropic', () => ({
-  createAnthropic: () => (modelId: string) => ({
-    modelId,
-    specificationVersion: 'v3',
-    provider: 'anthropic',
-  }),
-}));
-
-vi.mock('@ai-sdk/azure', () => ({
-  createAzure: () => ({
-    chat: (modelId: string) => ({
-      modelId,
-      specificationVersion: 'v3',
-      provider: 'azure',
-    }),
-  }),
+const piCompleteMock = vi.fn(async () => ({
+  role: 'assistant' as const,
+  content: [{ type: 'text', text: 'ok' }],
+  usage: {
+    input: 1,
+    output: 1,
+    cacheRead: 0,
+    cacheWrite: 0,
+    totalTokens: 2,
+    cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 },
+  },
+  api: 'openai-completions',
+  provider: 'openai',
+  model: 'gpt-test',
+  stopReason: 'stop' as const,
+  timestamp: Date.now(),
 }));
 
-vi.mock('@ai-sdk/google', () => ({
-  createGoogleGenerativeAI: () => (modelId: string) => ({
-    modelId,
-    specificationVersion: 'v3',
-    provider: 'google',
-  }),
+vi.mock('@mariozechner/pi-ai', () => ({
+  complete: (...args: unknown[]) => piCompleteMock(...(args as [])),
+  getModel: (provider: string, modelId: string) => piGetModelMock(provider, modelId),
+  registerBuiltInApiProviders: () => undefined,
 }));
 
 import { AgentvProvider } from '../../../src/evaluation/providers/agentv-provider.js';
@@ -63,73 +65,55 @@ describe('AgentvProvider', () => {
     expect(provider.id).toBe('agentv:test-grader');
   });
 
-  it('asLanguageModel() returns a defined LanguageModel', () => {
-    const provider = new AgentvProvider('test-grader', {
-      model: 'openai:gpt-5-mini',
-      temperature: 0,
-    });
-    const model = provider.asLanguageModel();
-    expect(model).toBeDefined();
-    expect((model as unknown as { modelId: string }).modelId).toBe('gpt-5-mini');
+  it('resolves openai model strings via pi-ai', () => {
+    piGetModelMock.mockClear();
+    new AgentvProvider('test', { model: 'openai:gpt-5-mini', temperature: 0 });
+    expect(piGetModelMock).toHaveBeenCalledWith('openai', 'gpt-5-mini');
   });
 
-  it('asLanguageModel() works with anthropic model strings', () => {
-    const provider = new AgentvProvider('test-grader', {
-      model: 'anthropic:claude-sonnet-4-20250514',
-      temperature: 0,
-    });
-    const model = provider.asLanguageModel();
-    expect(model).toBeDefined();
-    expect((model as unknown as { modelId: string }).modelId).toBe('claude-sonnet-4-20250514');
+  it('resolves anthropic model strings via pi-ai', () => {
+    piGetModelMock.mockClear();
+    new AgentvProvider('test', { model: 'anthropic:claude-sonnet-4', temperature: 0 });
+    expect(piGetModelMock).toHaveBeenCalledWith('anthropic', 'claude-sonnet-4');
   });
 
-  it('asLanguageModel() works with google model strings', () => {
-    const provider = new AgentvProvider('test-grader', {
-      model: 'google:gemini-2.5-flash',
-      temperature: 0,
-    });
-    const model = provider.asLanguageModel();
-    expect(model).toBeDefined();
-    expect((model as unknown as { modelId: string }).modelId).toBe('gemini-2.5-flash');
+  it('resolves google model strings via pi-ai', () => {
+    piGetModelMock.mockClear();
+    new AgentvProvider('test', { model: 'google:gemini-2.5-flash', temperature: 0 });
+    expect(piGetModelMock).toHaveBeenCalledWith('google', 'gemini-2.5-flash');
   });
 
-  it('asLanguageModel() works with azure model strings', () => {
-    const provider = new AgentvProvider('test-grader', {
-      model: 'azure:gpt-4o-deployment',
-      temperature: 0,
-    });
-    const model = provider.asLanguageModel();
-    expect(model).toBeDefined();
-    expect((model as unknown as { modelId: string }).modelId).toBe('gpt-4o-deployment');
+  it('resolves azure model strings via pi-ai (azure-openai-responses provider)', () => {
+    piGetModelMock.mockClear();
+    new AgentvProvider('test', { model: 'azure:gpt-4o-deployment', temperature: 0 });
+    expect(piGetModelMock).toHaveBeenCalledWith('azure-openai-responses', 'gpt-4o-deployment');
   });
 
   it('throws for unsupported provider prefix', () => {
     expect(
       () =>
-        new AgentvProvider('test-grader', {
+        new AgentvProvider('test', {
           model: 'unsupported:some-model',
           temperature: 0,
         }),
-    ).toThrow('Unsupported AI SDK provider "unsupported"');
+    ).toThrow('Unsupported agentv provider "unsupported"');
   });
 
   it('throws for model string without colon separator', () => {
     expect(
       () =>
-        new AgentvProvider('test-grader', {
+        new AgentvProvider('test', {
           model: 'gpt-5-mini',
           temperature: 0,
         }),
-    ).toThrow('Invalid model string "gpt-5-mini"');
+    ).toThrow('Invalid agentv model "gpt-5-mini"');
   });
 
-  it('invoke() throws an error', async () => {
-    const provider = new AgentvProvider('test-grader', {
-      model: 'openai:gpt-5-mini',
-      temperature: 0,
-    });
-    await expect(provider.invoke({ question: 'test' })).rejects.toThrow(
-      'AgentvProvider does not support direct invoke()',
-    );
+  it('invoke() routes through pi-ai complete()', async () => {
+    piCompleteMock.mockClear();
+    const provider = new AgentvProvider('test', { model: 'openai:gpt-5-mini', temperature: 0 });
+    const response = await provider.invoke({ question: 'hello' });
+    expect(piCompleteMock).toHaveBeenCalledTimes(1);
+    expect(response.output?.[0]).toMatchObject({ role: 'assistant', content: 'ok' });
   });
 });

From c0fe2b2bcfabcc4d577f1e508dc049767f3f7231 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sun, 3 May 2026 12:31:02 +0200
Subject: [PATCH 11/20] refactor(core): drop ai-sdk entirely; all providers on
 pi-ai
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Complete the #1205 migration. ai-sdk.ts no longer imports from @ai-sdk/* or
'ai'; all five direct-API providers (OpenAI, Azure, OpenRouter, Anthropic,
Gemini) route through the same invokePiAi() adapter.

Provider classes (ai-sdk.ts):
- All five resolve a pi-ai PiModel in their constructor and delegate invoke()
  to invokePiAi.
- Vercel `this.model` field, createOpenAI()/createAzure()/etc., and
  asLanguageModel() are gone.
- AnthropicProvider passes thinkingBudget through pi-ai's Anthropic-specific
  options as { thinkingEnabled, thinkingBudgetTokens } — no lossy bucket
  mapping for older models. Newer models (Opus/Sonnet 4.6) ignore it in
  favour of adaptive thinking, same as before.
- AzureProvider routes through pi-ai's azure-openai-responses for both
  apiFormat values. Behavior change: the legacy Vercel path used
  /chat/completions for apiFormat='chat' (default); pi-ai uses /responses
  for everything. Functionally equivalent for grader use cases. Users who
  hit a deployment that only exposes /chat/completions can route through
  `provider: openai` with a deployment-scoped baseURL instead.

Provider interface (types.ts):
- Drop asLanguageModel?(); the Vercel LanguageModel reference is gone.

invokePiAi:
- Now accepts providerOptions: Record<string, unknown> for provider-specific
  knobs (Anthropic thinking, Azure URL config). Pi-ai's
  ProviderStreamOptions = StreamOptions & Record<string, unknown> forwards
  these to the underlying provider impl.

Tests:
- targets.test.ts: dropped @ai-sdk/* / ai / @openrouter/ai-sdk-provider
  module mocks. createProvider tests now assert pi-ai routing
  (providerName + apiId + baseUrl + provider-specific options).

Dependencies removed:
- packages/core: @ai-sdk/anthropic, @ai-sdk/azure, @ai-sdk/google,
  @ai-sdk/openai, ai
- apps/cli: @ai-sdk/openai
- root: @openrouter/ai-sdk-provider

Verification:
- Build / typecheck / lint / 1741 unit tests all green.
- Live eval: examples/features/rubric/evals/dataset.eval.yaml run with
  target=openai routed via OpenRouter. All 3 grader-score baselines pass:
    ✓ code-quality-multi-eval / rubrics: 0.5 ∈ [0.3, 1]
    ✓ code-explanation-simple / rubrics: 1.0 ∈ [0.8, 1]
    ✓ technical-writing-detailed / rubrics: 1.0 ∈ [0.8, 1]

Refs #1205

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 apps/cli/package.json                         |   1 -
 bun.lock                                      |  34 -
 package.json                                  |   3 -
 packages/core/package.json                    |   6 -
 .../core/src/evaluation/providers/ai-sdk.ts   | 791 +++++++-----------
 .../core/src/evaluation/providers/types.ts    |   5 -
 .../evaluation/registry/builtin-graders.ts    |   4 +-
 .../test/evaluation/providers/targets.test.ts | 245 ++----
 8 files changed, 412 insertions(+), 677 deletions(-)

diff --git a/apps/cli/package.json b/apps/cli/package.json
index ea34b2e1..578c9d31 100644
--- a/apps/cli/package.json
+++ b/apps/cli/package.json
@@ -28,7 +28,6 @@
     "test:watch": "bun test --watch"
   },
   "dependencies": {
-    "@ai-sdk/openai": "^3.0.0",
     "@anthropic-ai/claude-agent-sdk": "^0.2.49",
     "@github/copilot-sdk": "^0.1.25",
     "@hono/node-server": "^1.19.11",
diff --git a/bun.lock b/bun.lock
index fe9c6c59..029ea860 100644
--- a/bun.lock
+++ b/bun.lock
@@ -4,9 +4,6 @@
   "workspaces": {
     "": {
       "name": "@agentv/workspace",
-      "dependencies": {
-        "@openrouter/ai-sdk-provider": "^2.3.3",
-      },
       "devDependencies": {
         "@agentv/core": "workspace:*",
         "@agentv/eval": "workspace:*",
@@ -28,7 +25,6 @@
         "agentv": "./dist/cli.js",
       },
       "dependencies": {
-        "@ai-sdk/openai": "^3.0.0",
         "@anthropic-ai/claude-agent-sdk": "^0.2.49",
         "@github/copilot-sdk": "^0.1.25",
         "@hono/node-server": "^1.19.11",
@@ -92,16 +88,10 @@
       "dependencies": {
         "@agentclientprotocol/sdk": "^0.14.1",
         "@agentv/eval": "workspace:*",
-        "@ai-sdk/anthropic": "^3.0.0",
-        "@ai-sdk/azure": "^3.0.0",
-        "@ai-sdk/google": "^3.0.0",
-        "@ai-sdk/openai": "^3.0.0",
         "@github/copilot-sdk": "^0.1.25",
         "@mariozechner/pi-ai": "^0.62.0",
         "@openai/codex-sdk": "^0.104.0",
-        "@openrouter/ai-sdk-provider": "^2.3.1",
         "@sinclair/typebox": "^0.34.41",
-        "ai": "^6.0.0",
         "fast-glob": "^3.3.3",
         "json5": "^2.2.3",
         "micromatch": "^4.0.8",
@@ -148,20 +138,6 @@
 
     "@agentv/web": ["@agentv/web@workspace:apps/web"],
 
-    "@ai-sdk/anthropic": ["@ai-sdk/anthropic@3.0.58", "", { "dependencies": { "@ai-sdk/provider": "3.0.8", "@ai-sdk/provider-utils": "4.0.19" }, "peerDependencies": { "zod": "^3.25.76 || ^4.1.8" } }, "sha512-/53SACgmVukO4bkms4dpxpRlYhW8Ct6QZRe6sj1Pi5H00hYhxIrqfiLbZBGxkdRvjsBQeP/4TVGsXgH5rQeb8Q=="],
-
-    "@ai-sdk/azure": ["@ai-sdk/azure@3.0.42", "", { "dependencies": { "@ai-sdk/openai": "3.0.41", "@ai-sdk/provider": "3.0.8", "@ai-sdk/provider-utils": "4.0.19" }, "peerDependencies": { "zod": "^3.25.76 || ^4.1.8" } }, "sha512-BGg0e3GEI7KHkwUv7d5f9rXzDlTiWhQ4xzVakdHLV/OP24jvXes5X7fI3QZ0rbKBop6URq0yaxomBfwEqqRlzw=="],
-
-    "@ai-sdk/gateway": ["@ai-sdk/gateway@3.0.66", "", { "dependencies": { "@ai-sdk/provider": "3.0.8", "@ai-sdk/provider-utils": "4.0.19", "@vercel/oidc": "3.1.0" }, "peerDependencies": { "zod": "^3.25.76 || ^4.1.8" } }, "sha512-SIQ0YY0iMuv+07HLsZ+bB990zUJ6S4ujORAh+Jv1V2KGNn73qQKnGO0JBk+w+Res8YqOFSycwDoWcFlQrVxS4A=="],
-
-    "@ai-sdk/google": ["@ai-sdk/google@3.0.43", "", { "dependencies": { "@ai-sdk/provider": "3.0.8", "@ai-sdk/provider-utils": "4.0.19" }, "peerDependencies": { "zod": "^3.25.76 || ^4.1.8" } }, "sha512-NGCgP5g8HBxrNdxvF8Dhww+UKfqAkZAmyYBvbu9YLoBkzAmGKDBGhVptN/oXPB5Vm0jggMdoLycZ8JReQM8Zqg=="],
-
-    "@ai-sdk/openai": ["@ai-sdk/openai@3.0.41", "", { "dependencies": { "@ai-sdk/provider": "3.0.8", "@ai-sdk/provider-utils": "4.0.19" }, "peerDependencies": { "zod": "^3.25.76 || ^4.1.8" } }, "sha512-IZ42A+FO+vuEQCVNqlnAPYQnnUpUfdJIwn1BEDOBywiEHa23fw7PahxVtlX9zm3/zMvTW4JKPzWyvAgDu+SQ2A=="],
-
-    "@ai-sdk/provider": ["@ai-sdk/provider@3.0.8", "", { "dependencies": { "json-schema": "^0.4.0" } }, "sha512-oGMAgGoQdBXbZqNG0Ze56CHjDZ1IDYOwGYxYjO5KLSlz5HiNQ9udIXsPZ61VWaHGZ5XW/jyjmr6t2xz2jGVwbQ=="],
-
-    "@ai-sdk/provider-utils": ["@ai-sdk/provider-utils@4.0.19", "", { "dependencies": { "@ai-sdk/provider": "3.0.8", "@standard-schema/spec": "^1.1.0", "eventsource-parser": "^3.0.6" }, "peerDependencies": { "zod": "^3.25.76 || ^4.1.8" } }, "sha512-3eG55CrSWCu2SXlqq2QCsFjo3+E7+Gmg7i/oRVoSZzIodTuDSfLb3MRje67xE9RFea73Zao7Lm4mADIfUETKGg=="],
-
     "@anthropic-ai/claude-agent-sdk": ["@anthropic-ai/claude-agent-sdk@0.2.49", "", { "optionalDependencies": { "@img/sharp-darwin-arm64": "^0.34.2", "@img/sharp-darwin-x64": "^0.34.2", "@img/sharp-linux-arm": "^0.34.2", "@img/sharp-linux-arm64": "^0.34.2", "@img/sharp-linux-x64": "^0.34.2", "@img/sharp-linuxmusl-arm64": "^0.34.2", "@img/sharp-linuxmusl-x64": "^0.34.2", "@img/sharp-win32-arm64": "^0.34.2", "@img/sharp-win32-x64": "^0.34.2" }, "peerDependencies": { "zod": "^4.0.0" } }, "sha512-3avi409dwuGkPEETpWa0gyJvRMr3b6LxeuW5/sAPCOtLD9WxH9fYltbA5wZoazxTw5mlbXmjDp7JqO1rlmpaIQ=="],
 
     "@anthropic-ai/sdk": ["@anthropic-ai/sdk@0.73.0", "", { "dependencies": { "json-schema-to-ts": "^3.1.1" }, "peerDependencies": { "zod": "^3.25.0 || ^4.0.0" }, "optionalPeers": ["zod"], "bin": { "anthropic-ai-sdk": "bin/cli" } }, "sha512-URURVzhxXGJDGUGFunIOtBlSl7KWvZiAAKY/ttTkZAkXT9bTPqdk2eK0b8qqSxXpikh3QKPnPYpiyX98zf5ebw=="],
@@ -534,8 +510,6 @@
 
     "@openai/codex-win32-x64": ["@openai/codex@0.104.0-win32-x64", "", { "os": "win32", "cpu": "x64" }, "sha512-awyNLtfbTbj+2JzgsAIm+KFrxeAmxe/Fuqw/ZwBj8ljtO7SQWTT3kxDbf7iuA7E7IErGlQw/plgFgq/LJdsacg=="],
 
-    "@openrouter/ai-sdk-provider": ["@openrouter/ai-sdk-provider@2.3.3", "", { "peerDependencies": { "ai": "^6.0.0", "zod": "^3.25.0 || ^4.0.0" } }, "sha512-4fVteGkVedc7fGoA9+qJs4tpYwALezMq14m2Sjub3KmyRlksCbK+WJf67NPdGem8+NZrV2tAN42A1NU3+SiV3w=="],
-
     "@opentelemetry/api": ["@opentelemetry/api@1.9.0", "", {}, "sha512-3giAOQvZiH5F9bMlMiv8+GSPMeqg0dbaeo58/0SlA9sxSqZhnUtxzX9/2FzyhS9sWQf5S0GJE0AKBrFqjpeYcg=="],
 
     "@opentelemetry/api-logs": ["@opentelemetry/api-logs@0.212.0", "", { "dependencies": { "@opentelemetry/api": "^1.3.0" } }, "sha512-TEEVrLbNROUkYY51sBJGk7lO/OLjuepch8+hmpM6ffMJQ2z/KVCjdHuCFX6fJj8OkJP2zckPjrJzQtXU3IAsFg=="],
@@ -884,8 +858,6 @@
 
     "@ungap/structured-clone": ["@ungap/structured-clone@1.3.0", "", {}, "sha512-WmoN8qaIAo7WTYWbAZuG8PYEhn5fkz7dZrqTBZ7dtt//lL2Gwms1IcnQ5yHqjDfX8Ft5j4YzDM23f87zBfDe9g=="],
 
-    "@vercel/oidc": ["@vercel/oidc@3.1.0", "", {}, "sha512-Fw28YZpRnA3cAHHDlkt7xQHiJ0fcL+NRcIqsocZQUSmbzeIKRpwttJjik5ZGanXP+vlA4SbTg+AbA3bP363l+w=="],
-
     "@vitejs/plugin-react": ["@vitejs/plugin-react@4.7.0", "", { "dependencies": { "@babel/core": "^7.28.0", "@babel/plugin-transform-react-jsx-self": "^7.27.1", "@babel/plugin-transform-react-jsx-source": "^7.27.1", "@rolldown/pluginutils": "1.0.0-beta.27", "@types/babel__core": "^7.20.5", "react-refresh": "^0.17.0" }, "peerDependencies": { "vite": "^4.2.0 || ^5.0.0 || ^6.0.0 || ^7.0.0" } }, "sha512-gUu9hwfWvvEDBBmgtAowQCojwZmJ5mcLn3aufeCsitijs3+f2NsrPtlAWIR6OPiqljl96GVCUbLe0HyqIpVaoA=="],
 
     "acorn": ["acorn@8.15.0", "", { "bin": { "acorn": "bin/acorn" } }, "sha512-NZyJarBfL7nWwIq+FDL6Zp/yHEhePMNnnJ0y3qfieCrmNvYct8uvtiV41UvlSe6apAfk0fY1FbWx+NwfmpvtTg=="],
@@ -896,8 +868,6 @@
 
     "agentv": ["agentv@workspace:apps/cli"],
 
-    "ai": ["ai@6.0.116", "", { "dependencies": { "@ai-sdk/gateway": "3.0.66", "@ai-sdk/provider": "3.0.8", "@ai-sdk/provider-utils": "4.0.19", "@opentelemetry/api": "1.9.0" }, "peerDependencies": { "zod": "^3.25.76 || ^4.1.8" } }, "sha512-7yM+cTmyRLeNIXwt4Vj+mrrJgVQ9RMIW5WO0ydoLoYkewIvsMcvUmqS4j2RJTUXaF1HphwmSKUMQ/HypNRGOmA=="],
-
     "ajv": ["ajv@8.20.0", "", { "dependencies": { "fast-deep-equal": "^3.1.3", "fast-uri": "^3.0.1", "json-schema-traverse": "^1.0.0", "require-from-string": "^2.0.2" } }, "sha512-Thbli+OlOj+iMPYFBVBfJ3OmCAnaSyNn4M1vz9T6Gka5Jt9ba/HIR56joy65tY6kx/FCF5VXNB819Y7/GUrBGA=="],
 
     "ajv-formats": ["ajv-formats@3.0.1", "", { "dependencies": { "ajv": "^8.0.0" } }, "sha512-8iUql50EUR+uUcdRQ3HDqa6EVyo3docL8g5WJ3FNcWmu62IbkGUue/pEyLBW8VGKKucTPgqeks4fIU1DA4yowQ=="],
@@ -1182,8 +1152,6 @@
 
     "eventemitter3": ["eventemitter3@5.0.4", "", {}, "sha512-mlsTRyGaPBjPedk6Bvw+aqbsXDtoAyAzm5MO7JgU+yVRyMQ5O8bD4Kcci7BS85f93veegeCPkL8R4GLClnjLFw=="],
 
-    "eventsource-parser": ["eventsource-parser@3.0.6", "", {}, "sha512-Vo1ab+QXPzZ4tCa8SwIHJFaSzy4R6SHf7BY79rFBDf0idraZWAkYrDjDj8uWaSm3S2TK+hJ7/t1CEmZ7jXw+pg=="],
-
     "execa": ["execa@9.6.1", "", { "dependencies": { "@sindresorhus/merge-streams": "^4.0.0", "cross-spawn": "^7.0.6", "figures": "^6.1.0", "get-stream": "^9.0.0", "human-signals": "^8.0.1", "is-plain-obj": "^4.1.0", "is-stream": "^4.0.1", "npm-run-path": "^6.0.0", "pretty-ms": "^9.2.0", "signal-exit": "^4.1.0", "strip-final-newline": "^4.0.0", "yoctocolors": "^2.1.1" } }, "sha512-9Be3ZoN4LmYR90tUoVu2te2BsbzHfhJyfEiAVfz7N5/zv+jduIfLrV2xdQXOHbaD6KgpGdO9PRPM1Y4Q9QkPkA=="],
 
     "expressive-code": ["expressive-code@0.41.6", "", { "dependencies": { "@expressive-code/core": "^0.41.6", "@expressive-code/plugin-frames": "^0.41.6", "@expressive-code/plugin-shiki": "^0.41.6", "@expressive-code/plugin-text-markers": "^0.41.6" } }, "sha512-W/5+IQbrpCIM5KGLjO35wlp1NCwDOOVQb+PAvzEoGkW1xjGM807ZGfBKptNWH6UECvt6qgmLyWolCMYKh7eQmA=="],
@@ -1390,8 +1358,6 @@
 
     "json-bigint": ["json-bigint@1.0.0", "", { "dependencies": { "bignumber.js": "^9.0.0" } }, "sha512-SiPv/8VpZuWbvLSMtTDU8hEfrZWg/mH/nV/b4o0CYbSxu1UIQPLdwKOCIyLQX+VIPO5vrLX3i8qtqFyhdPSUSQ=="],
 
-    "json-schema": ["json-schema@0.4.0", "", {}, "sha512-es94M3nTIfsEPisRafak+HDLfHXnKBhV3vU5eqPcS3flIWqcxJWgXHXiey3YrpaNsanY5ei1VoYEbOzijuq9BA=="],
-
     "json-schema-to-ts": ["json-schema-to-ts@3.1.1", "", { "dependencies": { "@babel/runtime": "^7.18.3", "ts-algebra": "^2.0.0" } }, "sha512-+DWg8jCJG2TEnpy7kOm/7/AxaYoaRbjVB4LFZLySZlWn8exGs3A4OLJR966cVvU26N7X9TWxl+Jsw7dzAqKT6g=="],
 
     "json-schema-traverse": ["json-schema-traverse@1.0.0", "", {}, "sha512-NM8/P9n3XjXhIZn1lLhkFaACTOURQXjWhV4BA/RnOv8xvgqtqpAX9IO4mRQxSx1Rlo4tqzeqb0sOlruaOy3dug=="],
diff --git a/package.json b/package.json
index ddbefff8..19b26f17 100644
--- a/package.json
+++ b/package.json
@@ -39,8 +39,5 @@
     "tsup": "8.3.5",
     "typescript": "5.8.3",
     "yaml": "^2.8.3"
-  },
-  "dependencies": {
-    "@openrouter/ai-sdk-provider": "^2.3.3"
   }
 }
diff --git a/packages/core/package.json b/packages/core/package.json
index 39308f31..40cc1524 100644
--- a/packages/core/package.json
+++ b/packages/core/package.json
@@ -42,16 +42,10 @@
   "dependencies": {
     "@agentclientprotocol/sdk": "^0.14.1",
     "@agentv/eval": "workspace:*",
-    "@ai-sdk/anthropic": "^3.0.0",
-    "@ai-sdk/azure": "^3.0.0",
-    "@ai-sdk/google": "^3.0.0",
-    "@ai-sdk/openai": "^3.0.0",
     "@github/copilot-sdk": "^0.1.25",
     "@mariozechner/pi-ai": "^0.62.0",
     "@openai/codex-sdk": "^0.104.0",
-    "@openrouter/ai-sdk-provider": "^2.3.1",
     "@sinclair/typebox": "^0.34.41",
-    "ai": "^6.0.0",
     "fast-glob": "^3.3.3",
     "json5": "^2.2.3",
     "micromatch": "^4.0.8",
diff --git a/packages/core/src/evaluation/providers/ai-sdk.ts b/packages/core/src/evaluation/providers/ai-sdk.ts
index 8150e1db..df81c4f4 100644
--- a/packages/core/src/evaluation/providers/ai-sdk.ts
+++ b/packages/core/src/evaluation/providers/ai-sdk.ts
@@ -1,7 +1,27 @@
-import { createAnthropic } from '@ai-sdk/anthropic';
-import { type AzureOpenAIProviderSettings, createAzure } from '@ai-sdk/azure';
-import { createGoogleGenerativeAI } from '@ai-sdk/google';
-import { createOpenAI } from '@ai-sdk/openai';
+/**
+ * LLM provider classes for the five direct-API providers AgentV supports:
+ * OpenAI, Azure OpenAI, OpenRouter, Anthropic, Google (Gemini).
+ *
+ * All five route through @mariozechner/pi-ai. Each provider class:
+ *   1. Resolves a pi-ai Model in its constructor (registry lookup + field
+ *      merges; one-time work).
+ *   2. Implements invoke() by delegating to invokePiAi(), which runs the
+ *      stateless single-shot path or the multi-step agent loop depending on
+ *      whether the request carries `tools`.
+ *   3. Holds no Vercel AI SDK references.
+ *
+ * To add a new provider:
+ *   1. Add a config interface in targets.ts.
+ *   2. Add a class here that resolves a PiModel + maps config to invokePiAi
+ *      options. Pi-ai's KnownProvider list (see types.d.ts) is the source of
+ *      truth for `providerName`; pi-ai's KnownApi list is the source of
+ *      truth for `apiId`.
+ *   3. Register it in providers/index.ts.
+ *
+ * File name: kept as ai-sdk.ts for now to minimize diff churn during the
+ * pi-ai migration — rename in a follow-up once the dust settles.
+ */
+
 import {
   type AssistantMessage as PiAssistantMessage,
   type Message as PiMessage,
@@ -12,8 +32,6 @@ import {
   getModel as piGetModel,
   registerBuiltInApiProviders,
 } from '@mariozechner/pi-ai';
-import { createOpenRouter } from '@openrouter/ai-sdk-provider';
-import { type LanguageModel, type ModelMessage, generateText } from 'ai';
 
 // pi-ai routes complete()/stream() by Model.api; the built-in providers must be
 // registered once at module load. Cheap; idempotent across repeated imports.
@@ -33,51 +51,35 @@ import type { ChatPrompt, Provider, ProviderRequest, ProviderResponse } from './
 const DEFAULT_SYSTEM_PROMPT =
   'You are a careful assistant. Follow all provided instructions and do not fabricate results.';
 
-type TextResult = Awaited<ReturnType<typeof generateText>>;
-type GenerateTextOptions = Parameters<typeof generateText>[0];
-
 export interface ProviderDefaults {
   readonly temperature?: number;
   readonly maxOutputTokens?: number;
   readonly thinkingBudget?: number;
 }
 
+// ---------------------------------------------------------------------------
+// Provider classes — model is resolved in the constructor, invoke() is thin.
+// ---------------------------------------------------------------------------
+
 export class OpenAIProvider implements Provider {
   readonly id: string;
   readonly kind = 'openai' as const;
   readonly targetName: string;
 
-  // Vercel LanguageModel kept only for asLanguageModel() callers (llm-grader,
-  // composite, agentv-provider) until they migrate off it in #1205. Once gone,
-  // delete this field and the createOpenAI build below.
-  private readonly model: LanguageModel;
-  // pi-ai's Model is plain data — what model, where it lives — with no auth.
-  // We resolve once at construction (registry lookup + field merges) and pass
-  // it on each invoke. apiKey stays a per-call StreamOptions field, mirroring
-  // pi-ai's own API: model and credentials are orthogonal concerns.
   private readonly piModel: PiModel;
   private readonly defaults: ProviderDefaults;
   private readonly retryConfig?: RetryConfig;
+  private readonly apiKey: string;
 
-  constructor(
-    targetName: string,
-    private readonly config: OpenAIResolvedConfig,
-  ) {
+  constructor(targetName: string, config: OpenAIResolvedConfig) {
     this.id = `openai:${targetName}`;
     this.targetName = targetName;
+    this.apiKey = config.apiKey;
     this.defaults = {
       temperature: config.temperature,
       maxOutputTokens: config.maxOutputTokens,
     };
     this.retryConfig = config.retry;
-
-    const openai = createOpenAI({
-      apiKey: config.apiKey,
-      baseURL: config.baseURL,
-    });
-    this.model =
-      config.apiFormat === 'responses' ? openai(config.model) : openai.chat(config.model);
-
     this.piModel = resolvePiModel({
       providerName: 'openai',
       apiId: config.apiFormat === 'responses' ? 'openai-responses' : 'openai-completions',
@@ -89,483 +91,220 @@ export class OpenAIProvider implements Provider {
   async invoke(request: ProviderRequest): Promise<ProviderResponse> {
     return invokePiAi({
       model: this.piModel,
-      apiKey: this.config.apiKey,
+      apiKey: this.apiKey,
       request,
       defaults: this.defaults,
       retryConfig: this.retryConfig,
     });
   }
-
-  asLanguageModel(): LanguageModel {
-    return this.model;
-  }
 }
 
-export class AzureProvider implements Provider {
+export class OpenRouterProvider implements Provider {
   readonly id: string;
-  readonly kind = 'azure' as const;
+  readonly kind = 'openrouter' as const;
   readonly targetName: string;
 
-  private readonly model: LanguageModel;
+  private readonly piModel: PiModel;
   private readonly defaults: ProviderDefaults;
   private readonly retryConfig?: RetryConfig;
+  private readonly apiKey: string;
 
-  constructor(
-    targetName: string,
-    private readonly config: AzureResolvedConfig,
-  ) {
-    this.id = `azure:${targetName}`;
+  constructor(targetName: string, config: OpenRouterResolvedConfig) {
+    this.id = `openrouter:${targetName}`;
     this.targetName = targetName;
+    this.apiKey = config.apiKey;
     this.defaults = {
       temperature: config.temperature,
       maxOutputTokens: config.maxOutputTokens,
     };
     this.retryConfig = config.retry;
-
-    const azure = createAzure(buildAzureOptions(config));
-    this.model =
-      config.apiFormat === 'responses'
-        ? azure(config.deploymentName)
-        : azure.chat(config.deploymentName);
+    // OpenRouter exposes an OpenAI-compatible endpoint; pi-ai routes it through
+    // openai-completions with a fixed baseUrl.
+    this.piModel = resolvePiModel({
+      providerName: 'openrouter',
+      apiId: 'openai-completions',
+      modelId: config.model,
+      baseUrl: 'https://openrouter.ai/api/v1',
+    });
   }
 
   async invoke(request: ProviderRequest): Promise<ProviderResponse> {
-    return invokeModel({
-      model: this.model,
+    return invokePiAi({
+      model: this.piModel,
+      apiKey: this.apiKey,
       request,
       defaults: this.defaults,
       retryConfig: this.retryConfig,
     });
   }
-
-  asLanguageModel(): LanguageModel {
-    return this.model;
-  }
 }
 
-export class OpenRouterProvider implements Provider {
+export class AnthropicProvider implements Provider {
   readonly id: string;
-  readonly kind = 'openrouter' as const;
+  readonly kind = 'anthropic' as const;
   readonly targetName: string;
 
-  private readonly model: LanguageModel;
+  private readonly piModel: PiModel;
   private readonly defaults: ProviderDefaults;
   private readonly retryConfig?: RetryConfig;
+  private readonly apiKey: string;
+  private readonly thinkingBudget?: number;
 
-  constructor(
-    targetName: string,
-    private readonly config: OpenRouterResolvedConfig,
-  ) {
-    this.id = `openrouter:${targetName}`;
+  constructor(targetName: string, config: AnthropicResolvedConfig) {
+    this.id = `anthropic:${targetName}`;
     this.targetName = targetName;
+    this.apiKey = config.apiKey;
+    this.thinkingBudget = config.thinkingBudget;
     this.defaults = {
       temperature: config.temperature,
       maxOutputTokens: config.maxOutputTokens,
+      thinkingBudget: config.thinkingBudget,
     };
     this.retryConfig = config.retry;
-
-    const openrouter = createOpenRouter({
-      apiKey: config.apiKey,
+    this.piModel = resolvePiModel({
+      providerName: 'anthropic',
+      apiId: 'anthropic-messages',
+      modelId: config.model,
     });
-    this.model = openrouter(config.model);
   }
 
   async invoke(request: ProviderRequest): Promise<ProviderResponse> {
-    return invokeModel({
-      model: this.model,
+    // Pi-ai's Anthropic provider takes the same numeric thinking budget as the
+    // legacy Vercel path — no lossy bucket mapping needed for older models.
+    // Newer models (Opus 4.6, Sonnet 4.6) ignore thinkingBudgetTokens in favor
+    // of adaptive thinking; we still pass it for forward-compat.
+    const providerOptions =
+      this.thinkingBudget !== undefined
+        ? { thinkingEnabled: true, thinkingBudgetTokens: this.thinkingBudget }
+        : undefined;
+
+    return invokePiAi({
+      model: this.piModel,
+      apiKey: this.apiKey,
       request,
       defaults: this.defaults,
       retryConfig: this.retryConfig,
+      ...(providerOptions ? { providerOptions } : {}),
     });
   }
-
-  asLanguageModel(): LanguageModel {
-    return this.model;
-  }
 }
 
-export class AnthropicProvider implements Provider {
+export class GeminiProvider implements Provider {
   readonly id: string;
-  readonly kind = 'anthropic' as const;
+  readonly kind = 'gemini' as const;
   readonly targetName: string;
 
-  private readonly model: LanguageModel;
+  private readonly piModel: PiModel;
   private readonly defaults: ProviderDefaults;
   private readonly retryConfig?: RetryConfig;
+  private readonly apiKey: string;
 
-  constructor(
-    targetName: string,
-    private readonly config: AnthropicResolvedConfig,
-  ) {
-    this.id = `anthropic:${targetName}`;
+  constructor(targetName: string, config: GeminiResolvedConfig) {
+    this.id = `gemini:${targetName}`;
     this.targetName = targetName;
+    this.apiKey = config.apiKey;
     this.defaults = {
       temperature: config.temperature,
       maxOutputTokens: config.maxOutputTokens,
-      thinkingBudget: config.thinkingBudget,
     };
     this.retryConfig = config.retry;
-
-    const anthropic = createAnthropic({
-      apiKey: config.apiKey,
+    this.piModel = resolvePiModel({
+      providerName: 'google',
+      apiId: 'google-generative-ai',
+      modelId: config.model,
     });
-    this.model = anthropic(config.model);
   }
 
   async invoke(request: ProviderRequest): Promise<ProviderResponse> {
-    const providerOptions = buildAnthropicProviderOptions(this.defaults);
-
-    return invokeModel({
-      model: this.model,
+    return invokePiAi({
+      model: this.piModel,
+      apiKey: this.apiKey,
       request,
       defaults: this.defaults,
       retryConfig: this.retryConfig,
-      providerOptions,
     });
   }
-
-  asLanguageModel(): LanguageModel {
-    return this.model;
-  }
 }
 
-export class GeminiProvider implements Provider {
+export class AzureProvider implements Provider {
   readonly id: string;
-  readonly kind = 'gemini' as const;
+  readonly kind = 'azure' as const;
   readonly targetName: string;
 
-  private readonly model: LanguageModel;
+  private readonly piModel: PiModel;
   private readonly defaults: ProviderDefaults;
   private readonly retryConfig?: RetryConfig;
+  private readonly apiKey: string;
+  private readonly providerOptions: Record<string, unknown>;
 
-  constructor(
-    targetName: string,
-    private readonly config: GeminiResolvedConfig,
-  ) {
-    this.id = `gemini:${targetName}`;
+  constructor(targetName: string, config: AzureResolvedConfig) {
+    this.id = `azure:${targetName}`;
     this.targetName = targetName;
+    this.apiKey = config.apiKey;
     this.defaults = {
       temperature: config.temperature,
       maxOutputTokens: config.maxOutputTokens,
     };
     this.retryConfig = config.retry;
 
-    const google = createGoogleGenerativeAI({
-      apiKey: config.apiKey,
+    // Pi-ai's azure-openai-responses provider handles the Azure-specific URL
+    // shape and api-version query param. We pass either a full base URL or a
+    // resource name + apiVersion via providerOptions; pi-ai does the rest.
+    //
+    // apiFormat is intentionally not branched here: pi-ai uses Azure's
+    // Responses API for both chat-style and responses-style calls. Users who
+    // hit an Azure deployment that only exposes /chat/completions can route
+    // through `provider: openai` with a deployment-scoped baseURL instead.
+    const trimmed = config.resourceName.trim();
+    const isFullUrl = /^https?:\/\//i.test(trimmed);
+    const baseUrl = isFullUrl ? buildAzureBaseUrl(trimmed) : undefined;
+
+    this.providerOptions = {
+      ...(baseUrl ? { azureBaseUrl: baseUrl } : { azureResourceName: trimmed }),
+      ...(config.version ? { azureApiVersion: config.version } : {}),
+    };
+
+    this.piModel = resolvePiModel({
+      providerName: 'azure-openai-responses',
+      apiId: 'azure-openai-responses',
+      // The "model id" for Azure is the deployment name.
+      modelId: config.deploymentName,
+      ...(baseUrl ? { baseUrl } : {}),
     });
-    this.model = google(config.model);
   }
 
   async invoke(request: ProviderRequest): Promise<ProviderResponse> {
-    return invokeModel({
-      model: this.model,
+    return invokePiAi({
+      model: this.piModel,
+      apiKey: this.apiKey,
       request,
       defaults: this.defaults,
       retryConfig: this.retryConfig,
+      providerOptions: this.providerOptions,
     });
   }
-
-  asLanguageModel(): LanguageModel {
-    return this.model;
-  }
-}
-
-function buildAzureOptions(config: AzureResolvedConfig): AzureOpenAIProviderSettings {
-  const options: AzureOpenAIProviderSettings = {
-    apiKey: config.apiKey,
-    apiVersion: config.version,
-    // Chat completions still use deployment-scoped Azure URLs for compatibility
-    // with existing deployments. Responses API should use the SDK's v1 path.
-    useDeploymentBasedUrls: config.apiFormat !== 'responses',
-  };
-
-  const baseURL = normalizeAzureBaseUrl(config.resourceName);
-  if (baseURL) {
-    options.baseURL = baseURL;
-  } else {
-    options.resourceName = config.resourceName;
-  }
-
-  return options;
-}
-
-function normalizeAzureBaseUrl(resourceName: string): string | undefined {
-  const trimmed = resourceName.trim();
-  if (!/^https?:\/\//i.test(trimmed)) {
-    return undefined;
-  }
-
-  const withoutSlash = trimmed.replace(/\/+$/, '');
-  const normalized = withoutSlash.endsWith('/openai') ? withoutSlash : `${withoutSlash}/openai`;
-  return normalized;
-}
-
-function buildAnthropicProviderOptions(
-  defaults: ProviderDefaults,
-): GenerateTextOptions['providerOptions'] | undefined {
-  if (defaults.thinkingBudget === undefined) {
-    return undefined;
-  }
-
-  return {
-    anthropic: {
-      thinking: {
-        type: 'enabled',
-        budgetTokens: defaults.thinkingBudget,
-      },
-    },
-  };
-}
-
-function buildChatPrompt(request: ProviderRequest): ChatPrompt {
-  const provided = request.chatPrompt?.length ? request.chatPrompt : undefined;
-  if (provided) {
-    const hasSystemMessage = provided.some((message) => message.role === 'system');
-    if (hasSystemMessage) {
-      return provided;
-    }
-
-    const systemContent = resolveSystemContent(request);
-    return [{ role: 'system', content: systemContent }, ...provided];
-  }
-
-  const systemContent = resolveSystemContent(request);
-  const userContent = request.question.trim();
-
-  const prompt: ChatPrompt = [
-    { role: 'system', content: systemContent },
-    { role: 'user', content: userContent },
-  ];
-
-  return prompt;
-}
-
-function resolveSystemContent(request: ProviderRequest): string {
-  const systemSegments: string[] = [];
-
-  if (request.systemPrompt && request.systemPrompt.trim().length > 0) {
-    systemSegments.push(request.systemPrompt.trim());
-  } else {
-    systemSegments.push(DEFAULT_SYSTEM_PROMPT);
-  }
-
-  return systemSegments.join('\n\n');
-}
-
-function toModelMessages(chatPrompt: ChatPrompt): ModelMessage[] {
-  return chatPrompt.map((message) => {
-    if (message.role === 'tool' || message.role === 'function') {
-      const prefix = message.name ? `@[${message.name}]: ` : '@[Tool]: ';
-      return {
-        role: 'assistant',
-        content: `${prefix}${message.content}`,
-      } satisfies ModelMessage;
-    }
-
-    if (message.role === 'assistant' || message.role === 'system' || message.role === 'user') {
-      return {
-        role: message.role,
-        content: message.content,
-      } satisfies ModelMessage;
-    }
-
-    return {
-      role: 'user',
-      content: message.content,
-    } satisfies ModelMessage;
-  });
-}
-
-function resolveModelSettings(
-  request: ProviderRequest,
-  defaults: ProviderDefaults,
-): { temperature?: number; maxOutputTokens?: number } {
-  const temperature = request.temperature ?? defaults.temperature;
-  const maxOutputTokens = request.maxOutputTokens ?? defaults.maxOutputTokens;
-  return {
-    temperature,
-    maxOutputTokens,
-  };
-}
-
-async function invokeModel(options: {
-  readonly model: LanguageModel;
-  readonly request: ProviderRequest;
-  readonly defaults: ProviderDefaults;
-  readonly retryConfig?: RetryConfig;
-  readonly providerOptions?: GenerateTextOptions['providerOptions'];
-}): Promise<ProviderResponse> {
-  const { model, request, defaults, retryConfig, providerOptions } = options;
-  const chatPrompt = buildChatPrompt(request);
-  const { temperature, maxOutputTokens } = resolveModelSettings(request, defaults);
-
-  const startTime = new Date().toISOString();
-  const startMs = Date.now();
-
-  const result = await withRetry(
-    () =>
-      generateText({
-        model,
-        messages: toModelMessages(chatPrompt),
-        temperature,
-        maxOutputTokens,
-        maxRetries: 0,
-        abortSignal: request.signal,
-        ...(providerOptions ? { providerOptions } : {}),
-      }),
-    retryConfig,
-    request.signal,
-  );
-
-  const endTime = new Date().toISOString();
-  const durationMs = Date.now() - startMs;
-
-  return mapResponse(result, { durationMs, startTime, endTime });
 }
 
-function mapResponse(
-  result: TextResult,
-  timing?: { durationMs: number; startTime: string; endTime: string },
-): ProviderResponse {
-  const content = result.text ?? '';
-  const rawUsage = result.totalUsage ?? result.usage;
-  const reasoning = rawUsage?.outputTokenDetails?.reasoningTokens ?? undefined;
-  const cached = rawUsage?.inputTokenDetails?.cacheReadTokens ?? undefined;
-  const tokenUsage =
-    rawUsage?.inputTokens != null && rawUsage?.outputTokens != null
-      ? {
-          input: rawUsage.inputTokens,
-          output: rawUsage.outputTokens,
-          ...(reasoning != null ? { reasoning } : {}),
-          ...(cached != null ? { cached } : {}),
-        }
-      : undefined;
-
-  return {
-    raw: result,
-    usage: toJsonObject(rawUsage),
-    output: [{ role: 'assistant' as const, content }],
-    tokenUsage,
-    durationMs: timing?.durationMs,
-    startTime: timing?.startTime,
-    endTime: timing?.endTime,
-  };
-}
-
-function toJsonObject(value: unknown): JsonObject | undefined {
-  if (!value || typeof value !== 'object') {
-    return undefined;
-  }
-
-  try {
-    return JSON.parse(JSON.stringify(value)) as JsonObject;
-  } catch {
-    return undefined;
-  }
-}
-
-function extractStatus(error: unknown): number | undefined {
-  if (!error || typeof error !== 'object') {
-    return undefined;
-  }
-
-  const candidate = error as Record<string, unknown>;
-  const directStatus = candidate.status ?? candidate.statusCode;
-  if (typeof directStatus === 'number' && Number.isFinite(directStatus)) {
-    return directStatus;
-  }
-
-  const responseStatus =
-    typeof candidate.response === 'object' && candidate.response
-      ? (candidate.response as { status?: unknown }).status
-      : undefined;
-  if (typeof responseStatus === 'number' && Number.isFinite(responseStatus)) {
-    return responseStatus;
-  }
-
-  const message = typeof candidate.message === 'string' ? candidate.message : undefined;
-  if (message) {
-    const match = message.match(/HTTP\s+(\d{3})/i);
-    if (match) {
-      const parsed = Number.parseInt(match[1], 10);
-      if (Number.isFinite(parsed)) {
-        return parsed;
-      }
-    }
-  }
-
-  return undefined;
-}
-
-function isNetworkError(error: unknown): boolean {
-  if (!error || typeof error !== 'object') {
-    return false;
-  }
-
-  const candidate = error as Record<string, unknown>;
-  if (candidate.name === 'AbortError') {
-    return false;
-  }
-
-  const code = candidate.code;
-  if (typeof code === 'string' && /^E(AI|CONN|HOST|NET|PIPE|TIME|REFUSED|RESET)/i.test(code)) {
-    return true;
-  }
-
-  const message = typeof candidate.message === 'string' ? candidate.message : undefined;
-  if (
-    message &&
-    /(network|fetch failed|ECONNRESET|ENOTFOUND|EAI_AGAIN|ETIMEDOUT|ECONNREFUSED)/i.test(message)
-  ) {
-    return true;
-  }
-
-  return false;
-}
-
-function isRetryableError(error: unknown, retryableStatusCodes: readonly number[]): boolean {
-  const status = extractStatus(error);
-  if (status === 401 || status === 403) {
-    return false;
-  }
-  if (typeof status === 'number') {
-    return retryableStatusCodes.includes(status);
-  }
-
-  return isNetworkError(error);
-}
-
-function calculateRetryDelay(attempt: number, config: Required<RetryConfig>): number {
-  const delay = Math.min(
-    config.maxDelayMs,
-    config.initialDelayMs * config.backoffFactor ** attempt,
-  );
-  return delay * (0.75 + Math.random() * 0.5);
-}
-
-async function sleep(ms: number): Promise<void> {
-  return new Promise((resolve) => setTimeout(resolve, ms));
+/**
+ * Normalize a user-supplied Azure URL to pi-ai's expected base.
+ *
+ * Pi-ai's azure-openai-responses appends `/responses?api-version=...` to the
+ * baseUrl, so the URL we hand it should end at the `/openai/v1` segment.
+ * Accept either:
+ *   - https://<resource>.openai.azure.com         → add `/openai/v1`
+ *   - https://<resource>.openai.azure.com/openai  → replace `/openai` with `/openai/v1`
+ *   - https://<resource>.openai.azure.com/openai/v1 → keep as-is
+ */
+function buildAzureBaseUrl(input: string): string {
+  const trimmed = input.replace(/\/+$/, '');
+  if (trimmed.endsWith('/openai/v1')) return trimmed;
+  if (trimmed.endsWith('/openai')) return `${trimmed}/v1`;
+  return `${trimmed}/openai/v1`;
 }
 
 // ---------------------------------------------------------------------------
-// pi-ai migration (issue #1205)
+// Shared adapter — invokePiAi runs the model call (single-shot or agent loop)
 // ---------------------------------------------------------------------------
-//
-// invokePiAi runs a single non-streaming, non-tool-using completion through
-// @mariozechner/pi-ai. It is the new code path; the existing invokeModel
-// (Vercel AI SDK) above is still in use for the four providers we have not
-// ported yet (Azure, OpenRouter, Anthropic, Gemini).
-//
-// Types come through `@mariozechner/pi-ai` plus our local `pi-ai-shim.d.ts`
-// ambient augmentation. Pi-ai's published d.ts re-exports do not surface at
-// the package root under NodeNext, so the shim re-declares the small subset
-// we use (Model, Message, complete, getModel, ...). See pi-ai-shim.d.ts.
-//
-// To port a provider:
-//   1. Map its config to the invokePiAi options below (api id, baseUrl, key).
-//   2. Replace the provider's invoke() to call invokePiAi.
-//   3. Drop the createX() / this.model build from the constructor when
-//      asLanguageModel() is no longer used by any consumer.
 
 export interface InvokePiAiOptions {
   /** Pre-resolved pi-ai model (built once in the provider constructor). */
@@ -579,10 +318,17 @@ export interface InvokePiAiOptions {
   readonly request: ProviderRequest;
   readonly defaults: ProviderDefaults;
   readonly retryConfig?: RetryConfig;
+  /**
+   * Provider-specific options merged into pi-ai's call options. Pi-ai's
+   * ProviderStreamOptions is `StreamOptions & Record<string, unknown>`, so
+   * extra keys flow through to the underlying provider impl. Example:
+   * Anthropic accepts `{ thinkingEnabled: true, thinkingBudgetTokens: 8000 }`.
+   */
+  readonly providerOptions?: Record<string, unknown>;
 }
 
 export async function invokePiAi(options: InvokePiAiOptions): Promise<ProviderResponse> {
-  const { model, apiKey, request, defaults, retryConfig } = options;
+  const { model, apiKey, request, defaults, retryConfig, providerOptions } = options;
   const tools = request.tools && request.tools.length > 0 ? request.tools : undefined;
   const maxSteps = tools ? Math.max(1, request.maxSteps ?? 1) : 1;
 
@@ -604,6 +350,7 @@ export async function invokePiAi(options: InvokePiAiOptions): Promise<ProviderRe
     temperature,
     ...(maxOutputTokens !== undefined ? { maxTokens: maxOutputTokens } : {}),
     signal: request.signal,
+    ...(providerOptions ?? {}),
   };
 
   const startTime = new Date().toISOString();
@@ -701,12 +448,9 @@ export function resolvePiModel(args: {
 }): PiModel {
   const { providerName, apiId, modelId, baseUrl } = args;
 
-  // pi-ai's getModel returns a Model when the (provider, modelId) pair is in
-  // its generated registry. For runtime-string configs or unknown model ids
-  // we construct a minimal descriptor — every field is required by Model.
-  // piGetModel's upstream signature is generic over a typed model registry; at
-  // runtime the strings flow through and it returns a plain Model. The cast
-  // converts the unresolved generic return type to the shim's Model.
+  // pi-ai's getModel returns a Model when (provider, modelId) is in its
+  // registry; otherwise we synthesize a minimal descriptor — every field is
+  // required by the Model interface.
   let model: PiModel | undefined;
   try {
     model = piGetModel(providerName, modelId) as PiModel;
@@ -715,10 +459,6 @@ export function resolvePiModel(args: {
   }
 
   if (!model) {
-    // pi-ai's getModel didn't recognize this (provider, modelId) — typical when
-    // the user is on a custom gateway, a brand-new model, or an Azure deployment
-    // name. We must still hand pi-ai a non-empty baseUrl: pi-ai forwards it to
-    // `new OpenAI({ baseURL })` which misbehaves on empty string.
     const fallbackBaseUrl = baseUrl ?? defaultBaseUrlFor(providerName);
     if (!fallbackBaseUrl) {
       throw new Error(
@@ -755,9 +495,8 @@ export function resolvePiModel(args: {
  * empty string into pi-ai's OpenAI client, which fails opaquely.
  */
 function defaultBaseUrlFor(providerName: string): string | undefined {
-  if (providerName === 'openai') {
-    return 'https://api.openai.com/v1';
-  }
+  if (providerName === 'openai') return 'https://api.openai.com/v1';
+  if (providerName === 'openrouter') return 'https://openrouter.ai/api/v1';
   return undefined;
 }
 
@@ -766,54 +505,7 @@ interface PiContext {
   readonly messages: PiMessage[];
 }
 
-function attachImagesToLastUserMessage(
-  messages: PiMessage[],
-  images: ProviderRequest['images'],
-): void {
-  if (!images || images.length === 0) return;
-  for (let i = messages.length - 1; i >= 0; i--) {
-    const m = messages[i];
-    if (m.role !== 'user') continue;
-    const text = typeof m.content === 'string' ? m.content : '';
-    messages[i] = {
-      ...m,
-      content: [
-        ...(text ? [{ type: 'text' as const, text }] : []),
-        ...images.map((img) => ({
-          type: 'image' as const,
-          data: img.source,
-          mimeType: img.media_type,
-        })),
-      ],
-    };
-    return;
-  }
-  // No user message to attach images to — synthesize one.
-  messages.push({
-    role: 'user',
-    content: images.map((img) => ({
-      type: 'image' as const,
-      data: img.source,
-      mimeType: img.media_type,
-    })),
-    timestamp: Date.now(),
-  });
-}
-
 function chatPromptToPiContext(chatPrompt: ChatPrompt): PiContext {
-  // OpenAIProvider.invoke() is reached from the orchestrator's multi-turn
-  // and single-turn paths, so the chatPrompt may legitimately contain
-  // `assistant` (prior turn output) and `tool`/`function` (rare — most callers
-  // remap these upstream in prompt-builder). We mirror the Vercel path's
-  // toModelMessages: pass assistant through as-is; fold tool/function back
-  // into assistant text with a `@[name]:` prefix so pi-ai sees a clean
-  // user/assistant alternation.
-  //
-  // Pi-ai's AssistantMessage type carries api/provider/model/usage/stopReason
-  // for round-trip continuity, but its OpenAI-completions converter only reads
-  // role + content blocks for replayed history. We synthesize a minimal
-  // assistant turn with placeholder metadata — pi-ai ignores those fields when
-  // converting to the wire format.
   const systemSegments: string[] = [];
   const messages: PiMessage[] = [];
   const now = Date.now();
@@ -877,6 +569,40 @@ function chatPromptToPiContext(chatPrompt: ChatPrompt): PiContext {
   };
 }
 
+function attachImagesToLastUserMessage(
+  messages: PiMessage[],
+  images: ProviderRequest['images'],
+): void {
+  if (!images || images.length === 0) return;
+  for (let i = messages.length - 1; i >= 0; i--) {
+    const m = messages[i];
+    if (m.role !== 'user') continue;
+    const text = typeof m.content === 'string' ? m.content : '';
+    messages[i] = {
+      ...m,
+      content: [
+        ...(text ? [{ type: 'text' as const, text }] : []),
+        ...images.map((img) => ({
+          type: 'image' as const,
+          data: img.source,
+          mimeType: img.media_type,
+        })),
+      ],
+    };
+    return;
+  }
+  // No user message to attach images to — synthesize one.
+  messages.push({
+    role: 'user',
+    content: images.map((img) => ({
+      type: 'image' as const,
+      data: img.source,
+      mimeType: img.media_type,
+    })),
+    timestamp: Date.now(),
+  });
+}
+
 function mapPiResponse(
   result: PiAssistantMessage,
   timing: {
@@ -904,7 +630,7 @@ function mapPiResponse(
   // pi-ai always populates `cost.total`, but it computes 0 when the model
   // descriptor lacks pricing (fallback descriptor for unknown ids, or pi-ai's
   // registry simply not having rates yet). Surface 0 as "unknown" by leaving
-  // costUsd undefined — matches the Vercel path, which never sets it.
+  // costUsd undefined — matches the legacy ai-sdk path, which never set it.
   const costUsd = timing.aggregateUsage.cost > 0 ? timing.aggregateUsage.cost : undefined;
 
   return {
@@ -920,6 +646,132 @@ function mapPiResponse(
   };
 }
 
+// ---------------------------------------------------------------------------
+// Chat-prompt construction (shared with old paths; not pi-ai-specific)
+// ---------------------------------------------------------------------------
+
+function buildChatPrompt(request: ProviderRequest): ChatPrompt {
+  const provided = request.chatPrompt?.length ? request.chatPrompt : undefined;
+  if (provided) {
+    const hasSystemMessage = provided.some((message) => message.role === 'system');
+    if (hasSystemMessage) {
+      return provided;
+    }
+    const systemContent = resolveSystemContent(request);
+    return [{ role: 'system', content: systemContent }, ...provided];
+  }
+
+  const systemContent = resolveSystemContent(request);
+  const userContent = request.question.trim();
+
+  return [
+    { role: 'system', content: systemContent },
+    { role: 'user', content: userContent },
+  ];
+}
+
+function resolveSystemContent(request: ProviderRequest): string {
+  if (request.systemPrompt && request.systemPrompt.trim().length > 0) {
+    return request.systemPrompt.trim();
+  }
+  return DEFAULT_SYSTEM_PROMPT;
+}
+
+function resolveModelSettings(
+  request: ProviderRequest,
+  defaults: ProviderDefaults,
+): { temperature?: number; maxOutputTokens?: number } {
+  return {
+    temperature: request.temperature ?? defaults.temperature,
+    maxOutputTokens: request.maxOutputTokens ?? defaults.maxOutputTokens,
+  };
+}
+
+function toJsonObject(value: unknown): JsonObject | undefined {
+  if (!value || typeof value !== 'object') {
+    return undefined;
+  }
+  try {
+    return JSON.parse(JSON.stringify(value)) as JsonObject;
+  } catch {
+    return undefined;
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Retry / backoff — library-agnostic; wraps any async fn that may transient-fail
+// ---------------------------------------------------------------------------
+
+function extractStatus(error: unknown): number | undefined {
+  if (!error || typeof error !== 'object') return undefined;
+
+  const candidate = error as Record<string, unknown>;
+  const directStatus = candidate.status ?? candidate.statusCode;
+  if (typeof directStatus === 'number' && Number.isFinite(directStatus)) {
+    return directStatus;
+  }
+
+  const responseStatus =
+    typeof candidate.response === 'object' && candidate.response
+      ? (candidate.response as { status?: unknown }).status
+      : undefined;
+  if (typeof responseStatus === 'number' && Number.isFinite(responseStatus)) {
+    return responseStatus;
+  }
+
+  const message = typeof candidate.message === 'string' ? candidate.message : undefined;
+  if (message) {
+    const match = message.match(/HTTP\s+(\d{3})/i);
+    if (match) {
+      const parsed = Number.parseInt(match[1], 10);
+      if (Number.isFinite(parsed)) return parsed;
+    }
+  }
+
+  return undefined;
+}
+
+function isNetworkError(error: unknown): boolean {
+  if (!error || typeof error !== 'object') return false;
+
+  const candidate = error as Record<string, unknown>;
+  if (candidate.name === 'AbortError') return false;
+
+  const code = candidate.code;
+  if (typeof code === 'string' && /^E(AI|CONN|HOST|NET|PIPE|TIME|REFUSED|RESET)/i.test(code)) {
+    return true;
+  }
+
+  const message = typeof candidate.message === 'string' ? candidate.message : undefined;
+  if (
+    message &&
+    /(network|fetch failed|ECONNRESET|ENOTFOUND|EAI_AGAIN|ETIMEDOUT|ECONNREFUSED)/i.test(message)
+  ) {
+    return true;
+  }
+
+  return false;
+}
+
+function isRetryableError(error: unknown, retryableStatusCodes: readonly number[]): boolean {
+  const status = extractStatus(error);
+  if (status === 401 || status === 403) return false;
+  if (typeof status === 'number') return retryableStatusCodes.includes(status);
+  return isNetworkError(error);
+}
+
+function calculateRetryDelay(attempt: number, config: Required<RetryConfig>): number {
+  const delay = Math.min(
+    config.maxDelayMs,
+    config.initialDelayMs * config.backoffFactor ** attempt,
+  );
+  return delay * (0.75 + Math.random() * 0.5);
+}
+
+async function sleep(ms: number): Promise<void> {
+  return new Promise((resolve) => setTimeout(resolve, ms));
+}
+
 async function withRetry<T>(
   fn: () => Promise<T>,
   retryConfig?: RetryConfig,
@@ -945,13 +797,8 @@ async function withRetry<T>(
     } catch (error) {
       lastError = error;
 
-      if (attempt >= config.maxRetries) {
-        break;
-      }
-
-      if (!isRetryableError(error, config.retryableStatusCodes)) {
-        throw error;
-      }
+      if (attempt >= config.maxRetries) break;
+      if (!isRetryableError(error, config.retryableStatusCodes)) throw error;
 
       const delay = calculateRetryDelay(attempt, config);
       await sleep(delay);
diff --git a/packages/core/src/evaluation/providers/types.ts b/packages/core/src/evaluation/providers/types.ts
index 0caf874e..3670705e 100644
--- a/packages/core/src/evaluation/providers/types.ts
+++ b/packages/core/src/evaluation/providers/types.ts
@@ -353,11 +353,6 @@ export interface Provider {
    * the orchestrator may send multiple requests in a single provider session.
    */
   invokeBatch?(requests: readonly ProviderRequest[]): Promise<readonly ProviderResponse[]>;
-  /**
-   * Optional method to get a Vercel AI SDK LanguageModel instance for structured output generation.
-   * Used by evaluators that need generateObject/generateText from the AI SDK.
-   */
-  asLanguageModel?(): import('ai').LanguageModel;
 }
 
 export type EnvLookup = Readonly<Record<string, string | undefined>>;
diff --git a/packages/core/src/evaluation/registry/builtin-graders.ts b/packages/core/src/evaluation/registry/builtin-graders.ts
index b24eb20b..4133f9d4 100644
--- a/packages/core/src/evaluation/registry/builtin-graders.ts
+++ b/packages/core/src/evaluation/registry/builtin-graders.ts
@@ -95,8 +95,8 @@ export const llmGraderFactory: GraderFactoryFn = (config, context) => {
     }
     // Only pass graderTargetProvider for agent providers (delegate mode).
     // LLM providers use the normal resolveGraderProvider path for structured JSON mode.
-    // Note: agentv uses asLanguageModel() not invoke(), so it's not in AGENT_PROVIDER_KINDS;
-    // check it explicitly here for built-in agent mode.
+    // The agentv provider drives the built-in agent loop directly, so include
+    // it alongside AGENT_PROVIDER_KINDS even though it doesn't spawn a subprocess.
     const isAgent = isAgentProvider(graderTargetProvider) || graderTargetProvider.kind === 'agentv';
     evaluator = new LlmGrader({
       resolveGraderProvider: async (evalContext) => {
diff --git a/packages/core/test/evaluation/providers/targets.test.ts b/packages/core/test/evaluation/providers/targets.test.ts
index 0e806120..5b663234 100644
--- a/packages/core/test/evaluation/providers/targets.test.ts
+++ b/packages/core/test/evaluation/providers/targets.test.ts
@@ -1,44 +1,6 @@
 import { beforeEach, describe, expect, it, mock, spyOn } from 'bun:test';
-const generateTextMock = mock(async () => ({
-  text: 'ok',
-  reasoningText: undefined,
-  usage: { promptTokens: 1, completionTokens: 1, totalTokens: 2 },
-  totalUsage: { promptTokens: 1, completionTokens: 1, totalTokens: 2 },
-  content: [],
-  reasoning: [],
-  files: [],
-  sources: [],
-  toolCalls: [],
-  staticToolCalls: [],
-  dynamicToolCalls: [],
-  toolResults: [],
-  staticToolResults: [],
-  dynamicToolResults: [],
-  finishReason: 'stop',
-  warnings: undefined,
-  providerMetadata: undefined,
-}));
-
-const createAzureMock = mock((options: unknown) => {
-  const fn = () => ({ provider: 'azure', options, apiFormat: 'responses' });
-  fn.chat = () => ({ provider: 'azure', options, apiFormat: 'chat' });
-  fn.responses = () => ({ provider: 'azure', options, apiFormat: 'responses' });
-  return fn;
-});
-const createOpenAIMock = mock((options: unknown) => {
-  const fn = () => ({ provider: 'openai', options });
-  fn.chat = () => ({ provider: 'openai', options });
-  fn.responses = () => ({ provider: 'openai', options });
-  return fn;
-});
-const createOpenRouterMock = mock((options: unknown) => () => ({
-  provider: 'openrouter',
-  options,
-}));
-const createAnthropicMock = mock(() => () => ({ provider: 'anthropic' }));
-const createGeminiMock = mock(() => () => ({ provider: 'gemini' }));
 
-const piCompleteMock = mock(async () => ({
+const piCompleteMock = mock(async (model: { provider: string }) => ({
   content: [{ type: 'text', text: 'ok' }],
   usage: {
     input: 1,
@@ -49,8 +11,8 @@ const piCompleteMock = mock(async () => ({
     cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0, total: 0 },
   },
   api: 'openai-completions',
-  provider: 'openai',
-  model: 'gpt-test',
+  provider: model.provider,
+  model: 'mock',
   stopReason: 'stop',
   timestamp: Date.now(),
   role: 'assistant',
@@ -70,35 +32,11 @@ const piGetModelMock = mock((provider: string, modelId: string) => ({
 const piRegisterMock = mock(() => {});
 
 mock.module('@mariozechner/pi-ai', () => ({
-  complete: (...args: unknown[]) => piCompleteMock(...(args as [])),
+  complete: (...args: unknown[]) => piCompleteMock(...(args as [{ provider: string }])),
   getModel: (provider: string, modelId: string) => piGetModelMock(provider, modelId),
   registerBuiltInApiProviders: () => piRegisterMock(),
 }));
 
-mock.module('ai', () => ({
-  generateText: () => generateTextMock(),
-}));
-
-mock.module('@ai-sdk/azure', () => ({
-  createAzure: (options: unknown) => createAzureMock(options),
-}));
-
-mock.module('@ai-sdk/openai', () => ({
-  createOpenAI: (options: unknown) => createOpenAIMock(options),
-}));
-
-mock.module('@openrouter/ai-sdk-provider', () => ({
-  createOpenRouter: (options: unknown) => createOpenRouterMock(options),
-}));
-
-mock.module('@ai-sdk/anthropic', () => ({
-  createAnthropic: () => createAnthropicMock(),
-}));
-
-mock.module('@ai-sdk/google', () => ({
-  createGoogleGenerativeAI: () => createGeminiMock(),
-}));
-
 const providerModule = await import('../../../src/evaluation/providers/index.js');
 const { resolveDelegatedTargetDefinition, resolveTargetDefinition, createProvider } =
   providerModule;
@@ -145,7 +83,8 @@ describe('resolveDelegatedTargetDefinition', () => {
 
 describe('resolveTargetDefinition', () => {
   beforeEach(() => {
-    generateTextMock.mockClear();
+    piCompleteMock.mockClear();
+    piGetModelMock.mockClear();
   });
 
   it("throws when settings don't use ${{ }} syntax", () => {
@@ -1131,150 +1070,148 @@ describe('resolveTargetDefinition', () => {
 
 describe('createProvider', () => {
   beforeEach(() => {
-    generateTextMock.mockClear();
-    createAzureMock.mockClear();
-    createOpenAIMock.mockClear();
-    createOpenRouterMock.mockClear();
-    createAnthropicMock.mockClear();
-    createGeminiMock.mockClear();
+    piCompleteMock.mockClear();
+    piGetModelMock.mockClear();
   });
 
-  it('creates an azure provider that calls the Vercel AI SDK', async () => {
+  it('routes openai targets through pi-ai openai-completions', async () => {
     const env = {
-      AZURE_OPENAI_ENDPOINT: 'https://example.openai.azure.com',
-      AZURE_OPENAI_API_KEY: 'key',
-      AZURE_DEPLOYMENT_NAME: 'gpt-4o',
+      OPENAI_ENDPOINT: 'https://llm-gateway.example.com/v1',
+      OPENAI_API_KEY: 'openai-key',
+      OPENAI_MODEL: 'gpt-5.4',
     } satisfies Record<string, string>;
 
     const resolved = resolveTargetDefinition(
       {
-        name: 'azure-target',
-        provider: 'azure',
-        endpoint: '${{ AZURE_OPENAI_ENDPOINT }}',
-        api_key: '${{ AZURE_OPENAI_API_KEY }}',
-        model: '${{ AZURE_DEPLOYMENT_NAME }}',
+        name: 'openai-target',
+        provider: 'openai',
+        endpoint: '${{ OPENAI_ENDPOINT }}',
+        api_key: '${{ OPENAI_API_KEY }}',
+        model: '${{ OPENAI_MODEL }}',
       },
       env,
     );
 
     const provider = createProvider(resolved);
-    const response = await provider.invoke({ question: 'Hello' });
+    expect(provider.kind).toBe('openai');
 
-    expect(createAzureMock).toHaveBeenCalledTimes(1);
-    expect(createAzureMock.mock.calls[0]?.[0]).toMatchObject({ useDeploymentBasedUrls: true });
-    expect(provider.asLanguageModel()).toMatchObject({ apiFormat: 'chat' });
-    expect(generateTextMock).toHaveBeenCalledTimes(1);
+    const response = await provider.invoke({ question: 'Hello from OpenAI' });
+
+    expect(piGetModelMock).toHaveBeenCalledWith('openai', 'gpt-5.4');
+    expect(piCompleteMock).toHaveBeenCalledTimes(1);
     expect(extractLastAssistantContent(response.output)).toBe('ok');
   });
 
-  it('creates an azure provider using the responses api when requested', async () => {
+  it('routes openai targets with apiFormat=responses through pi-ai openai-responses', async () => {
     const env = {
-      AZURE_OPENAI_ENDPOINT: 'https://example.openai.azure.com',
-      AZURE_OPENAI_API_KEY: 'key',
-      AZURE_DEPLOYMENT_NAME: 'gpt-4o',
+      OPENAI_ENDPOINT: 'https://api.openai.com/v1',
+      OPENAI_API_KEY: 'k',
+      OPENAI_MODEL: 'gpt-5',
     } satisfies Record<string, string>;
-
     const resolved = resolveTargetDefinition(
       {
-        name: 'azure-responses-target',
-        provider: 'azure',
-        endpoint: '${{ AZURE_OPENAI_ENDPOINT }}',
-        api_key: '${{ AZURE_OPENAI_API_KEY }}',
-        model: '${{ AZURE_DEPLOYMENT_NAME }}',
+        name: 'openai-resp',
+        provider: 'openai',
+        endpoint: '${{ OPENAI_ENDPOINT }}',
+        api_key: '${{ OPENAI_API_KEY }}',
+        model: '${{ OPENAI_MODEL }}',
         api_format: 'responses',
       },
       env,
     );
-
     const provider = createProvider(resolved);
-    const response = await provider.invoke({ question: 'Hello' });
-
-    expect(createAzureMock).toHaveBeenCalledTimes(1);
-    expect(createAzureMock.mock.calls[0]?.[0]).toMatchObject({ useDeploymentBasedUrls: false });
-    expect(provider.asLanguageModel()).toMatchObject({ apiFormat: 'responses' });
-    expect(generateTextMock).toHaveBeenCalledTimes(1);
-    expect(extractLastAssistantContent(response.output)).toBe('ok');
+    await provider.invoke({ question: 'Hello' });
+    // The model passed to pi-ai's complete() should carry api='openai-responses'
+    const modelArg = piCompleteMock.mock.calls[0]?.[0] as { api: string };
+    expect(modelArg.api).toBe('openai-responses');
   });
-  it('creates a gemini provider that calls the Vercel AI SDK', async () => {
+
+  it('routes openrouter targets through pi-ai openai-completions with the OpenRouter baseUrl', async () => {
     const env = {
-      GOOGLE_API_KEY: 'gemini-key',
+      OPENROUTER_API_KEY: 'openrouter-key',
+      OPENROUTER_MODEL: 'openai/gpt-5-mini',
     } satisfies Record<string, string>;
-
     const resolved = resolveTargetDefinition(
       {
-        name: 'gemini-target',
-        provider: 'gemini',
-        api_key: '${{ GOOGLE_API_KEY }}',
+        name: 'openrouter-target',
+        provider: 'openrouter',
+        api_key: '${{ OPENROUTER_API_KEY }}',
+        model: '${{ OPENROUTER_MODEL }}',
       },
       env,
     );
-
     const provider = createProvider(resolved);
-    expect(provider.kind).toBe('gemini');
-    expect(provider.targetName).toBe('gemini-target');
-
-    const response = await provider.invoke({ question: 'Test prompt' });
+    expect(provider.kind).toBe('openrouter');
+    await provider.invoke({ question: 'Hello' });
 
-    expect(createGeminiMock).toHaveBeenCalled();
-    expect(generateTextMock).toHaveBeenCalled();
-    expect(extractLastAssistantContent(response.output)).toBe('ok');
+    expect(piGetModelMock).toHaveBeenCalledWith('openrouter', 'openai/gpt-5-mini');
+    const modelArg = piCompleteMock.mock.calls[0]?.[0] as { baseUrl: string };
+    expect(modelArg.baseUrl).toBe('https://openrouter.ai/api/v1');
   });
 
-  it('creates an openai provider that calls @mariozechner/pi-ai', async () => {
+  it('routes anthropic targets through pi-ai anthropic-messages and forwards thinkingBudget', async () => {
     const env = {
-      OPENAI_ENDPOINT: 'https://llm-gateway.example.com/v1',
-      OPENAI_API_KEY: 'openai-key',
-      OPENAI_MODEL: 'gpt-5.4',
+      ANTHROPIC_API_KEY: 'k',
+      ANTHROPIC_MODEL: 'claude-sonnet-4',
     } satisfies Record<string, string>;
-
     const resolved = resolveTargetDefinition(
       {
-        name: 'openai-target',
-        provider: 'openai',
-        endpoint: '${{ OPENAI_ENDPOINT }}',
-        api_key: '${{ OPENAI_API_KEY }}',
-        model: '${{ OPENAI_MODEL }}',
+        name: 'anthropic-target',
+        provider: 'anthropic',
+        api_key: '${{ ANTHROPIC_API_KEY }}',
+        model: '${{ ANTHROPIC_MODEL }}',
+        thinking_budget: 4096,
       },
       env,
     );
-
-    piCompleteMock.mockClear();
-    piGetModelMock.mockClear();
-
     const provider = createProvider(resolved);
-    expect(provider.kind).toBe('openai');
-
-    const response = await provider.invoke({ question: 'Hello from OpenAI' });
+    expect(provider.kind).toBe('anthropic');
+    await provider.invoke({ question: 'Hello' });
+
+    expect(piGetModelMock).toHaveBeenCalledWith('anthropic', 'claude-sonnet-4');
+    const callOptions = piCompleteMock.mock.calls[0]?.[2] as Record<string, unknown>;
+    expect(callOptions).toMatchObject({
+      thinkingEnabled: true,
+      thinkingBudgetTokens: 4096,
+    });
+  });
 
-    expect(piGetModelMock).toHaveBeenCalledWith('openai', 'gpt-5.4');
-    expect(piCompleteMock).toHaveBeenCalledTimes(1);
-    expect(extractLastAssistantContent(response.output)).toBe('ok');
+  it('routes gemini targets through pi-ai google-generative-ai', async () => {
+    const env = { GOOGLE_API_KEY: 'gemini-key' } satisfies Record<string, string>;
+    const resolved = resolveTargetDefinition(
+      { name: 'gemini-target', provider: 'gemini', api_key: '${{ GOOGLE_API_KEY }}' },
+      env,
+    );
+    const provider = createProvider(resolved);
+    expect(provider.kind).toBe('gemini');
+    await provider.invoke({ question: 'Hello' });
+    expect(piGetModelMock.mock.calls[0]?.[0]).toBe('google');
   });
 
-  it('creates an openrouter provider that calls the Vercel AI SDK', async () => {
+  it('routes azure targets through pi-ai azure-openai-responses and forwards azureBaseUrl', async () => {
     const env = {
-      OPENROUTER_API_KEY: 'openrouter-key',
-      OPENROUTER_MODEL: 'openai/gpt-5-mini',
+      AZURE_OPENAI_ENDPOINT: 'https://example.openai.azure.com',
+      AZURE_OPENAI_API_KEY: 'key',
+      AZURE_DEPLOYMENT_NAME: 'gpt-4o',
     } satisfies Record<string, string>;
-
     const resolved = resolveTargetDefinition(
       {
-        name: 'openrouter-target',
-        provider: 'openrouter',
-        api_key: '${{ OPENROUTER_API_KEY }}',
-        model: '${{ OPENROUTER_MODEL }}',
+        name: 'azure-target',
+        provider: 'azure',
+        endpoint: '${{ AZURE_OPENAI_ENDPOINT }}',
+        api_key: '${{ AZURE_OPENAI_API_KEY }}',
+        model: '${{ AZURE_DEPLOYMENT_NAME }}',
       },
       env,
     );
-
     const provider = createProvider(resolved);
-    expect(provider.kind).toBe('openrouter');
+    await provider.invoke({ question: 'Hello' });
 
-    const response = await provider.invoke({ question: 'Hello from OpenRouter' });
-
-    expect(createOpenRouterMock).toHaveBeenCalledTimes(1);
-    expect(generateTextMock).toHaveBeenCalledTimes(1);
-    expect(extractLastAssistantContent(response.output)).toBe('ok');
+    expect(piGetModelMock).toHaveBeenCalledWith('azure-openai-responses', 'gpt-4o');
+    const callOptions = piCompleteMock.mock.calls[0]?.[2] as Record<string, unknown>;
+    expect(callOptions).toMatchObject({
+      azureBaseUrl: 'https://example.openai.azure.com/openai/v1',
+    });
   });
 
   it('resolves pi-coding-agent with azure subprovider and base_url', () => {

From 5b31b75bbb64149d9af6d6fecb9939a4aadf6825 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sun, 3 May 2026 14:36:30 +0200
Subject: [PATCH 12/20] =?UTF-8?q?chore(core):=20rename=20ai-sdk.ts=20?=
 =?UTF-8?q?=E2=86=92=20llm-providers.ts;=20add=20pi-ai-shim=20sync=20check?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Two cleanups closing out the #1205 migration:

1. Rename providers/ai-sdk.ts → providers/llm-providers.ts. The file is no
   longer the Vercel AI SDK adapter; it owns the five direct-API LLM provider
   classes (OpenAI, OpenRouter, Anthropic, Gemini, Azure) and delegates to
   pi-ai. Keeping the old name was misleading. `llm-providers.ts` also
   distinguishes from the agent providers (claude.ts, codex.ts, etc.) in the
   same directory. Updated callers in agentv-provider.ts and providers/index.ts.

2. Add scripts/check-pi-ai-shim.ts + a pre-push prek hook + bun script alias.
   The shim re-declares pi-ai's public surface so our static imports resolve
   under NodeNext (pi-ai's cross-module re-exports don't bubble up through
   `export * from`). If pi-ai ships a breaking change — renamed field,
   removed function — TypeScript stays happy against the shim while the
   runtime drifts. The check parses both d.ts files (regex + brace counting),
   confirms every interface name + field name in our shim exists upstream,
   and likewise for exported function names. Field types are not compared —
   too much surface for too little value; type-level breakage would surface
   in llm-providers.ts compilation, and runtime presence is exercised by
   the unit-test suite.

   Wired into .pre-commit-config.yaml as `check-pi-ai-shim` (pre-push) and
   exposed as `bun run check:pi-ai-shim` for manual runs.

   Verified the failure path by injecting a fake field into the shim — the
   script exits non-zero with a clear "interface X declares field Y not in
   upstream" message.

Refs #1205

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .pre-commit-config.yaml                       |   7 +
 .../results/export-e2e-providers.test.ts      |   2 +-
 package.json                                  |   1 +
 .../evaluation/providers/agentv-provider.ts   |   6 +-
 .../core/src/evaluation/providers/index.ts    |  14 +-
 .../providers/{ai-sdk.ts => llm-providers.ts} |   7 +-
 .../src/evaluation/providers/pi-ai-shim.d.ts  |   4 +
 scripts/check-pi-ai-shim.ts                   | 199 ++++++++++++++++++
 8 files changed, 224 insertions(+), 16 deletions(-)
 rename packages/core/src/evaluation/providers/{ai-sdk.ts => llm-providers.ts} (98%)
 create mode 100644 scripts/check-pi-ai-shim.ts

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index d6e1bec8..effe728e 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -39,3 +39,10 @@ repos:
         language: system
         pass_filenames: false
         stages: [pre-push]
+
+      - id: check-pi-ai-shim
+        name: Validate pi-ai shim sync
+        entry: bun run check:pi-ai-shim
+        language: system
+        pass_filenames: false
+        stages: [pre-push]
diff --git a/apps/cli/test/commands/results/export-e2e-providers.test.ts b/apps/cli/test/commands/results/export-e2e-providers.test.ts
index 5cd562b4..2d8cd1df 100644
--- a/apps/cli/test/commands/results/export-e2e-providers.test.ts
+++ b/apps/cli/test/commands/results/export-e2e-providers.test.ts
@@ -3,7 +3,7 @@
  *
  * Validates that reasoning tokens, cached tokens, duration, cost,
  * and other metrics survive the JSONL → artifact conversion pipeline
- * for: claude-cli, codex, copilot-cli, pi-coding-agent, and llm (ai-sdk).
+ * for: claude-cli, codex, copilot-cli, pi-coding-agent, and llm (pi-ai).
  */
 import { afterEach, beforeEach, describe, expect, it } from 'bun:test';
 import { existsSync, mkdtempSync, readFileSync, rmSync } from 'node:fs';
diff --git a/package.json b/package.json
index 19b26f17..fa4c3477 100644
--- a/package.json
+++ b/package.json
@@ -19,6 +19,7 @@
     "agentv": "bun apps/cli/src/cli.ts",
     "agentv:buildrun": "bun run build && bun apps/cli/dist/cli.js",
     "validate:examples": "EVAL_CRITERIA=placeholder CUSTOM_SYSTEM_PROMPT=placeholder bun scripts/validate-example-evals.ts",
+    "check:pi-ai-shim": "bun scripts/check-pi-ai-shim.ts",
     "eval:baseline-check": "bun scripts/check-eval-baselines.ts",
     "release": "bun scripts/release.ts",
     "release:next": "bun scripts/release.ts next",
diff --git a/packages/core/src/evaluation/providers/agentv-provider.ts b/packages/core/src/evaluation/providers/agentv-provider.ts
index 7097524a..b89d1523 100644
--- a/packages/core/src/evaluation/providers/agentv-provider.ts
+++ b/packages/core/src/evaluation/providers/agentv-provider.ts
@@ -1,4 +1,4 @@
-import { invokePiAi, resolvePiModel } from './ai-sdk.js';
+import { invokePiAi, resolvePiModel } from './llm-providers.js';
 import type { AgentVResolvedConfig } from './targets.js';
 import type { Provider, ProviderRequest, ProviderResponse } from './types.js';
 
@@ -39,8 +39,8 @@ export class AgentvProvider implements Provider {
 }
 
 /**
- * Parse `provider:model` into the pi-ai routing fields. Each ai-sdk-style
- * provider name maps to a pi-ai (providerName, apiId) pair:
+ * Parse `provider:model` into the pi-ai routing fields. Each provider
+ * shorthand maps to a pi-ai (providerName, apiId) pair:
  *
  *   openai:<id>    → ('openai', 'openai-completions')
  *   anthropic:<id> → ('anthropic', 'anthropic-messages')
diff --git a/packages/core/src/evaluation/providers/index.ts b/packages/core/src/evaluation/providers/index.ts
index 1c7cee7d..c7b18bda 100644
--- a/packages/core/src/evaluation/providers/index.ts
+++ b/packages/core/src/evaluation/providers/index.ts
@@ -1,11 +1,4 @@
 import { AgentvProvider } from './agentv-provider.js';
-import {
-  AnthropicProvider,
-  AzureProvider,
-  GeminiProvider,
-  OpenAIProvider,
-  OpenRouterProvider,
-} from './ai-sdk.js';
 import { ClaudeCliProvider } from './claude-cli.js';
 import { ClaudeSdkProvider } from './claude-sdk.js';
 import { ClaudeProvider } from './claude.js';
@@ -14,6 +7,13 @@ import { CodexProvider } from './codex.js';
 import { CopilotCliProvider } from './copilot-cli.js';
 import { CopilotLogProvider } from './copilot-log.js';
 import { CopilotSdkProvider } from './copilot-sdk.js';
+import {
+  AnthropicProvider,
+  AzureProvider,
+  GeminiProvider,
+  OpenAIProvider,
+  OpenRouterProvider,
+} from './llm-providers.js';
 import { MockProvider } from './mock.js';
 import { PiCliProvider } from './pi-cli.js';
 import { PiCodingAgentProvider } from './pi-coding-agent.js';
diff --git a/packages/core/src/evaluation/providers/ai-sdk.ts b/packages/core/src/evaluation/providers/llm-providers.ts
similarity index 98%
rename from packages/core/src/evaluation/providers/ai-sdk.ts
rename to packages/core/src/evaluation/providers/llm-providers.ts
index df81c4f4..823c7c7a 100644
--- a/packages/core/src/evaluation/providers/ai-sdk.ts
+++ b/packages/core/src/evaluation/providers/llm-providers.ts
@@ -8,7 +8,6 @@
  *   2. Implements invoke() by delegating to invokePiAi(), which runs the
  *      stateless single-shot path or the multi-step agent loop depending on
  *      whether the request carries `tools`.
- *   3. Holds no Vercel AI SDK references.
  *
  * To add a new provider:
  *   1. Add a config interface in targets.ts.
@@ -17,9 +16,6 @@
  *      truth for `providerName`; pi-ai's KnownApi list is the source of
  *      truth for `apiId`.
  *   3. Register it in providers/index.ts.
- *
- * File name: kept as ai-sdk.ts for now to minimize diff churn during the
- * pi-ai migration — rename in a follow-up once the dust settles.
  */
 
 import {
@@ -630,7 +626,8 @@ function mapPiResponse(
   // pi-ai always populates `cost.total`, but it computes 0 when the model
   // descriptor lacks pricing (fallback descriptor for unknown ids, or pi-ai's
   // registry simply not having rates yet). Surface 0 as "unknown" by leaving
-  // costUsd undefined — matches the legacy ai-sdk path, which never set it.
+  // costUsd undefined — keeps parity with consumers that previously saw it
+  // unset.
   const costUsd = timing.aggregateUsage.cost > 0 ? timing.aggregateUsage.cost : undefined;
 
   return {
diff --git a/packages/core/src/evaluation/providers/pi-ai-shim.d.ts b/packages/core/src/evaluation/providers/pi-ai-shim.d.ts
index 41464eba..0885477b 100644
--- a/packages/core/src/evaluation/providers/pi-ai-shim.d.ts
+++ b/packages/core/src/evaluation/providers/pi-ai-shim.d.ts
@@ -8,6 +8,10 @@
 //
 // Keep this minimal: only what we actively call. Mirror the upstream shape
 // from node_modules/.bun/@mariozechner+pi-ai@*/dist/*.d.ts.
+//
+// `bun run check:pi-ai-shim` (also runs on pre-push) compares this file
+// against pi-ai's published types and fails when interfaces or fields drift.
+// Run it locally after editing this shim or bumping pi-ai.
 
 declare module '@mariozechner/pi-ai' {
   // ---- types/types.d.ts ----
diff --git a/scripts/check-pi-ai-shim.ts b/scripts/check-pi-ai-shim.ts
new file mode 100644
index 00000000..cd6c654d
--- /dev/null
+++ b/scripts/check-pi-ai-shim.ts
@@ -0,0 +1,199 @@
+/**
+ * check-pi-ai-shim.ts
+ *
+ * Validates that packages/core/src/evaluation/providers/pi-ai-shim.d.ts stays
+ * structurally compatible with the published types of @mariozechner/pi-ai.
+ *
+ * The shim re-declares pi-ai's public surface so our static imports resolve
+ * (pi-ai's published d.ts has cross-module re-exports that don't surface
+ * under NodeNext). If pi-ai ships a breaking change — renamed field, removed
+ * function — the shim stays valid TypeScript while our runtime drifts.
+ * This script catches that drift.
+ *
+ * Checks performed:
+ *   - Every interface declared in the shim exists in pi-ai's published .d.ts
+ *     files, and every field name we declare is also declared upstream.
+ *   - Every function declared in the shim is exported by pi-ai's d.ts.
+ *
+ * Field types are NOT compared — too much surface and rarely the source of
+ * silent drift. Type-level breakage would surface as a TypeScript error in
+ * llm-providers.ts; the unit-test suite covers runtime export presence.
+ *
+ * Usage:
+ *   bun scripts/check-pi-ai-shim.ts
+ *
+ * Wired into the pre-push hook (see .pre-commit-config.yaml).
+ */
+
+import { existsSync, readFileSync, readdirSync } from 'node:fs';
+import { dirname, join, resolve } from 'node:path';
+
+// ---------------------------------------------------------------------------
+// Locate pi-ai's installed dist directory and our shim source.
+//
+// Bun's package layout keeps each version under node_modules/.bun/<name>@<v>+<hash>/
+// rather than hoisting to node_modules/<name>/. require.resolve from this
+// script's location can't reach it (we're not inside packages/core's resolution
+// path). Walk node_modules/.bun directly — first match wins, since we only
+// install one pi-ai version.
+// ---------------------------------------------------------------------------
+
+function findPiAiDistDir(): string {
+  const bunDir = resolve('node_modules/.bun');
+  if (!existsSync(bunDir)) {
+    throw new Error(`node_modules/.bun does not exist at ${bunDir} — run \`bun install\`?`);
+  }
+  for (const entry of readdirSync(bunDir)) {
+    if (entry.startsWith('@mariozechner+pi-ai@')) {
+      const dist = join(bunDir, entry, 'node_modules', '@mariozechner', 'pi-ai', 'dist');
+      if (existsSync(dist)) return dist;
+    }
+  }
+  throw new Error('Could not locate @mariozechner/pi-ai under node_modules/.bun.');
+}
+
+const piAiDistDir = findPiAiDistDir();
+const shimPath = resolve('packages/core/src/evaluation/providers/pi-ai-shim.d.ts');
+
+// ---------------------------------------------------------------------------
+// Read all .d.ts files under pi-ai/dist into one concatenated source string.
+// Pi-ai re-exports across modules; concatenating lets us search for any
+// declaration regardless of which file it lives in.
+// ---------------------------------------------------------------------------
+
+function readDtsRecursive(dir: string): string {
+  const parts: string[] = [];
+  for (const entry of readdirSync(dir, { withFileTypes: true })) {
+    const path = join(dir, entry.name);
+    if (entry.isDirectory()) {
+      parts.push(readDtsRecursive(path));
+    } else if (entry.name.endsWith('.d.ts')) {
+      parts.push(readFileSync(path, 'utf8'));
+    }
+  }
+  return parts.join('\n');
+}
+
+const upstreamSource = readDtsRecursive(piAiDistDir);
+const shimSource = readFileSync(shimPath, 'utf8');
+
+// ---------------------------------------------------------------------------
+// Lightweight d.ts parser: extract interface names + their top-level field
+// names, and exported function names. Not a full TS parser — enough for the
+// shapes we care about. Uses brace-counting so multi-line bodies and nested
+// type literals don't trip it.
+// ---------------------------------------------------------------------------
+
+function extractInterfaces(source: string): Map<string, Set<string>> {
+  const interfaces = new Map<string, Set<string>>();
+  // Match: `export interface Name<T,U>(?: extends ...)? {`
+  const startPattern = /export\s+interface\s+(\w+)(?:<[^>]+>)?(?:\s+extends\s+[^{]+)?\s*\{/g;
+  let m: RegExpExecArray | null;
+  while (true) {
+    m = startPattern.exec(source);
+    if (!m) break;
+    const name = m[1];
+    const bodyStart = m.index + m[0].length;
+    let depth = 1;
+    let i = bodyStart;
+    while (i < source.length && depth > 0) {
+      const c = source[i];
+      if (c === '{') depth++;
+      else if (c === '}') depth--;
+      i++;
+    }
+    const body = source.slice(bodyStart, i - 1);
+    interfaces.set(name, extractTopLevelFieldNames(body));
+  }
+  return interfaces;
+}
+
+function extractTopLevelFieldNames(body: string): Set<string> {
+  const fields = new Set<string>();
+  let depth = 0;
+  let lineStart = 0;
+  // Walk the body splitting on `;` `,` and newlines but only at depth 0 so
+  // nested type literals (e.g. `cost: { input: number; ... }`) stay together.
+  for (let j = 0; j <= body.length; j++) {
+    const c = body[j];
+    if (c === '{' || c === '[' || c === '(') depth++;
+    else if (c === '}' || c === ']' || c === ')') depth--;
+    else if ((c === '\n' || c === ';' || c === ',' || j === body.length) && depth === 0) {
+      const line = body.slice(lineStart, j).trim();
+      const fieldMatch = line.match(/^(?:readonly\s+)?(\w+)\s*\??\s*:/);
+      if (fieldMatch) fields.add(fieldMatch[1]);
+      lineStart = j + 1;
+    }
+  }
+  return fields;
+}
+
+function extractFunctions(source: string): Set<string> {
+  const fns = new Set<string>();
+  const re = /export\s+(?:declare\s+)?function\s+(\w+)\s*[(<]/g;
+  let m: RegExpExecArray | null;
+  while (true) {
+    m = re.exec(source);
+    if (!m) break;
+    fns.add(m[1]);
+  }
+  return fns;
+}
+
+// ---------------------------------------------------------------------------
+// Run the checks
+// ---------------------------------------------------------------------------
+
+const errors: string[] = [];
+
+// Type structure
+const shimInterfaces = extractInterfaces(shimSource);
+const upstreamInterfaces = extractInterfaces(upstreamSource);
+
+for (const [name, fields] of shimInterfaces) {
+  const upstreamFields = upstreamInterfaces.get(name);
+  if (!upstreamFields) {
+    errors.push(
+      `interface '${name}' is declared in pi-ai-shim.d.ts but not found in pi-ai's published types`,
+    );
+    continue;
+  }
+  for (const field of fields) {
+    if (!upstreamFields.has(field)) {
+      errors.push(`interface '${name}': shim declares field '${field}' that is not in upstream`);
+    }
+  }
+}
+
+const shimFns = extractFunctions(shimSource);
+const upstreamFns = extractFunctions(upstreamSource);
+for (const fn of shimFns) {
+  if (!upstreamFns.has(fn)) {
+    errors.push(
+      `function '${fn}' is declared in pi-ai-shim.d.ts but not in pi-ai's published types`,
+    );
+  }
+}
+
+// ---------------------------------------------------------------------------
+// Report
+// ---------------------------------------------------------------------------
+
+if (errors.length > 0) {
+  console.error(
+    'pi-ai-shim drift detected. Update packages/core/src/evaluation/providers/pi-ai-shim.d.ts to match pi-ai:',
+  );
+  console.error('');
+  for (const e of errors) {
+    console.error(`  ✗ ${e}`);
+  }
+  console.error('');
+  console.error(`pi-ai d.ts location: ${piAiDistDir}`);
+  process.exit(1);
+}
+
+const interfaceCount = shimInterfaces.size;
+const fnCount = shimFns.size;
+console.log(
+  `✓ pi-ai-shim is in sync with @mariozechner/pi-ai (${interfaceCount} interfaces, ${fnCount} functions checked)`,
+);

From 64aeab9cb524d7bf3238ad25cf3c74bf58fc3975 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sun, 3 May 2026 16:45:18 +0200
Subject: [PATCH 13/20] fix(core): root-cause pi-ai type resolution; delete
 shim
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The pi-ai-shim.d.ts wasn't working around a pi-ai bug — it was working
around a stale `declare module '@mariozechner/pi-ai'` block in our own
src/types/pi-sdk.d.ts that declared just `getModel(...): unknown`.

That stub was added when pi-ai was an optional peer-dep accessed via
dynamic import in pi-coding-agent.ts. When pi-ai became a direct dep with
its own published types, the stub started colliding: TypeScript merged
our `declare module` block with the real one and shadowed/dropped most
of pi-ai's exports (complete, Model, AssistantMessage, ...) — but only
when the full src/ tree was compiled, which is why it didn't reproduce
in a minimal project.

Confirmed the diagnosis by removing the stub block and watching pi-ai's
imports resolve cleanly with no other changes. The pi-ai-shim.d.ts and
the @sinclair/typebox direct dep we added were both unnecessary
workarounds for this self-inflicted issue.

Changes:
- src/types/pi-sdk.d.ts: drop the `declare module '@mariozechner/pi-ai'`
  block entirely. Keep the pi-coding-agent block (still a real optional
  peer-dep stub). Header comment now warns against re-adding a pi-ai
  block.
- src/evaluation/providers/pi-ai-shim.d.ts: deleted.
- src/evaluation/providers/llm-providers.ts: import pi-ai's real types.
  Add boundary casts where pi-ai's typed registry meets our runtime
  strings (PiKnownProvider for getModel's provider arg, `as never` for
  modelId, `as unknown as PiTool[]` for our JSON-Schema tools fed into
  pi-ai's TypeBox-typed parameters slot — pi-ai's openai-completions
  converter passes parameters through as JSON Schema unchanged).
- packages/core/package.json: drop @sinclair/typebox direct dep.
- scripts/check-pi-ai-shim.ts: deleted (no shim to validate).
- .pre-commit-config.yaml: drop the check-pi-ai-shim hook.
- package.json: drop the check:pi-ai-shim script.

Verified: typecheck / lint / 1741 unit tests / live UAT through
OpenRouter all green with no shim and pi-ai's real types in use.

Refs #1205

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .pre-commit-config.yaml                       |   7 -
 bun.lock                                      |   1 -
 package.json                                  |   1 -
 packages/core/package.json                    |   1 -
 .../src/evaluation/providers/llm-providers.ts |  27 ++-
 .../src/evaluation/providers/pi-ai-shim.d.ts  | 135 ------------
 packages/core/src/types/pi-sdk.d.ts           |  13 +-
 scripts/check-pi-ai-shim.ts                   | 199 ------------------
 8 files changed, 29 insertions(+), 355 deletions(-)
 delete mode 100644 packages/core/src/evaluation/providers/pi-ai-shim.d.ts
 delete mode 100644 scripts/check-pi-ai-shim.ts

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index effe728e..d6e1bec8 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -39,10 +39,3 @@ repos:
         language: system
         pass_filenames: false
         stages: [pre-push]
-
-      - id: check-pi-ai-shim
-        name: Validate pi-ai shim sync
-        entry: bun run check:pi-ai-shim
-        language: system
-        pass_filenames: false
-        stages: [pre-push]
diff --git a/bun.lock b/bun.lock
index 029ea860..f099d396 100644
--- a/bun.lock
+++ b/bun.lock
@@ -91,7 +91,6 @@
         "@github/copilot-sdk": "^0.1.25",
         "@mariozechner/pi-ai": "^0.62.0",
         "@openai/codex-sdk": "^0.104.0",
-        "@sinclair/typebox": "^0.34.41",
         "fast-glob": "^3.3.3",
         "json5": "^2.2.3",
         "micromatch": "^4.0.8",
diff --git a/package.json b/package.json
index fa4c3477..19b26f17 100644
--- a/package.json
+++ b/package.json
@@ -19,7 +19,6 @@
     "agentv": "bun apps/cli/src/cli.ts",
     "agentv:buildrun": "bun run build && bun apps/cli/dist/cli.js",
     "validate:examples": "EVAL_CRITERIA=placeholder CUSTOM_SYSTEM_PROMPT=placeholder bun scripts/validate-example-evals.ts",
-    "check:pi-ai-shim": "bun scripts/check-pi-ai-shim.ts",
     "eval:baseline-check": "bun scripts/check-eval-baselines.ts",
     "release": "bun scripts/release.ts",
     "release:next": "bun scripts/release.ts next",
diff --git a/packages/core/package.json b/packages/core/package.json
index 40cc1524..a6b30d85 100644
--- a/packages/core/package.json
+++ b/packages/core/package.json
@@ -45,7 +45,6 @@
     "@github/copilot-sdk": "^0.1.25",
     "@mariozechner/pi-ai": "^0.62.0",
     "@openai/codex-sdk": "^0.104.0",
-    "@sinclair/typebox": "^0.34.41",
     "fast-glob": "^3.3.3",
     "json5": "^2.2.3",
     "micromatch": "^4.0.8",
diff --git a/packages/core/src/evaluation/providers/llm-providers.ts b/packages/core/src/evaluation/providers/llm-providers.ts
index 823c7c7a..8b8ce269 100644
--- a/packages/core/src/evaluation/providers/llm-providers.ts
+++ b/packages/core/src/evaluation/providers/llm-providers.ts
@@ -19,9 +19,11 @@
  */
 
 import {
+  type Api as PiApi,
   type AssistantMessage as PiAssistantMessage,
+  type KnownProvider as PiKnownProvider,
   type Message as PiMessage,
-  type Model as PiModel,
+  type Model as PiModelBase,
   type Tool as PiTool,
   type ToolCall as PiToolCall,
   complete as piComplete,
@@ -29,6 +31,11 @@ import {
   registerBuiltInApiProviders,
 } from '@mariozechner/pi-ai';
 
+// Pi-ai's `Model<TApi>` is generic over the api id. Every site that passes a
+// model around treats it as `Model<Api>` (the runtime-string variant), so
+// alias once here.
+type PiModel = PiModelBase<PiApi>;
+
 // pi-ai routes complete()/stream() by Model.api; the built-in providers must be
 // registered once at module load. Cheap; idempotent across repeated imports.
 registerBuiltInApiProviders();
@@ -332,12 +339,17 @@ export async function invokePiAi(options: InvokePiAiOptions): Promise<ProviderRe
   if (request.images && request.images.length > 0) {
     attachImagesToLastUserMessage(messages, request.images);
   }
+  // Pi-ai's `Tool.parameters` is typed as a TypeBox `TSchema` (Symbol-branded
+  // for TS-level inference), but at runtime its OpenAI-completions converter
+  // forwards `parameters` to the wire format unchanged — see pi-ai's
+  // openai-completions.js convertTools(): "TypeBox already generates JSON
+  // Schema". We pass plain JSON Schema and cast at the boundary.
   const piTools: PiTool[] | undefined = tools
-    ? tools.map((t) => ({
+    ? (tools.map((t) => ({
         name: t.name,
         description: t.description,
         parameters: t.parameters,
-      }))
+      })) as unknown as PiTool[])
     : undefined;
   const ctx = { systemPrompt, messages, ...(piTools ? { tools: piTools } : {}) };
   const { temperature, maxOutputTokens } = resolveModelSettings(request, defaults);
@@ -444,12 +456,13 @@ export function resolvePiModel(args: {
 }): PiModel {
   const { providerName, apiId, modelId, baseUrl } = args;
 
-  // pi-ai's getModel returns a Model when (provider, modelId) is in its
-  // registry; otherwise we synthesize a minimal descriptor — every field is
-  // required by the Model interface.
+  // pi-ai's getModel is generic over a typed registry of (provider, modelId)
+  // pairs; runtime strings need a cast at the boundary. Returns a Model when
+  // the pair is in its registry, throws otherwise; we synthesize a minimal
+  // descriptor below for unknown pairs (custom gateways, Azure deployments).
   let model: PiModel | undefined;
   try {
-    model = piGetModel(providerName, modelId) as PiModel;
+    model = piGetModel(providerName as PiKnownProvider, modelId as never) as PiModel;
   } catch {
     model = undefined;
   }
diff --git a/packages/core/src/evaluation/providers/pi-ai-shim.d.ts b/packages/core/src/evaluation/providers/pi-ai-shim.d.ts
deleted file mode 100644
index 0885477b..00000000
--- a/packages/core/src/evaluation/providers/pi-ai-shim.d.ts
+++ /dev/null
@@ -1,135 +0,0 @@
-// Augments '@mariozechner/pi-ai' types with the subset we use.
-// Pi-ai's published d.ts has cross-module re-exports (`export * from`,
-// `export { X } from`) that TypeScript's NodeNext resolution does not surface
-// at the top-level — only direct primary declarations make it through (e.g.
-// `getModel` from models.d.ts is fine; `complete` from stream.d.ts isn't).
-// This shim re-declares the surface we depend on so our code can use plain
-// static imports and real types instead of dynamic-import + any casts.
-//
-// Keep this minimal: only what we actively call. Mirror the upstream shape
-// from node_modules/.bun/@mariozechner+pi-ai@*/dist/*.d.ts.
-//
-// `bun run check:pi-ai-shim` (also runs on pre-push) compares this file
-// against pi-ai's published types and fails when interfaces or fields drift.
-// Run it locally after editing this shim or bumping pi-ai.
-
-declare module '@mariozechner/pi-ai' {
-  // ---- types/types.d.ts ----
-  export type Api = string;
-  export type KnownProvider = string;
-  export type Provider = string;
-  export type ThinkingLevel = 'minimal' | 'low' | 'medium' | 'high' | 'xhigh';
-
-  export interface TextContent {
-    type: 'text';
-    text: string;
-  }
-  export interface ThinkingContent {
-    type: 'thinking';
-    thinking: string;
-  }
-  export interface ImageContent {
-    type: 'image';
-    /** Base64 data, data URL, or absolute URL. */
-    data: string;
-    /** MIME type, e.g. "image/png". */
-    mimeType: string;
-  }
-  export interface ToolCall {
-    type: 'toolCall';
-    id: string;
-    name: string;
-    arguments: unknown;
-    thoughtSignature?: string;
-  }
-
-  export interface Usage {
-    input: number;
-    output: number;
-    cacheRead: number;
-    cacheWrite: number;
-    totalTokens: number;
-    cost: {
-      input: number;
-      output: number;
-      cacheRead: number;
-      cacheWrite: number;
-      total: number;
-    };
-  }
-
-  export interface UserMessage {
-    role: 'user';
-    content: string | Array<TextContent | ImageContent>;
-    timestamp: number;
-  }
-  export interface AssistantMessage {
-    role: 'assistant';
-    content: Array<TextContent | ThinkingContent | ToolCall>;
-    api: Api;
-    provider: Provider;
-    model: string;
-    usage: Usage;
-    stopReason: 'stop' | 'length' | 'toolUse' | 'error' | 'aborted';
-    timestamp: number;
-  }
-  export interface ToolResultMessage {
-    role: 'toolResult';
-    toolCallId: string;
-    toolName: string;
-    content: Array<TextContent | ImageContent>;
-    isError: boolean;
-    timestamp: number;
-  }
-  export type Message = UserMessage | AssistantMessage | ToolResultMessage;
-
-  export interface Model {
-    id: string;
-    name: string;
-    api: Api;
-    provider: Provider;
-    baseUrl: string;
-    reasoning: boolean;
-    input: ReadonlyArray<'text' | 'image'>;
-    cost: { input: number; output: number; cacheRead: number; cacheWrite: number };
-    contextWindow: number;
-    maxTokens: number;
-  }
-
-  /**
-   * Pi-ai's Tool wraps a TypeBox schema; we send JSON Schema directly via the
-   * adapter, so the relaxed `parameters: object` here lets us pass plain
-   * JSON-Schema objects without round-tripping through TypeBox builders. Pi-ai
-   * forwards `parameters` to the provider's wire format unchanged (it
-   * stringifies it for OpenAI completions, etc.) so this is safe at runtime.
-   */
-  export interface Tool {
-    name: string;
-    description: string;
-    parameters: object;
-  }
-
-  export interface Context {
-    systemPrompt?: string;
-    messages: Message[];
-    tools?: Tool[];
-  }
-
-  export interface StreamOptions {
-    temperature?: number;
-    maxTokens?: number;
-    apiKey?: string;
-    signal?: AbortSignal;
-    headers?: Record<string, string>;
-  }
-
-  // ---- stream.d.ts ----
-  export function complete(
-    model: Model,
-    context: Context,
-    options?: StreamOptions,
-  ): Promise<AssistantMessage>;
-
-  // ---- providers/register-builtins.d.ts ----
-  export function registerBuiltInApiProviders(): void;
-}
diff --git a/packages/core/src/types/pi-sdk.d.ts b/packages/core/src/types/pi-sdk.d.ts
index a1fbafeb..dbff7b91 100644
--- a/packages/core/src/types/pi-sdk.d.ts
+++ b/packages/core/src/types/pi-sdk.d.ts
@@ -1,3 +1,12 @@
+// pi-coding-agent is an optional peerDependency (loaded lazily by
+// pi-coding-agent.ts when the user explicitly opts in to pi as an agent
+// target). It is not always installed, so we declare a minimal type stub
+// here to keep TypeScript happy in the common path.
+//
+// Do NOT add a parallel `declare module '@mariozechner/pi-ai'` block —
+// pi-ai is a regular dependency with proper published types, and a stub
+// here would shadow them and break named imports.
+
 declare module '@mariozechner/pi-coding-agent' {
   interface PiEvent {
     type: string;
@@ -33,7 +42,3 @@ declare module '@mariozechner/pi-coding-agent' {
     };
   }>;
 }
-
-declare module '@mariozechner/pi-ai' {
-  export function getModel(...args: unknown[]): unknown;
-}
diff --git a/scripts/check-pi-ai-shim.ts b/scripts/check-pi-ai-shim.ts
deleted file mode 100644
index cd6c654d..00000000
--- a/scripts/check-pi-ai-shim.ts
+++ /dev/null
@@ -1,199 +0,0 @@
-/**
- * check-pi-ai-shim.ts
- *
- * Validates that packages/core/src/evaluation/providers/pi-ai-shim.d.ts stays
- * structurally compatible with the published types of @mariozechner/pi-ai.
- *
- * The shim re-declares pi-ai's public surface so our static imports resolve
- * (pi-ai's published d.ts has cross-module re-exports that don't surface
- * under NodeNext). If pi-ai ships a breaking change — renamed field, removed
- * function — the shim stays valid TypeScript while our runtime drifts.
- * This script catches that drift.
- *
- * Checks performed:
- *   - Every interface declared in the shim exists in pi-ai's published .d.ts
- *     files, and every field name we declare is also declared upstream.
- *   - Every function declared in the shim is exported by pi-ai's d.ts.
- *
- * Field types are NOT compared — too much surface and rarely the source of
- * silent drift. Type-level breakage would surface as a TypeScript error in
- * llm-providers.ts; the unit-test suite covers runtime export presence.
- *
- * Usage:
- *   bun scripts/check-pi-ai-shim.ts
- *
- * Wired into the pre-push hook (see .pre-commit-config.yaml).
- */
-
-import { existsSync, readFileSync, readdirSync } from 'node:fs';
-import { dirname, join, resolve } from 'node:path';
-
-// ---------------------------------------------------------------------------
-// Locate pi-ai's installed dist directory and our shim source.
-//
-// Bun's package layout keeps each version under node_modules/.bun/<name>@<v>+<hash>/
-// rather than hoisting to node_modules/<name>/. require.resolve from this
-// script's location can't reach it (we're not inside packages/core's resolution
-// path). Walk node_modules/.bun directly — first match wins, since we only
-// install one pi-ai version.
-// ---------------------------------------------------------------------------
-
-function findPiAiDistDir(): string {
-  const bunDir = resolve('node_modules/.bun');
-  if (!existsSync(bunDir)) {
-    throw new Error(`node_modules/.bun does not exist at ${bunDir} — run \`bun install\`?`);
-  }
-  for (const entry of readdirSync(bunDir)) {
-    if (entry.startsWith('@mariozechner+pi-ai@')) {
-      const dist = join(bunDir, entry, 'node_modules', '@mariozechner', 'pi-ai', 'dist');
-      if (existsSync(dist)) return dist;
-    }
-  }
-  throw new Error('Could not locate @mariozechner/pi-ai under node_modules/.bun.');
-}
-
-const piAiDistDir = findPiAiDistDir();
-const shimPath = resolve('packages/core/src/evaluation/providers/pi-ai-shim.d.ts');
-
-// ---------------------------------------------------------------------------
-// Read all .d.ts files under pi-ai/dist into one concatenated source string.
-// Pi-ai re-exports across modules; concatenating lets us search for any
-// declaration regardless of which file it lives in.
-// ---------------------------------------------------------------------------
-
-function readDtsRecursive(dir: string): string {
-  const parts: string[] = [];
-  for (const entry of readdirSync(dir, { withFileTypes: true })) {
-    const path = join(dir, entry.name);
-    if (entry.isDirectory()) {
-      parts.push(readDtsRecursive(path));
-    } else if (entry.name.endsWith('.d.ts')) {
-      parts.push(readFileSync(path, 'utf8'));
-    }
-  }
-  return parts.join('\n');
-}
-
-const upstreamSource = readDtsRecursive(piAiDistDir);
-const shimSource = readFileSync(shimPath, 'utf8');
-
-// ---------------------------------------------------------------------------
-// Lightweight d.ts parser: extract interface names + their top-level field
-// names, and exported function names. Not a full TS parser — enough for the
-// shapes we care about. Uses brace-counting so multi-line bodies and nested
-// type literals don't trip it.
-// ---------------------------------------------------------------------------
-
-function extractInterfaces(source: string): Map<string, Set<string>> {
-  const interfaces = new Map<string, Set<string>>();
-  // Match: `export interface Name<T,U>(?: extends ...)? {`
-  const startPattern = /export\s+interface\s+(\w+)(?:<[^>]+>)?(?:\s+extends\s+[^{]+)?\s*\{/g;
-  let m: RegExpExecArray | null;
-  while (true) {
-    m = startPattern.exec(source);
-    if (!m) break;
-    const name = m[1];
-    const bodyStart = m.index + m[0].length;
-    let depth = 1;
-    let i = bodyStart;
-    while (i < source.length && depth > 0) {
-      const c = source[i];
-      if (c === '{') depth++;
-      else if (c === '}') depth--;
-      i++;
-    }
-    const body = source.slice(bodyStart, i - 1);
-    interfaces.set(name, extractTopLevelFieldNames(body));
-  }
-  return interfaces;
-}
-
-function extractTopLevelFieldNames(body: string): Set<string> {
-  const fields = new Set<string>();
-  let depth = 0;
-  let lineStart = 0;
-  // Walk the body splitting on `;` `,` and newlines but only at depth 0 so
-  // nested type literals (e.g. `cost: { input: number; ... }`) stay together.
-  for (let j = 0; j <= body.length; j++) {
-    const c = body[j];
-    if (c === '{' || c === '[' || c === '(') depth++;
-    else if (c === '}' || c === ']' || c === ')') depth--;
-    else if ((c === '\n' || c === ';' || c === ',' || j === body.length) && depth === 0) {
-      const line = body.slice(lineStart, j).trim();
-      const fieldMatch = line.match(/^(?:readonly\s+)?(\w+)\s*\??\s*:/);
-      if (fieldMatch) fields.add(fieldMatch[1]);
-      lineStart = j + 1;
-    }
-  }
-  return fields;
-}
-
-function extractFunctions(source: string): Set<string> {
-  const fns = new Set<string>();
-  const re = /export\s+(?:declare\s+)?function\s+(\w+)\s*[(<]/g;
-  let m: RegExpExecArray | null;
-  while (true) {
-    m = re.exec(source);
-    if (!m) break;
-    fns.add(m[1]);
-  }
-  return fns;
-}
-
-// ---------------------------------------------------------------------------
-// Run the checks
-// ---------------------------------------------------------------------------
-
-const errors: string[] = [];
-
-// Type structure
-const shimInterfaces = extractInterfaces(shimSource);
-const upstreamInterfaces = extractInterfaces(upstreamSource);
-
-for (const [name, fields] of shimInterfaces) {
-  const upstreamFields = upstreamInterfaces.get(name);
-  if (!upstreamFields) {
-    errors.push(
-      `interface '${name}' is declared in pi-ai-shim.d.ts but not found in pi-ai's published types`,
-    );
-    continue;
-  }
-  for (const field of fields) {
-    if (!upstreamFields.has(field)) {
-      errors.push(`interface '${name}': shim declares field '${field}' that is not in upstream`);
-    }
-  }
-}
-
-const shimFns = extractFunctions(shimSource);
-const upstreamFns = extractFunctions(upstreamSource);
-for (const fn of shimFns) {
-  if (!upstreamFns.has(fn)) {
-    errors.push(
-      `function '${fn}' is declared in pi-ai-shim.d.ts but not in pi-ai's published types`,
-    );
-  }
-}
-
-// ---------------------------------------------------------------------------
-// Report
-// ---------------------------------------------------------------------------
-
-if (errors.length > 0) {
-  console.error(
-    'pi-ai-shim drift detected. Update packages/core/src/evaluation/providers/pi-ai-shim.d.ts to match pi-ai:',
-  );
-  console.error('');
-  for (const e of errors) {
-    console.error(`  ✗ ${e}`);
-  }
-  console.error('');
-  console.error(`pi-ai d.ts location: ${piAiDistDir}`);
-  process.exit(1);
-}
-
-const interfaceCount = shimInterfaces.size;
-const fnCount = shimFns.size;
-console.log(
-  `✓ pi-ai-shim is in sync with @mariozechner/pi-ai (${interfaceCount} interfaces, ${fnCount} functions checked)`,
-);

From 88586cb03a6c12aad9a6d3bb8f42049e9eb98d08 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sun, 3 May 2026 17:32:32 +0200
Subject: [PATCH 14/20] chore(core): freshen comments + per-provider fallback
 metadata + cast docs
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Three small follow-ups on the pi-ai migration:

1. llm-grader.ts: comments at line 208/474/478 still referenced "AI SDK
   generateText" / "Vercel AI SDK generateText()". Updated to describe the
   actual code path: provider.invoke() with filesystem tools, agent loop
   driven by pi-ai through the agentv provider.

2. llm-providers.ts: `resolvePiModel`'s synthesized fallback Model used a
   single hardcoded `contextWindow: 128000 / maxTokens: 16384` for every
   unknown (provider, modelId). These fields are metadata only — pi-ai
   uses them for cost telemetry, not to cap the API call (the real
   request size comes from StreamOptions.maxTokens, which we omit unless
   the caller set request.maxOutputTokens). Replaced with per-provider
   defaults via `defaultModelMetadata()`:
     - openai / azure-openai-responses: 400K / 128K (gpt-5 family)
     - anthropic: 200K / 32K (claude 4.x)
     - google: 1M / 64K (gemini 2.5)
     - openrouter: 200K / 32K
     - default: 128K / 16K
   Bump these if a custom gateway routes to bigger windows.

3. llm-providers.ts: tightened the two boundary casts with one-line "why
   safe" explanations citing the upstream proof:
     - `as unknown as PiTool[]` — pi-ai/dist/providers/openai-completions.js
       convertTools() forwards `parameters` unchanged ("TypeBox already
       generates JSON Schema").
     - `piGetModel(... as PiKnownProvider, ... as never)` —
       pi-ai/dist/models.js getModel() is a plain Map lookup that accepts
       any string and returns undefined on miss; the casts satisfy the
       generic constraint without changing runtime behavior. Also fixed
       the comment's "throws otherwise" → returns undefined, and made the
       cast `PiModel | undefined` to match.

Verified: typecheck / lint / 1741 unit tests / live UAT through OpenRouter
all green.

Refs #1205

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .../core/src/evaluation/graders/llm-grader.ts |  9 ++-
 .../src/evaluation/providers/llm-providers.ts | 60 +++++++++++++++----
 2 files changed, 54 insertions(+), 15 deletions(-)

diff --git a/packages/core/src/evaluation/graders/llm-grader.ts b/packages/core/src/evaluation/graders/llm-grader.ts
index 08840dea..985f574b 100644
--- a/packages/core/src/evaluation/graders/llm-grader.ts
+++ b/packages/core/src/evaluation/graders/llm-grader.ts
@@ -205,7 +205,7 @@ export class LlmGrader implements Grader {
       throw new Error('No grader provider available for LLM grading');
     }
 
-    // Built-in agent mode: agentv provider → AI SDK generateText with filesystem tools
+    // Built-in agent mode: agentv provider → provider.invoke() with filesystem tools
     if (graderProvider.kind === 'agentv') {
       return this.evaluateBuiltIn(preparedContext, graderProvider);
     }
@@ -471,11 +471,14 @@ export class LlmGrader implements Grader {
   }
 
   // ---------------------------------------------------------------------------
-  // Built-in agent mode (agentv provider — AI SDK generateText with filesystem tools)
+  // Built-in agent mode (agentv provider — provider.invoke() with filesystem tools)
   // ---------------------------------------------------------------------------
 
   /**
-   * Built-in mode: Uses Vercel AI SDK generateText() with sandboxed filesystem tools.
+   * Built-in mode: drives the grader through provider.invoke() with the
+   * sandboxed filesystem tools and a step budget. The pi-ai-backed agentv
+   * provider runs the agent loop (tool call → tool execute → next model
+   * turn) until the model stops requesting tools or maxSteps is hit.
    */
   private async evaluateBuiltIn(
     context: EvaluationContext,
diff --git a/packages/core/src/evaluation/providers/llm-providers.ts b/packages/core/src/evaluation/providers/llm-providers.ts
index 8b8ce269..154cad5a 100644
--- a/packages/core/src/evaluation/providers/llm-providers.ts
+++ b/packages/core/src/evaluation/providers/llm-providers.ts
@@ -339,11 +339,12 @@ export async function invokePiAi(options: InvokePiAiOptions): Promise<ProviderRe
   if (request.images && request.images.length > 0) {
     attachImagesToLastUserMessage(messages, request.images);
   }
-  // Pi-ai's `Tool.parameters` is typed as a TypeBox `TSchema` (Symbol-branded
-  // for TS-level inference), but at runtime its OpenAI-completions converter
-  // forwards `parameters` to the wire format unchanged — see pi-ai's
-  // openai-completions.js convertTools(): "TypeBox already generates JSON
-  // Schema". We pass plain JSON Schema and cast at the boundary.
+  // Cast safety: pi-ai types `Tool.parameters` as a TypeBox `TSchema` for
+  // TS-level inference, but its OpenAI-completions converter forwards
+  // `parameters` to the wire format as-is — see pi-ai/dist/providers/openai-
+  // completions.js `convertTools` which annotates `parameters: tool.parameters
+  // // TypeBox already generates JSON Schema`. Plain JSON Schema works at
+  // runtime; the cast bridges the TS-only Symbol-branding gap.
   const piTools: PiTool[] | undefined = tools
     ? (tools.map((t) => ({
         name: t.name,
@@ -456,13 +457,17 @@ export function resolvePiModel(args: {
 }): PiModel {
   const { providerName, apiId, modelId, baseUrl } = args;
 
-  // pi-ai's getModel is generic over a typed registry of (provider, modelId)
-  // pairs; runtime strings need a cast at the boundary. Returns a Model when
-  // the pair is in its registry, throws otherwise; we synthesize a minimal
-  // descriptor below for unknown pairs (custom gateways, Azure deployments).
+  // Cast safety: pi-ai's `getModel<TProvider, TModelId>` is generic over a
+  // generated registry, but its implementation in pi-ai/dist/models.js is a
+  // plain Map lookup — `modelRegistry.get(provider)?.get(modelId)` — that
+  // accepts any string and returns `undefined` on miss. The PiKnownProvider /
+  // `as never` casts satisfy the type-level constraint without changing
+  // runtime behavior; the try/catch is defensive in case a future pi-ai
+  // version starts throwing. We synthesize a minimal descriptor below for
+  // unknown pairs (custom gateways, Azure deployments).
   let model: PiModel | undefined;
   try {
-    model = piGetModel(providerName as PiKnownProvider, modelId as never) as PiModel;
+    model = piGetModel(providerName as PiKnownProvider, modelId as never) as PiModel | undefined;
   } catch {
     model = undefined;
   }
@@ -474,6 +479,7 @@ export function resolvePiModel(args: {
         `pi-ai adapter cannot resolve a baseUrl for provider '${providerName}' / model '${modelId}'. Either set the target's baseUrl/endpoint or use a model id pi-ai recognizes.`,
       );
     }
+    const { contextWindow, maxTokens } = defaultModelMetadata(providerName);
     model = {
       id: modelId,
       name: modelId,
@@ -483,8 +489,8 @@ export function resolvePiModel(args: {
       reasoning: false,
       input: ['text'],
       cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
-      contextWindow: 128000,
-      maxTokens: 16384,
+      contextWindow,
+      maxTokens,
     };
   }
 
@@ -509,6 +515,36 @@ function defaultBaseUrlFor(providerName: string): string | undefined {
   return undefined;
 }
 
+/**
+ * Generous per-provider context-window / output-token metadata used in the
+ * synthesized fallback Model when pi-ai's registry doesn't recognize the
+ * (provider, modelId) pair. These values are *metadata only* — pi-ai uses
+ * them for cost telemetry and display, not to cap the API call (the actual
+ * request size comes from StreamOptions.maxTokens, which we omit unless
+ * the caller set request.maxOutputTokens). Numbers track the largest
+ * commonly-deployed model family per provider; bump them if a custom
+ * gateway routes to bigger windows.
+ */
+function defaultModelMetadata(providerName: string): {
+  contextWindow: number;
+  maxTokens: number;
+} {
+  switch (providerName) {
+    case 'openai':
+      return { contextWindow: 400_000, maxTokens: 128_000 };
+    case 'azure-openai-responses':
+      return { contextWindow: 400_000, maxTokens: 128_000 };
+    case 'anthropic':
+      return { contextWindow: 200_000, maxTokens: 32_000 };
+    case 'google':
+      return { contextWindow: 1_000_000, maxTokens: 64_000 };
+    case 'openrouter':
+      return { contextWindow: 200_000, maxTokens: 32_000 };
+    default:
+      return { contextWindow: 128_000, maxTokens: 16_000 };
+  }
+}
+
 interface PiContext {
   readonly systemPrompt: string | undefined;
   readonly messages: PiMessage[];

From 14564a23d6a41bf1337f911bfa12d1e408d4f891 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Sun, 3 May 2026 23:18:52 +0200
Subject: [PATCH 15/20] chore(core): simplify resolvePiModel fallback to
 universal 128K/16K
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The per-provider defaultModelMetadata table was over-engineered. On the
complete()/streamOpenAICompletions code path we use, pi-ai only sets
max_tokens when the caller passes StreamOptions.maxTokens — model.maxTokens
is not consulted. Pi-ai's *simple* options builder
(simple-options.js:buildBaseOptions) does fall back to
Math.min(model.maxTokens, 32000) for the completeSimple/streamSimple path,
but we don't currently call that path.

Replace the switch statement with a universal { contextWindow: 128000,
maxTokens: 16384 } matching pi-coding-agent's ModelRegistry choice for
custom models — same numbers across both shims keeps behavior consistent
when callers eventually mix the two SDKs.

Comment now honestly describes pi-ai's actual maxTokens consumption: not
"metadata only", but "metadata on our path; would be a fallback ceiling
on the *Simple path we don't use".

Refs #1205

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 .../src/evaluation/providers/llm-providers.ts | 44 +++++--------------
 1 file changed, 11 insertions(+), 33 deletions(-)

diff --git a/packages/core/src/evaluation/providers/llm-providers.ts b/packages/core/src/evaluation/providers/llm-providers.ts
index 154cad5a..3d6f0190 100644
--- a/packages/core/src/evaluation/providers/llm-providers.ts
+++ b/packages/core/src/evaluation/providers/llm-providers.ts
@@ -479,7 +479,15 @@ export function resolvePiModel(args: {
         `pi-ai adapter cannot resolve a baseUrl for provider '${providerName}' / model '${modelId}'. Either set the target's baseUrl/endpoint or use a model id pi-ai recognizes.`,
       );
     }
-    const { contextWindow, maxTokens } = defaultModelMetadata(providerName);
+    // Universal fallback matching pi-coding-agent's ModelRegistry. These
+    // numbers are mostly metadata: on the complete() / streamOpenAICompletions
+    // path we use, pi-ai only sets max_tokens when the caller passes
+    // StreamOptions.maxTokens (we omit it unless request.maxOutputTokens is
+    // set). pi-ai's *simple* options builder (buildBaseOptions in
+    // simple-options.js) does fall back to Math.min(model.maxTokens, 32000)
+    // when maxTokens is omitted — we don't currently call that path, but if
+    // a future caller switches to completeSimple, the 16384 here keeps the
+    // fallback ceiling sane.
     model = {
       id: modelId,
       name: modelId,
@@ -489,8 +497,8 @@ export function resolvePiModel(args: {
       reasoning: false,
       input: ['text'],
       cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
-      contextWindow,
-      maxTokens,
+      contextWindow: 128_000,
+      maxTokens: 16_384,
     };
   }
 
@@ -515,36 +523,6 @@ function defaultBaseUrlFor(providerName: string): string | undefined {
   return undefined;
 }
 
-/**
- * Generous per-provider context-window / output-token metadata used in the
- * synthesized fallback Model when pi-ai's registry doesn't recognize the
- * (provider, modelId) pair. These values are *metadata only* — pi-ai uses
- * them for cost telemetry and display, not to cap the API call (the actual
- * request size comes from StreamOptions.maxTokens, which we omit unless
- * the caller set request.maxOutputTokens). Numbers track the largest
- * commonly-deployed model family per provider; bump them if a custom
- * gateway routes to bigger windows.
- */
-function defaultModelMetadata(providerName: string): {
-  contextWindow: number;
-  maxTokens: number;
-} {
-  switch (providerName) {
-    case 'openai':
-      return { contextWindow: 400_000, maxTokens: 128_000 };
-    case 'azure-openai-responses':
-      return { contextWindow: 400_000, maxTokens: 128_000 };
-    case 'anthropic':
-      return { contextWindow: 200_000, maxTokens: 32_000 };
-    case 'google':
-      return { contextWindow: 1_000_000, maxTokens: 64_000 };
-    case 'openrouter':
-      return { contextWindow: 200_000, maxTokens: 32_000 };
-    default:
-      return { contextWindow: 128_000, maxTokens: 16_000 };
-  }
-}
-
 interface PiContext {
   readonly systemPrompt: string | undefined;
   readonly messages: PiMessage[];

From 52275fca4b39ab1e98504f294362d4abf78ab27f Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Mon, 4 May 2026 01:18:42 +0200
Subject: [PATCH 16/20] docs: add MiMo targets to targets.yaml; document
 max_output_tokens for custom providers

---
 .agentv/targets.yaml                        | 13 +++++++++
 apps/cli/src/templates/.agentv/targets.yaml | 31 +++++++++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/.agentv/targets.yaml b/.agentv/targets.yaml
index 067a75bf..f9a1ee79 100644
--- a/.agentv/targets.yaml
+++ b/.agentv/targets.yaml
@@ -151,3 +151,16 @@ targets:
     provider: openrouter
     api_key: ${{ OPENROUTER_API_KEY }}
     model: ${{ OPENROUTER_MODEL }}
+
+  # ── MiMo (Xiaomi) via OpenRouter ───────────────────────────────────
+  - name: mimo
+    provider: openrouter
+    api_key: ${{ OPENROUTER_API_KEY }}
+    model: xiaomi/mimo-v2.5-pro
+    grader_target: grader
+
+  - name: mimo-flash
+    provider: openrouter
+    api_key: ${{ OPENROUTER_API_KEY }}
+    model: xiaomi/mimo-v2-flash
+    grader_target: grader
diff --git a/apps/cli/src/templates/.agentv/targets.yaml b/apps/cli/src/templates/.agentv/targets.yaml
index 3c36b808..ab6d3e25 100644
--- a/apps/cli/src/templates/.agentv/targets.yaml
+++ b/apps/cli/src/templates/.agentv/targets.yaml
@@ -63,3 +63,34 @@ targets:
     cwd: ${{ CLI_EVALS_DIR }}
     healthcheck:
       command: uv run ./mock_cli.py --healthcheck
+
+  # ── MiMo (Xiaomi) via OpenRouter ───────────────────────────────────
+  # All MiMo models are available through OpenRouter with OpenAI-compatible API.
+  # See https://openrouter.ai/xiaomi/mimo-v2.5-pro for pricing and limits.
+  #
+  # Models:
+  #   mimo-v2.5-pro    — 1M context, 131K output, flagship
+  #   mimo-v2-pro      — 1M context, ~131K output
+  #   mimo-v2.5        — 1M context, ~131K output, multimodal
+  #   mimo-v2-flash    — 262K context, 65K output, fast MoE (open-source)
+  #   mimo-v2-omni     — 262K context, 65K output, omni-modal
+  - name: mimo
+    provider: openrouter
+    api_key: ${{ OPENROUTER_API_KEY }}
+    model: xiaomi/mimo-v2.5-pro
+
+  - name: mimo-flash
+    provider: openrouter
+    api_key: ${{ OPENROUTER_API_KEY }}
+    model: xiaomi/mimo-v2-flash
+
+  # ── Direct provider (not through OpenRouter) ───────────────────────
+  # For providers not in pi-ai's model registry, set max_output_tokens
+  # to match your model's actual output limit. Without this, the default
+  # is 16K which may cap output below the model's capability.
+  # - name: mimo-direct
+  #   provider: openai
+  #   base_url: ${{ MIMO_API_ENDPOINT }}
+  #   api_key: ${{ MIMO_API_KEY }}
+  #   model: ${{ MIMO_MODEL }}
+  #   max_output_tokens: 131072

From 32af2aa860a82b4b0244178d7bdc0db32a3c73f2 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Mon, 4 May 2026 02:51:45 +0200
Subject: [PATCH 17/20] docs: add MiMo direct API target with Bitwarden key;
 update targets.yaml template

---
 .agentv/targets.yaml                        | 9 +++++++++
 apps/cli/src/templates/.agentv/targets.yaml | 9 ++++++---
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/.agentv/targets.yaml b/.agentv/targets.yaml
index f9a1ee79..5426204c 100644
--- a/.agentv/targets.yaml
+++ b/.agentv/targets.yaml
@@ -164,3 +164,12 @@ targets:
     api_key: ${{ OPENROUTER_API_KEY }}
     model: xiaomi/mimo-v2-flash
     grader_target: grader
+
+  # MiMo direct API. Fetch key from Bitwarden: bws secret get xiaomi-mimo
+  - name: mimo-direct
+    provider: openai
+    base_url: https://token-plan-sgp.xiaomimimo.com/v1
+    api_key: ${{ XIAOMI_MIMO_API_KEY }}
+    model: xiaomi/mimo-v2.5-pro
+    max_output_tokens: 131072
+    grader_target: grader
diff --git a/apps/cli/src/templates/.agentv/targets.yaml b/apps/cli/src/templates/.agentv/targets.yaml
index ab6d3e25..77d3ab51 100644
--- a/apps/cli/src/templates/.agentv/targets.yaml
+++ b/apps/cli/src/templates/.agentv/targets.yaml
@@ -88,9 +88,12 @@ targets:
   # For providers not in pi-ai's model registry, set max_output_tokens
   # to match your model's actual output limit. Without this, the default
   # is 16K which may cap output below the model's capability.
+  # MiMo direct API (not through OpenRouter).
+  # API key: fetch from Bitwarden with `bws secret get xiaomi-mimo`
+  # Endpoints: openai=https://token-plan-sgp.xiaomimimo.com/v1, anthropic=https://token-plan-sgp.xiaomimimo.com/anthropic
   # - name: mimo-direct
   #   provider: openai
-  #   base_url: ${{ MIMO_API_ENDPOINT }}
-  #   api_key: ${{ MIMO_API_KEY }}
-  #   model: ${{ MIMO_MODEL }}
+  #   base_url: https://token-plan-sgp.xiaomimimo.com/v1
+  #   api_key: ${{ XIAOMI_MIMO_API_KEY }}
+  #   model: xiaomi/mimo-v2.5-pro
   #   max_output_tokens: 131072

From 55ab661b4029fbd4668ceb8f173aeb12731525c2 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Mon, 4 May 2026 02:54:39 +0200
Subject: [PATCH 18/20] docs: remove bws references from targets.yaml templates

---
 .agentv/targets.yaml                        | 1 -
 apps/cli/src/templates/.agentv/targets.yaml | 3 ---
 2 files changed, 4 deletions(-)

diff --git a/.agentv/targets.yaml b/.agentv/targets.yaml
index 5426204c..a1b426dd 100644
--- a/.agentv/targets.yaml
+++ b/.agentv/targets.yaml
@@ -165,7 +165,6 @@ targets:
     model: xiaomi/mimo-v2-flash
     grader_target: grader
 
-  # MiMo direct API. Fetch key from Bitwarden: bws secret get xiaomi-mimo
   - name: mimo-direct
     provider: openai
     base_url: https://token-plan-sgp.xiaomimimo.com/v1
diff --git a/apps/cli/src/templates/.agentv/targets.yaml b/apps/cli/src/templates/.agentv/targets.yaml
index 77d3ab51..e90eae50 100644
--- a/apps/cli/src/templates/.agentv/targets.yaml
+++ b/apps/cli/src/templates/.agentv/targets.yaml
@@ -88,9 +88,6 @@ targets:
   # For providers not in pi-ai's model registry, set max_output_tokens
   # to match your model's actual output limit. Without this, the default
   # is 16K which may cap output below the model's capability.
-  # MiMo direct API (not through OpenRouter).
-  # API key: fetch from Bitwarden with `bws secret get xiaomi-mimo`
-  # Endpoints: openai=https://token-plan-sgp.xiaomimimo.com/v1, anthropic=https://token-plan-sgp.xiaomimimo.com/anthropic
   # - name: mimo-direct
   #   provider: openai
   #   base_url: https://token-plan-sgp.xiaomimimo.com/v1

From ced0b2b9d5e6dca66841759bab73b15dfe852a38 Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Mon, 4 May 2026 03:13:38 +0200
Subject: [PATCH 19/20] =?UTF-8?q?chore(deps):=20bump=20@mariozechner/pi-ai?=
 =?UTF-8?q?=20^0.62.0=20=E2=86=92=20^0.72.1?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Pinned in both packages/core/package.json and apps/cli/package.json (the
two places that consume pi-ai's runtime). 10 minor versions of upstream
fixes and additions; no breaking changes for our adapter — index.d.ts
shape is unchanged on the named exports we use (complete, getModel,
registerBuiltInApiProviders) and the Model / Tool / Message / AssistantMessage
types still match our cast assumptions in llm-providers.ts.

Verified:
- typecheck / lint / 1741 unit tests all green
- live UAT: generateRubrics through OpenAIProvider routed at OpenRouter
  returns 6 valid rubrics

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 apps/cli/package.json      |  2 +-
 bun.lock                   | 26 +++++++-------------------
 packages/core/package.json |  2 +-
 3 files changed, 9 insertions(+), 21 deletions(-)

diff --git a/apps/cli/package.json b/apps/cli/package.json
index 578c9d31..a9d5d74b 100644
--- a/apps/cli/package.json
+++ b/apps/cli/package.json
@@ -32,7 +32,7 @@
     "@github/copilot-sdk": "^0.1.25",
     "@hono/node-server": "^1.19.11",
     "@inquirer/prompts": "^8.2.1",
-    "@mariozechner/pi-ai": "^0.62.0",
+    "@mariozechner/pi-ai": "^0.72.1",
     "@openai/codex-sdk": "^0.104.0",
     "cmd-ts": "^0.14.3",
     "dotenv": "^16.4.5",
diff --git a/bun.lock b/bun.lock
index f099d396..0d222b1b 100644
--- a/bun.lock
+++ b/bun.lock
@@ -29,7 +29,7 @@
         "@github/copilot-sdk": "^0.1.25",
         "@hono/node-server": "^1.19.11",
         "@inquirer/prompts": "^8.2.1",
-        "@mariozechner/pi-ai": "^0.62.0",
+        "@mariozechner/pi-ai": "^0.72.1",
         "@openai/codex-sdk": "^0.104.0",
         "cmd-ts": "^0.14.3",
         "dotenv": "^16.4.5",
@@ -89,7 +89,7 @@
         "@agentclientprotocol/sdk": "^0.14.1",
         "@agentv/eval": "workspace:*",
         "@github/copilot-sdk": "^0.1.25",
-        "@mariozechner/pi-ai": "^0.62.0",
+        "@mariozechner/pi-ai": "^0.72.1",
         "@openai/codex-sdk": "^0.104.0",
         "fast-glob": "^3.3.3",
         "json5": "^2.2.3",
@@ -139,7 +139,7 @@
 
     "@anthropic-ai/claude-agent-sdk": ["@anthropic-ai/claude-agent-sdk@0.2.49", "", { "optionalDependencies": { "@img/sharp-darwin-arm64": "^0.34.2", "@img/sharp-darwin-x64": "^0.34.2", "@img/sharp-linux-arm": "^0.34.2", "@img/sharp-linux-arm64": "^0.34.2", "@img/sharp-linux-x64": "^0.34.2", "@img/sharp-linuxmusl-arm64": "^0.34.2", "@img/sharp-linuxmusl-x64": "^0.34.2", "@img/sharp-win32-arm64": "^0.34.2", "@img/sharp-win32-x64": "^0.34.2" }, "peerDependencies": { "zod": "^4.0.0" } }, "sha512-3avi409dwuGkPEETpWa0gyJvRMr3b6LxeuW5/sAPCOtLD9WxH9fYltbA5wZoazxTw5mlbXmjDp7JqO1rlmpaIQ=="],
 
-    "@anthropic-ai/sdk": ["@anthropic-ai/sdk@0.73.0", "", { "dependencies": { "json-schema-to-ts": "^3.1.1" }, "peerDependencies": { "zod": "^3.25.0 || ^4.0.0" }, "optionalPeers": ["zod"], "bin": { "anthropic-ai-sdk": "bin/cli" } }, "sha512-URURVzhxXGJDGUGFunIOtBlSl7KWvZiAAKY/ttTkZAkXT9bTPqdk2eK0b8qqSxXpikh3QKPnPYpiyX98zf5ebw=="],
+    "@anthropic-ai/sdk": ["@anthropic-ai/sdk@0.91.1", "", { "dependencies": { "json-schema-to-ts": "^3.1.1" }, "peerDependencies": { "zod": "^3.25.0 || ^4.0.0" }, "optionalPeers": ["zod"], "bin": { "anthropic-ai-sdk": "bin/cli" } }, "sha512-LAmu761tSN9r66ixvmciswUj/ZC+1Q4iAfpedTfSVLeswRwnY3n2Nb6Tsk+cLPP28aLOPWeMgIuTuCcMC6W/iw=="],
 
     "@astrojs/compiler": ["@astrojs/compiler@2.13.0", "", {}, "sha512-mqVORhUJViA28fwHYaWmsXSzLO9osbdZ5ImUfxBarqsYdMlPbqAqGJCxsNzvppp1BEzc1mJNjOVvQqeDN8Vspw=="],
 
@@ -475,11 +475,11 @@
 
     "@jridgewell/trace-mapping": ["@jridgewell/trace-mapping@0.3.31", "", { "dependencies": { "@jridgewell/resolve-uri": "^3.1.0", "@jridgewell/sourcemap-codec": "^1.4.14" } }, "sha512-zzNR+SdQSDJzc8joaeP8QQoCQr8NuYx2dIIytl1QeBEZHJ9uW6hebsrYgbz8hJwUQao3TWCMtmfV8Nu1twOLAw=="],
 
-    "@mariozechner/pi-ai": ["@mariozechner/pi-ai@0.62.0", "", { "dependencies": { "@anthropic-ai/sdk": "^0.73.0", "@aws-sdk/client-bedrock-runtime": "^3.983.0", "@google/genai": "^1.40.0", "@mistralai/mistralai": "1.14.1", "@sinclair/typebox": "^0.34.41", "ajv": "^8.17.1", "ajv-formats": "^3.0.1", "chalk": "^5.6.2", "openai": "6.26.0", "partial-json": "^0.1.7", "proxy-agent": "^6.5.0", "undici": "^7.19.1", "zod-to-json-schema": "^3.24.6" }, "bin": { "pi-ai": "dist/cli.js" } }, "sha512-mJgryZ5RgBQG++tiETMtCQQJoH2MAhKetCfqI98NMvGydu7L9x2qC2JekQlRaAgIlTgv4MRH1UXHMEs4UweE/Q=="],
+    "@mariozechner/pi-ai": ["@mariozechner/pi-ai@0.72.1", "", { "dependencies": { "@anthropic-ai/sdk": "^0.91.1", "@aws-sdk/client-bedrock-runtime": "^3.1030.0", "@google/genai": "^1.40.0", "@mistralai/mistralai": "^2.2.0", "chalk": "^5.6.2", "openai": "6.26.0", "partial-json": "^0.1.7", "proxy-agent": "^6.5.0", "typebox": "^1.1.24", "undici": "^7.19.1", "zod-to-json-schema": "^3.24.6" }, "bin": { "pi-ai": "dist/cli.js" } }, "sha512-mOq71Pjnu72xxzwrh52VIiNwt+/a+Wpa11k5segi01/zTZJt8eMDc5Q2z6GhczYMr5+6EpZ8T+BaeHqq0jk5ag=="],
 
     "@mdx-js/mdx": ["@mdx-js/mdx@3.1.1", "", { "dependencies": { "@types/estree": "^1.0.0", "@types/estree-jsx": "^1.0.0", "@types/hast": "^3.0.0", "@types/mdx": "^2.0.0", "acorn": "^8.0.0", "collapse-white-space": "^2.0.0", "devlop": "^1.0.0", "estree-util-is-identifier-name": "^3.0.0", "estree-util-scope": "^1.0.0", "estree-walker": "^3.0.0", "hast-util-to-jsx-runtime": "^2.0.0", "markdown-extensions": "^2.0.0", "recma-build-jsx": "^1.0.0", "recma-jsx": "^1.0.0", "recma-stringify": "^1.0.0", "rehype-recma": "^1.0.0", "remark-mdx": "^3.0.0", "remark-parse": "^11.0.0", "remark-rehype": "^11.0.0", "source-map": "^0.7.0", "unified": "^11.0.0", "unist-util-position-from-estree": "^2.0.0", "unist-util-stringify-position": "^4.0.0", "unist-util-visit": "^5.0.0", "vfile": "^6.0.0" } }, "sha512-f6ZO2ifpwAQIpzGWaBQT2TXxPv6z3RBzQKpVftEWN78Vl/YweF1uwussDx8ECAXVtr3Rs89fKyG9YlzUs9DyGQ=="],
 
-    "@mistralai/mistralai": ["@mistralai/mistralai@1.14.1", "", { "dependencies": { "ws": "^8.18.0", "zod": "^3.25.0 || ^4.0.0", "zod-to-json-schema": "^3.24.1" } }, "sha512-IiLmmZFCCTReQgPAT33r7KQ1nYo5JPdvGkrkZqA8qQ2qB1GHgs5LoP5K2ICyrjnpw2n8oSxMM/VP+liiKcGNlQ=="],
+    "@mistralai/mistralai": ["@mistralai/mistralai@2.2.1", "", { "dependencies": { "ws": "^8.18.0", "zod": "^3.25.0 || ^4.0.0", "zod-to-json-schema": "^3.25.0" } }, "sha512-uKU8CZmL2RzYKmplsU01hii4p3pe4HqJefpWNRWXm1Tcm0Sm4xXfwSLIy4k7ZCPlbETCGcp69E7hZs+WOJ5itQ=="],
 
     "@monaco-editor/loader": ["@monaco-editor/loader@1.7.0", "", { "dependencies": { "state-local": "^1.0.6" } }, "sha512-gIwR1HrJrrx+vfyOhYmCZ0/JcWqG5kbfG7+d3f/C1LXk2EvzAbHSg3MQ5lO2sMlo9izoAZ04shohfKLVT6crVA=="],
 
@@ -637,8 +637,6 @@
 
     "@shikijs/vscode-textmate": ["@shikijs/vscode-textmate@10.0.2", "", {}, "sha512-83yeghZ2xxin3Nj8z1NMd/NCuca+gsYXswywDy5bHvwlWL8tpTQmzGeUuHd9FC3E/SBEMvzJRwWEOz5gGes9Qg=="],
 
-    "@sinclair/typebox": ["@sinclair/typebox@0.34.49", "", {}, "sha512-brySQQs7Jtn0joV8Xh9ZV/hZb9Ozb0pmazDIASBkYKCjXrXU3mpcFahmK/z4YDhGkQvP9mWJbVyahdtU5wQA+A=="],
-
     "@sindresorhus/merge-streams": ["@sindresorhus/merge-streams@4.0.0", "", {}, "sha512-tlqY9xq5ukxTUZBmoOp+m61cqwQD5pHJtFY3Mn8CA8ps6yghLH/Hw8UPdqg4OLmFW3IFlcXnQNmo/dh8HzXYIQ=="],
 
     "@smithy/config-resolver": ["@smithy/config-resolver@4.4.17", "", { "dependencies": { "@smithy/node-config-provider": "^4.3.14", "@smithy/types": "^4.14.1", "@smithy/util-config-provider": "^4.2.2", "@smithy/util-endpoints": "^3.4.2", "@smithy/util-middleware": "^4.2.14", "tslib": "^2.6.2" } }, "sha512-TzDZcAnhTyAHbXVxWZo7/tEcrIeFq20IBk8So3OLOetWpR8EwY/yEqBMBFaJMeyEiREDq4NfEl+qO3OAUD+vbQ=="],
@@ -867,10 +865,6 @@
 
     "agentv": ["agentv@workspace:apps/cli"],
 
-    "ajv": ["ajv@8.20.0", "", { "dependencies": { "fast-deep-equal": "^3.1.3", "fast-uri": "^3.0.1", "json-schema-traverse": "^1.0.0", "require-from-string": "^2.0.2" } }, "sha512-Thbli+OlOj+iMPYFBVBfJ3OmCAnaSyNn4M1vz9T6Gka5Jt9ba/HIR56joy65tY6kx/FCF5VXNB819Y7/GUrBGA=="],
-
-    "ajv-formats": ["ajv-formats@3.0.1", "", { "dependencies": { "ajv": "^8.0.0" } }, "sha512-8iUql50EUR+uUcdRQ3HDqa6EVyo3docL8g5WJ3FNcWmu62IbkGUue/pEyLBW8VGKKucTPgqeks4fIU1DA4yowQ=="],
-
     "ansi-align": ["ansi-align@3.0.1", "", { "dependencies": { "string-width": "^4.1.0" } }, "sha512-IOfwwBF5iczOjp/WeY4YxyjqAFMQoZufdQWDd19SEExbVLNXqvpzSJ/M7Za4/sCPmQ0+GRquoA7bGcINcxew6w=="],
 
     "ansi-regex": ["ansi-regex@6.2.2", "", {}, "sha512-Bq3SmSpyFHaWjPk8If9yc6svM8c56dB5BAtW4Qbw5jHTwwXXcTLoRMkpDJp6VL0XzlWaCHTXrkFURMYmD0sLqg=="],
@@ -1157,16 +1151,12 @@
 
     "extend": ["extend@3.0.2", "", {}, "sha512-fjquC59cD7CyW6urNXK0FBufkZcoiGG80wTuPujX590cB5Ttln20E2UB4S/WARVqhXffZl2LNgS+gQdPIIim/g=="],
 
-    "fast-deep-equal": ["fast-deep-equal@3.1.3", "", {}, "sha512-f3qQ9oQy9j2AhBe/H9VC91wLmKBCCU/gDOnKNAYG5hswO7BLKj09Hc5HYNz9cGI++xlpDCIgDaitVs03ATR84Q=="],
-
     "fast-glob": ["fast-glob@3.3.3", "", { "dependencies": { "@nodelib/fs.stat": "^2.0.2", "@nodelib/fs.walk": "^1.2.3", "glob-parent": "^5.1.2", "merge2": "^1.3.0", "micromatch": "^4.0.8" } }, "sha512-7MptL8U0cqcFdzIzwOTHoilX9x5BrNqye7Z/LuC7kCMRio1EMSyqRK3BEAUD7sXRq4iT4AzTVuZdhgQ2TCvYLg=="],
 
     "fast-string-truncated-width": ["fast-string-truncated-width@3.0.3", "", {}, "sha512-0jjjIEL6+0jag3l2XWWizO64/aZVtpiGE3t0Zgqxv0DPuxiMjvB3M24fCyhZUO4KomJQPj3LTSUnDP3GpdwC0g=="],
 
     "fast-string-width": ["fast-string-width@3.0.2", "", { "dependencies": { "fast-string-truncated-width": "^3.0.2" } }, "sha512-gX8LrtNEI5hq8DVUfRQMbr5lpaS4nMIWV+7XEbXk2b8kiQIizgnlr12B4dA3ZEx3308ze0O4Q1R+cHts8kyUJg=="],
 
-    "fast-uri": ["fast-uri@3.1.0", "", {}, "sha512-iPeeDKJSWf4IEOasVVrknXpaBV0IApz/gp7S2bb7Z4Lljbl2MGJRqInZiUrQwV16cpzw/D3S5j5Julj/gT52AA=="],
-
     "fast-wrap-ansi": ["fast-wrap-ansi@0.2.0", "", { "dependencies": { "fast-string-width": "^3.0.2" } }, "sha512-rLV8JHxTyhVmFYhBJuMujcrHqOT2cnO5Zxj37qROj23CP39GXubJRBUFF0z8KFK77Uc0SukZUf7JZhsVEQ6n8w=="],
 
     "fast-xml-builder": ["fast-xml-builder@1.1.5", "", { "dependencies": { "path-expression-matcher": "^1.1.3" } }, "sha512-4TJn/8FKLeslLAH3dnohXqE3QSoxkhvaMzepOIZytwJXZO69Bfz0HBdDHzOTOon6G59Zrk6VQ2bEiv1t61rfkA=="],
@@ -1359,8 +1349,6 @@
 
     "json-schema-to-ts": ["json-schema-to-ts@3.1.1", "", { "dependencies": { "@babel/runtime": "^7.18.3", "ts-algebra": "^2.0.0" } }, "sha512-+DWg8jCJG2TEnpy7kOm/7/AxaYoaRbjVB4LFZLySZlWn8exGs3A4OLJR966cVvU26N7X9TWxl+Jsw7dzAqKT6g=="],
 
-    "json-schema-traverse": ["json-schema-traverse@1.0.0", "", {}, "sha512-NM8/P9n3XjXhIZn1lLhkFaACTOURQXjWhV4BA/RnOv8xvgqtqpAX9IO4mRQxSx1Rlo4tqzeqb0sOlruaOy3dug=="],
-
     "json5": ["json5@2.2.3", "", { "bin": { "json5": "lib/cli.js" } }, "sha512-XmOWe7eyHYH14cLdVPoyg+GOH3rYX++KpzrylJwSW98t3Nk+U8XOl8FWKOgwtzdb8lXGf6zYwDUzeHMWfxasyg=="],
 
     "jwa": ["jwa@2.0.1", "", { "dependencies": { "buffer-equal-constant-time": "^1.0.1", "ecdsa-sig-formatter": "1.0.11", "safe-buffer": "^5.0.1" } }, "sha512-hRF04fqJIP8Abbkq5NKGN0Bbr3JxlQ+qhZufXVr0DvujKy93ZCbXZMHDL4EOtodSbCWxOqR8MS1tXA5hwqCXDg=="],
@@ -1725,8 +1713,6 @@
 
     "remark-stringify": ["remark-stringify@11.0.0", "", { "dependencies": { "@types/mdast": "^4.0.0", "mdast-util-to-markdown": "^2.0.0", "unified": "^11.0.0" } }, "sha512-1OSmLd3awB/t8qdoEOMazZkNsfVTeY4fTsgzcQFdXNq8ToTN4ZGwrMnlda4K6smTFKD+GRV6O48i6Z4iKgPPpw=="],
 
-    "require-from-string": ["require-from-string@2.0.2", "", {}, "sha512-Xf0nWe6RseziFMu+Ap9biiUbmplq6S9/p+7w7YXP/JBHhrUDDUhwa+vANyubuqfZWTveU//DYVGsDG7RKL/vEw=="],
-
     "reselect": ["reselect@5.1.1", "", {}, "sha512-K/BG6eIky/SBpzfHZv/dd+9JBFiS4SWV7FIujVyJRux6e45+73RaUHXLmIR1f7WOMaQ0U1km6qwklRQxpJJY0w=="],
 
     "resolve-from": ["resolve-from@5.0.0", "", {}, "sha512-qYg9KP24dD5qka9J47d0aVky0N+b4fTU89LN9iDnjB5waksiC49rvMB0PrUJQGoTmH50XPiqOvAjDfaijGxYZw=="],
@@ -1857,6 +1843,8 @@
 
     "type-fest": ["type-fest@4.41.0", "", {}, "sha512-TeTSQ6H5YHvpqVwBRcnLDCBnDOHWYu7IvGbHT6N8AOymcr9PJGjc1GTtiWZTYg0NCgYwvnYWEkVChQAr9bjfwA=="],
 
+    "typebox": ["typebox@1.1.37", "", {}, "sha512-jb7jp6KvOvvy5sd+11AfJ0/e0F0AS9RcOXd55oGi2ZnRHIGmFvrTaNF+ZidRmGBmmNTkM5KKl0Z37KzxJ+owEQ=="],
+
     "typescript": ["typescript@5.8.3", "", { "bin": { "tsc": "bin/tsc", "tsserver": "bin/tsserver" } }, "sha512-p1diW6TqL9L07nNxvRMM7hMMw4c5XOo/1ibL4aAIGmSAt9slTE1Xgw5KWuof2uTOvCg9BY7ZRi+GaF+7sfgPeQ=="],
 
     "ufo": ["ufo@1.6.3", "", {}, "sha512-yDJTmhydvl5lJzBmy/hyOAA0d+aqCBuwl818haVdYCRrWV84o7YyeVm4QlVHStqNrrJSTb6jKuFAVqAFsr+K3Q=="],
diff --git a/packages/core/package.json b/packages/core/package.json
index a6b30d85..711d4126 100644
--- a/packages/core/package.json
+++ b/packages/core/package.json
@@ -43,7 +43,7 @@
     "@agentclientprotocol/sdk": "^0.14.1",
     "@agentv/eval": "workspace:*",
     "@github/copilot-sdk": "^0.1.25",
-    "@mariozechner/pi-ai": "^0.62.0",
+    "@mariozechner/pi-ai": "^0.72.1",
     "@openai/codex-sdk": "^0.104.0",
     "fast-glob": "^3.3.3",
     "json5": "^2.2.3",

From 9f28c3f8fabb47d2d3f1019f4023631681b1105a Mon Sep 17 00:00:00 2001
From: Christopher Tso <christso@gmail.com>
Date: Mon, 4 May 2026 03:17:16 +0200
Subject: [PATCH 20/20] docs: remove spike plan doc (content moved to #1205)

---
 docs/plans/1205-pi-ai-spike.md | 192 ---------------------------------
 1 file changed, 192 deletions(-)
 delete mode 100644 docs/plans/1205-pi-ai-spike.md

diff --git a/docs/plans/1205-pi-ai-spike.md b/docs/plans/1205-pi-ai-spike.md
deleted file mode 100644
index f4ffa2b0..00000000
--- a/docs/plans/1205-pi-ai-spike.md
+++ /dev/null
@@ -1,192 +0,0 @@
-# Spike: pi-ai migration — Path B selected
-
-Tracks #1205. This doc captures the spike findings and the chosen migration
-path. Once the spike port lands, delete this file and fold any user-relevant
-content into module headers / the issue.
-
-## Decision: Path B
-
-We're going with Path B — drop `asLanguageModel()` from the `Provider` interface
-and enrich `Provider.invoke()` to cover the full grader hot path (multi-step +
-tools). The four consumers migrate to the new API.
-
-**Why not Path A** (Vercel `LanguageModelV2` shim over pi-ai): A is a shim, not
-an abstraction. With A our `Provider` interface stays a thin facade — we'd be
-implementing Vercel's contract on top of pi-ai, and every consumer would still
-depend on Vercel's API surface. The next time we want to swap LLM libs, A leaves
-the consumer-side coupling untouched. B fixes the coupling: `Provider` becomes
-the real boundary, consumers depend on AgentV's own API, and only provider
-implementations change when the underlying lib changes.
-
-The cost is honest: bigger initial PR (4 consumer changes vs. 1 shim), more
-baseline runs. But if we're spending the migration budget anyway, spend it on
-the change that leaves the codebase better.
-
-## Initial assumption (wrong)
-
-Original assumption: `Provider.invoke(request) -> response` is the contract every
-grader call site uses, so we can swap the implementation behind `invoke()` from
-Vercel `generateText` to pi-ai `complete()` and call it a day.
-
-## Actual call graph
-
-`asLanguageModel(): import('ai').LanguageModel` is part of the `Provider`
-interface (`providers/types.ts:309`) and is the load-bearing entry point for
-every real grader path. The consumers don't go through `provider.invoke()`:
-
-| Consumer | What it does | Tools? | Multi-step? |
-| --- | --- | --- | --- |
-| `graders/llm-grader.ts:485` (built-in agent) | `asLanguageModel()` → `generateText({ model, system, prompt, tools, stopWhen, temperature })` | yes (3 sandboxed FS tools) | yes (`stepCountIs(maxSteps)`) |
-| `graders/llm-grader.ts:1106` (LLM-judge) | `asLanguageModel()` → `generateText({ model, messages })` | no | no |
-| `graders/composite.ts:343` | `asLanguageModel()` → `generateText({ model, messages })` | no | no |
-| `generators/rubric-generator.ts:35` | `asLanguageModel()` → `generateText({ model, messages })` | no | no |
-| `providers/agentv-provider.ts:73-84` | `invoke()` actively throws "use asLanguageModel() instead" | — | — |
-
-The `built-in agent` case in `llm-grader.ts:485` is the hardest consumer — any
-new `Provider` API has to cover its full surface or we end up with two ways to
-call providers.
-
-## New `Provider.invoke()` design
-
-### Goals
-
-- One `invoke()` shape covers single-shot, judged-message, and tool-using
-  multi-step calls.
-- Tool schema language is provider-library-neutral (JSON Schema on the wire).
-- Existing fields (`question`, `chatPrompt`, `temperature`, `maxOutputTokens`,
-  `signal`, `evalCaseId`, `attempt`, etc.) stay as-is — additive change.
-- Existing `ProviderResponse` fields (`output`, `tokenUsage`, `costUsd`,
-  `durationMs`, `startTime`, `endTime`) stay as-is.
-
-### Additions to `ProviderRequest`
-
-```ts
-export interface ProviderTool {
-  /** Tool name as shown to the model. */
-  readonly name: string;
-  /** Tool description as shown to the model. */
-  readonly description: string;
-  /** JSON Schema for the tool's input. Pi-ai TypeBox compiles to JSON Schema; Zod
-   * compiles via zod-to-json-schema. Provider implementations translate to the
-   * underlying lib's native shape (TypeBox object for pi-ai). */
-  readonly parameters: JsonObject;
-  /** Executes the tool. Receives parsed JSON input, returns a JSON-serializable
-   * result. Errors are caught and surfaced to the model as tool-error results. */
-  execute(input: unknown): Promise<unknown>;
-}
-
-export interface ProviderRequest {
-  // ...existing fields unchanged...
-
-  /** Tools the model may call. Provider runs the agent loop, calling
-   * tool.execute() for each tool call until either the model returns no
-   * further tool calls or `maxSteps` is reached. */
-  readonly tools?: readonly ProviderTool[];
-
-  /** Maximum number of agent loop iterations (model turn + tool execution =
-   * one step). Required when `tools` is non-empty. Ignored otherwise. */
-  readonly maxSteps?: number;
-}
-```
-
-### Additions to `ProviderResponse`
-
-```ts
-export interface ProviderStepInfo {
-  /** Number of agent loop steps executed (0 if no tools were used). */
-  readonly count: number;
-  /** Total tool calls across all steps. */
-  readonly toolCallCount: number;
-}
-
-export interface ProviderResponse {
-  // ...existing fields unchanged...
-
-  /** Populated when the request used tools. Undefined for single-shot calls. */
-  readonly steps?: ProviderStepInfo;
-}
-```
-
-This is the minimum llm-grader's `built-in` mode actually needs from
-`generateText`'s richer `steps[]` array (see `llm-grader.ts:524`). If a future
-consumer needs per-step detail (which tool, what input, what output), promote
-`ProviderStepInfo` then — YAGNI for now.
-
-### Removed
-
-- `Provider.asLanguageModel?(): import('ai').LanguageModel` — gone.
-- `import('ai').LanguageModel` reference in `providers/types.ts:309` — gone.
-- `agentv-provider.ts`'s `invoke()`-throws-by-design — `agentv` becomes a
-  normal `Provider` that runs through `invoke()` like the others.
-
-### Tool schema neutrality
-
-JSON Schema on the wire keeps consumers free to author tools with whatever
-schema lib they want. The two grader call sites today use Zod via ai-sdk's
-`tool()` helper; under Path B they'd switch to **TypeBox** (pi-ai native, no
-extra conversion step). That's a small port — three filesystem tools in
-`llm-grader.ts:1473-1554`. Provider implementations are responsible for
-translating `ProviderTool.parameters` (JSON Schema) → the underlying lib's
-expected shape.
-
-## Consumer migration order
-
-Smallest blast radius first so we can flush the design through real code before
-touching the hardest case:
-
-1. **`rubric-generator.ts`** — single-shot, no tools. Simplest possible exercise
-   of `provider.invoke({ chatPrompt: [...] })`. Validates token usage + response
-   text plumbing.
-2. **`composite.ts`** — same shape as rubric-generator. Smoke test that the API
-   works for a second consumer.
-3. **`llm-grader.ts:1106`** (LLM-judge mode) — same shape again, different
-   prompt construction.
-4. **`llm-grader.ts:485`** (built-in agent mode) — exercises `tools` +
-   `maxSteps`. The whole point of the new API.
-5. **`agentv-provider.ts`** — collapse the `invoke()`-throws path. Provider
-   becomes a normal pi-ai-backed implementation.
-
-After step 5, `asLanguageModel?` can be removed from the `Provider` interface
-and `import { generateText } from 'ai'` disappears from grader code.
-
-## Provider implementation order
-
-After consumers compile against the new interface, port providers one at a time:
-
-1. **OpenAIProvider** — pi-ai native, simplest. Run grader-score baselines.
-2. **OpenRouterProvider** — pi-ai treats it as an OpenAI-compatible endpoint;
-   should fall out of step 1 with config differences only.
-3. **GeminiProvider** — pi-ai native (`google` provider).
-4. **AnthropicProvider** — pi-ai native, but thinking-budget mapping needs
-   design (see open question below).
-5. **AzureProvider** — pi-ai has `azure-openai-responses.js`; verify the
-   `useDeploymentBasedUrls` + `apiFormat` cases.
-
-Each step ends with: build green, lint green, baselines re-run for an eval that
-exercises that provider.
-
-## Open design questions
-
-- **Anthropic thinking-budget mapping.** ai-sdk takes a numeric `budgetTokens`;
-  pi-ai exposes a 5-bucket `reasoning` enum (`minimal|low|medium|high|xhigh`).
-  Lossy. Pick one of: (a) coerce numeric → bucket via thresholds, (b) drop the
-  knob to a bucket-only YAML field with deprecation warning, (c) bypass pi-ai's
-  abstraction and pass through to its Anthropic provider directly. Decide
-  before porting `AnthropicProvider`.
-- **Retry/backoff.** `ai-sdk.ts:520-559` has bespoke exponential backoff with
-  configurable status-code list. pi-ai's behavior differs. Likely answer: keep
-  the existing `withRetry` wrapper around `provider.invoke()`'s underlying
-  pi-ai call — the retry logic is library-agnostic. Confirm in step 1.
-- **Token-usage object shape.** pi-ai returns `{input, output, cost}`; ai-sdk
-  surfaces `{inputTokens, outputTokens, cachedInputTokens, reasoningTokens}`.
-  Map to the existing `ProviderTokenUsage` shape (`input`, `output`, optional
-  `cached`, optional `reasoning`) — which is already what consumers see today.
-  Cost goes to the existing `costUsd` field.
-
-## Out-of-scope for this spike
-
-- Anthropic thinking-budget mapping resolution (call it out, design separately).
-- Streaming support — current consumers don't stream; defer.
-- Adding new providers exposed by pi-ai (Bedrock, Vertex, Mistral, etc.) — this
-  PR ports the existing 5, no more.
-- Orchestrator-side changes (agent provider kinds, batching) — untouched.