diff --git a/.codex/ledger/2026-05-06-MEMORY-sqlite-migration-drift.md b/.codex/ledger/2026-05-06-MEMORY-sqlite-migration-drift.md new file mode 100644 index 000000000..713efa33e --- /dev/null +++ b/.codex/ledger/2026-05-06-MEMORY-sqlite-migration-drift.md @@ -0,0 +1,60 @@ +Goal (incl. success criteria): + +- Fix global SQLite migration identity drift that prevents seamless `./bin/agh daemon stop/start`. +- Success: canonical migration registry preserves already-recorded versions 17-20, observed-history DB upgrades to v22, guardrail lesson/instructions land, focused tests and `make verify` pass or blockers are reported with evidence. + +Constraints/Assumptions: + +- Use root-cause fix only; do not weaken migration integrity checks and do not manually edit the live `~/.agh/agh.db`. +- Persist accepted Plan Mode plan under `.codex/plans/` before execution. +- Conversation in BR-PT; code/docs/artifacts in English. +- Use RTK for shell commands and avoid destructive git commands. + +Key decisions: + +- Restore registry order to observed DB history: v17 task orchestration profile, v18 review gate, v19 notification cursors, v20 bridge task subscriptions, v21 network conversation containers, v22 memory v2 events. +- No one-pass repair unless evidence appears for DBs created by the inverse broken order. +- Add durable guardrails in `docs/_memory/lessons`, root `AGENTS.md`/`CLAUDE.md`, and internal `AGENTS.md`/`CLAUDE.md`. + +State: + +- Registry, tests, lesson, and instruction guardrails are patched. Focused Go tests, isolated daemon restart proof, and full `make verify` passed. + +Done: + +- Confirmed live DB records v17-v20 as task/bridge migrations. +- Confirmed current code expects network migration at v17 and shifts v17-v20 to v18-v21. +- Accepted plan produced in chat. +- Persisted accepted plan in `.codex/plans/2026-05-06-sqlite-migration-append-only.md`. +- Restored `internal/store/globaldb.globalSchemaMigrations` append-only order: v17 task orchestration profile, v18 review gate, v19 notification cursors, v20 bridge subscriptions, v21 network conversations, v22 memory events. +- Added migration identity/order contract helpers and observed-history upgrade coverage for the real v17-v20 prefix. +- Added lesson `docs/_memory/lessons/L-021-schema-migration-identity-is-append-only.md` and updated lessons index. +- Added guardrails to root/internal `AGENTS.md` and `CLAUDE.md`. +- Ran `gofmt` on touched Go files. +- Focused verification passed: `go test ./internal/store ./internal/store/globaldb -count=1 -race`. +- Attempted `scripts/check-test-conventions.py`, but the helper script is absent in this repository. +- Built local `./bin/agh` with the fixed registry (`make build` exit 0; Vite chunk-size warning emitted). +- Isolated daemon proof passed against a SQLite backup of `/Users/pedronauck/.agh/agh.db`: `AGH_HOME=/tmp/agh-migration-proof.HvgC9J`, HTTP `127.0.0.1:63350`, socket `/tmp/agh-migration-proof.HvgC9J/daemon.sock`; first start appended v21 `rebuild_network_conversation_containers` and v22 `memv2_memory_events`, stop/start repeated without integrity mismatch. +- Full verification passed: `make verify` exit 0. Output included the existing Vite chunk-size warning and macOS linker warning from `golangci-lint`; no command failed. + +Now: + +- Prepare final report with verification evidence. + +Next: + +- None. + +Open questions (UNCONFIRMED if needed): + +- None. + +Working set (files/ids/commands): + +- `.codex/plans/2026-05-06-sqlite-migration-append-only.md` +- `.codex/ledger/2026-05-06-MEMORY-sqlite-migration-drift.md` +- `internal/store/globaldb/global_db.go` +- `internal/store/globaldb/global_db*_test.go` +- `docs/_memory/lessons/L-021-schema-migration-identity-is-append-only.md` +- `docs/_memory/lessons/README.md` +- `AGENTS.md`, `CLAUDE.md`, `internal/AGENTS.md`, `internal/CLAUDE.md` diff --git a/.codex/plans/2026-05-06-sqlite-migration-append-only.md b/.codex/plans/2026-05-06-sqlite-migration-append-only.md new file mode 100644 index 000000000..163409f2d --- /dev/null +++ b/.codex/plans/2026-05-06-sqlite-migration-append-only.md @@ -0,0 +1,50 @@ +# Fix Global SQLite Migration Order Drift And Add Guardrails + +## Summary + +- Root cause confirmed: the daemon correctly refuses to start because `globalSchemaMigrations` changed the identity of an already-applied migration version. +- The observed `/Users/pedronauck/.agh/agh.db` records `17=add_task_orchestration_profile_schema`, `18=add_task_review_gate_schema`, `19=add_notification_cursors`, `20=add_bridge_task_subscriptions`. +- Current code incorrectly expects `17=rebuild_network_conversation_containers` and shifted the previously recorded task/bridge migrations to `18..21`. +- The fix restores append-only migration identity, keeps strict integrity mismatch failures, adds regression coverage for this exact history, and documents the rule in durable project memory plus active agent instructions. + +## Key Changes + +- Restore the canonical global migration order in `internal/store/globaldb/global_db.go`: + - `17 add_task_orchestration_profile_schema` + - `18 add_task_review_gate_schema` + - `19 add_notification_cursors` + - `20 add_bridge_task_subscriptions` + - `21 rebuild_network_conversation_containers` + - `22 memv2_memory_events` +- Update network conversation migration tests to use `networkConversationMigrationVersion = 21` and seed legacy network DBs from the corrected pre-network history. +- Add a regression test that seeds a DB matching the observed local history through migration `20`, with legacy `network_timeline_log.interaction_id`, then opens it through `OpenGlobalDB` and asserts no integrity mismatch, network migration v21, memory migration v22, intact task/bridge schema, and idempotent reopen. +- Add an append-only registry contract test for the known global migration sequence, emphasizing versions `17..22`. +- Preserve strict integrity behavior in `store.RunMigrations`; do not accept arbitrary mismatches or edit `schema_migrations` in place. +- Do not add one-pass repair unless real DBs are found with the broken inverse sequence. + +## Documentation Guardrails + +- Add `docs/_memory/lessons/L-021-schema-migration-identity-is-append-only.md`. +- Update `docs/_memory/lessons/README.md` with `L-021`. +- Update root `AGENTS.md` and `CLAUDE.md` under `### Schema Migrations` with the append-only registry rule. +- Update `internal/AGENTS.md` and `internal/CLAUDE.md` with an `internal/store` migration invariant. + +## Public Interfaces / Data Contract + +- No HTTP, UDS, CLI, OpenAPI, web, or config contract changes. +- The internal data contract is made explicit: global SQLite migration numbers, names, and checksums are immutable once applied anywhere meaningful. +- Fresh DB final schema remains the same. Existing DBs with the observed `17..20` history upgrade by applying only missing migrations `21` and `22`. + +## Test Plan + +- Run `go test ./internal/store ./internal/store/globaldb -count=1 -race`. +- Run an isolated daemon upgrade proof using a temp copy of `/Users/pedronauck/.agh/agh.db`. +- Verify lesson and instruction guardrails landed in the intended files. +- Run `make verify`. + +## Assumptions + +- The selected implementation scope is registry and tests, not generic migration-runner redesign. +- The observed local DB history is valid and must be preserved. +- The live `/Users/pedronauck/.agh/agh.db` will not be manually mutated during validation. +- Persistent artifacts are written in English. diff --git a/.compozy/tasks/provider-model-catalog/analysis/acp-sdk-breaking-changes.md b/.compozy/tasks/provider-model-catalog/analysis/acp-sdk-breaking-changes.md new file mode 100644 index 000000000..8c74348c2 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/analysis/acp-sdk-breaking-changes.md @@ -0,0 +1,203 @@ +# ACP SDK v0.6.3 to v0.12.2 Breaking-Change Audit + +Task: provider-model-catalog Task 06 + +This audit was produced before migrating AGH code from `github.com/coder/acp-go-sdk` `v0.6.3` to `v0.12.2`. + +## Sources Checked + +- Current AGH usage under `internal/acp`, `internal/session`, and API contract conversion code. +- Local module cache for `github.com/coder/acp-go-sdk@v0.6.3`. +- Local module cache for `github.com/coder/acp-go-sdk@v0.12.2`. +- Zed ACP references under `.resources/zed/crates/agent_ui/src/config_options.rs`, `.resources/zed/crates/acp_thread/src/connection.rs`, and `.resources/zed/crates/agent_servers/src/acp.rs`. +- Harnss ACP config cache/set reference under `.resources/harnss/src/types/window.d.ts`. + +## AGH ACP Symbols Currently Used + +AGH production and test code currently uses these ACP SDK symbols directly: + +- `acpsdk.Agent`, `acpsdk.AgentSideConnection`, `acpsdk.NewAgentSideConnection` +- `acpsdk.ClientCapabilities`, `acpsdk.AgentCapabilities`, `acpsdk.InitializeRequest`, `acpsdk.InitializeResponse` +- `acpsdk.FileSystemCapability` +- `acpsdk.NewSessionRequest`, `acpsdk.NewSessionResponse` +- `acpsdk.LoadSessionResponse` +- `acpsdk.SessionId` +- `acpsdk.SessionModeId`, `acpsdk.SessionModeState`, `acpsdk.AvailableSessionMode` +- `acpsdk.SessionModelState`, `acpsdk.ModelInfo`, `acpsdk.ModelId` +- `acpsdk.SetSessionModeRequest`, `acpsdk.SetSessionModeResponse` +- `acpsdk.SetSessionModelRequest`, `acpsdk.SetSessionModelResponse` +- `acpsdk.CancelNotification` +- `acpsdk.PromptRequest`, `acpsdk.PromptResponse`, `acpsdk.PromptStopReason`, `acpsdk.PromptResponseStopReasonEndTurn` +- `acpsdk.SessionNotification`, `acpsdk.SessionUpdate` +- `acpsdk.RequestError` +- `acpsdk.RequestPermissionRequest`, `acpsdk.RequestPermissionToolCall` +- `acpsdk.KillTerminalCommandRequest`, `acpsdk.KillTerminalCommandResponse` +- `acpsdk.ContentBlock`, `acpsdk.ContentBlockText` +- `acpsdk.AgentMethodSessionLoad`, `acpsdk.AgentMethodSessionPrompt`, `acpsdk.AgentMethodSessionCancel` +- `acpsdk.AgentMethodSessionSetMode`, `acpsdk.AgentMethodSessionSetModel` +- `acpsdk.ClientMethodSessionUpdate` + +AGH also intentionally uses a local `wireLoadSessionRequest` wrapper because AGH needs to preserve the existing `additional_dirs` wire field used by its ACP integration. + +## Changed Symbols and Required AGH Impact + +### Session Creation and Loading + +`NewSessionResponse` and `LoadSessionResponse` now include: + +- `ConfigOptions []SessionConfigOption json:"configOptions,omitempty"` +- `Meta map[string]any json:"meta,omitempty"` instead of an unconstrained `any` meta field. + +AGH impact: + +- `captureCaps` must accept and store `ConfigOptions` from both `session/new` and `session/load`. +- Resume/load paths must capture config options exactly like new-session paths. +- Existing mode/model capture remains valid, but config options take precedence for active session model/reasoning controls. + +### Config Option Types + +`v0.12.2` introduces these wire types: + +- `SessionConfigOption` +- `SessionConfigOptionSelect` +- `SessionConfigOptionBoolean` +- `SessionConfigId` +- `SessionConfigValueId` +- `SessionConfigSelectOptions` +- `SessionConfigSelectOptionsUngrouped` +- `SessionConfigSelectOptionsGrouped` +- `SessionConfigSelectOption` +- `SessionConfigOptionCategory` +- `SessionConfigOptionUpdate` + +AGH impact: + +- Add AGH-owned session config option state rather than leaking SDK union types into public session state. +- Convert only known option shapes needed by AGH consumers. Select options are required for model/reasoning changes; boolean options should be preserved for contract visibility but not used for model/reasoning selection. +- Flatten grouped and ungrouped select values into a stable payload while preserving each option ID, label, current value, and valid values. + +### Config Option Mutation Method + +`v0.12.2` adds: + +- `AgentMethodSessionSetConfigOption = "session/set_config_option"` +- `SetSessionConfigOptionRequest` +- `SetSessionConfigOptionResponse` +- `SetSessionConfigOptionValueId` +- `SetSessionConfigOptionBoolean` +- `ClientSideConnection.SetSessionConfigOption` + +AGH impact: + +- Model changes must prefer `session/set_config_option` when a conservative model config option exists and contains the requested value. +- Reasoning effort must prefer `session/set_config_option` when a conservative reasoning config option exists and contains the requested value. +- AGH should update active session config option state from the response's returned `configOptions`. + +### Legacy Model Mutation + +`AgentMethodSessionSetModel` remains on the wire as `session/set_model`, but the request/response symbols changed: + +- Removed or renamed: `SetSessionModelRequest` +- Removed or renamed: `SetSessionModelResponse` +- New names: `UnstableSetSessionModelRequest`, `UnstableSetSessionModelResponse` +- The agent-side interface moved this handler to `AgentExperimental.UnstableSetSessionModel`. + +AGH impact: + +- Production fallback must use `UnstableSetSessionModelRequest` and `UnstableSetSessionModelResponse`. +- ACP test helper agents must implement `UnstableSetSessionModel` instead of `SetSessionModel`. +- Fallback must only run when config options are absent and legacy `SessionModelState.AvailableModels` advertises the requested model. + +### Session Update Notifications + +`SessionUpdate` now includes: + +- `ConfigOptionUpdate *SessionConfigOptionUpdate` +- `SessionInfoUpdate` +- Existing `AvailableCommandsUpdate`, `CurrentModeUpdate`, and `UsageUpdate` remain. + +AGH impact: + +- `config_option_update` notifications must mutate the active process/session config option state even when no prompt event is currently being emitted. +- Notification translation can continue producing a system event, but state capture must not depend on prompt activity. + +### Session Model and Mode State + +`SessionModeState`, `AvailableSessionMode`, `SessionModelState`, and `ModelInfo` remain structurally compatible for AGH's existing needs, with meta fields now represented as `map[string]any`. + +AGH impact: + +- Existing mode capture and `session/set_mode` behavior can remain. +- Existing model list capture can remain as the legacy fallback surface. + +### Client Capabilities + +`ClientCapabilities.Fs` remains, but the filesystem capability type was renamed: + +- Removed or renamed: `FileSystemCapability` +- New name: `FileSystemCapabilities` + +AGH impact: + +- Initialization must construct `acpsdk.FileSystemCapabilities`. + +### Prompt Metadata + +`PromptRequest.Meta` now uses `map[string]any`. + +AGH impact: + +- AGH's structured `PromptMeta` must be converted to a map before being assigned to the SDK prompt request. +- The conversion must avoid ignored marshal/unmarshal errors. + +### Terminal Kill Request Types + +The client-side terminal kill symbols were renamed: + +- Removed or renamed: `KillTerminalCommandRequest` +- Removed or renamed: `KillTerminalCommandResponse` +- New names: `KillTerminalRequest`, `KillTerminalResponse` + +AGH impact: + +- The AGH terminal kill handler signature and return values must use the new symbols. + +### Permission Tool Call Type + +`RequestPermissionRequest.ToolCall` is now the existing `ToolCallUpdate` type instead of a dedicated `RequestPermissionToolCall` type. + +AGH impact: + +- Permission display helpers must accept `acpsdk.ToolCallUpdate`. + +### Cancellation and Request Errors + +`CancelNotification` and `RequestError` remain available. `v0.12.2` adds `NewRequestCancelled` and uses a cancellation error code constant. + +AGH impact: + +- Existing request error handling should continue compiling unless code referenced removed field names. +- No AGH production code currently depends on renamed cancellation fields in the audited ACP paths. + +## Conservative Matching Rules for AGH + +Model config option detection: + +- Prefer exact config option ID `model`. +- Allow only explicitly documented local fixture IDs after `model`; do not match display names or categories. +- Send only values present in the select option's advertised values. + +Reasoning config option detection: + +- Prefer exact config option ID `reasoning_effort`. +- Then allow exact `effort`. +- Send only values present in the select option's advertised values. +- Never derive valid reasoning levels from catalog metadata such as `supports_reasoning`. + +## Required Verification After Migration + +- Focused ACP tests for `session/new`, `session/load`, `config_option_update`, set-config model, set-config reasoning, legacy fallback, and no invented reasoning levels. +- Focused session manager tests for start option propagation and legacy resume behavior. +- Contract tests proving a named `SessionConfigOptionPayload` is exposed. +- `make codegen` and `make codegen-check` if the public API contract changes. +- Full `make verify` before completion and again after the local commit. diff --git a/.compozy/tasks/provider-model-catalog/qa/issues/BUG-NNN-template.md b/.compozy/tasks/provider-model-catalog/qa/issues/BUG-NNN-template.md new file mode 100644 index 000000000..68767cf75 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/issues/BUG-NNN-template.md @@ -0,0 +1,72 @@ +# BUG-NNN: + +**Severity:** Critical | High | Medium | Low +**Priority:** P0 | P1 | P2 | P3 +**Type:** Functional | UI | Performance | Security | Data | Crash +**Status:** Open +**Discovered During:** TC-FUNC-NNN | TC-INT-NNN | TC-PERF-NNN | TC-SEC-NNN | TC-UI-NNN | TC-REG-NNN | TC-SCEN-NNN +**Reporter:** +**Created:** YYYY-MM-DD +**Last Updated:** YYYY-MM-DD + +## Environment + +- **Build:** +- **OS:** +- **Browser:** (only for UI bugs) +- **URL / Endpoint:** +- **Bootstrap manifest:** +- **Lab root / runtime home / ports:** +- **Live provider/LLM:** + +## Summary + + + +## Behavioral Impact + +- **Operator/User Goal:** +- **Agent Behavior:** +- **Business Outcome:** +- **Cross-Surface State:** + +## Reproduction + +```bash +# Verbatim commands (paths from bootstrap manifest) +``` + +Observed before fix: + +- + +## Expected + + + +## Root Cause + + + +## Fix + + + +## Verification + +- +- +- + +## Impact + +- **Users Affected:** +- **Frequency:** +- **Workaround:** + +## Related + +- Test Case: +- TechSpec Invariant: +- ADR: +- Logs / artifacts: diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/SMOKE-001.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/SMOKE-001.md new file mode 100644 index 000000000..80b8da370 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/SMOKE-001.md @@ -0,0 +1,57 @@ +# SMOKE-001: Provider Model Catalog Smoke Readiness + +**Priority:** P0 +**Type:** Smoke +**Status:** Not Run +**Estimated Time:** 15 minutes +**Created:** 2026-05-07 +**Last Updated:** 2026-05-07 + +--- + +## Objective + +Confirm the isolated QA lab is healthy enough to execute release-grade catalog scenarios. Smoke is **entry criteria only**; passing this case proves nothing about feature behavior. + +## Preconditions + +- [ ] `agh-qa-bootstrap` produced `bootstrap-manifest.json` for the run. +- [ ] Unique `AGH_HOME`, ports, and tmux socket allocated. +- [ ] `AGH_WEB_API_PROXY_TARGET` exported from manifest. +- [ ] No production code changes pending beyond Task 12 / Task 13 QA artifacts. + +## Test Steps + +1. **Verify daemon binary builds.** + - Command: `make build`. + - **Expected:** Exit 0; binary present at `bin/agh`. +2. **Verify codegen contracts are clean.** + - Command: `make codegen-check`. + - **Expected:** No drift in `openapi/agh.json` or `web/src/generated/agh-openapi.d.ts`. +3. **Verify Bun typecheck and unit tests.** + - Command: `make bun-typecheck && make bun-test`. + - **Expected:** All workspaces pass; vitest catches no regression. +4. **Verify focused Go gates compile and pass.** + - Command: `go test -race -count=1 ./internal/config ./internal/store/globaldb ./internal/modelcatalog/... ./internal/acp ./internal/api/... ./internal/cli ./internal/extension/...`. + - **Expected:** Exit 0. +5. **Boot the daemon and request status.** + - Command (in lab): `agh daemon start --foreground &` then `agh provider models status -o json`. + - **Expected:** JSON payload includes `sources` array with `idle` or `succeeded` `refresh_state`. + +## Audit Coverage + +- Smoke entry only. Does **not** satisfy any release-grade audit minimum. + +## Pass Criteria + +- All five steps exit 0. +- Daemon responds within 5s. + +## Failure Criteria + +- Any step exits non-zero. +- Daemon hangs or returns OS-level error. + +## Notes + +If smoke fails, halt the QA run and report the failing step in `qa/verification-report.md` before any TC-FUNC/INT execution. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-001.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-001.md new file mode 100644 index 000000000..51a6a6813 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-001.md @@ -0,0 +1,84 @@ +# TC-FUNC-001: Provider Config Hard-Cut - Old Keys Rejected + +**Priority:** P0 +**Type:** Functional +**Module:** `internal/config` +**Requirement:** ADR-002, TechSpec Delete Targets, Task 01. +**Status:** Not Run +**Created:** 2026-05-07 +**Last Updated:** 2026-05-07 + +## Objective + +Verify that any `config.toml` containing the deleted flat provider model fields fails validation with deterministic, path-scoped errors and that no compatibility fallback rehydrates the values. + +## Preconditions + +- [ ] Fresh isolated `AGH_HOME` (no prior config cache). +- [ ] Daemon binary built from current branch. + +## Test Steps + +1. **Write `config.toml` with the deleted `default_model` key.** + - Input: + ```toml + [providers.codex] + command = "/bin/true" + default_model = "gpt-5.4" + ``` + - **Expected:** `agh config validate` (and daemon boot) returns an error referencing path `providers.codex.default_model` and explicitly stating the key is removed. +2. **Replace with deleted `supported_models` key.** + - Input: + ```toml + [providers.codex] + command = "/bin/true" + supported_models = ["gpt-5.4"] + ``` + - **Expected:** Error references `providers.codex.supported_models`. +3. **Replace with deleted `supports_reasoning_effort` key.** + - Input: + ```toml + [providers.codex] + command = "/bin/true" + supports_reasoning_effort = true + ``` + - **Expected:** Error references `providers.codex.supports_reasoning_effort`. +4. **Confirm new nested shape parses cleanly.** + - Input: + ```toml + [providers.codex] + command = "/bin/true" + [providers.codex.models] + default = "gpt-5.4" + [[providers.codex.models.curated]] + id = "gpt-5.4" + supports_reasoning = true + reasoning_efforts = ["minimal", "low", "medium", "high", "xhigh"] + default_reasoning_effort = "medium" + ``` + - **Expected:** Validation succeeds; daemon starts; `agh provider models list codex -o json` returns rows tagged with `source_id="config"` and priority `120`. + +## Negative / Boundary Tests + +- Empty curated array with valid `default` → must succeed (manual default model is valid, SI-6). +- `default = ""` → must fail with explicit path `providers.codex.models.default`. +- Curated model `id` blank → must fail. +- `default_reasoning_effort = "extreme"` not in `reasoning_efforts` → must fail. + +## Audit Coverage + +- C6 task tree (Task 01). +- C8 cross-surface truth: rendered `agh config show` and persisted SQLite catalog row both reflect new shape. +- TechSpec Safety Invariants: SI-6 (manual entry valid), SI-8 (only `internal/modelcatalog.Store` writes catalog rows). + +## Pass Criteria + +- Steps 1-3 fail with the documented error path; no silent hydrate of legacy fields. +- Step 4 produces catalog rows attributed to the `config` source. +- `agh config show` does not emit any of the deleted keys. + +## Failure Criteria + +- Any deleted key parses without error. +- Error path lacks the offending key name. +- Catalog row attributes the data to a source other than `config` (priority 120). diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-002.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-002.md new file mode 100644 index 000000000..84a32f55e --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-002.md @@ -0,0 +1,52 @@ +# TC-FUNC-002: Provider Config - Curated Validation Rules + +**Priority:** P1 +**Type:** Functional +**Module:** `internal/config` +**Requirement:** TechSpec Config Lifecycle. +**Status:** Not Run + +## Objective + +Verify the nested `[providers..models]` block enforces the documented validation rules and accepts manual default models. + +## Preconditions + +- [ ] Fresh `AGH_HOME`. +- [ ] Daemon binary built from current branch. + +## Test Steps + +1. **Manual default model outside curated list is accepted.** + - Input: `[providers.codex.models] default = "manual-gpt-9000"` with empty `curated`. + - **Expected:** Validation succeeds; `agh provider models list codex -o json` includes `manual-gpt-9000` only when sources later report it; manual selection at session creation succeeds. +2. **Duplicate curated `id` is rejected.** + - Input: two `[[providers.codex.models.curated]]` entries with `id = "gpt-5.4"`. + - **Expected:** Error references both occurrences. +3. **Blank reasoning effort is rejected.** + - Input: `reasoning_efforts = ["high", ""]`. + - **Expected:** Error references the empty entry. +4. **`default_reasoning_effort` must be present in `reasoning_efforts`.** + - Input: `reasoning_efforts = ["low", "medium"]`, `default_reasoning_effort = "high"`. + - **Expected:** Error references the curated entry's effort path. +5. **`[model_catalog.sources.models_dev]` defaults populate.** + - Input: omit the section entirely. + - **Expected:** `agh config show` resolves `enabled=true`, `endpoint="https://models.dev/api.json"`, `ttl="24h"`, `timeout="10s"`. +6. **`models.discovery.command` and `.endpoint` are mutually exclusive when both set without adapter override.** + - Input: `[providers.openclaw.models.discovery] command = "x" endpoint = "https://"`. + - **Expected:** Error states only one of the two is allowed unless the provider adapter documents both. + +## Audit Coverage + +- C6 task tree (Task 01, Task 03 sources, Task 05 daemon wiring). +- SI-6 (manual model entry remains valid). + +## Pass Criteria + +- All validation cases produce the documented error or success. +- Defaults appear when omitted. + +## Failure Criteria + +- Any blank/duplicate/invalid combination is silently accepted. +- Defaults differ from the TechSpec values. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-003.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-003.md new file mode 100644 index 000000000..4a567b0c7 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-003.md @@ -0,0 +1,48 @@ +# TC-FUNC-003: Builtin Source Converts Defaults to Priority-10 Rows + +**Priority:** P1 +**Type:** Functional +**Module:** `internal/modelcatalog` (`builtin` source) +**Requirement:** TechSpec Source Implementations. +**Status:** Not Run + +## Objective + +Verify the `builtin` source emits source rows with priority 10, supports offline first-run, and never wins against config or live sources. + +## Preconditions + +- [ ] Fresh `AGH_HOME` with no overrides for built-in providers. +- [ ] Network disabled (no `models.dev`, no live discovery). + +## Test Steps + +1. **Boot daemon offline.** + - Command: `agh daemon start --foreground` with `AGH_DISABLE_OUTBOUND=1` (or stubbed transport). + - **Expected:** Daemon starts; no errors logged that block startup. +2. **List catalog for a built-in provider (e.g. `codex`).** + - Command: `agh provider models list codex -o json`. + - **Expected:** Models present with `sources[0].source_id="builtin"` and `priority=10`; `availability_state="unknown"`. +3. **Add a config curated model that overrides display name.** + - Update `config.toml` with curated metadata for the same `model_id`. + - **Expected:** Merged projection shows the config-source `display_name` because priority 120 > 10; builtin row remains addressable as a separate source via `--source builtin`. +4. **Disable the builtin source via internal API.** + - Programmatically remove builtin source registration in tests. + - **Expected:** Catalog falls back to remaining sources without panicking; no orphan rows remain in `model_catalog_rows` for the removed source after replace. + +## Audit Coverage + +- C6 task tree (Task 03). +- SI-13 (partial-source success). + +## Pass Criteria + +- Builtin rows appear at priority 10. +- Config wins on conflict; builtin survives as second source. +- Removing builtin source does not corrupt rows. + +## Failure Criteria + +- Builtin priority differs from 10. +- Builtin overrides higher-priority sources. +- Daemon panics or fails to boot offline. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-004.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-004.md new file mode 100644 index 000000000..86be1e886 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-004.md @@ -0,0 +1,58 @@ +# TC-FUNC-004: Catalog Merge Determinism (Priority + Freshness + Source-ID Tie-Break) + +**Priority:** P0 +**Type:** Functional +**Module:** `internal/modelcatalog` merge +**Requirement:** TechSpec Proposed Design / Architectural Boundaries. +**Status:** Not Run + +## Objective + +Verify that the merge projection is deterministic and follows the documented priority order, freshness tie-break, and source-id tie-break, with lower-priority sources filling missing fields. + +## Preconditions + +- [ ] Catalog seeded via test harness with crafted source rows for one provider/model. +- [ ] All rows written through `internal/modelcatalog.Store.ReplaceSourceRows`. + +## Test Steps + +1. **Higher-priority source wins conflicting non-empty field.** + - Seed: `config` (priority 120) `display_name="Config Name"`, `models_dev` (priority 50) `display_name="DevName"`. + - **Expected:** Projected `display_name="Config Name"`. +2. **Lower-priority source fills missing field.** + - Seed: `config` row sets only `default_reasoning_effort`; `models_dev` row sets `cost_input_per_million`. + - **Expected:** Projected model exposes both fields. +3. **Freshness tie-break.** + - Seed: two rows with identical priority but different `refreshed_at`. + - **Expected:** Fresher row wins. +4. **Source-id tie-break.** + - Seed: two rows with identical priority and `refreshed_at`. + - **Expected:** Ascending `source_id` wins. +5. **Sources array sorted deterministically.** + - **Expected:** `sources` ordered `(priority DESC, refreshed_at DESC, source_id ASC)`. +6. **Projection top-level sorted by `(provider_id ASC, model_id ASC)`.** + - **Expected:** Stable across repeated calls. +7. **Availability state derivation.** + - Seed: live row `available=true stale=false` + models_dev row. + - **Expected:** `availability_state="available_live"`. + - Replace: live row `available=true stale=true` → `available_stale`. + - Replace: live row `available=false stale=true` → `unavailable_stale`. + - Remove live/extension row → `unknown`. +8. **`models.dev` and `builtin` never elevate availability above `unknown`.** + - Seed only `models_dev` + `builtin`. + - **Expected:** `availability_state="unknown"` and `available=null`. + +## Audit Coverage + +- C6 task tree (Task 03). +- SI-5 (`models.dev` not authority), SI-13 (partial success). + +## Pass Criteria + +- Every assertion holds across two consecutive runs (determinism). + +## Failure Criteria + +- Any tie-break diverges from the documented order. +- `models.dev`/`builtin` ever yield `available=true` directly. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-005.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-005.md new file mode 100644 index 000000000..494c497f6 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-005.md @@ -0,0 +1,57 @@ +# TC-FUNC-005: `models.dev` Source - TTL, Disable, Legacy Aliases + +**Priority:** P1 +**Type:** Functional +**Module:** `internal/modelcatalog/modelsdev.go` +**Requirement:** TechSpec `Models.dev Source` + Config Lifecycle. +**Status:** Not Run + +## Objective + +Verify the `models.dev` source honors the configurable TTL, endpoint, timeout, and disable switch; tolerates current and legacy schema aliases; and never proves account-level availability. + +## Preconditions + +- [ ] `httptest`-based stub server mirroring `models.dev/api.json`. +- [ ] Config writes `[model_catalog.sources.models_dev]` with stub endpoint. + +## Test Steps + +1. **Current-schema parse.** + - Stub returns canonical fields (`reasoning`, `tool_call`, `limit.context`, `cost.input`, `cost.output`). + - **Expected:** Rows include `supports_reasoning`, `supports_tools`, `context_window`, `cost_*` populated. +2. **Legacy-schema parse.** + - Stub returns `supportsReasoning`, `supports_reasoning`, `supportsTools`, `supports_tools`, `contextWindow`, `maxInputTokens`, `maxOutputTokens`, `pricing.input`, `pricing.output`. + - **Expected:** All fields parse identically; tolerant aliases tested. +3. **TTL respected.** + - Trigger refresh; immediately call list with `Refresh=false`. + - **Expected:** Cached rows returned without HTTP call within TTL. +4. **Disable switch.** + - Set `[model_catalog.sources.models_dev] enabled = false`. + - **Expected:** Source status `refresh_state="idle"`, no outbound HTTP, rows absent for the source. +5. **Override endpoint and timeout.** + - Set `endpoint = "http://127.0.0.1:0/api.json"`, `timeout = "1ms"`. + - **Expected:** Source status records timeout error; redacted `last_error`; prior stale rows preserved. +6. **No account availability.** + - Stub returns models for `codex` provider with `available=true` field. + - **Expected:** Projection ignores `available` from `models.dev` (kind keeps `available=null`); availability remains `unknown` unless live/extension says otherwise. +7. **Provider-scoped status row.** + - Stub spans 3 AGH providers; refresh once. + - **Expected:** `model_catalog_sources` has 3 rows (one per provider) for `models_dev`; no blank-provider sentinel row. + +## Audit Coverage + +- C6 task tree (Task 03 + Task 05 wiring). +- SI-5, SI-13. + +## Pass Criteria + +- All schema variants parse. +- TTL/disable/override honored. +- Provider-scoped status rows preserved. + +## Failure Criteria + +- Any legacy alias fails parse. +- Disabled source still calls HTTP. +- Account availability inferred from `models.dev`. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-006.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-006.md new file mode 100644 index 000000000..943d2f1ca --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-006.md @@ -0,0 +1,49 @@ +# TC-FUNC-006: Stale Fallback Preserves Prior Successful Rows + +**Priority:** P1 +**Type:** Functional +**Module:** `internal/modelcatalog` +**Requirement:** TechSpec Safety Invariants SI-4. +**Status:** Not Run + +## Objective + +Verify that when a source refresh fails after at least one prior successful refresh, AGH preserves the previously stored rows, marks them stale, and surfaces the redacted `last_error` through projection and status. + +## Preconditions + +- [ ] Catalog seeded via successful refresh of a stub source. +- [ ] Stub source can be flipped to fail on demand. + +## Test Steps + +1. **Successful refresh.** + - Trigger refresh; assert `model_catalog_rows` has rows for the source with `stale=0`. + - **Expected:** Source status `last_success_at` populated; rows readable via projection. +2. **Force failure on next refresh.** + - Stub returns 5xx; trigger `agh provider models refresh codex --source models_dev`. + - **Expected:** Source status records `refresh_state="failed"`, `last_error` redacted; previous rows now flagged `stale=1`. +3. **List after failure returns stale rows with markers.** + - Command: `agh provider models list codex --include-stale -o json`. + - **Expected:** Rows present with `stale=true`; `availability_state` either `available_stale` or `unavailable_stale` if previous live row existed; `unknown` otherwise. +4. **Without `--include-stale`, projection still includes stale rows but flags them.** + - **Expected:** Default behavior surfaces stale rows tagged `stale=true` (TechSpec keeps stale rows usable as fallback). +5. **Daemon restart preserves stale rows.** + - Restart daemon; reissue list. + - **Expected:** Same rows present, still flagged stale; no row loss. + +## Audit Coverage + +- C6 task tree (Task 03, Task 05). +- SI-4, SI-13. + +## Pass Criteria + +- Stale rows persist across refresh failure and daemon restart. +- `last_error` redacted (no API key / OAuth / env secret string). + +## Failure Criteria + +- Failure clears prior rows. +- Status loses `last_success_at`. +- Stale rows missing the `stale` flag. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-007.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-007.md new file mode 100644 index 000000000..23ffd6bcb --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-007.md @@ -0,0 +1,51 @@ +# TC-FUNC-007: Partial Source Success vs All-Source Failure + +**Priority:** P0 +**Type:** Functional +**Module:** `internal/modelcatalog.Service.ListModels` +**Requirement:** TechSpec Safety Invariants SI-13. +**Status:** Not Run + +## Objective + +Verify that: +- A list call succeeds when at least one source delivers usable rows or a stale cache exists. +- A list call fails (deterministic error) only when every usable source fails AND no stale cache exists. + +## Preconditions + +- [ ] Catalog with multiple sources registered for one provider. +- [ ] Stub control over each source's success/failure. + +## Test Steps + +1. **Partial success.** + - Force `models.dev` 5xx; let `builtin` return rows. + - Command: `agh provider models list codex -o json`. + - **Expected:** Exit 0; rows from `builtin` returned; status reports `models_dev` as failed; `last_error` redacted. +2. **All-source failure with stale cache.** + - Run a successful refresh first; then force every source to fail. + - **Expected:** List returns stale rows with `stale=true`; no error to operator. +3. **All-source failure with no stale cache.** + - Wipe SQLite catalog tables (test harness only); force every source to fail. + - **Expected:** List returns deterministic error referencing the failed sources; CLI exit non-zero with structured JSON error in `-o json` mode. +4. **Refresh during all-source failure remains coalesced.** + - Issue two concurrent refreshes for the same provider. + - **Expected:** One subprocess/network attempt per source; status batch returned identically to both callers. + +## Audit Coverage + +- C6 task tree (Task 03, Task 05). +- SI-4, SI-13. + +## Pass Criteria + +- Steps 1-2 succeed without error. +- Step 3 fails with structured error and non-zero exit. +- Step 4 shows single underlying call. + +## Failure Criteria + +- Partial failure reported as global failure. +- All-source failure with stale cache returns error. +- Coalescing breaks under concurrent refresh. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-008.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-008.md new file mode 100644 index 000000000..3e44a25bd --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-008.md @@ -0,0 +1,51 @@ +# TC-FUNC-008: Live Provider Source Timeout + Effective Auth/Home/Env + +**Priority:** P1 +**Type:** Functional +**Module:** `internal/modelcatalog/live_sources.go` +**Requirement:** TechSpec Live Provider Sources, SI-3. +**Status:** Not Run + +## Objective + +Verify each registered live provider source is timeout-bound, uses the provider's effective auth/home/env policy, never inherits the request context's deadline implicitly, and records source status (not session blockers) on failure. + +## Preconditions + +- [ ] Stub or fake provider subprocess and HTTP endpoints. +- [ ] Provider config with `home_policy`, `env_policy`, `auth_mode` set per provider. +- [ ] Daemon base env injected for live discovery. + +## Test Steps + +1. **Timeout enforcement.** + - Stub server delays 30s; provider discovery timeout 1s. + - **Expected:** Source status records `failed` with redacted timeout message; no panic; coalescing serializes per provider (TC-PERF-001 covers concurrent storms). +2. **Provider home policy honored.** + - Set `home_policy=isolated`; spawn live discovery subprocess. + - **Expected:** Subprocess `HOME` matches provider isolated home; daemon does not leak operator `HOME`. +3. **Auth status command non-zero.** + - Stub `auth_status_command` returns exit 2. + - **Expected:** Source status `failed`; daemon does not raise an operator error; manual model entry still works. +4. **Provider secret resolver exposes redacted env.** + - Resolver injects `OPENAI_API_KEY=secret-xyzzy`. + - **Expected:** Source error log entries do not contain `secret-xyzzy`; refer to TC-SEC-001 for cross-surface redaction. +5. **Source IDs are `provider_live:` with priority 110.** + - **Expected:** SQLite rows match the documented IDs and priority (Task 04 invariant). + +## Audit Coverage + +- C6 task tree (Task 04). +- SI-1 (no session blocker), SI-3, SI-9. + +## Pass Criteria + +- Timeouts enforced. +- Effective home/env honored. +- Source IDs and priority match Task 04 contract. + +## Failure Criteria + +- Subprocess inherits operator `HOME`. +- Source error contains raw secret material. +- Timeout exceeds configured timeout. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-009.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-009.md new file mode 100644 index 000000000..f76b2ffff --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-009.md @@ -0,0 +1,46 @@ +# TC-FUNC-009: Live Discovery Never Touches ACP Sessions + +**Priority:** P1 +**Type:** Functional +**Module:** `internal/modelcatalog` live sources + `internal/acp` +**Requirement:** TechSpec Safety Invariants SI-2, ADR-001. +**Status:** Not Run + +## Objective + +Verify live provider discovery never calls `session/new`, `session/load`, `session/set_model`, or `session/set_config_option`, and that unavailable side-effect-free discovery paths surface as source-status failures, not session blockers. + +## Preconditions + +- [ ] ACP fake driver instrumented to assert it is not invoked from discovery code paths. +- [ ] Discovery sources registered for built-in providers and adapter-config providers (OpenClaw, Hermes, Pi). + +## Test Steps + +1. **Run a refresh storm against every provider.** + - Command: `for p in codex anthropic openrouter ollama opencode openclaw hermes pi; do agh provider models refresh $p; done`. + - **Expected:** ACP fake driver records zero invocations. +2. **Provider without `discovery.command`/`discovery.endpoint`.** + - Configure OpenClaw with `discovery.enabled=true` but no command/endpoint. + - **Expected:** Source status `refresh_state="failed"`, `last_error` references missing discovery contract; session creation for that provider remains usable; manual model entry still valid. +3. **Provider discovery enabled with invalid HTTP endpoint.** + - Set `endpoint = "http://127.0.0.1:0"`. + - **Expected:** Source status `failed` with redacted error; ACP driver still untouched. +4. **Concurrent session creation while discovery refresh runs.** + - Trigger refresh and a session create simultaneously for the same provider. + - **Expected:** Session creation completes without waiting on discovery; ACP fake records only `session/new` from the session caller, not from discovery. + +## Audit Coverage + +- C6 task tree (Task 04, Task 06). +- SI-1, SI-2, SI-3. + +## Pass Criteria + +- Zero ACP `session/*` calls originate from discovery code. +- Missing discovery configuration produces source status, never blocks sessions. + +## Failure Criteria + +- Discovery code path invokes any ACP session method. +- Failure to discover blocks session creation. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-010.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-010.md new file mode 100644 index 000000000..c764540cf --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-010.md @@ -0,0 +1,50 @@ +# TC-FUNC-010: ACP `session/set_config_option` Precedence + +**Priority:** P0 +**Type:** Functional +**Module:** `internal/acp` (Driver.applySessionModel) +**Requirement:** TechSpec ACP Session Config Options, SI-7. +**Status:** Not Run + +## Objective + +Verify the upgraded SDK driver prefers `session/set_config_option` for model and reasoning effort and only falls back to `session/set_model` when no matching config option exists. Reasoning never sent when no matching control is advertised. + +## Preconditions + +- [ ] `coder/acp-go-sdk@v0.12.2` upgraded. +- [ ] ACP fake driver fixtures expose `configOptions` for `model` and `reasoning_effort` (and the documented synonyms). + +## Test Steps + +1. **`session/new` advertises a `model` config option matching the requested model.** + - **Expected:** Driver issues `session/set_config_option` with `id="model"`, `value=`; never invokes `session/set_model` for this case. +2. **`session/new` advertises a reasoning option.** + - **Expected:** Driver applies reasoning via `session/set_config_option`; legacy `set_model` not invoked. +3. **`config_option_update` event arrives mid-session.** + - **Expected:** Driver updates session state; HTTP/UDS session capability surfaces reflect new options on next read. +4. **No matching config option present, but legacy model state advertises the model.** + - **Expected:** Driver falls back to `session/set_model`; debug log notes fallback reason. +5. **Neither config option nor legacy model state.** + - **Expected:** Driver does not send any model mutation; reasoning effort is silently skipped (SI-7); session creation succeeds with default state. +6. **Conservative ID matching.** + - Stub option ID `model_v2` (not in known list). + - **Expected:** Driver does not assume it is a model option; treats as opaque; falls back as in step 5 if no exact `model` ID. + +## Audit Coverage + +- C6 task tree (Task 06). +- SI-7. + +## Pass Criteria + +- Steps 1-3 use `session/set_config_option`. +- Step 4 falls back to `session/set_model`. +- Step 5 sends no mutation. +- Step 6 never invents reasoning levels from `supports_reasoning=true`. + +## Failure Criteria + +- Driver invokes `session/set_model` when a matching config option exists. +- Reasoning effort fired without an advertised control. +- Unknown option IDs treated as model option. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-011.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-011.md new file mode 100644 index 000000000..8f9be6609 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-011.md @@ -0,0 +1,50 @@ +# TC-FUNC-011: Extension `model.source` Manifest + Row Validation + +**Priority:** P1 +**Type:** Functional +**Module:** `internal/extension` +**Requirement:** ADR-003, TechSpec Extension Sources. +**Status:** Not Run + +## Objective + +Verify extension manifests can declare a `model.source` capability with a normalizable slug; non-normalizable slugs are rejected; `models/list` results pass through `internal/modelcatalog` validation; invalid rows are dropped with deterministic source-status errors. + +## Preconditions + +- [ ] Extension fixture with manifest declaring `model.source` capability for one provider. +- [ ] Daemon configured to register that extension. + +## Test Steps + +1. **Manifest accepts normalizable slug.** + - Manifest declares `name = "Acme Models"` mapped to slug `acme-models`. + - **Expected:** Daemon registers `source_id="extension:acme-models"`; manifest validation passes. +2. **Manifest rejects unmappable slug.** + - Manifest declares `name = "??"`. + - **Expected:** Validation fails with deterministic error referencing the manifest field. +3. **Extension returns valid rows.** + - `models/list` returns rows with provider/model IDs the extension declares. + - **Expected:** Rows persist; merge applies extension priority 100; status `succeeded`. +4. **Extension returns invalid rows.** + - Stub returns row with empty `model_id`. + - **Expected:** Row rejected; remaining valid rows persist; source status records redacted error referencing the offending field. +5. **Extension declares provider it has no grant for.** + - **Expected:** Source status reports `failed` with capability-missing error; no rows persisted. + +## Audit Coverage + +- C6 task tree (Task 08). +- SI-8 (only `internal/modelcatalog.Store` writes rows), SI-9 (redaction). + +## Pass Criteria + +- Manifest validation matches Task 08 fixtures. +- Invalid rows do not pollute persisted catalog. +- Capability gate enforced. + +## Failure Criteria + +- Invalid manifest passes validation. +- Invalid row corrupts persisted state. +- Capability gate bypassed. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-012.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-012.md new file mode 100644 index 000000000..549943f84 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-012.md @@ -0,0 +1,44 @@ +# TC-FUNC-012: Extension Capability Missing / Revoked = Denial + +**Priority:** P1 +**Type:** Functional +**Module:** `internal/extension/host_api_models.go` +**Requirement:** ADR-003. +**Status:** Not Run + +## Objective + +Verify extension Host API methods (`models/list`, `models/refresh`, `models/status`) honor capability grants and surface deterministic denial errors when grants are missing or revoked, without leaking daemon internals. + +## Preconditions + +- [ ] Extension fixture with grants toggleable per method. + +## Test Steps + +1. **All three grants present.** + - **Expected:** Host API succeeds; payload matches daemon-owned projection (not raw extension payload). +2. **`models/list` grant missing.** + - **Expected:** Host API returns deterministic capability error; no rows leaked. +3. **`models/refresh` grant missing.** + - **Expected:** Refresh denied; no source status changes; no subprocess executed. +4. **`models/status` grant missing.** + - **Expected:** Status request denied; no source status read. +5. **Grant revoked mid-run.** + - Trigger list, then revoke grant, then trigger again. + - **Expected:** Second call denied; no cached payload returned. + +## Audit Coverage + +- C6 task tree (Task 08). +- SI-8, SI-9. + +## Pass Criteria + +- Capability gate enforced on every call. +- Errors deterministic. + +## Failure Criteria + +- Missing grant still returns rows or status. +- Error surface leaks daemon internals. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-013.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-013.md new file mode 100644 index 000000000..65b0f2432 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-013.md @@ -0,0 +1,44 @@ +# TC-FUNC-013: Source Error Redaction at Persistence + Projection + +**Priority:** P0 +**Type:** Functional +**Module:** `internal/modelcatalog/redact.go`, projection helpers. +**Requirement:** TechSpec SI-9. +**Status:** Not Run + +## Objective + +Verify source errors are redacted at both persistence time and at every public projection boundary so that secrets cannot leak through alternate surfaces. + +## Preconditions + +- [ ] Stub source whose error message contains an API key (`sk-test-1234567890abcdef`), an OAuth token (`Bearer secret.token`), and an env-shaped secret (`OPENAI_API_KEY=secret-xyzzy`). +- [ ] Daemon log capture available. + +## Test Steps + +1. **Trigger refresh failure with the seeded error string.** + - **Expected:** SQLite `model_catalog_sources.last_error` contains a redacted summary; raw secret strings absent. +2. **List status via HTTP / UDS / CLI / Host API.** + - **Expected:** `last_error` field redacted in every surface; payload byte-equal between HTTP and UDS for the same status row. +3. **List status via web app.** + - **Expected:** Web component renders the redacted string only; no secret visible in DOM, network response, or React Query cache (TC-UI-001 covers UI rendering). +4. **Daemon log capture.** + - **Expected:** Structured log entry omits secret strings; correlation keys (`refresh_request_id`, `provider_id`, `source_id`, `source_kind`) present. +5. **Inject error at projection time only (bypassing persistence redaction).** + - **Expected:** Projection helper still redacts before serialization (defense in depth at HTTP/UDS/Host API/SSE boundary). + +## Audit Coverage + +- C6 task tree (Task 11), C11 disruption probe. +- SI-9. + +## Pass Criteria + +- No surface emits raw secret material. +- Both persistence and projection redaction functions invoked. + +## Failure Criteria + +- Any surface (logs, status, API, web, Host API, SSE) reveals a secret. +- Projection skips redaction when persistence layer is bypassed. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-014.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-014.md new file mode 100644 index 000000000..8f697584e --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-014.md @@ -0,0 +1,48 @@ +# TC-FUNC-014: Refresh Deadline Detached From Request Context + +**Priority:** P1 +**Type:** Functional +**Module:** `internal/modelcatalog.Service.Refresh` +**Requirement:** TechSpec SI-11. +**Status:** Not Run + +## Objective + +Verify refresh work uses `context.WithoutCancel(ctx)` plus an explicit `context.WithDeadline`, so HTTP/UDS request cancellation does not abort refresh prematurely and refresh deadlines do not leak from the request context. + +## Preconditions + +- [ ] Refresh stub configured to take longer than the request timeout. +- [ ] Test harness with deterministic clock or sleep-based assertion. + +## Test Steps + +1. **Cancel the HTTP request mid-refresh.** + - Trigger `POST /api/providers/codex/models/refresh` with a 100ms client timeout while the source takes 2s. + - **Expected:** Client receives canceled response; daemon completes refresh through the configured deadline; `model_catalog_sources` records the refresh outcome. +2. **Configured deadline applied.** + - Configure a refresh deadline (default 60s in TechSpec; configurable per source). + - **Expected:** Refresh completes within the configured deadline regardless of the request lifetime. +3. **Daemon shutdown joins outstanding refresh workers.** + - Initiate refresh; gracefully shut daemon. + - **Expected:** Daemon waits for refresh worker to finish (or hits configured shutdown timeout) before exit; no orphan goroutine; SQLite rows consistent. +4. **Repeated cancellation under storm.** + - Cancel 100 sequential refresh calls within 50ms. + - **Expected:** Coalescing prevents storm; one underlying refresh completes; status reflects single outcome. + +## Audit Coverage + +- C6 task tree (Task 05, Task 11). +- SI-11. + +## Pass Criteria + +- Refresh outcome recorded after request cancellation. +- Deadlines respected. +- Daemon shutdown clean. + +## Failure Criteria + +- Refresh aborts when request cancels. +- Deadlines inherited implicitly from request context. +- Goroutine leak after shutdown. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-015.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-015.md new file mode 100644 index 000000000..116cfd426 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-FUNC-015.md @@ -0,0 +1,48 @@ +# TC-FUNC-015: Generated Contracts and Docs Drift Gate + +**Priority:** P1 +**Type:** Functional +**Module:** Codegen + Docs +**Requirement:** TechSpec Web/Docs Impact, Task 10. +**Status:** Not Run + +## Objective + +Verify `make codegen` regenerates `openapi/agh.json`, `web/src/generated/agh-openapi.d.ts`, and CLI references; the docs vitest enforces hard-cut copy in `packages/site`. + +## Preconditions + +- [ ] Working tree clean except QA artifacts. +- [ ] `make` toolchain available. + +## Test Steps + +1. **Run codegen.** + - Command: `make codegen`. + - **Expected:** `git status` shows no diff (committed state already matches generated). +2. **Run codegen-check.** + - Command: `make codegen-check`. + - **Expected:** Exit 0. +3. **Run docs vitest.** + - Command: `cd packages/site && bun run test -- provider-model-catalog-docs`. + - **Expected:** Suite passes; no flat-field claims (`default_model`, `supported_models`, `supports_reasoning_effort`) in narrative copy outside the hard-cut warning. +4. **CLI docs regenerated.** + - Command: `make cli-docs`. + - **Expected:** `packages/site/content/runtime/cli/provider/models/{list,refresh,status}.mdx` reflects current cobra exports. +5. **Inspect MDX sources.** + - Command: `grep -R "default_model\|supported_models\|supports_reasoning_effort" packages/site/content/runtime`. + - **Expected:** No matches outside hard-cut warning copy. + +## Audit Coverage + +- C6 task tree (Task 10). + +## Pass Criteria + +- Codegen idempotent; docs vitest green. + +## Failure Criteria + +- Codegen produces diff. +- Docs vitest fails. +- Hard-cut residue in narrative copy. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-001.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-001.md new file mode 100644 index 000000000..834702b86 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-001.md @@ -0,0 +1,52 @@ +# TC-INT-001: Global Migration v23 - Fresh DB + Reopen-After-Restart + +**Priority:** P0 +**Type:** Integration +**Systems:** `internal/store/globaldb` schema, `internal/modelcatalog.Store`. +**Requirement:** TechSpec Data Model, SI-10, Task 02. +**Status:** Not Run + +## Objective + +Verify the migration registry creates `model_catalog_sources`, `model_catalog_rows`, `model_catalog_reasoning_efforts`, and the documented indexes on a fresh DB; that the `BEGIN IMMEDIATE` write transaction is honored; that reopening the DB after a daemon restart keeps the row identity stable; and that the migration registry append-only contract still passes after v23. + +## Preconditions + +- [ ] Test isolated `globaldb` instance. +- [ ] No prior migrations. + +## Test Steps + +1. **Fresh DB migration.** + - Run migrator end-to-end. + - **Expected:** `schema_migrations` ends at v23 with the documented `name`/`checksum` for the model catalog migration; previous v1-v22 unchanged. +2. **Tables and indexes exist.** + - Inspect SQLite schema. + - **Expected:** Three tables and the indexes `idx_model_catalog_rows_provider_model`, `idx_model_catalog_rows_source_provider`, `idx_model_catalog_sources_provider` exist; foreign-key cascade on reasoning efforts present. +3. **Insert + read round-trip.** + - Use `Store.ReplaceSourceRows` with one row including reasoning efforts and a stale flag. + - **Expected:** `ListRows`/`ListSourceStatus` returns identical data; reasoning efforts ordered by `rank`. +4. **Reopen after restart.** + - Close DB; reopen. + - **Expected:** Rows present; reasoning efforts still ordered; status row preserved. +5. **WAL/SHM companion handling.** + - Simulate stale `-wal`/`-shm` companions; reopen. + - **Expected:** Migrator recovers cleanly; no migration mismatch. +6. **Append-only contract guarded.** + - Modify migration v23 hash and reopen. + - **Expected:** Migrator fails fast with mismatch error; never silently rewrites history. + +## Audit Coverage + +- C6 task tree (Task 02), C8 cross-surface persistence truth. +- SI-8, SI-10. + +## Pass Criteria + +- All steps pass with deterministic data. + +## Failure Criteria + +- Schema differs from TechSpec. +- Append-only contract rewritable. +- Reopen loses rows. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-002.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-002.md new file mode 100644 index 000000000..7189a83f8 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-002.md @@ -0,0 +1,48 @@ +# TC-INT-002: HTTP/UDS Native Catalog Handlers Serve Daemon-Owned Projection + +**Priority:** P0 +**Type:** Integration +**Systems:** `internal/api/core`, `internal/api/httpapi`, `internal/api/udsapi`, `internal/modelcatalog`. +**Requirement:** TechSpec Public Interfaces, ADR-001, Task 07. +**Status:** Not Run + +## Objective + +Verify the native catalog HTTP and UDS routes (registered via the `/api/providers/*catalog_path` dispatcher) return identical daemon-owned payloads for list / refresh / status across both transports. + +## Preconditions + +- [ ] Daemon running with seeded catalog state from TC-INT-001 fixture. +- [ ] Bearer token configured for HTTP; UDS connected via local socket from bootstrap manifest. + +## Test Steps + +1. **HTTP `GET /api/providers/models`.** + - **Expected:** 200; payload `ProviderModelPayload` shape; deterministic sort order. +2. **HTTP `GET /api/providers/{provider_id}/models`.** + - **Expected:** Returns subset filtered by provider; same deterministic sort. +3. **HTTP `POST /api/providers/{provider_id}/models/refresh`.** + - **Expected:** Returns `[]SourceStatus` with `refresh_request_id`; status reflects new `last_refresh_at`. +4. **HTTP `GET /api/providers/models/status` and `/api/providers/{provider_id}/models/status`.** + - **Expected:** Status payload includes redacted `last_error`; rows match SQLite source rows. +5. **UDS parity.** + - Repeat each call via UDS client (`internal/cli/client_provider_models.go` exposes the parity surface). + - **Expected:** Same shape; UDS responses match HTTP byte-equally for steady-state list payloads (TC-INT-003 validates byte equality). +6. **Refresh failure path.** + - Force a source to fail; refresh again. + - **Expected:** HTTP and UDS responses both surface failed status with redacted error. + +## Audit Coverage + +- C5 channel coverage, C8 cross-surface truth. +- SI-4, SI-9, SI-13. + +## Pass Criteria + +- All routes respond with documented payloads on both transports. +- Refresh failures surface consistently. + +## Failure Criteria + +- Any route differs in shape between HTTP and UDS. +- Refresh failure exposes raw error. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-003.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-003.md new file mode 100644 index 000000000..b77888f9a --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-003.md @@ -0,0 +1,45 @@ +# TC-INT-003: HTTP/UDS Canonical JSON Byte Equality + CLI Parity + +**Priority:** P0 +**Type:** Integration +**Systems:** `internal/api/core` deterministic encoder, `internal/api/testutil/model_catalog_parity_test.go`. +**Requirement:** TechSpec Testing Approach, Task 11. +**Status:** Not Run + +## Objective + +Verify the native catalog payload bytes match exactly between HTTP and UDS for at least one deterministic catalog state, that CLI structured JSON output covers the same persisted state, and that the Host API projection is structurally equivalent. + +## Preconditions + +- [ ] Daemon seeded with deterministic catalog state. +- [ ] Bearer auth + UDS socket from bootstrap manifest. + +## Test Steps + +1. **Capture HTTP `GET /api/providers/models` response body bytes.** +2. **Capture UDS `GET /api/providers/models` response body bytes.** + - **Expected:** Byte equal after canonical sort. +3. **Capture CLI `agh provider models list -o json` output.** + - **Expected:** Structurally equivalent (same provider/model rows, sources, availability) after JSON normalization; CLI may add wrapper metadata but core list matches. +4. **Capture Host API `models/list` (extension capability granted) response.** + - **Expected:** Same provider/model rows; daemon-owned projection (not raw extension payload). +5. **Repeat for status (`GET /api/providers/models/status`) and refresh (one cycle).** + - **Expected:** Same parity holds. +6. **Modify state via Settings > Providers (TC-UI-001).** + - **Expected:** Subsequent CLI/HTTP/UDS/Host API calls reflect the change uniformly. + +## Audit Coverage + +- C5, C8. +- TC-INT-002 covers shape; TC-INT-003 enforces byte/structural identity. + +## Pass Criteria + +- All four surfaces agree for at least the steady-state list payload. + +## Failure Criteria + +- Any drift between HTTP and UDS bytes. +- CLI loses fields. +- Host API exposes raw extension payload. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-004.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-004.md new file mode 100644 index 000000000..99092f352 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-004.md @@ -0,0 +1,49 @@ +# TC-INT-004: `/api/openai/v1/models` HTTP-Only Registration + Filter + +**Priority:** P0 +**Type:** Integration +**Systems:** `internal/api/httpapi/routes.go`, `internal/api/udsapi/routes.go`, `internal/api/core/model_catalog.go`. +**Requirement:** TechSpec OpenAI-Compatible Projection, Task 07. +**Status:** Not Run + +## Objective + +Verify `/api/openai/v1/models` is registered only on HTTP, returns the OpenAI-shaped projection with `agh` metadata, accepts `provider_id` filter, and is absent from UDS routes. + +## Preconditions + +- [ ] Daemon running with seeded catalog and bearer auth. + +## Test Steps + +1. **HTTP `GET /api/openai/v1/models`.** + - **Expected:** 200; body `{"object":"list","data":[...]}`; each item has `id`, `object="model"`, `created=0`, `owned_by=`, `agh.{provider_id, display_name, supports_tools, supports_reasoning, availability_state, reasoning_efforts, context_window, max_output_tokens, sources}`. +2. **`provider_id` filter.** + - Command: `GET /api/openai/v1/models?provider_id=codex`. + - **Expected:** Subset filtered; deterministic order. +3. **Unknown `provider_id`.** + - Command: `GET /api/openai/v1/models?provider_id=unknown-xyz`. + - **Expected:** 200 with empty `data` array; no error. +4. **UDS does not expose the route.** + - Command: hit UDS path `/api/openai/v1/models`. + - **Expected:** 404 (route not registered); UDS routes table only includes the native catalog dispatcher. +5. **Refresh route absent for OpenAI projection.** + - Command: HTTP `POST /api/openai/v1/models`. + - **Expected:** 404 / method not allowed; refresh remains exclusive to native catalog routes. +6. **Source identity exposed in `agh.sources`.** + - **Expected:** Array of `source_id` strings ordered consistently with native projection. + +## Audit Coverage + +- C5, C8. +- SI-9 (no secret in OpenAI payload). + +## Pass Criteria + +- All steps match documented behavior. + +## Failure Criteria + +- UDS exposes the OpenAI route. +- Filter ignored. +- `agh` metadata missing. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-005.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-005.md new file mode 100644 index 000000000..9a1b89de8 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-005.md @@ -0,0 +1,48 @@ +# TC-INT-005: Extension Source - Success and Denial Through Host API + +**Priority:** P0 +**Type:** Integration +**Systems:** `internal/extension`, `internal/modelcatalog`, Host API. +**Requirement:** ADR-003, TechSpec Extensibility Integration Plan, Task 08. +**Status:** Not Run + +## Objective + +Verify the extension `model.source` end-to-end path: AGH calls extension `models/list`, validates and persists rows, and surfaces the daemon-owned projection through Host API; capability denial is deterministic. + +## Preconditions + +- [ ] Extension fixture with `model.source` capability for provider `codex`. +- [ ] Capability grants toggleable. + +## Test Steps + +1. **Extension grant present, valid rows.** + - Trigger Host API `models/refresh` for `codex`. + - **Expected:** Extension subprocess invoked; AGH validates rows; SQLite catalog updated; status `succeeded`; `models/list` returns rows including extension priority 100. +2. **Extension returns invalid row.** + - **Expected:** Row dropped; source status records redacted error referencing the offending field; valid rows persist. +3. **Capability missing for `models/list`.** + - **Expected:** Deterministic capability error returned; no rows leaked. +4. **Capability missing for `models/refresh`.** + - **Expected:** No subprocess invoked; source status unchanged. +5. **Capability missing for `models/status`.** + - **Expected:** Deterministic capability error. +6. **Extension declares provider it has no grant for.** + - **Expected:** Refresh fails closed with capability error; valid grants for other providers unaffected. + +## Audit Coverage + +- C5, C6 (Task 08), C11 disruption probe. +- SI-8, SI-9. + +## Pass Criteria + +- Steps 1-2 produce correct catalog state. +- Steps 3-6 deterministically denied. + +## Failure Criteria + +- Denied call returns rows. +- Invalid extension row breaks persistence. +- Subprocess invoked without grant. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-006.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-006.md new file mode 100644 index 000000000..a7d96a6a3 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-INT-006.md @@ -0,0 +1,44 @@ +# TC-INT-006: ACP SDK v0.12.2 - Create / Load / Resume Coverage + +**Priority:** P0 +**Type:** Integration +**Systems:** `internal/acp` driver + ACP fake fixtures. +**Requirement:** TechSpec ACP Session Config Options, Task 06. +**Status:** Not Run + +## Objective + +Verify upgrade to `coder/acp-go-sdk@v0.12.2` keeps create/load/resume/mode behavior intact, exposes captured `configOptions`, and propagates `ACPCapsPayload.config_options` / `SessionConfigOptionPayload`. + +## Preconditions + +- [ ] ACP fake driver with fixtures for `session/new`, `session/load`, `config_option_update`, mode events. + +## Test Steps + +1. **`session/new` returns `configOptions`.** + - **Expected:** Driver records options; HTTP/UDS capability payload includes `config_options` with the documented shape. +2. **`session/load` reuses captured options.** + - **Expected:** No duplicate model mutations. +3. **`session/set_config_option` applied for model and reasoning.** + - **Expected:** Driver issues the call when matching IDs are advertised; legacy `session/set_model` only used as fallback (TC-FUNC-010 covers exact behavior). +4. **`config_option_update` event mid-session.** + - **Expected:** Capability payload updated on next read. +5. **Mode/cancellation/error fields renamed in v0.12.2.** + - **Expected:** Driver compiles and tests prove old behavior intact (existing create/load/resume coverage). +6. **Resume flow.** + - **Expected:** Resumed session retains `configOptions` from prior load; no new `session/set_*` calls if state matches. + +## Audit Coverage + +- C5, C6 (Task 06). + +## Pass Criteria + +- All ACP flows pass on the upgraded SDK. +- `config_options` surface populated. + +## Failure Criteria + +- ACP driver regresses on create/load/resume. +- `config_options` missing in payload. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-PERF-001.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-PERF-001.md new file mode 100644 index 000000000..aa7c5cd22 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-PERF-001.md @@ -0,0 +1,50 @@ +# TC-PERF-001: Refresh Concurrency - Per-Provider Serialization + Cross-Provider Parallelism + +**Priority:** P0 +**Type:** Performance +**Module:** `internal/modelcatalog.Service.Refresh`, refresh wrapper. +**Requirement:** TechSpec SI-12, Task 11. +**Status:** Not Run + +## Objective + +Verify per-provider refresh requests serialize before any subprocess or provider-home work, identical concurrent requests for one provider coalesce, refresh storms across providers proceed in parallel, and SQLite write contention is avoided (no `BUSY` errors). + +## Preconditions + +- [ ] Stub live sources for `codex`, `anthropic`, `gemini`, `openrouter`, `ollama` with measurable subprocess latency. +- [ ] Test harness counts subprocess invocations and SQLite write attempts. + +## Test Steps + +1. **N concurrent same-provider refreshes.** + - Issue 32 simultaneous `POST /api/providers/codex/models/refresh` requests. + - **Expected:** Exactly one subprocess invocation; all 32 callers receive the same status batch with the same `refresh_request_id`. +2. **N cross-provider refreshes.** + - Issue refreshes for all 5 providers concurrently. + - **Expected:** 5 underlying subprocess invocations run in parallel; total wall time approximates the slowest provider, not the sum. +3. **Mixed storm.** + - Issue 32 same-provider + 32 cross-provider concurrently. + - **Expected:** Same-provider coalesced; cross-provider parallel; no SQLite `BUSY` error escapes coalescing. +4. **Repeated coalescing returns identical statuses.** + - **Expected:** Two callers in the same coalesce window see byte-equal status payloads; refresh request id correlated. +5. **SQLite contention.** + - Drive 100 refreshes/second across 5 providers for 30s. + - **Expected:** Zero `SQLITE_BUSY` propagated; per-provider serialization holds; no row corruption. + +## Audit Coverage + +- C5, C6 (Task 11), C11 disruption probe. +- SI-12, SI-13. + +## Pass Criteria + +- Same-provider coalescing observed. +- Cross-provider parallelism observed. +- No `SQLITE_BUSY` escapes. + +## Failure Criteria + +- Multiple subprocess invocations per coalesced batch. +- Cross-provider serialized. +- SQLite errors observed. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-PERF-002.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-PERF-002.md new file mode 100644 index 000000000..a4844d13a --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-PERF-002.md @@ -0,0 +1,48 @@ +# TC-PERF-002: Detached Refresh Lifetime + Daemon Shutdown Join + +**Priority:** P0 +**Type:** Performance +**Module:** `internal/modelcatalog` refresh wrapper. +**Requirement:** TechSpec SI-11. +**Status:** Not Run + +## Objective + +Verify request-cancellation does not abort detached refresh work, the configured deadline binds refresh, daemon shutdown joins outstanding refresh workers, and no goroutine leaks remain. + +## Preconditions + +- [ ] Stub source with controllable latency. +- [ ] Goroutine leak detector or `runtime.NumGoroutine` snapshot harness. + +## Test Steps + +1. **Cancel mid-flight HTTP refresh.** + - Configure stub latency = 2s; client timeout = 100ms. + - **Expected:** Client gets canceled error; daemon completes refresh; SQLite reflects success; goroutine count returns to baseline. +2. **Override request context deadline.** + - Submit refresh under a context with 50ms deadline. + - **Expected:** Refresh ignores caller deadline; uses configured deadline. +3. **Daemon shutdown.** + - Trigger refresh; immediately call daemon shutdown. + - **Expected:** Shutdown waits for refresh to complete (or hits configured shutdown timeout); SQLite consistent; no orphan goroutine; `Close` on store happens after refresh worker join. +4. **Goroutine leak check.** + - After 100 cancellation cycles, snapshot `runtime.NumGoroutine`. + - **Expected:** No monotonic growth. + +## Audit Coverage + +- C11. +- SI-11, SI-12. + +## Pass Criteria + +- Refresh completes under cancellation. +- Daemon shuts down cleanly. +- No goroutine leak. + +## Failure Criteria + +- Refresh aborts when request cancels. +- Goroutine count grows. +- Daemon exits before refresh completes. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-REG-001.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-REG-001.md new file mode 100644 index 000000000..12bd116e4 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-REG-001.md @@ -0,0 +1,45 @@ +# TC-REG-001: Hard-Cut Residue Repository Scan + +**Priority:** P1 +**Type:** Regression +**Surface:** Repository. +**Requirement:** ADR-002, Task 11.1. +**Status:** Not Run + +## Objective + +Verify no production code or generated artifact references `default_model`, `supported_models`, or `supports_reasoning_effort` outside the documented hard-cut warning copy and historical migration text. + +## Preconditions + +- [ ] Working tree clean except QA artifacts. + +## Test Steps + +1. **Repository grep.** + - Command: `grep -nE "default_model|supported_models|supports_reasoning_effort" -r --include="*.go" --include="*.ts" --include="*.tsx" --include="*.json" --include="*.toml" .` + - **Expected:** Only known allowlisted matches appear: + - `internal/modelcatalog/hardcut_residue_test.go` and related tests asserting the residue scan. + - `packages/site` warning copy (`provider-model-catalog-docs.test.ts`). + - QA artifacts under `.compozy/tasks/provider-model-catalog/qa/`. + - No production source under `internal/`, `web/src/`, `cmd/`, `openapi/`, or generated TS/openapi files contain the literal strings. +2. **Generated contracts.** + - Inspect `openapi/agh.json` and `web/src/generated/agh-openapi.d.ts`. + - **Expected:** No occurrences of the deleted fields. +3. **Web E2E fixtures.** + - Inspect `web/e2e/fixtures/`. + - **Expected:** No references to deleted keys. +4. **Site narrative copy.** + - **Expected:** Only hard-cut warning copy mentions the deleted keys; the docs vitest enforces this. + +## Audit Coverage + +- C6 (Task 11), C8. + +## Pass Criteria + +- Grep produces only allowlisted matches. + +## Failure Criteria + +- Any unexpected reference. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-REG-002.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-REG-002.md new file mode 100644 index 000000000..7a2078835 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-REG-002.md @@ -0,0 +1,42 @@ +# TC-REG-002: Generated Docs and CLI Reference Stay in Sync + +**Priority:** P1 +**Type:** Regression +**Surface:** `packages/site`, `make cli-docs`, `make codegen-check`. +**Requirement:** TechSpec Docs Impact, Task 10. +**Status:** Not Run + +## Objective + +Verify generated CLI docs, generated OpenAPI/TS types, and narrative MDX align with current production behavior. + +## Preconditions + +- [ ] Branch up-to-date. + +## Test Steps + +1. **Run `make cli-docs`.** + - **Expected:** No diff against committed `packages/site/content/runtime/cli/provider/models/{list,refresh,status}.mdx`. +2. **Run `make codegen-check`.** + - **Expected:** No diff in `openapi/agh.json` or `web/src/generated/agh-openapi.d.ts`. +3. **Run `cd packages/site && bun run test -- provider-model-catalog-docs`.** + - **Expected:** Suite passes; no flat-field claims outside warning copy. +4. **Open `packages/site/content/runtime/core/agents/model-catalog.mdx`.** + - **Expected:** Documents native HTTP/UDS catalog endpoints, `/api/openai/v1/models`, refresh lifetime/coalescing, extension `model.source`. Merge priority table reflects current source priorities (config 120 / live 110 / extension 100 / models_dev 50 / builtin 10). +5. **Open `packages/site/content/runtime/core/configuration/config-toml.mdx`.** + - **Expected:** `[model_catalog.sources.models_dev]`, `models.discovery`, and nested `[providers..models]` documented; defaults and validation rules match TechSpec. + +## Audit Coverage + +- C6 (Task 10). + +## Pass Criteria + +- All gates green; no diff. + +## Failure Criteria + +- Codegen diff. +- Docs vitest fails. +- Narrative copy contradicts daemon behavior. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-SCEN-001.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-SCEN-001.md new file mode 100644 index 000000000..5da8d83ef --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-SCEN-001.md @@ -0,0 +1,83 @@ +# TC-SCEN-001: Operator Edits Provider Catalog and Starts a Session + +**Priority:** P0 +**Type:** Real Scenario +**Status:** Not Run +**Estimated Time:** 25 minutes +**Created:** 2026-05-07 + +--- + +## Behavioral Scenario Charter + +- **Startup situation**: Operator runs an isolated AGH lab (unique `AGH_HOME`, ports, tmux socket) provisioned by `agh-qa-bootstrap`. At least one ACP-capable provider is configured with synthetic credentials; live discovery uses stub HTTP servers and fake subprocesses by default. +- **Operator intent**: Adjust the curated metadata for one provider, refresh its catalog, then start a new session against a chosen model with reasoning effort. +- **Expected business outcome**: Operator perceives a coherent catalog with deterministic source attribution; the new session creates against the chosen model and reasoning effort; selection persists across surfaces. +- **AGH surfaces used**: Web (Settings > Providers, new session dialog), HTTP (`/api/providers/...`), SQLite (`model_catalog_*` tables), ACP (`session/new`, `session/set_config_option`). +- **Real provider/LLM expectation**: ACP fake driver acts as the provider unless `MODELCATALOG_LIVE=1` is set; in that case one ACP-backed provider must produce a real `session/new` response. +- **Blocked live-provider boundary**: Default run uses fake ACP; `MODELCATALOG_LIVE=1` annex documents the real-provider boundary in the verification report. +- **Scenario contract minimums covered**: operator role, web channel, HTTP channel, SQLite truth, ACP control, manual entry, stale fallback observation. + +## Actors and Agent Roles + +| Actor / Agent | Role | Expected Behavior | Evidence Source | +|---------------|------|-------------------|-----------------| +| Operator | Catalog editor + session creator | Edits curated metadata, refreshes catalog, starts session | Web screenshots + DOM snapshot | +| ACP fake driver | Provider | Returns `configOptions` for model + reasoning | ACP fixture transcript | +| Daemon | Catalog authority | Persists curated edit, refreshes, projects to surfaces | SQLite + HTTP responses | + +## Preconditions + +- [ ] Bootstrap manifest exists; `AGH_WEB_API_PROXY_TARGET` exported. +- [ ] Daemon running; web app reachable; ACP fake driver registered. +- [ ] Catalog seeded with `models.dev` + `builtin` rows for `codex`. + +## Journey Steps + +1. **Operator opens Settings > Providers.** + - Surface: Web. + - Input: Browser navigation to `/settings/providers` via `browser-use:browser` (fallback `agent-browser`). + - **Expected:** Provider cards render with source status; redacted `last_error` shown for any failed source; `default_model`/`supported_models`/`supports_reasoning_effort` strings absent in DOM and React Query cache. +2. **Operator adds a curated entry with reasoning efforts.** + - Surface: Web form. + - Input: `id="manual-gpt"`, `display_name="Manual GPT"`, `reasoning_efforts=["medium","high"]`, `default_reasoning_effort="medium"`. + - **Expected:** PUT request matches generated TS contract; daemon persists; SQLite `model_catalog_rows` has a `config` row at priority 120 with snapshot-preserved metadata; CLI / HTTP / UDS / Host API agree (TC-INT-003). +3. **Operator refreshes catalog.** + - Surface: Web refresh button. + - Input: Click refresh on `codex` card. + - **Expected:** UI shows pending state; on completion `last_refresh_at` updates; if a stub source is failing, `stale=true` flag visible with redacted error. +4. **Operator opens new session dialog and selects manual model.** + - Surface: Web dialog. + - Input: select provider `codex`, model `manual-gpt`, reasoning `medium`. + - **Expected:** Dialog renders catalog rows from `useProviderModels`; manual entry valid; submission triggers `session/new` and `session/set_config_option` (TC-FUNC-010 invariant). +5. **Operator confirms session is live with chosen model.** + - Surface: Web active session panel. + - **Expected:** Session controls switch to ACP `configOptions`; chosen model + reasoning effort reflected; catalog metadata never overrides current option value (SI-7). +6. **Disruption probe - stale catalog while session lives.** + - Probe: stub `models.dev` 5xx; trigger refresh. + - **Expected:** Catalog rows flagged stale; running session unaffected; manual model selection still valid. + +## Required Evidence + +- Browser screenshots: settings page, dialog, active session controls. +- HTTP request/response logs (network panel exports). +- SQLite snapshots (rows + status) before and after edit. +- ACP fake driver transcript showing `session/new` + `session/set_config_option`. +- Daemon log capture with `refresh_request_id` correlation. + +## Audit Coverage + +- C4 (operator), C5 (Web + HTTP), C8 (cross-surface), C10 (artifact reuse: catalog row reused by TC-SCEN-002), C11 (stale probe), C14. + +## Pass Criteria + +- Operator goal achieved end-to-end without manual workaround. +- Catalog row visible across CLI/HTTP/UDS/Host API. +- ACP control matches TC-FUNC-010 invariants. +- Stale probe surfaces redacted error; session unaffected. + +## Failure Criteria + +- Settings form emits legacy fields. +- ACP control regresses to `session/set_model` despite advertised config option. +- Stale state hidden or session aborted. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-SCEN-002.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-SCEN-002.md new file mode 100644 index 000000000..550f72c76 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-SCEN-002.md @@ -0,0 +1,83 @@ +# TC-SCEN-002: Agent Manages Catalog Through CLI/HTTP/UDS/Host API + +**Priority:** P0 +**Type:** Real Scenario +**Status:** Not Run +**Estimated Time:** 30 minutes +**Created:** 2026-05-07 + +--- + +## Behavioral Scenario Charter + +- **Startup situation**: Same isolated lab as TC-SCEN-001. Catalog state already includes the curated row produced in TC-SCEN-001 (artifact reuse). +- **Operator intent**: An external agent (script or another AGH agent) manages the catalog without web UI: list rows, refresh, inspect status, select a manual model, and start a session via API. +- **Expected business outcome**: The agent's CLI/HTTP/UDS/Host API operations are deterministic, structured, byte-equal between transports for steady-state list payloads, and reflect the same persisted state as the web UI. +- **AGH surfaces used**: CLI (`agh provider models {list|refresh|status}`), HTTP (`/api/providers/...`, `/api/openai/v1/models`), UDS (`/api/providers/...`), Host API (`models/list|refresh|status`). +- **Real provider/LLM expectation**: ACP fake driver creates the session; opt-in `MODELCATALOG_LIVE=1` annex covers a real provider. +- **Blocked live-provider boundary**: documented in the verification report when not running live. +- **Scenario contract minimums covered**: agent role, CLI/HTTP/UDS/Host API channels, refresh storm, redaction, deterministic JSON output, capability gating. + +## Actors and Agent Roles + +| Actor / Agent | Role | Expected Behavior | Evidence Source | +|---------------|------|-------------------|-----------------| +| Remote agent | Catalog reader/operator | Drives CLI + HTTP + UDS + Host API | CLI transcripts | +| Daemon | Catalog authority | Serves identical projection | HTTP/UDS/Host API responses | +| Extension | `model.source` provider | Returns valid + invalid rows on demand | Extension subprocess transcript | + +## Preconditions + +- [ ] Bootstrap manifest exists; `AGH_HOME`, ports, sockets unique. +- [ ] Daemon running with extension fixture installed. +- [ ] TC-SCEN-001 catalog state present (`manual-gpt` curated row). + +## Journey Steps + +1. **Agent runs `agh provider models list -o json`.** + - Surface: CLI. + - Input: no flags. + - **Expected:** JSON includes `manual-gpt` row from TC-SCEN-001; sources sorted deterministically; output structurally equivalent to HTTP `GET /api/providers/models`. +2. **Agent triggers refresh storm.** + - Surface: CLI + HTTP concurrently for `codex`, `anthropic`, `gemini`. + - **Expected:** Same-provider coalescing observed (TC-PERF-001); cross-provider parallel; redacted errors only. +3. **Agent inspects status via UDS.** + - Surface: UDS client. + - **Expected:** Same byte-equal status payload as HTTP for steady-state list; `refresh_request_id`, `last_refresh_at`, `last_error` redacted. +4. **Agent inspects OpenAI projection.** + - Surface: HTTP `GET /api/openai/v1/models?provider_id=codex`. + - **Expected:** OpenAI shape with `agh` metadata; UDS does NOT expose this route. +5. **Agent calls Host API `models/list` (with grant).** + - **Expected:** Daemon-owned projection; structurally equivalent to HTTP/CLI; raw extension payload not leaked. +6. **Agent revokes Host API grant and retries.** + - **Expected:** Deterministic capability error; no rows leaked. +7. **Agent creates a session via HTTP `POST /api/sessions` selecting manual model.** + - **Expected:** Session creation succeeds; ACP control uses `session/set_config_option`; session fixture confirms. +8. **Disruption probe - extension returns invalid row mid-storm.** + - **Expected:** Invalid row dropped; valid rows persist; redacted error surfaced; refresh request id correlated in logs. + +## Required Evidence + +- CLI transcripts with structured JSON output. +- HTTP/UDS response bodies (canonical sort) for byte-equality check. +- Host API response bodies before and after grant revoke. +- Daemon log entries with `refresh_request_id`, `provider_id`, `source_id`, `source_kind`, `extension_name` correlation keys. +- Extension subprocess transcript. + +## Audit Coverage + +- C4 (agent), C5 (CLI + HTTP + UDS + Host API), C8 (parity), C9 (provider boundary), C10 (artifact reuse from TC-SCEN-001), C11 (refresh storm + extension denial + invalid row), C14. + +## Pass Criteria + +- All four surfaces show identical persisted state. +- Refresh storm coalesced. +- Capability gate enforced. +- Manual model session created via API. + +## Failure Criteria + +- CLI/HTTP/UDS/Host API drift. +- OpenAI projection registered on UDS. +- Capability gate bypassed. +- Refresh storm causes SQLite `BUSY`. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-SEC-001.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-SEC-001.md new file mode 100644 index 000000000..db62b537d --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-SEC-001.md @@ -0,0 +1,43 @@ +# TC-SEC-001: No Secret Material Leaks Across Surfaces + +**Priority:** P0 +**Type:** Security +**OWASP Category:** A09 (logging) / A02 (cryptographic failures) +**Risk Level:** Critical +**Requirement:** TechSpec SI-9. +**Status:** Not Run + +## Objective + +Verify API keys, OAuth tokens, secret-shaped env vars, and provider credential material never appear in any source error, log, status payload, SSE event, web-visible payload, or Host API response. + +## Preconditions + +- [ ] Daemon running; structured logs captured. +- [ ] Source stubs configured to return errors that include API key, OAuth token, env-shaped secrets. +- [ ] Provider env explicitly seeds `OPENAI_API_KEY=sk-test-1234567890abcdef`, `ANTHROPIC_API_KEY=sk-ant-secret`, `OAUTH_REFRESH_TOKEN=oauth.refresh.secret`. + +## Test Steps + +1. **Trigger refresh failures with seeded errors for `models.dev`, live providers, extension source.** +2. **Capture logs (stdout + structured), HTTP/UDS status responses, CLI output, Host API response, web `network` traffic from Settings > Providers, SSE events.** +3. **Grep all captured payloads for the seeded secret strings.** + - **Expected:** Zero matches. +4. **Reduce redaction helper to no-op (test harness override) and re-run.** + - **Expected:** Projection-time redaction still catches secrets; defense-in-depth confirmed. +5. **Restore redaction helper; introduce a new secret-looking string in error.** + - **Expected:** Redacted summary remains readable but obfuscates secret-shaped substrings. + +## Audit Coverage + +- C11 disruption probe, C14. +- SI-9. + +## Pass Criteria + +- No secret leak across any surface. + +## Failure Criteria + +- Secret string appears in any captured surface. +- Redaction toggleable from outside redact helper. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-SEC-002.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-SEC-002.md new file mode 100644 index 000000000..6b3d654fc --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-SEC-002.md @@ -0,0 +1,50 @@ +# TC-SEC-002: `/api/openai/v1/models` Auth + OpenAI-Shaped Errors + +**Priority:** P0 +**Type:** Security +**OWASP Category:** A01 (broken access control) +**Risk Level:** High +**Requirement:** TechSpec OpenAI-Compatible Projection. +**Status:** Not Run + +## Objective + +Verify the OpenAI projection enforces bearer auth like every `/api/*` route, returns OpenAI-shaped error envelope on auth failure, and remains absent from UDS where authentication semantics differ. + +## Preconditions + +- [ ] Daemon running with bearer auth enforced. +- [ ] Test client without token. + +## Test Steps + +1. **Unauthenticated HTTP request.** + - Command: `GET /api/openai/v1/models` without `Authorization`. + - **Expected:** 401 / 403 with OpenAI-shaped error envelope: `{"error":{"message":"...","type":"...","code":"..."}}`; AGH HTTP status code matches `/api/*` semantics; no catalog data leaked. +2. **Bad bearer token.** + - **Expected:** Same shape; rate limiting and CORS middleware applied if enabled. +3. **CORS preflight.** + - Send OPTIONS with allowed origin. + - **Expected:** CORS responds per `/api/*` policy. +4. **Authenticated `provider_id` filter for unknown provider.** + - **Expected:** 200 with empty `data`; no error. +5. **Method not supported for refresh.** + - Command: `POST /api/openai/v1/models`. + - **Expected:** 404/405 with OpenAI-shaped error if applicable. +6. **UDS does not expose the route.** + - Command: hit UDS path. + - **Expected:** 404; auth boundary respected. + +## Audit Coverage + +- C5, C9 boundary, C11. + +## Pass Criteria + +- All steps return documented behavior. + +## Failure Criteria + +- Unauthenticated call returns data. +- Error envelope diverges from OpenAI shape. +- UDS exposes the route. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-UI-001.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-UI-001.md new file mode 100644 index 000000000..9396364a4 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-UI-001.md @@ -0,0 +1,57 @@ +# TC-UI-001: Settings > Providers - Source Status + Refresh + +**Priority:** P1 +**Type:** UI +**Surface:** `web/src/routes/_app/settings/providers.tsx`, `web/src/systems/model-catalog/` +**Requirement:** TechSpec Web, Task 09. +**Status:** Not Run + +## Objective + +Verify each provider card surfaces source status (id, kind, last refresh, next refresh, redacted last error, stale flag), exposes a refresh control, and reflects daemon-served catalog state including curated metadata snapshot preservation. + +## Preconditions + +- [ ] Daemon running with seeded catalog state. +- [ ] Web app served under `AGH_WEB_API_PROXY_TARGET` from bootstrap manifest. +- [ ] Browser via `browser-use:browser` or `agent-browser` fallback. + +## Test Steps + +1. **Open Settings > Providers.** + - **Expected:** Each provider card lists every catalog source with status; loading skeleton replaced by data; no console errors. +2. **Trigger refresh for one provider.** + - **Expected:** Refresh button enters pending state; on completion the card updates source rows, `last_refresh_at`, and `refresh_state`; no other provider is impacted. +3. **Force a source error and refresh.** + - **Expected:** Card shows redacted `last_error`; stale flag visible; manual entry control still available. +4. **Curated metadata snapshot preserved on save.** + - Edit curated entry; save settings. + - **Expected:** Catalog adapters use snapshot-preserved metadata so unrelated rows are not mutated; daemon `config` source rows reflect only the edited fields. +5. **Visual conformance.** + - **Expected:** Card uses `DESIGN.md` tokens (no shadows, warm-dark palette, signal palette for refresh state colors). Default state matches `Paper` artboards in `DESIGN.md`. No invented metrics shown. + +## Visual Specifications + +- Background: `oklch` warm-dark token from `DESIGN.md`. +- Refresh button states: idle (neutral), running (warning `#FFD60A`), success (`#30D158`), failure (`#FF453A`). +- Stale label uses warning palette. + +## Responsive Checks + +- Desktop 1280px, Tablet 768px, Mobile 375px - layout legible at each breakpoint. + +## Audit Coverage + +- C5, C8, C11. + +## Pass Criteria + +- Source status renders correctly with redacted errors. +- Refresh updates only target provider. +- Curated edit preserves snapshot. + +## Failure Criteria + +- Stale flag missing. +- Console error during refresh. +- Curated edit corrupts other models. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-UI-002.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-UI-002.md new file mode 100644 index 000000000..82bcea959 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-UI-002.md @@ -0,0 +1,43 @@ +# TC-UI-002: Settings > Providers - Manual Entry + Curated Edit + +**Priority:** P1 +**Type:** UI +**Surface:** `web/src/routes/_app/settings/providers.tsx` +**Requirement:** TechSpec SI-6, Task 09. +**Status:** Not Run + +## Objective + +Verify the new settings form edits `models.default` and `models.curated`, allows manual model IDs (curated is not an allowlist), and emits payloads matching the new nested contract (no `default_model`, `supported_models`, or `supports_reasoning_effort`). + +## Preconditions + +- [ ] Daemon and web app running. + +## Test Steps + +1. **Add curated model with reasoning efforts.** + - **Expected:** Form accepts metadata; submits payload using `models.default`/`models.curated`; daemon persists; CLI/HTTP/UDS reflect change. +2. **Set default to a model NOT in curated list.** + - **Expected:** Form accepts; payload shows `models.default = "manual-id"`; manual model becomes selectable in session create dialog. +3. **Reject duplicate curated id.** + - **Expected:** Form shows inline validation error. +4. **Reject blank reasoning effort.** + - **Expected:** Inline validation error referencing the empty entry. +5. **Inspect payload network request.** + - **Expected:** No legacy keys; matches generated TS contract `web/src/generated/agh-openapi.d.ts`. + +## Audit Coverage + +- C5, C8. +- SI-6. + +## Pass Criteria + +- Form contract matches generated types. +- Manual entry accepted. + +## Failure Criteria + +- Legacy fields appear in payload. +- Manual default rejected. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-UI-003.md b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-UI-003.md new file mode 100644 index 000000000..3b5cf715d --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-cases/TC-UI-003.md @@ -0,0 +1,51 @@ +# TC-UI-003: New Session Dialog - Catalog vs ACP Config Options + +**Priority:** P1 +**Type:** UI +**Surface:** `web/src/systems/session/components/session-create-dialog.tsx`, `web/src/systems/session/hooks/use-session-create-dialog.ts`, `web/src/systems/model-catalog/lib/derive-active-session-options.ts`. +**Requirement:** TechSpec SI-7, Task 09. +**Status:** Not Run + +## Objective + +Verify the new session dialog uses the daemon catalog (not legacy `supported_models`) for pre-session model selection, supports manual entry, surfaces stale/error/empty states, and switches to ACP `configOptions` once the session is active. + +## Preconditions + +- [ ] Daemon with seeded catalog. +- [ ] Web app under bootstrap manifest proxy target. + +## Test Steps + +1. **Open new session dialog with seeded catalog.** + - **Expected:** Model picker lists rows from `useProviderModels`; sources/availability badges visible; manual entry input present. +2. **Catalog stale.** + - Force stale flag in seed. + - **Expected:** Stale models render with stale label; selection still allowed. +3. **Catalog empty.** + - Seed with zero rows. + - **Expected:** Empty state shown; manual entry remains valid; submitting manual model creates session successfully. +4. **Refresh in dialog.** + - **Expected:** `useRefreshProviderModels` triggers; loading state visible; rows update on completion. +5. **Switch to active session.** + - After session creates, open active session settings panel. + - **Expected:** Controls switch to ACP `configOptions` (model + reasoning) via `deriveActiveSessionOptions`; catalog metadata never overrides session option current value (SI-7). +6. **No legacy field reads.** + - Inspect React Query cache + network responses. + - **Expected:** No `supported_models` / `default_model` / `supports_reasoning_effort` references. + +## Audit Coverage + +- C5, C7, C11. +- SI-7. + +## Pass Criteria + +- Catalog drives picker; ACP overrides post-creation. +- Manual entry valid in all states. + +## Failure Criteria + +- Picker reads legacy field. +- ACP override missing. +- Manual entry blocked. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-plans/00-coverage-matrix.md b/.compozy/tasks/provider-model-catalog/qa/test-plans/00-coverage-matrix.md new file mode 100644 index 000000000..8aa31bd1b --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-plans/00-coverage-matrix.md @@ -0,0 +1,124 @@ +# Provider Model Catalog - QA Coverage Matrix + +This matrix maps every TechSpec safety invariant, ADR decision, and implementation task to the concrete test cases authored under `qa/test-cases/`. Task 13 must run every case listed here. A blank row is a Task 13 blocker. + +## Source Authorities + +- TechSpec: `.compozy/tasks/provider-model-catalog/_techspec.md` (Safety Invariants 1-13, Testing Approach, Observability). +- ADRs: `adrs/adr-001-daemon-owned-provider-model-catalog.md`, `adrs/adr-002-provider-model-config-hard-cut.md`, `adrs/adr-003-extension-model-source-contract.md`. +- Tasks: `task_01.md` through `task_11.md`. +- QA tail template: `.agents/skills/cy-tasks-tail-qa-pair/references/hermes-tail-template.md`. + +## TechSpec Safety Invariants + +| Invariant | Description | Test Cases | +|-----------|-------------|------------| +| SI-1 | Session creation never depends on successful network model discovery. | TC-SCEN-001, TC-FUNC-008, TC-INT-005 | +| SI-2 | Discovery must not create, load, mutate, or stop ACP sessions. | TC-FUNC-009, TC-INT-006 | +| SI-3 | Live discovery uses provider effective auth/home/env policy and explicit timeouts. | TC-FUNC-009, TC-FUNC-014 | +| SI-4 | Source refresh failure records source status and preserves prior stale rows. | TC-FUNC-006, TC-FUNC-013, TC-INT-002 | +| SI-5 | `models.dev` rows never prove account-level availability. | TC-FUNC-005, TC-INT-002 | +| SI-6 | `models.curated` is never an allowlist; manual model IDs remain valid. | TC-FUNC-002, TC-SCEN-002, TC-UI-002 | +| SI-7 | Active ACP `configOptions` override catalog metadata for that session only. | TC-FUNC-010, TC-INT-006, TC-UI-003 | +| SI-8 | Global catalog rows are only written through `internal/modelcatalog.Store`. | TC-FUNC-004, TC-INT-001 | +| SI-9 | No raw secrets, API keys, OAuth data, or credential material in source errors / logs / status / SSE / web / Host API. | TC-SEC-001, TC-FUNC-013, TC-INT-002 | +| SI-10 | SQLite schema changes append a new migration at the registry tail and pass fresh DB plus reopen-after-restart tests. | TC-INT-001 | +| SI-11 | HTTP/UDS request lifetime does not own background refresh; refresh uses `context.WithoutCancel(ctx)` + explicit deadline. | TC-FUNC-014, TC-PERF-002 | +| SI-12 | Live refresh work is serialized/coalesced per `provider_id` before touching `HOME`, native CLI auth state, cache files, or SQLite. | TC-PERF-001, TC-PERF-002 | +| SI-13 | Partial-source success is success; list fails only when every usable source fails and no stale cache exists. | TC-FUNC-007, TC-INT-002 | + +## ADR Decisions + +| ADR | Decision | Test Cases | +|-----|----------|------------| +| ADR-001 | Daemon-owned catalog with HTTP/UDS/CLI/Host API/web parity. | TC-INT-002, TC-INT-003, TC-INT-004, TC-SCEN-002 | +| ADR-002 | Hard cut of `default_model`/`supported_models`/`supports_reasoning_effort`. | TC-FUNC-001, TC-REG-001 | +| ADR-003 | Extension `model.source` capability + Host API `models/list|refresh|status`. | TC-FUNC-011, TC-FUNC-012, TC-INT-005 | + +## Task Coverage + +| Task | Title | Test Cases | +|------|-------|------------| +| 01 | Provider Config and Builtin Model Hard Cut | TC-FUNC-001, TC-FUNC-002, TC-REG-001 | +| 02 | Model Catalog Persistence | TC-INT-001 | +| 03 | Catalog Service and Catalog Sources | TC-FUNC-003, TC-FUNC-004, TC-FUNC-005, TC-FUNC-006, TC-FUNC-007 | +| 04 | Live Provider Discovery Sources | TC-FUNC-008, TC-FUNC-009 | +| 05 | Daemon Catalog Wiring | TC-INT-002, TC-PERF-001, TC-PERF-002 | +| 06 | ACP SDK Upgrade and Config Options | TC-FUNC-010, TC-INT-006, TC-UI-003 | +| 07 | HTTP, UDS, CLI, OpenAI Model Projection | TC-INT-002, TC-INT-003, TC-INT-004, TC-SEC-002 | +| 08 | Extension Model Source Contract | TC-FUNC-011, TC-FUNC-012, TC-INT-005 | +| 09 | Web Model Catalog Experience | TC-UI-001, TC-UI-002, TC-UI-003, TC-SCEN-001, TC-SCEN-002 | +| 10 | Generated Contracts and Runtime Docs | TC-FUNC-015, TC-REG-002 | +| 11 | Cross-Surface Regression Hardening | TC-FUNC-013, TC-FUNC-014, TC-PERF-001, TC-PERF-002 | + +## Public Surface Coverage + +| Surface | Endpoints / Commands | Test Cases | +|---------|----------------------|------------| +| HTTP native catalog | `GET /api/providers/models`, `GET /api/providers/{provider_id}/models`, `POST /api/providers/models/refresh`, `POST /api/providers/{provider_id}/models/refresh`, `GET /api/providers/models/status`, `GET /api/providers/{provider_id}/models/status` | TC-INT-002, TC-INT-003 | +| HTTP-only OpenAI projection | `GET /api/openai/v1/models`, `GET /api/openai/v1/models?provider_id=` | TC-INT-004, TC-SEC-002 | +| UDS native catalog | Same path family registered on UDS group, **never** the OpenAI projection. | TC-INT-002, TC-INT-003, TC-INT-004 | +| CLI | `agh provider models list [provider]`, `agh provider models refresh [provider]`, `agh provider models status [provider]`, with `--source`, `--refresh`, `--include-stale`, `-o json`. | TC-INT-002, TC-INT-003, TC-SCEN-002 | +| Extension Host API | `models/list`, `models/refresh`, `models/status` | TC-FUNC-011, TC-FUNC-012, TC-INT-005 | +| AGH -> extension | `models/list` request shape, capability gate. | TC-FUNC-011, TC-FUNC-012 | +| Web (Settings > Providers) | `web/src/routes/_app/settings/providers.tsx`, source status cards, refresh button, curated/default editor. | TC-UI-001, TC-UI-002 | +| Web (Session create dialog) | `web/src/systems/session/components/session-create-dialog.tsx`, model picker pulled from catalog, manual entry fallback. | TC-UI-003, TC-SCEN-001 | +| Web TanStack adapter | `web/src/systems/model-catalog/` query keys, hooks, adapter, `deriveActiveSessionOptions`. | TC-UI-003 | +| Generated contracts | `openapi/agh.json`, `web/src/generated/agh-openapi.d.ts`, extension TS types. | TC-FUNC-015, TC-REG-002 | +| Docs | `packages/site/content/runtime/core/agents/model-catalog.mdx`, `providers.mdx`, `config-toml.mdx`, `cli/provider/models/*.mdx`, extension authoring docs. | TC-FUNC-015 | +| `config.toml` | `[providers..models]` (default, curated, discovery), `[model_catalog.sources.models_dev]`. | TC-FUNC-001, TC-FUNC-002, TC-INT-001 | +| Observability | Structured logs/events with `refresh_request_id`, `provider_id`, `source_id`, `source_kind`, `model_id`, `extension_name`. | TC-FUNC-013, TC-INT-005, TC-PERF-002 | +| Persistence | `model_catalog_sources`, `model_catalog_rows`, `model_catalog_reasoning_efforts` tables (global migration v23). | TC-INT-001 | + +## Failure-Mode Coverage + +| Failure / Edge Case | Cases | +|---------------------|-------| +| Old TOML keys present (`default_model`, `supported_models`, `supports_reasoning_effort`). | TC-FUNC-001, TC-REG-001 | +| Curated default not in curated list. | TC-FUNC-002 | +| Curated duplicate IDs / blank reasoning efforts / `default_reasoning_effort` not in list. | TC-FUNC-002 | +| `models.dev` HTTP 5xx, network timeout, JSON malformed, legacy field aliases. | TC-FUNC-005, TC-FUNC-006, TC-FUNC-013 | +| `models.dev` disabled via config. | TC-FUNC-005 | +| Live provider source timeout, subprocess failure, missing auth. | TC-FUNC-008, TC-FUNC-009 | +| Live provider source attempts ACP `session/new`/`set_*`. | TC-FUNC-009 | +| Stale source rows preserved across daemon restart. | TC-FUNC-006, TC-INT-001 | +| All sources fail, no stale cache exists. | TC-FUNC-007 | +| Source error contains API key / OAuth token / env secret. | TC-SEC-001, TC-FUNC-013 | +| Source error shape leaks beyond redaction at HTTP/UDS/Web/Host API. | TC-SEC-001, TC-INT-002 | +| Concurrent same-provider refresh. | TC-PERF-001 | +| Concurrent cross-provider refresh storm. | TC-PERF-001 | +| Repeated coalesced refresh returns same status batch. | TC-PERF-001 | +| Request cancellation during refresh detaches refresh lifetime. | TC-PERF-002, TC-FUNC-014 | +| SQLite `BUSY` write contention. | TC-PERF-001 | +| Extension capability missing or revoked. | TC-FUNC-012 | +| Extension manifest declares non-normalizable `model.source` slug. | TC-FUNC-011 | +| Extension `models/list` returns invalid rows. | TC-FUNC-011 | +| `/api/openai/v1/models` registered on UDS by mistake. | TC-INT-004 | +| `/api/openai/v1/models` unauthenticated request. | TC-SEC-002 | +| `/api/openai/v1/models?provider_id=unknown`. | TC-INT-004 | +| ACP `session/set_config_option` succeeds; `session/set_model` fallback only when config option absent. | TC-FUNC-010, TC-INT-006 | +| ACP session exposes no model option; reasoning never sent. | TC-FUNC-010 | +| Web: Settings > Providers refresh button surfaces stale state and last error. | TC-UI-001 | +| Web: New session dialog uses ACP `configOptions` after creation. | TC-UI-003 | +| Web: Manual model entry remains valid when curated empty. | TC-UI-002, TC-SCEN-001 | +| Generated docs / OpenAPI / TS types drift. | TC-FUNC-015, TC-REG-002 | + +## Real-Scenario Mapping (TC-SCEN) + +| TC-SCEN | Operator Journey | Surfaces | TechSpec Anchors | +|---------|-------------------|----------|-------------------| +| TC-SCEN-001 | Operator opens Settings > Providers, edits curated metadata, refreshes models, then creates a session and selects a model. | Web + HTTP + SQLite + ACP | SI-1, SI-6, SI-7 | +| TC-SCEN-002 | Agent driving CLI/HTTP/UDS lists, refreshes, and inspects model status without using the web UI. | CLI + HTTP + UDS + SQLite | SI-4, SI-12, SI-13 | + +## Auditor Coverage + +The TC-SCEN cases must satisfy: + +- C4 actor/role coverage: operator + agent both exercise catalog surfaces. +- C5 channels: HTTP, UDS, CLI, web, Host API. +- C6 task tree: TC-SCEN cases reference Tasks 01-11. +- C8 cross-surface truth: TC-INT-003 / TC-SCEN-002 compare CLI/HTTP/UDS/Host API/web payloads. +- C9 live provider: TC-FUNC-008 documents the live discovery boundary; real-provider runs are opt-in. +- C10 artifact reuse: catalog rows produced in TC-SCEN-001 are reused by TC-SCEN-002. +- C11 disruption probes: stale, timeout, redaction, denial, and SQLite contention. +- C14 final verification: TC-SCEN-001 and TC-SCEN-002 require `make verify` evidence in `qa/verification-report.md`. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-plans/provider-model-catalog-regression.md b/.compozy/tasks/provider-model-catalog/qa/test-plans/provider-model-catalog-regression.md new file mode 100644 index 000000000..4b8110844 --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-plans/provider-model-catalog-regression.md @@ -0,0 +1,79 @@ +# Provider Model Catalog - Regression Suite + +This suite drives the existing test cases in `qa/test-cases/` through tiered execution that Task 13 follows. + +## Tiered Execution + +| Suite | Duration | Frequency | Cases | +|-------|----------|-----------|-------| +| Smoke | ≤15 min | Per change | SMOKE-001 (daemon start, focused gates compile, web build, docs vitest, codegen-check) | +| Targeted | 30-60 min | Per task PR | All TC-FUNC-* + relevant TC-INT-* for changed surfaces | +| Full Release | 2-3 h | Release / Task 13 | All TC-FUNC, TC-INT, TC-PERF, TC-SEC, TC-UI, TC-REG, TC-SCEN | +| Sanity | 10-15 min | After hotfix | TC-FUNC-001, TC-INT-002, TC-SCEN-001 happy path only | + +## P0 Cases (must always pass) + +- TC-FUNC-001: Old TOML keys rejected. +- TC-FUNC-004: Catalog merge + tie-break determinism. +- TC-FUNC-007: Partial-source success / all-source failure. +- TC-FUNC-010: ACP `session/set_config_option` precedence over `session/set_model`. +- TC-FUNC-013: Source error redaction at projection boundary. +- TC-INT-001: Global migration v23 fresh DB + reopen-after-restart. +- TC-INT-002: HTTP/UDS native catalog handlers serve daemon-owned projection. +- TC-INT-003: HTTP/UDS canonical JSON byte equality + CLI parity. +- TC-INT-004: `/api/openai/v1/models` HTTP-only registration + auth + provider filter. +- TC-INT-005: Extension `model.source` + Host API `models/list|refresh|status`. +- TC-INT-006: ACP fixtures from upgraded SDK keep create/load/resume covered. +- TC-PERF-001: Per-provider refresh serialization + coalescing under concurrency. +- TC-PERF-002: Detached refresh lifetime survives request cancellation. +- TC-SEC-001: No raw secrets across logs / status / API / SSE / web / Host API. +- TC-SEC-002: `/api/openai/v1/models` rejects unauthenticated calls with OpenAI-shaped error. +- TC-SCEN-001: Operator real journey through web → HTTP → SQLite → ACP. +- TC-SCEN-002: Agent real journey through CLI → HTTP/UDS → Host API. + +## P1 Cases (≥90% pass required) + +- TC-FUNC-002: Curated config validation rules. +- TC-FUNC-003: Builtin source converts defaults to priority-10 rows. +- TC-FUNC-005: `models.dev` source TTL, disable, legacy alias parsing. +- TC-FUNC-006: Stale fallback when refresh fails after prior success. +- TC-FUNC-008: Live provider source timeout + per-provider env/home policy. +- TC-FUNC-009: Live discovery never calls ACP `session/*` mutators. +- TC-FUNC-011: Extension manifest validation + invalid row rejection. +- TC-FUNC-012: Extension capability missing/revoked is treated as denial. +- TC-FUNC-014: Refresh deadline detached from request context. +- TC-FUNC-015: Codegen drift gate for OpenAPI / TS contracts / docs. +- TC-REG-001: Hard-cut residue scan. +- TC-REG-002: Generated docs and CLI reference stay in sync. +- TC-UI-001: Settings > Providers source status + refresh state. +- TC-UI-002: Settings > Providers manual entry + curated edit. +- TC-UI-003: New session dialog uses ACP `configOptions` post-creation. + +## P2 / Exploratory + +- Manual exploratory probes documented in `qa/verification-report.md`: + - Toggle `[model_catalog.sources.models_dev].enabled = false` and verify status reflects disabled state without outbound HTTP. + - Disable extension grant mid-run and observe CLI/Host API states. + - Force `models.dev` 5xx and observe stale rows persisted. + +## Pass/Fail Criteria + +- **PASS**: All P0 cases pass; ≥90% P1 pass; remaining P1 failures have BUGs filed with root cause + fix; `make verify` clean. +- **FAIL**: Any P0 fails; secret material leaks anywhere; cross-surface parity diverges; SQLite contention causes BUSY errors that escape coalescing; ACP regression fallback path executes when config option exists. +- **CONDITIONAL**: P1 failure only with documented workaround AND scheduled fix in `qa/verification-report.md`. + +## Execution Order + +1. Smoke (SMOKE-001) — block on failure. +2. P0 unit + integration cases (TC-FUNC + TC-INT). +3. P0 perf + security cases (TC-PERF, TC-SEC). +4. P1 cases. +5. UI cases (TC-UI) under Playwright. +6. Real-scenario cases (TC-SCEN). +7. Final `make verify`. + +## Reporting + +- Update `qa/verification-report.md` after each case batch. +- File `qa/issues/BUG-NNN.md` for every reproduced defect with TC-ID linkage. +- Update `qa/test-cases/.md` execution history table. diff --git a/.compozy/tasks/provider-model-catalog/qa/test-plans/provider-model-catalog-test-plan.md b/.compozy/tasks/provider-model-catalog/qa/test-plans/provider-model-catalog-test-plan.md new file mode 100644 index 000000000..4aaee01aa --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/test-plans/provider-model-catalog-test-plan.md @@ -0,0 +1,190 @@ +# Provider Model Catalog - Master QA Plan + +## Executive Summary + +The provider model catalog program (Tasks 01-11) replaces the flat provider model fields with a daemon-owned, persisted, refreshable, agent-manageable catalog. It hard-cuts `default_model`, `supported_models`, and `supports_reasoning_effort`; introduces nested `[providers..models]`, `[model_catalog.sources.models_dev]`, and discovery config; persists rows in three new SQLite tables; exposes HTTP/UDS/CLI/Host API/web/`/api/openai/v1/models` projections; and upgrades ACP to `coder/acp-go-sdk@v0.12.2` with `session/set_config_option` semantics. + +This plan defines the QA contract that Task 13 must execute. Every TechSpec safety invariant (SI-1..SI-13), every ADR decision, every public surface, every failure mode, and every cross-surface parity boundary has a concrete test case in `qa/test-cases/`. + +### Objectives + +- Prove the hard cut is complete: no production code reads `default_model`, `supported_models`, or `supports_reasoning_effort`; old TOML keys fail with deterministic errors. +- Prove the catalog merge policy is deterministic: priority ordering, freshness tie-break, source-id tie-break, lower-priority enrichment, merged availability states, partial success. +- Prove HTTP, UDS, CLI, Host API, web, and the OpenAI projection serve the same persisted catalog state. +- Prove refresh stays correct under concurrency, request cancellation, SQLite write contention, and source failure. +- Prove redaction is enforced at persistence, projection, and log boundaries. +- Prove ACP sessions respect `configOptions` and only fall back to `session/set_model` when config options are absent. +- Prove operator and agent can manage the catalog without web UI through CLI/HTTP/UDS/Host API. + +### Out of Scope + +- Droid discovery. +- Fake ACP sessions for discovery. +- `models.dev` as account-level availability proof. +- `models.curated` as an allowlist. +- Real-provider `models.dev`, OpenAI, Anthropic, Gemini, OpenRouter, Vercel, Ollama, OpenCode HTTP calls — opt-in only via env tags, not gated by `make verify`. + +## Scope + +### In-Scope Surfaces + +- Go runtime: `internal/config`, `internal/store/globaldb`, `internal/modelcatalog`, `internal/acp`, `internal/api/core`, `internal/api/httpapi`, `internal/api/udsapi`, `internal/cli`, `internal/extension`, `internal/daemon`. +- Generated contracts: `openapi/agh.json`, `web/src/generated/agh-openapi.d.ts`, extension TS types. +- Web app: `web/src/systems/model-catalog/`, `web/src/systems/session/`, `web/src/systems/settings/`, `web/src/routes/_app/settings/providers.tsx`, web E2E fixtures. +- Docs: `packages/site/content/runtime/core/agents/{providers.mdx,model-catalog.mdx,extensions/*.mdx}`, `packages/site/content/runtime/core/configuration/config-toml.mdx`, generated CLI docs under `packages/site/content/runtime/cli/provider/models/*.mdx`. + +### Out-of-Scope + +- Real-provider live discovery validation (covered as opt-in scenario with explicit boundaries). +- Pricing/cost rendering changes outside the catalog payload. +- AGH Network protocol changes. + +## Behavioral Scenario Charter + +- **Startup situation**: Greenfield AGH alpha (no production users). Operator runs daemon locally with isolated `AGH_HOME`, custom ports, and a tmux-bridge socket. Provider env may include real or stubbed credentials per scenario. +- **Operator intent**: Add or refine a provider, see which models AGH knows about, refresh catalog state, and start a session against a chosen model with optional reasoning effort. +- **Expected business outcome**: The operator sees a coherent, deterministic, source-attributed catalog; manual model entry remains valid; sessions start without depending on network discovery; agent and operator perceive the same catalog state across surfaces. +- **AGH surfaces used**: HTTP (`/api/providers/...`, `/api/openai/v1/models`), UDS, CLI (`agh provider models {list|refresh|status}`), web Settings > Providers, web new-session dialog, extension Host API, ACP `session/set_config_option`. +- **Real provider/LLM expectation**: The daemon must function with stubbed live discovery (default in `make verify`); opt-in real-provider runs (`MODELCATALOG_LIVE=1`) document a single end-to-end refresh against `models.dev` and one configured ACP provider. +- **Blocked live-provider boundary**: `make verify` and CI runs use stub HTTP servers and fake subprocesses. Real-provider runs are opt-in; missing credentials are reported as source status, not failures. +- **Scenario contract minimums covered**: TC-SCEN-001 + TC-SCEN-002 collectively satisfy operator and agent journeys, cross-surface parity, manual entry, refresh under stress, and stale-state observation. + +## Test Strategy + +1. **Smoke readiness (entry criteria only)**: SMOKE-001 verifies daemon starts, web build succeeds, codegen is clean, focused Go gates compile. Smoke is not release-grade evidence. +2. **Unit tests** cover pure logic per package: config validation, schema migrations, catalog merge, redaction, source parsing, conversion helpers, ACP config option capture/apply. +3. **Integration tests** cover daemon-served HTTP/UDS handlers, CLI client, Host API capability gating, deterministic JSON byte parity, and migration boot reconciliation against a real `globaldb` instance. +4. **E2E (runtime + browser)** cover operator journeys end-to-end through `make test-e2e-runtime` and `make test-e2e-web` with fresh QA labs created via `agh-qa-bootstrap`. +5. **Failure / chaos** cover stale fallback, all-source failure, SQLite contention, request cancellation, concurrent refresh coalescing, and credential redaction. +6. **Codegen and docs** are gated through `make codegen-check`, `make bun-typecheck`, and the `provider-model-catalog-docs` vitest suite. + +Each test case in `qa/test-cases/` declares Audit Coverage IDs that map back to `qa/test-plans/00-coverage-matrix.md`. + +## Environment Requirements + +- Go 1.23.x with `CGO_ENABLED=1` (`-race` parity). +- Bun and Node toolchain compatible with the repo `.tool-versions` / `.nvmrc`. +- macOS 15+ or Linux x86_64; SQLite 3.45+. +- `coder/acp-go-sdk@v0.12.2` available through `go mod`. +- Isolated lab via `agh-qa-bootstrap`: unique `AGH_HOME`, daemon ports, `AGH_WEB_API_PROXY_TARGET`, tmux-bridge socket. +- Browser: Chromium under Playwright; `browser-use:browser` primary, `agent-browser` fallback. +- Provider env: synthetic credentials by default; opt-in real credentials only under `MODELCATALOG_LIVE=1`. + +## Entry Criteria + +- `git status` clean for production code under test (only QA artifacts may be uncommitted). +- `make verify` passed at the previous commit. +- `agh-qa-bootstrap` produced a fresh `bootstrap-manifest.json` for the run. +- Unique `AGH_HOME`, ports, and `tmux-bridge` socket allocated per worktree. +- Bootstrap manifest exports `AGH_WEB_API_PROXY_TARGET` for any web QA. + +## Exit Criteria + +- All P0 cases pass. +- ≥90% of P1 cases pass; remaining failures have `qa/issues/BUG-NNN.md` with root-cause + fix. +- Cross-surface parity test (TC-INT-003) shows byte-equal canonical JSON between native HTTP and UDS, and structurally equivalent CLI / Host API rows. +- Redaction tests (TC-SEC-001, TC-FUNC-013) show no API key, OAuth token, or env-shaped secret in any logged or projected payload. +- `make verify` passes after any QA-driven fixes. +- `qa/verification-report.md` records bootstrap manifest path, lab root, runtime home, base URL, commands, results, bug links, and residual risk. + +## Risk Assessment + +| Risk | Probability | Impact | Mitigation | +|------|-------------|--------|------------| +| Hard-cut residue silently rehydrates old fields. | Medium | High | TC-FUNC-001 + TC-REG-001 + repository scan. | +| Refresh under concurrency corrupts SQLite rows or status. | Medium | Critical | TC-PERF-001 + per-provider serialization assertions. | +| Refresh request cancellation cancels detached work. | Medium | High | TC-FUNC-014 + TC-PERF-002 + `context.WithoutCancel` assertions. | +| Source error leaks credentials into logs/UI/Host API. | Low | Critical | TC-SEC-001 + redaction at persistence and projection. | +| Generated contracts drift from runtime payload. | Medium | High | TC-FUNC-015, `make codegen-check`. | +| ACP `session/set_config_option` regresses to legacy `set_model`. | Low | High | TC-FUNC-010 + TC-INT-006 fixtures from upgraded SDK. | +| `/api/openai/v1/models` accidentally registered on UDS. | Low | High | TC-INT-004 explicit registration check. | +| `models.dev` becomes account availability proof under UI label drift. | Low | Medium | TC-FUNC-005 + TC-UI-001 stale label assertions. | +| Browser E2E flake on slow runners. | Medium | Medium | Use Playwright retries, deterministic seed via `web/e2e/fixtures/runtime-seed.ts`. | + +## Timeline and Deliverables + +- Day 1: Bootstrap fresh lab, run focused gates, replay TC-FUNC and TC-INT cases. +- Day 2: TC-PERF, TC-SEC, TC-UI cases; file BUGs as discovered. +- Day 3: TC-SCEN cases, fix loops with regression tests, finalize `verification-report.md`, commit. + +Deliverables are listed in Task 12 / Task 13 specs and in `qa/verification-report.md`. + +## Scenario Contract + +The following minimums must collectively be satisfied by the P0/P1 real-scenario cases (`TC-SCEN-001`, `TC-SCEN-002`): + +- Agents: operator (human) + remote agent (CLI/HTTP/Host API consumer). +- Roles: catalog editor, catalog reader, session creator, extension model source provider. +- Channels: HTTP, UDS, CLI, web, Host API, generated docs, generated TS types. +- Task tree: every public surface that Tasks 07-09 touched. +- Provider-backed sessions: at least one ACP-backed session uses `session/set_config_option` semantics (mock SDK fixture acceptable when real provider is blocked). +- Cross-surface objects: catalog row, source status, refresh request id, model availability state, source error. +- Artifacts used later: catalog row written via Settings > Providers (TC-SCEN-001) is read by CLI in TC-SCEN-002. +- Disruption probes: stale fallback, refresh coalescing, redaction, extension denial, request cancellation. +- Required surfaces: HTTP, UDS, CLI, web, Host API, OpenAI projection. + +## Auditor Mapping + +- C4 actor/role coverage → TC-SCEN-001 (operator) + TC-SCEN-002 (agent). +- C5 channels → TC-INT-002, TC-INT-003, TC-INT-004, TC-INT-005, TC-UI-001..003. +- C6 task tree → TC-FUNC + TC-INT cover Tasks 01-11. +- C8 cross-surface truth → TC-INT-003. +- C9 live provider → TC-FUNC-008 (stub) + opt-in `MODELCATALOG_LIVE=1` annex. +- C10 artifact reuse → TC-SCEN-001 → TC-SCEN-002 catalog row hand-off. +- C11 disruption probes → TC-PERF-001, TC-PERF-002, TC-FUNC-013, TC-FUNC-014. +- C14 final verification → `qa/verification-report.md` records `make verify` output. + +## Verification Commands (Required) + +Task 13 must run all of the following from a clean isolated lab. Substitute paths with the bootstrap manifest output where applicable. + +```bash +# 1. Activate isolated lab +.agents/skills/agh-qa-bootstrap/scripts/bootstrap.sh \ + --scenario provider-model-catalog \ + --output .compozy/tasks/provider-model-catalog/qa/lab +export AGH_HOME=$(jq -r '.runtime_home' .compozy/tasks/provider-model-catalog/qa/lab/bootstrap-manifest.json) +export AGH_WEB_API_PROXY_TARGET=$(jq -r '.web_api_proxy_target' .compozy/tasks/provider-model-catalog/qa/lab/bootstrap-manifest.json) + +# 2. Codegen + docs gates +make codegen +make codegen-check +cd packages/site && bun run test -- provider-model-catalog-docs && cd - + +# 3. Focused Go gates +go test -race ./internal/config ./internal/store/globaldb ./internal/modelcatalog/... \ + ./internal/acp ./internal/api/... ./internal/cli ./internal/extension/... + +# 4. Bun gates +make bun-typecheck +make bun-test +make web-build + +# 5. E2E lanes +make test-e2e-runtime +make test-e2e-web + +# 6. Optional live-provider annex (opt-in) +MODELCATALOG_LIVE=1 go test -tags=live ./internal/modelcatalog/... -run TestLive + +# 7. Repo-wide gate +make verify +``` + +`make verify` is the final blocking gate. It must run last and pass with zero warnings. + +## Bug Report Template + +Every reproduced defect must use `assets/issue-template.md` (see `qa/issues/BUG-NNN-template.md`). Each bug records reproduction, root cause, fix, verification, and links the failing TC-ID. + +## Verification Report Template + +Task 13 closes the run by writing `qa/verification-report.md` (template at `qa/verification-report-template.md`) with: + +- Bootstrap manifest path. +- Lab root, runtime home, base URL, ports, tmux socket. +- Commands executed (verbatim) with results and durations. +- Test case index with pass/fail/blocked status. +- Bug links and root-cause summaries. +- Residual risk + recommended follow-up. +- Final `make verify` evidence. diff --git a/.compozy/tasks/provider-model-catalog/qa/verification-report-template.md b/.compozy/tasks/provider-model-catalog/qa/verification-report-template.md new file mode 100644 index 000000000..1e2b5599b --- /dev/null +++ b/.compozy/tasks/provider-model-catalog/qa/verification-report-template.md @@ -0,0 +1,110 @@ +# Provider Model Catalog - Verification Report Template + +> Task 13 must rename this file to `verification-report.md` and fill every section before reporting completion. + +## Run Metadata + +- **Date:** YYYY-MM-DD +- **Operator:** +- **Branch:** +- **Commit:** +- **Bootstrap manifest path:** `.compozy/tasks/provider-model-catalog/qa/lab/bootstrap-manifest.json` +- **Lab root:** +- **Runtime home (`AGH_HOME`):** +- **Daemon ports:** , , +- **`AGH_WEB_API_PROXY_TARGET`:** +- **tmux-bridge socket:** + +## Smoke Readiness + +| Step | Command | Result | Notes | +|------|---------|--------|-------| +| 1 | `make build` | | | +| 2 | `make codegen-check` | | | +| 3 | `make bun-typecheck && make bun-test` | | | +| 4 | Focused Go gates | | | +| 5 | `agh daemon start --foreground` | | | + +## Test Case Results + +| TC | Title | Priority | Result | Notes / BUG-IDs | +|----|-------|----------|--------|-----------------| +| TC-FUNC-001 | Provider config hard cut | P0 | | | +| TC-FUNC-002 | Curated validation rules | P1 | | | +| TC-FUNC-003 | Builtin source priority 10 | P1 | | | +| TC-FUNC-004 | Merge determinism | P0 | | | +| TC-FUNC-005 | `models.dev` TTL/disable/aliases | P1 | | | +| TC-FUNC-006 | Stale fallback | P1 | | | +| TC-FUNC-007 | Partial vs all-source failure | P0 | | | +| TC-FUNC-008 | Live provider timeout | P1 | | | +| TC-FUNC-009 | No ACP session calls from discovery | P1 | | | +| TC-FUNC-010 | ACP `set_config_option` precedence | P0 | | | +| TC-FUNC-011 | Extension manifest validation | P1 | | | +| TC-FUNC-012 | Extension capability denial | P1 | | | +| TC-FUNC-013 | Source error redaction | P0 | | | +| TC-FUNC-014 | Detached refresh deadline | P1 | | | +| TC-FUNC-015 | Codegen + docs drift | P1 | | | +| TC-INT-001 | Migration v23 fresh + reopen | P0 | | | +| TC-INT-002 | HTTP/UDS handler payloads | P0 | | | +| TC-INT-003 | Canonical JSON parity | P0 | | | +| TC-INT-004 | OpenAI projection HTTP-only | P0 | | | +| TC-INT-005 | Extension success/denial | P0 | | | +| TC-INT-006 | ACP SDK upgrade flows | P0 | | | +| TC-PERF-001 | Refresh concurrency + coalesce | P0 | | | +| TC-PERF-002 | Detached refresh + shutdown join | P0 | | | +| TC-SEC-001 | Secret redaction across surfaces | P0 | | | +| TC-SEC-002 | OpenAI auth + envelope | P0 | | | +| TC-UI-001 | Settings source status + refresh | P1 | | | +| TC-UI-002 | Manual entry + curated edit | P1 | | | +| TC-UI-003 | New session ACP override | P1 | | | +| TC-REG-001 | Hard-cut residue scan | P1 | | | +| TC-REG-002 | Generated docs + CLI sync | P1 | | | +| TC-SCEN-001 | Operator real journey | P0 | | | +| TC-SCEN-002 | Agent real journey | P0 | | | + +## Verification Commands Executed + +For each command record verbatim invocation, exit code, and duration. Attach full logs under `qa/lab/logs/`. + +```bash +# Example +make codegen-check # exit 0, 12s +make bun-test # exit 0, 1m24s +go test -race ./internal/modelcatalog/... # exit 0, 2m11s +make test-e2e-runtime # exit 0, 4m02s +make test-e2e-web # exit 0, 6m18s +make verify # exit 0, 14m37s +``` + +## Filed Bugs + +| BUG | Severity | TC | Status | Fix Commit | +|-----|----------|----|--------|------------| +| | | | | | + +## Live-Provider Annex (Optional) + +If `MODELCATALOG_LIVE=1` was set, document the real-provider boundary here: + +- Provider exercised: +- Credential source: +- Endpoints hit: `models.dev/api.json` (real), `` (real) +- Result: + +If not run, state explicitly: "Live-provider annex not executed; default run uses stub HTTP servers and fake subprocesses." + +## Residual Risk + +- + +## Final Verification + +- `make verify` exit code: <0 / non-zero> +- Duration: +- Log path: `qa/lab/logs/make-verify.log` + +## Sign-Off + +- Reporter: +- Date: +- Decision: PASS | FAIL | CONDITIONAL (with documented workaround) diff --git a/AGENTS.md b/AGENTS.md index a9a2392a2..71a5adb79 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -220,6 +220,8 @@ Backend architecture, autonomy contracts, security invariants, package layout, a - **Skill**: `agh-schema-migration`. - **When**: any SQLite column, index, or constraint change. - **Mandatory**: numbered migration in the registry — `EnsureSchema`-style boot reconciliation is forbidden for column changes. +- **Append-only identity**: migration `version`, `name`, and `checksum` are persisted data contracts. Never insert, reorder, rename, renumber, or change an existing migration after it may have reached any developer, QA, or release database; append new migrations at the registry tail. +- **Integrity mismatch response**: stop and investigate the recorded history. Fix the registry order or write an ADR-backed one-pass repair; never weaken mismatch checks and never manually edit a live `schema_migrations` row as the fix. - **Covers**: numbered registry, transactional wrap (`BEGIN IMMEDIATE`), `-wal` / `-shm` companion handling on recovery, `ORDER BY 0` pitfall, fresh-DB + reopen-after-restart tests. ## Vocabulary & Product Strategy @@ -236,7 +238,7 @@ Repo-wide rules backed by RFC 001 / RFC 002. Runtime implementation details (pre - **Standing directives** — `docs/_memory/standing_directives.md`. Perpetually-active engineering posture (SD-001..SD-011): long-running session supervision, greenfield-delete, BR-PT/EN, multi-LLM pipeline, real-scenario QA, forensic-first bug fixes, truthful UI, composition-root discipline, detached lifetime, extensible-and-agent-manageable design. Read before opening a TechSpec, defending an architecture pivot, or whenever someone proposes a compat shim. - **Spec authoring playbook** — `docs/_memory/spec-authoring-playbook.md`. Mandatory preflight for `cy-create-prd` / `cy-create-techspec` / `cy-create-tasks`, with phase-by-phase MUST / MUST-NOT and evidence references. The `cy-spec-preflight` skill enforces this — always read before producing any `_idea.md` / `_prd.md` / `_techspec.md` / `_tasks.md`. -- **Lessons learned** — `docs/_memory/lessons/` (`L-001..L-015`, plus `README.md` index). One file per durable lesson with confirmed root cause + fix + evidence (ADR, commit, review issue, or QA bug). Scan the index whenever you hit a class of issue: concurrency / API, testing discipline, autonomy architecture, persistence, spec authoring. +- **Lessons learned** — `docs/_memory/lessons/` (`L-001..L-021`, plus `README.md` index). One file per durable lesson with confirmed root cause + fix + evidence (ADR, commit, review issue, or QA bug). Scan the index whenever you hit a class of issue: concurrency / API, testing discipline, autonomy architecture, persistence, spec authoring. - **Glossary** — `docs/_memory/glossary.md`. Canonical vocabulary (`capability` vs `recipe`, `AGENT.md` vs `AGENTS.md`, Peer Card vs Agent Card, autonomy primitives). Authoritative when older RFCs / ledgers conflict. Read when naming anything new, reviewing a rename PR, or when a term feels overloaded. - **Cross-source synthesis** — `docs/_memory/_synthesis.md`. Cross-referenced findings from 8 forensic analyses, ranked by source count — the evidence corpus behind every rule in CLAUDE.md and the standing directives. Read when challenging or evolving a rule. - **Forensic analyses** — `docs/_memory/analysis/analysis_*.md`. Per-source raw analyses (codex sessions / plans / ledger, compozy tasks, qmd collections, local / global runs, existing surfaces) feeding `_synthesis.md`. Read when synthesis cites a finding and you need the underlying evidence. diff --git a/CLAUDE.md b/CLAUDE.md index 3eb4eb21d..fd53e7fa1 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -220,6 +220,8 @@ Backend architecture, autonomy contracts, security invariants, package layout, a - **Skill**: `agh-schema-migration`. - **When**: any SQLite column, index, or constraint change. - **Mandatory**: numbered migration in the registry — `EnsureSchema`-style boot reconciliation is forbidden for column changes. +- **Append-only identity**: migration `version`, `name`, and `checksum` are persisted data contracts. Never insert, reorder, rename, renumber, or change an existing migration after it may have reached any developer, QA, or release database; append new migrations at the registry tail. +- **Integrity mismatch response**: stop and investigate the recorded history. Fix the registry order or write an ADR-backed one-pass repair; never weaken mismatch checks and never manually edit a live `schema_migrations` row as the fix. - **Covers**: numbered registry, transactional wrap (`BEGIN IMMEDIATE`), `-wal` / `-shm` companion handling on recovery, `ORDER BY 0` pitfall, fresh-DB + reopen-after-restart tests. ## Vocabulary & Product Strategy @@ -236,7 +238,7 @@ Repo-wide rules backed by RFC 001 / RFC 002. Runtime implementation details (pre - **Standing directives** — `docs/_memory/standing_directives.md`. Perpetually-active engineering posture (SD-001..SD-011): long-running session supervision, greenfield-delete, BR-PT/EN, multi-LLM pipeline, real-scenario QA, forensic-first bug fixes, truthful UI, composition-root discipline, detached lifetime, extensible-and-agent-manageable design. Read before opening a TechSpec, defending an architecture pivot, or whenever someone proposes a compat shim. - **Spec authoring playbook** — `docs/_memory/spec-authoring-playbook.md`. Mandatory preflight for `cy-create-prd` / `cy-create-techspec` / `cy-create-tasks`, with phase-by-phase MUST / MUST-NOT and evidence references. The `cy-spec-preflight` skill enforces this — always read before producing any `_idea.md` / `_prd.md` / `_techspec.md` / `_tasks.md`. -- **Lessons learned** — `docs/_memory/lessons/` (`L-001..L-015`, plus `README.md` index). One file per durable lesson with confirmed root cause + fix + evidence (ADR, commit, review issue, or QA bug). Scan the index whenever you hit a class of issue: concurrency / API, testing discipline, autonomy architecture, persistence, spec authoring. +- **Lessons learned** — `docs/_memory/lessons/` (`L-001..L-021`, plus `README.md` index). One file per durable lesson with confirmed root cause + fix + evidence (ADR, commit, review issue, or QA bug). Scan the index whenever you hit a class of issue: concurrency / API, testing discipline, autonomy architecture, persistence, spec authoring. - **Glossary** — `docs/_memory/glossary.md`. Canonical vocabulary (`capability` vs `recipe`, `AGENT.md` vs `AGENTS.md`, Peer Card vs Agent Card, autonomy primitives). Authoritative when older RFCs / ledgers conflict. Read when naming anything new, reviewing a rename PR, or when a term feels overloaded. - **Cross-source synthesis** — `docs/_memory/_synthesis.md`. Cross-referenced findings from 8 forensic analyses, ranked by source count — the evidence corpus behind every rule in CLAUDE.md and the standing directives. Read when challenging or evolving a rule. - **Forensic analyses** — `docs/_memory/analysis/analysis_*.md`. Per-source raw analyses (codex sessions / plans / ledger, compozy tasks, qmd collections, local / global runs, existing surfaces) feeding `_synthesis.md`. Read when synthesis cites a finding and you need the underlying evidence. diff --git a/config.toml b/config.toml index cebe1a0fe..c5218d6b8 100644 --- a/config.toml +++ b/config.toml @@ -29,14 +29,14 @@ external_default = "disabled" approval_timeout_seconds = 120 trusted_sources = [] -[providers.claude] -default_model = "claude-sonnet-4-6" +[providers.claude.models] +default = "claude-sonnet-4-6" -[providers.codex] -default_model = "gpt-5.4" +[providers.codex.models] +default = "gpt-5.4" -[providers.gemini] -default_model = "gemini-3.1-pro-preview" +[providers.gemini.models] +default = "gemini-3.1-pro-preview" [observability] enabled = true diff --git a/docs/_memory/lessons/L-021-schema-migration-identity-is-append-only.md b/docs/_memory/lessons/L-021-schema-migration-identity-is-append-only.md new file mode 100644 index 000000000..251339be1 --- /dev/null +++ b/docs/_memory/lessons/L-021-schema-migration-identity-is-append-only.md @@ -0,0 +1,81 @@ +# L-021 — Schema migration identity is append-only + +**Class:** Persistence +**Date discovered:** 2026-05-06 (daemon restart migration integrity failure) +**Evidence sources:** Local daemon restart failure, observed `~/.agh/agh.db` `schema_migrations` +rows, `0b371eaa feat: add network threads (#105)`, `08eedb32 feat: orchestration +improvements (#106)`, and L-008 schema migration discipline + +## Context + +Restarting the local daemon failed before readiness with: + +```text +store: migration 17 integrity mismatch: recorded "add_task_orchestration_profile_schema"/2026-05-05-add-task-orchestration-profile-schema, current "rebuild_network_conversation_containers"/2026-05-05-rebuild-network-conversation-containers +``` + +The live global database had already recorded: + +```text +17 add_task_orchestration_profile_schema +18 add_task_review_gate_schema +19 add_notification_cursors +20 add_bridge_task_subscriptions +``` + +Current code had inserted `rebuild_network_conversation_containers` at version 17 and shifted the +existing task/bridge migrations to later numbers. The migration runner correctly refused to boot: +the persisted version/name/checksum identity no longer matched the binary. + +## Root cause + +Migration numbers were treated as a local ordering convenience instead of persisted contract data. +Fresh database tests still passed because the final schema could be built from the new order, but +an existing database carries the historical identity in `schema_migrations`. Once any developer, +QA, or release database can record a migration version/name/checksum, that identity is immutable. +Reordering the registry after that point breaks upgrades even when the end-state schema is valid. + +## Rule + +> SQLite migration identity is append-only. After a migration may have been applied anywhere +> meaningful, do not insert before it, reorder it, rename it, renumber it, or change its checksum. +> New schema work appends the next migration number at the registry tail. + +If an existing database reports an integrity mismatch, treat it as a safety signal. Do not weaken +the runner, do not accept arbitrary mismatches, and do not manually edit `schema_migrations`. +Investigate which identity is historically valid, restore the append-only sequence, and add +observed-history upgrade coverage. + +## Operationalization + +- Before choosing a migration number, inspect the current registry, recent commits touching the + registry, and relevant ledgers/tasks for concurrently landed migrations. +- New schema work appends after the highest registered version. Chronological neatness is not a + reason to insert into the middle. +- Migration tests must include fresh database coverage and upgrade/reopen coverage. For drift + fixes, add an observed-history regression seeded with the real `schema_migrations` prefix that + failed in the operator database. +- Keep integrity mismatch failures strict. A mismatch means the binary and database disagree about + history; fixing that disagreement belongs in the registry or in an ADR-backed one-pass repair. +- One-pass repair is allowed only under the existing greenfield exception: bounded to one boot, + documented in an ADR, and followed immediately by strict semantics. + +## Anti-pattern + +- Inserting a new migration at an older number because it "belongs" earlier in feature chronology. +- Renumbering already-recorded migrations to make a branch merge look sequential. +- Updating tests to the new fresh-DB order without seeding an old DB and reopening it. +- Handling an integrity mismatch by allowing multiple names/checksums for one version. +- Manually updating rows in a live `schema_migrations` table to match the current binary. + +## Source + +- Observed local database: + `sqlite3 /Users/pedronauck/.agh/agh.db 'SELECT version, name, checksum FROM schema_migrations ORDER BY version;'` +- Failing daemon startup: + `error: daemon: open global database "/Users/pedronauck/.agh/agh.db": store: initialize sqlite database "/Users/pedronauck/.agh/agh.db": store: migration 17 integrity mismatch` +- `internal/store/globaldb/global_db.go` — `globalSchemaMigrations` registry +- `internal/store/schema.go` — strict `RunMigrations` version/name/checksum validation +- `docs/_memory/lessons/L-008-schema-migrations-mandatory.md` +- `0b371eaa feat: add network threads (#105)` +- `08eedb32 feat: orchestration improvements (#106)` diff --git a/docs/_memory/lessons/README.md b/docs/_memory/lessons/README.md index d082ff6fb..375f12c9a 100644 --- a/docs/_memory/lessons/README.md +++ b/docs/_memory/lessons/README.md @@ -28,6 +28,7 @@ These are NOT speculative warnings — every lesson here has either an ADR, a co | [L-018](L-018-delegated-docs-runtime-truth-audit.md) | Delegated docs lanes need a runtime-truth audit before acceptance | Documentation | | [L-019](L-019-diagnostic-data-outlives-primary-record.md) | Diagnostic data must outlive its primary record when audit/replay matters | Architecture / Persistence | | [L-020](L-020-dense-typed-records-need-pointer-boundaries.md) | Dense typed orchestration records need pointer boundaries | Architecture / Code style | +| [L-021](L-021-schema-migration-identity-is-append-only.md) | Schema migration identity is append-only | Persistence | ## How to use diff --git a/go.mod b/go.mod index f80f7e860..594cd8df6 100644 --- a/go.mod +++ b/go.mod @@ -7,7 +7,7 @@ require ( github.com/Masterminds/semver/v3 v3.4.0 github.com/charmbracelet/bubbles v1.0.0 github.com/charmbracelet/bubbletea v1.3.10 - github.com/coder/acp-go-sdk v0.6.3 + github.com/coder/acp-go-sdk v0.12.2 github.com/creativeprojects/go-selfupdate v1.5.2 github.com/daytonaio/daytona/libs/sdk-go v0.166.0 github.com/getkin/kin-openapi v0.135.0 diff --git a/go.sum b/go.sum index a25164cf9..ea9f28427 100644 --- a/go.sum +++ b/go.sum @@ -74,6 +74,8 @@ github.com/cloudwego/base64x v0.1.6 h1:t11wG9AECkCDk5fMSoxmufanudBtJ+/HemLstXDLI github.com/cloudwego/base64x v0.1.6/go.mod h1:OFcloc187FXDaYHvrNIjxSe8ncn0OOM8gEHfghB2IPU= github.com/coder/acp-go-sdk v0.6.3 h1:LsXQytehdjKIYJnoVWON/nf7mqbiarnyuyE3rrjBsXQ= github.com/coder/acp-go-sdk v0.6.3/go.mod h1:yKzM/3R9uELp4+nBAwwtkS0aN1FOFjo11CNPy37yFko= +github.com/coder/acp-go-sdk v0.12.2 h1:fpRJ8Z5HMSr5cZ5IywzFlFZcIxZOsto+laNVu7XelFA= +github.com/coder/acp-go-sdk v0.12.2/go.mod h1:yKzM/3R9uELp4+nBAwwtkS0aN1FOFjo11CNPy37yFko= github.com/cpuguy83/go-md2man/v2 v2.0.6 h1:XJtiaUW6dEEqVuZiMTn1ldk455QWwEIsMIJlo5vtkx0= github.com/cpuguy83/go-md2man/v2 v2.0.6/go.mod h1:oOW0eioCTA6cOiMLiUPZOpcVxMig6NIQQ7OS05n1F4g= github.com/cpuguy83/go-md2man/v2 v2.0.7 h1:zbFlGlXEAKlwXpmvle3d8Oe3YnkKIK4xSRTd3sHPnBo= diff --git a/internal/AGENTS.md b/internal/AGENTS.md index d34be4fc5..24a88b807 100644 --- a/internal/AGENTS.md +++ b/internal/AGENTS.md @@ -51,6 +51,12 @@ Generic Go concurrency patterns (goroutine ownership, channels vs mutexes, `sele - Append-only event store (`runtime.db`) is the canonical operational ledger; session DBs are projections, not authority. - Live broadcasters publish only after durable append; reconnect/replay uses `after_seq`. +### Persistence + +- **SQLite migration registries are append-only.** `internal/store/globaldb.globalSchemaMigrations` and equivalent registries persist `version`, `name`, and `checksum` in `schema_migrations`; never insert, reorder, rename, renumber, or change an existing migration identity after it may have been applied. +- **Migration drift fixes require observed-history tests.** Cover fresh DB, upgrade/reopen, and the real recorded migration prefix that failed. Integrity mismatch is a safety signal to preserve, not an error to suppress. +- **New schema work appends at the registry tail.** If a migration appears to need an earlier slot, stop and write an ADR-backed repair plan instead of silently shifting recorded history. + ## Security Invariants - **`claim_token` redaction is non-negotiable.** Raw `claim_token` (`agh_claim_*`), MCP auth tokens, OAuth codes, PKCE verifiers, and secret bindings MUST NEVER appear in logs, status APIs, settings views, error payloads, channel messages, SSE, web UI, or memory. Use hash forms (`claim_token_hash`) over the wire. Network layer rejects raw `claim_token` in metadata. diff --git a/internal/CLAUDE.md b/internal/CLAUDE.md index 3f6e05e22..30ae6317c 100644 --- a/internal/CLAUDE.md +++ b/internal/CLAUDE.md @@ -51,6 +51,12 @@ Generic Go concurrency patterns (goroutine ownership, channels vs mutexes, `sele - Append-only event store (`runtime.db`) is the canonical operational ledger; session DBs are projections, not authority. - Live broadcasters publish only after durable append; reconnect/replay uses `after_seq`. +### Persistence + +- **SQLite migration registries are append-only.** `internal/store/globaldb.globalSchemaMigrations` and equivalent registries persist `version`, `name`, and `checksum` in `schema_migrations`; never insert, reorder, rename, renumber, or change an existing migration identity after it may have been applied. +- **Migration drift fixes require observed-history tests.** Cover fresh DB, upgrade/reopen, and the real recorded migration prefix that failed. Integrity mismatch is a safety signal to preserve, not an error to suppress. +- **New schema work appends at the registry tail.** If a migration appears to need an earlier slot, stop and write an ADR-backed repair plan instead of silently shifting recorded history. + ## Security Invariants - **`claim_token` redaction is non-negotiable.** Raw `claim_token` (`agh_claim_*`), MCP auth tokens, OAuth codes, PKCE verifiers, and secret bindings MUST NEVER appear in logs, status APIs, settings views, error payloads, channel messages, SSE, web UI, or memory. Use hash forms (`claim_token_hash`) over the wire. Network layer rejects raw `claim_token` in metadata. diff --git a/internal/acp/client.go b/internal/acp/client.go index b2aeadeea..a1e56af1c 100644 --- a/internal/acp/client.go +++ b/internal/acp/client.go @@ -339,7 +339,7 @@ func (d *Driver) initializeConnection(ctx context.Context, process *AgentProcess initRequest := acpsdk.InitializeRequest{ ProtocolVersion: acpsdk.ProtocolVersionNumber, ClientCapabilities: acpsdk.ClientCapabilities{ - Fs: acpsdk.FileSystemCapability{ + Fs: acpsdk.FileSystemCapabilities{ ReadTextFile: true, WriteTextFile: true, }, @@ -364,9 +364,9 @@ func (d *Driver) initializeConnection(ctx context.Context, process *AgentProcess ) } - process.Caps = Caps{ + process.setCaps(Caps{ SupportsLoadSession: initializeResponse.AgentCapabilities.LoadSession, - } + }) return nil } @@ -378,7 +378,7 @@ func (d *Driver) negotiateSession(ctx context.Context, process *AgentProcess, no } func (d *Driver) loadSession(ctx context.Context, process *AgentProcess, normalized StartOpts) error { - if !process.Caps.SupportsLoadSession { + if !process.CapsSnapshot().SupportsLoadSession { return WrapFailure(store.FailureLoad, "ACP session/load is not supported", fmt.Errorf( "%w: agent %q does not support session/load for resume %q", ErrAgentDoesNotSupportSession, @@ -418,7 +418,12 @@ func (d *Driver) loadSession(ctx context.Context, process *AgentProcess, normali if err := process.checkpointProcessOwner(ctx); err != nil { return err } - process.Caps = captureCaps(process.Caps.SupportsLoadSession, loadResponse.Modes, loadResponse.Models) + process.setCaps(captureCaps( + process.CapsSnapshot().SupportsLoadSession, + loadResponse.Modes, + loadResponse.Models, + loadResponse.ConfigOptions, + )) if err := d.applySessionMode(ctx, process, normalized.Permissions); err != nil { return WrapFailure( store.FailureProtocol, @@ -433,6 +438,13 @@ func (d *Driver) loadSession(ctx context.Context, process *AgentProcess, normali fmt.Errorf("acp: set session model for %q: %w", normalized.AgentName, err), ) } + if err := d.applySessionReasoningEffort(ctx, process, normalized.ReasoningEffort); err != nil { + return WrapFailure( + store.FailureProtocol, + "ACP session reasoning negotiation failed", + fmt.Errorf("acp: set session reasoning effort for %q: %w", normalized.AgentName, err), + ) + } return nil } @@ -464,7 +476,12 @@ func (d *Driver) createSession(ctx context.Context, process *AgentProcess, norma if err := process.checkpointProcessOwner(ctx); err != nil { return err } - process.Caps = captureCaps(process.Caps.SupportsLoadSession, newResponse.Modes, newResponse.Models) + process.setCaps(captureCaps( + process.CapsSnapshot().SupportsLoadSession, + newResponse.Modes, + newResponse.Models, + newResponse.ConfigOptions, + )) if err := d.applySessionMode(ctx, process, normalized.Permissions); err != nil { return WrapFailure( store.FailureProtocol, @@ -479,6 +496,13 @@ func (d *Driver) createSession(ctx context.Context, process *AgentProcess, norma fmt.Errorf("acp: set session model for %q: %w", normalized.AgentName, err), ) } + if err := d.applySessionReasoningEffort(ctx, process, normalized.ReasoningEffort); err != nil { + return WrapFailure( + store.FailureProtocol, + "ACP session reasoning negotiation failed", + fmt.Errorf("acp: set session reasoning effort for %q: %w", normalized.AgentName, err), + ) + } return nil } @@ -491,7 +515,7 @@ func (d *Driver) applySessionMode( return nil } - modeID := preferredSessionMode(process.Caps.SupportedModes, permissions, process.toolGateway != nil) + modeID := preferredSessionMode(process.CapsSnapshot().SupportedModes, permissions, process.toolGateway != nil) if modeID == "" { return nil } @@ -517,18 +541,82 @@ func (d *Driver) applySessionModel(ctx context.Context, process *AgentProcess, p return nil } - _, err := acpsdk.SendRequest[acpsdk.SetSessionModelResponse]( + caps := process.CapsSnapshot() + if len(caps.ConfigOptions) > 0 { + option, ok := findModelConfigOption(caps.ConfigOptions) + if !ok { + return fmt.Errorf("acp: model config option is not available for requested model %q", modelID) + } + if !configOptionAllowsValue(option, modelID) { + return fmt.Errorf("acp: model %q is not available in config option %q", modelID, option.ID) + } + return d.applySessionConfigOption(ctx, process, option.ID, modelID) + } + + if !legacyModelStateAllows(caps, modelID) { + return fmt.Errorf("acp: model %q is not available in legacy ACP model state", modelID) + } + + _, err := acpsdk.SendRequest[acpsdk.UnstableSetSessionModelResponse]( process.conn, ctx, acpsdk.AgentMethodSessionSetModel, - acpsdk.SetSessionModelRequest{ + acpsdk.UnstableSetSessionModelRequest{ SessionId: acpsdk.SessionId(process.SessionID), - ModelId: acpsdk.ModelId(modelID), + ModelId: acpsdk.UnstableModelId(modelID), }, ) return err } +func (d *Driver) applySessionReasoningEffort(ctx context.Context, process *AgentProcess, effort string) error { + if ctx == nil || process == nil || process.conn == nil { + return nil + } + effortID := strings.TrimSpace(effort) + if effortID == "" { + return nil + } + + caps := process.CapsSnapshot() + if len(caps.ConfigOptions) == 0 { + return nil + } + option, ok := findReasoningConfigOption(caps.ConfigOptions) + if !ok { + return nil + } + if !configOptionAllowsValue(option, effortID) { + return fmt.Errorf("acp: reasoning effort %q is not available in config option %q", effortID, option.ID) + } + return d.applySessionConfigOption(ctx, process, option.ID, effortID) +} + +func (d *Driver) applySessionConfigOption( + ctx context.Context, + process *AgentProcess, + optionID string, + valueID string, +) error { + response, err := acpsdk.SendRequest[acpsdk.SetSessionConfigOptionResponse]( + process.conn, + ctx, + acpsdk.AgentMethodSessionSetConfigOption, + acpsdk.SetSessionConfigOptionRequest{ + ValueId: &acpsdk.SetSessionConfigOptionValueId{ + SessionId: acpsdk.SessionId(process.SessionID), + ConfigId: acpsdk.SessionConfigId(strings.TrimSpace(optionID)), + Value: acpsdk.SessionConfigValueId(strings.TrimSpace(valueID)), + }, + }, + ) + if err != nil { + return err + } + process.setConfigOptions(sessionConfigOptionsFromSDK(response.ConfigOptions)) + return nil +} + func preferredSessionMode( supported []string, permissions aghconfig.PermissionMode, @@ -831,12 +919,10 @@ func (d *Driver) runPrompt(ctx context.Context, proc *AgentProcess, active *acti } }() - promptRequest := acpsdk.PromptRequest{ - SessionId: acpsdk.SessionId(proc.SessionID), - Prompt: []acpsdk.ContentBlock{acpsdk.TextBlock(proc.nextPromptText(req.Message))}, - } - if meta := req.Meta.Normalize(); !meta.IsZero() { - promptRequest.Meta = meta + promptRequest, err := buildWirePromptRequest(proc, req) + if err != nil { + emitPromptBuildError(proc, req, err) + return } response, err := acpsdk.SendRequest[wirePromptResponse]( proc.conn, @@ -878,6 +964,31 @@ func (d *Driver) runPrompt(ctx context.Context, proc *AgentProcess, active *acti proc.emitPromptEvent(doneEvent) } +func buildWirePromptRequest(proc *AgentProcess, req PromptRequest) (acpsdk.PromptRequest, error) { + promptRequest := acpsdk.PromptRequest{ + SessionId: acpsdk.SessionId(proc.SessionID), + Prompt: []acpsdk.ContentBlock{acpsdk.TextBlock(proc.nextPromptText(req.Message))}, + } + if meta := req.Meta.Normalize(); !meta.IsZero() { + metaMap, err := meta.ToMap() + if err != nil { + return acpsdk.PromptRequest{}, err + } + promptRequest.Meta = metaMap + } + return promptRequest, nil +} + +func emitPromptBuildError(proc *AgentProcess, req PromptRequest, err error) { + proc.emitPromptEvent(AgentEvent{ + Type: EventTypeError, + SessionID: proc.SessionID, + TurnID: req.TurnID, + Timestamp: timeNowUTC(), + Error: err.Error(), + }) +} + func shouldSuppressPromptErrorOnStop(err error) bool { if err == nil { return false @@ -1024,6 +1135,7 @@ func normalizeStartOpts(opts StartOpts) (StartOpts, error) { } normalized.SystemPrompt = strings.TrimSpace(normalized.SystemPrompt) normalized.PreferredModel = strings.TrimSpace(normalized.PreferredModel) + normalized.ReasoningEffort = strings.TrimSpace(normalized.ReasoningEffort) return normalized, nil } @@ -1207,7 +1319,12 @@ func toSDKMCPServers(servers []aghconfig.MCPServer) []acpsdk.McpServer { return converted } -func captureCaps(loadSession bool, modes *acpsdk.SessionModeState, models *acpsdk.SessionModelState) Caps { +func captureCaps( + loadSession bool, + modes *acpsdk.SessionModeState, + models *acpsdk.SessionModelState, + configOptions []acpsdk.SessionConfigOption, +) Caps { caps := Caps{SupportsLoadSession: loadSession} if modes != nil { caps.SupportedModes = make([]string, 0, len(modes.AvailableModes)) @@ -1221,6 +1338,7 @@ func captureCaps(loadSession bool, modes *acpsdk.SessionModeState, models *acpsd caps.SupportedModels = append(caps.SupportedModels, string(model.ModelId)) } } + caps.ConfigOptions = sessionConfigOptionsFromSDK(configOptions) return caps } diff --git a/internal/acp/client_test.go b/internal/acp/client_test.go index 482204720..a11fcb453 100644 --- a/internal/acp/client_test.go +++ b/internal/acp/client_test.go @@ -12,6 +12,7 @@ import ( "slices" "strconv" "strings" + "sync" "testing" "time" @@ -731,14 +732,14 @@ func TestStartSetsPreferredSessionModelWhenProvided(t *testing.T) { { name: "Should set preferred model for new sessions", scenario: "stream_updates", - preferred: "openrouter/openai/gpt-5.4", + preferred: "new-model", wantSession: "sess-new", }, { name: "Should set preferred model for resumed sessions", scenario: "load_session", resumeSession: "sess-existing", - preferred: "anthropic/claude-opus-4-7", + preferred: "loaded-model", wantSession: "sess-existing", }, } @@ -768,6 +769,228 @@ func TestStartSetsPreferredSessionModelWhenProvided(t *testing.T) { } } +func TestStartCapturesSessionConfigOptions(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + scenario string + resumeSession string + wantModel string + wantReasoning string + }{ + { + name: "Should capture config options from session new", + scenario: "config_options", + wantModel: "new-model", + wantReasoning: "medium", + }, + { + name: "Should capture config options from session load", + scenario: "load_config_options", + resumeSession: "sess-existing", + wantModel: "loaded-model", + wantReasoning: "high", + }, + } + + for _, tc := range tests { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + + driver := New() + proc := startHelperProcess(t, driver, tc.scenario, "", StartOpts{ + ResumeSessionID: tc.resumeSession, + }) + defer stopProcess(t, driver, proc) + + caps := proc.CapsSnapshot() + assertConfigOption(t, caps.ConfigOptions, "model", tc.wantModel, "new-model", "loaded-model", "other-model") + assertConfigOption(t, caps.ConfigOptions, "reasoning_effort", tc.wantReasoning, "minimal", "high", "xhigh") + }) + } +} + +func TestStartUsesSetConfigOptionForPreferredModelWhenAvailable(t *testing.T) { + t.Parallel() + + driver := New() + captureFile := filepath.Join(t.TempDir(), "session-set-config-model.jsonl") + proc := startHelperProcess(t, driver, "config_options", "", StartOpts{ + PreferredModel: "other-model", + Env: helperEnvWithCapture("config_options", "", captureFile), + }) + defer stopProcess(t, driver, proc) + + request := decodeCapturedSetSessionConfigOptionRequest( + t, + captureRequestParams(t, captureFile, acpsdk.AgentMethodSessionSetConfigOption), + ) + if got := request.SessionID; got != "sess-new" { + t.Fatalf("set-config session id = %q, want sess-new", got) + } + if got := request.ConfigID; got != "model" { + t.Fatalf("set-config config id = %q, want model", got) + } + if got := request.Value; got != "other-model" { + t.Fatalf("set-config value = %q, want other-model", got) + } + if captureMethodExists(t, captureFile, acpsdk.AgentMethodSessionSetModel) { + t.Fatal("legacy set_model was sent when model config option was available") + } + assertConfigOption(t, proc.CapsSnapshot().ConfigOptions, "model", "other-model", "other-model") +} + +func TestStartUsesSetConfigOptionForReasoningEffortWhenAvailable(t *testing.T) { + t.Parallel() + + driver := New() + captureFile := filepath.Join(t.TempDir(), "session-set-config-reasoning.jsonl") + proc := startHelperProcess(t, driver, "config_options", "", StartOpts{ + ReasoningEffort: "high", + Env: helperEnvWithCapture("config_options", "", captureFile), + }) + defer stopProcess(t, driver, proc) + + request := decodeCapturedSetSessionConfigOptionRequest( + t, + captureRequestParams(t, captureFile, acpsdk.AgentMethodSessionSetConfigOption), + ) + if got := request.ConfigID; got != "reasoning_effort" { + t.Fatalf("set-config config id = %q, want reasoning_effort", got) + } + if got := request.Value; got != "high" { + t.Fatalf("set-config value = %q, want high", got) + } + assertConfigOption(t, proc.CapsSnapshot().ConfigOptions, "reasoning_effort", "high", "high") +} + +func TestStartDoesNotInventReasoningConfigOptionWhenAbsent(t *testing.T) { + t.Parallel() + + driver := New() + captureFile := filepath.Join(t.TempDir(), "session-no-reasoning-config.jsonl") + proc := startHelperProcess(t, driver, "config_options_no_reasoning", "", StartOpts{ + ReasoningEffort: "xhigh", + Env: helperEnvWithCapture("config_options_no_reasoning", "", captureFile), + }) + defer stopProcess(t, driver, proc) + + if captureMethodExists(t, captureFile, acpsdk.AgentMethodSessionSetConfigOption) { + t.Fatal("set_config_option was sent without a reasoning config option") + } + if captureMethodExists(t, captureFile, acpsdk.AgentMethodSessionSetModel) { + t.Fatal("legacy set_model was sent for a reasoning-only override") + } +} + +func TestStartRejectsUnavailableSessionConfigOptionValues(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + opts StartOpts + wantError string + forbiddenMethod string + }{ + { + name: "Should reject preferred model absent from model config option values", + opts: StartOpts{ + PreferredModel: "missing-model", + }, + wantError: `model "missing-model" is not available in config option "model"`, + forbiddenMethod: acpsdk.AgentMethodSessionSetModel, + }, + { + name: "Should reject reasoning effort absent from reasoning config option values", + opts: StartOpts{ + ReasoningEffort: "turbo", + }, + wantError: `reasoning effort "turbo" is not available in config option "reasoning_effort"`, + forbiddenMethod: acpsdk.AgentMethodSessionSetConfigOption, + }, + } + + for _, tc := range tests { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + + driver := New() + captureFile := filepath.Join(t.TempDir(), "session-unavailable-config-option.jsonl") + opts := StartOpts{ + AgentName: "helper", + Command: helperCommand(t), + Cwd: t.TempDir(), + Env: helperEnvWithCapture("config_options", "", captureFile), + Permissions: aghconfig.PermissionModeApproveAll, + } + opts.PreferredModel = tc.opts.PreferredModel + opts.ReasoningEffort = tc.opts.ReasoningEffort + proc, err := driver.Start(testutil.Context(t), opts) + if proc != nil { + defer stopProcess(t, driver, proc) + } + if err == nil { + t.Fatal("Start() error = nil, want unavailable config option error") + } + if !strings.Contains(err.Error(), tc.wantError) { + t.Fatalf("Start() error = %v, want containing %q", err, tc.wantError) + } + if captureMethodExists(t, captureFile, tc.forbiddenMethod) { + t.Fatalf("forbidden method %q was sent after unavailable config value", tc.forbiddenMethod) + } + }) + } +} + +func TestStartRejectsUnsupportedLegacyPreferredModel(t *testing.T) { + t.Parallel() + + driver := New() + captureFile := filepath.Join(t.TempDir(), "session-unsupported-legacy-model.jsonl") + proc, err := driver.Start(testutil.Context(t), StartOpts{ + AgentName: "helper", + Command: helperCommand(t), + Cwd: t.TempDir(), + Env: helperEnvWithCapture("stream_updates", "", captureFile), + Permissions: aghconfig.PermissionModeApproveAll, + PreferredModel: "missing-model", + }) + if proc != nil { + defer stopProcess(t, driver, proc) + } + if err == nil { + t.Fatal("Start() error = nil, want unsupported legacy model error") + } + if !strings.Contains(err.Error(), `model "missing-model" is not available in legacy ACP model state`) { + t.Fatalf("Start() error = %v", err) + } + if captureMethodExists(t, captureFile, acpsdk.AgentMethodSessionSetModel) { + t.Fatal("legacy set_model was sent for an unsupported legacy model") + } +} + +func TestSessionConfigOptionUpdateMutatesCaps(t *testing.T) { + t.Parallel() + + driver := New() + proc := startHelperProcess(t, driver, "config_option_update", "", StartOpts{}) + defer stopProcess(t, driver, proc) + + events, err := driver.Prompt(testutil.Context(t), proc, PromptRequest{ + TurnID: "turn-config-options", + Message: "update config", + }) + if err != nil { + t.Fatalf("Prompt() error = %v", err) + } + collectEvents(t, events) + + caps := proc.CapsSnapshot() + assertConfigOption(t, caps.ConfigOptions, "model", "other-model", "other-model") + assertConfigOption(t, caps.ConfigOptions, "reasoning_effort", "xhigh", "xhigh") +} + func TestStartWithEmptyAdditionalDirsKeepsBaselinePayload(t *testing.T) { t.Parallel() @@ -1209,7 +1432,7 @@ func TestDriverApprovePermissionValidationAndForwarding(t *testing.T) { proc := newDirectProcess(t, aghconfig.PermissionModeDenyAll) requestID, pending := proc.registerPendingPermission("turn-1", acpsdk.RequestPermissionRequest{ - ToolCall: acpsdk.RequestPermissionToolCall{ToolCallId: "tool-1"}, + ToolCall: acpsdk.ToolCallUpdate{ToolCallId: "tool-1"}, }) if err := driver.ApprovePermission(context.Background(), proc, ApproveRequest{ @@ -1288,6 +1511,9 @@ func startHelperProcess( if overrides.PreferredModel != "" { opts.PreferredModel = overrides.PreferredModel } + if overrides.ReasoningEffort != "" { + opts.ReasoningEffort = overrides.ReasoningEffort + } opts.ResumeSessionID = overrides.ResumeSessionID opts.Launcher = overrides.Launcher opts.ToolHost = overrides.ToolHost @@ -1541,14 +1767,38 @@ type capturedSetSessionModelRequest struct { ModelID string `json:"modelId"` } +type capturedSetSessionConfigOptionRequest struct { + SessionID string `json:"sessionId"` + ConfigID string `json:"configId"` + Value string `json:"value"` +} + func captureRequestParams(t *testing.T, path string, method string) map[string]json.RawMessage { t.Helper() + matches := captureRequestParamsForMethod(t, path, method) + if len(matches) > 0 { + return matches[0] + } + t.Fatalf("capture file %q does not contain method %q", path, method) + return nil +} + +func captureMethodExists(t *testing.T, path string, method string) bool { + t.Helper() + + return len(captureRequestParamsForMethod(t, path, method)) > 0 +} + +func captureRequestParamsForMethod(t *testing.T, path string, method string) []map[string]json.RawMessage { + t.Helper() + data, err := os.ReadFile(path) if err != nil { t.Fatalf("os.ReadFile(%q) error = %v", path, err) } + matches := make([]map[string]json.RawMessage, 0) lines := strings.SplitSeq(strings.TrimSpace(string(data)), "\n") for line := range lines { if strings.TrimSpace(line) == "" { @@ -1567,11 +1817,9 @@ func captureRequestParams(t *testing.T, path string, method string) map[string]j if err := json.Unmarshal(envelope.Params, ¶ms); err != nil { t.Fatalf("json.Unmarshal(captured params) error = %v", err) } - return params + matches = append(matches, params) } - - t.Fatalf("capture file %q does not contain method %q", path, method) - return nil + return matches } func decodeCapturedNewSessionRequest(t *testing.T, params map[string]json.RawMessage) capturedNewSessionRequest { @@ -1636,6 +1884,56 @@ func decodeCapturedSetSessionModelRequest( return request } +func decodeCapturedSetSessionConfigOptionRequest( + t *testing.T, + params map[string]json.RawMessage, +) capturedSetSessionConfigOptionRequest { + t.Helper() + + raw, err := json.Marshal(params) + if err != nil { + t.Fatalf("json.Marshal(set-session-config-option params) error = %v", err) + } + var request capturedSetSessionConfigOptionRequest + if err := json.Unmarshal(raw, &request); err != nil { + t.Fatalf("json.Unmarshal(set-session-config-option request) error = %v", err) + } + return request +} + +func assertConfigOption( + t *testing.T, + options []SessionConfigOption, + id string, + current string, + wantValues ...string, +) { + t.Helper() + + var found *SessionConfigOption + for index := range options { + if options[index].ID == id { + found = &options[index] + break + } + } + if found == nil { + t.Fatalf("config option %q not found in %#v", id, options) + } + if got := found.Current; got != current { + t.Fatalf("config option %q current = %q, want %q", id, got, current) + } + values := make([]string, 0, len(found.Values)) + for _, value := range found.Values { + values = append(values, value.Value) + } + for _, want := range wantValues { + if !slices.Contains(values, want) { + t.Fatalf("config option %q values = %#v, want value %q", id, values, want) + } + } +} + func mustCanonicalDir(t *testing.T, path string) string { t.Helper() @@ -1661,9 +1959,11 @@ func assertPermissionResult(t *testing.T, err error, wantOK bool) { } type helperACPAgent struct { - conn *acpsdk.AgentSideConnection - scenario string - filePath string + conn *acpsdk.AgentSideConnection + scenario string + filePath string + configOptionsMu sync.Mutex + configOptions []acpsdk.SessionConfigOption } func (a *helperACPAgent) Authenticate( @@ -1678,7 +1978,7 @@ func (a *helperACPAgent) Initialize(context.Context, acpsdk.InitializeRequest) ( ProtocolVersion: acpsdk.ProtocolVersionNumber, AgentCapabilities: acpsdk.AgentCapabilities{ LoadSession: a.scenario == "load_session" || a.scenario == "load_session_error" || - a.scenario == "load_mode_mapping", + a.scenario == "load_mode_mapping" || a.scenario == "load_config_options", }, AuthMethods: []acpsdk.AuthMethod{}, }, nil @@ -1688,6 +1988,27 @@ func (a *helperACPAgent) Cancel(context.Context, acpsdk.CancelNotification) erro return nil } +func (a *helperACPAgent) CloseSession( + context.Context, + acpsdk.CloseSessionRequest, +) (acpsdk.CloseSessionResponse, error) { + return acpsdk.CloseSessionResponse{}, nil +} + +func (a *helperACPAgent) ListSessions( + context.Context, + acpsdk.ListSessionsRequest, +) (acpsdk.ListSessionsResponse, error) { + return acpsdk.ListSessionsResponse{}, nil +} + +func (a *helperACPAgent) ResumeSession( + context.Context, + acpsdk.ResumeSessionRequest, +) (acpsdk.ResumeSessionResponse, error) { + return acpsdk.ResumeSessionResponse{}, nil +} + func (a *helperACPAgent) NewSession(context.Context, acpsdk.NewSessionRequest) (acpsdk.NewSessionResponse, error) { if a.scenario == "mode_mapping" { return acpsdk.NewSessionResponse{ @@ -1696,6 +2017,21 @@ func (a *helperACPAgent) NewSession(context.Context, acpsdk.NewSessionRequest) ( Models: helperModelState("new-model"), }, nil } + if a.scenario == "config_options" || + a.scenario == "config_options_no_reasoning" || + a.scenario == "config_option_update" { + configOptions := helperConfigOptions("new-model", "medium") + if a.scenario == "config_options_no_reasoning" { + configOptions = helperModelConfigOptions("new-model") + } + a.setHelperConfigOptions(configOptions) + return acpsdk.NewSessionResponse{ + SessionId: "sess-new", + Modes: helperModeState("new-mode"), + Models: helperModelState("new-model"), + ConfigOptions: configOptions, + }, nil + } return acpsdk.NewSessionResponse{ SessionId: "sess-new", Modes: helperModeState("new-mode"), @@ -1713,6 +2049,15 @@ func (a *helperACPAgent) LoadSession(context.Context, acpsdk.LoadSessionRequest) Models: helperModelState("loaded-model"), }, nil } + if a.scenario == "load_config_options" { + configOptions := helperConfigOptions("loaded-model", "high") + a.setHelperConfigOptions(configOptions) + return acpsdk.LoadSessionResponse{ + Modes: helperModeState("loaded-mode"), + Models: helperModelState("loaded-model"), + ConfigOptions: configOptions, + }, nil + } return acpsdk.LoadSessionResponse{ Modes: helperModeState("loaded-mode"), Models: helperModelState("loaded-model"), @@ -1754,6 +2099,17 @@ func (a *helperACPAgent) Prompt(ctx context.Context, params acpsdk.PromptRequest }); sendErr != nil { return acpsdk.PromptResponse{}, sendErr } + case "config_option_update": + if sendErr := a.conn.SessionUpdate(ctx, acpsdk.SessionNotification{ + SessionId: params.SessionId, + Update: acpsdk.SessionUpdate{ + ConfigOptionUpdate: &acpsdk.SessionConfigOptionUpdate{ + ConfigOptions: helperConfigOptions("other-model", "xhigh"), + }, + }, + }); sendErr != nil { + return acpsdk.PromptResponse{}, sendErr + } case "fs_read": response, err := a.conn.ReadTextFile(ctx, acpsdk.ReadTextFileRequest{ SessionId: params.SessionId, @@ -1836,7 +2192,7 @@ func (a *helperACPAgent) Prompt(ctx context.Context, params acpsdk.PromptRequest {OptionId: "reject-once", Name: "reject once", Kind: acpsdk.PermissionOptionKindRejectOnce}, {OptionId: "reject-always", Name: "reject always", Kind: acpsdk.PermissionOptionKindRejectAlways}, }, - ToolCall: acpsdk.RequestPermissionToolCall{ + ToolCall: acpsdk.ToolCallUpdate{ ToolCallId: "tool-1", Title: &title, Locations: []acpsdk.ToolCallLocation{ @@ -1965,11 +2321,85 @@ func (a *helperACPAgent) SetSessionMode( return acpsdk.SetSessionModeResponse{}, nil } -func (a *helperACPAgent) SetSessionModel( +func (a *helperACPAgent) SetSessionConfigOption( + _ context.Context, + request acpsdk.SetSessionConfigOptionRequest, +) (acpsdk.SetSessionConfigOptionResponse, error) { + a.configOptionsMu.Lock() + defer a.configOptionsMu.Unlock() + if request.ValueId != nil { + configID := string(request.ValueId.ConfigId) + value := acpsdk.SessionConfigValueId(strings.TrimSpace(string(request.ValueId.Value))) + for index := range a.configOptions { + if a.configOptions[index].Select == nil || string(a.configOptions[index].Select.Id) != configID { + continue + } + a.configOptions[index].Select.CurrentValue = value + } + } + return acpsdk.SetSessionConfigOptionResponse{ + ConfigOptions: append([]acpsdk.SessionConfigOption(nil), a.configOptions...), + }, nil +} + +func (a *helperACPAgent) UnstableSetSessionModel( context.Context, - acpsdk.SetSessionModelRequest, -) (acpsdk.SetSessionModelResponse, error) { - return acpsdk.SetSessionModelResponse{}, nil + acpsdk.UnstableSetSessionModelRequest, +) (acpsdk.UnstableSetSessionModelResponse, error) { + return acpsdk.UnstableSetSessionModelResponse{}, nil +} + +func (a *helperACPAgent) setHelperConfigOptions(options []acpsdk.SessionConfigOption) { + a.configOptionsMu.Lock() + defer a.configOptionsMu.Unlock() + a.configOptions = append([]acpsdk.SessionConfigOption(nil), options...) +} + +func helperConfigOptions(modelCurrent string, reasoningCurrent string) []acpsdk.SessionConfigOption { + options := helperModelConfigOptions(modelCurrent) + options = append(options, helperSelectConfigOption( + "reasoning_effort", + "Reasoning effort", + reasoningCurrent, + "minimal", + "low", + "medium", + "high", + "xhigh", + )) + return options +} + +func helperModelConfigOptions(current string) []acpsdk.SessionConfigOption { + return []acpsdk.SessionConfigOption{ + helperSelectConfigOption("model", "Model", current, "new-model", "loaded-model", "other-model"), + } +} + +func helperSelectConfigOption( + id string, + name string, + current string, + values ...string, +) acpsdk.SessionConfigOption { + selectOptions := make(acpsdk.SessionConfigSelectOptionsUngrouped, 0, len(values)) + for _, value := range values { + selectOptions = append(selectOptions, acpsdk.SessionConfigSelectOption{ + Value: acpsdk.SessionConfigValueId(value), + Name: value, + }) + } + return acpsdk.SessionConfigOption{ + Select: &acpsdk.SessionConfigOptionSelect{ + Id: acpsdk.SessionConfigId(id), + Name: name, + CurrentValue: acpsdk.SessionConfigValueId(current), + Options: acpsdk.SessionConfigSelectOptions{ + Ungrouped: &selectOptions, + }, + Type: "select", + }, + } } func helperModeState(id string) *acpsdk.SessionModeState { diff --git a/internal/acp/config_options.go b/internal/acp/config_options.go new file mode 100644 index 000000000..05f4c3f4a --- /dev/null +++ b/internal/acp/config_options.go @@ -0,0 +1,141 @@ +package acp + +import ( + "strings" + + acpsdk "github.com/coder/acp-go-sdk" +) + +func sessionConfigOptionsFromSDK(options []acpsdk.SessionConfigOption) []SessionConfigOption { + if len(options) == 0 { + return nil + } + converted := make([]SessionConfigOption, 0, len(options)) + for _, option := range options { + if convertedOption, ok := sessionConfigOptionFromSDK(option); ok { + converted = append(converted, convertedOption) + } + } + return converted +} + +func sessionConfigOptionFromSDK(option acpsdk.SessionConfigOption) (SessionConfigOption, bool) { + switch { + case option.Select != nil: + selectOption := option.Select + id := strings.TrimSpace(string(selectOption.Id)) + if id == "" { + return SessionConfigOption{}, false + } + return SessionConfigOption{ + ID: id, + Label: strings.TrimSpace(selectOption.Name), + Description: trimStringPointer(selectOption.Description), + Kind: SessionConfigOptionKindSelect, + Current: strings.TrimSpace(string(selectOption.CurrentValue)), + Values: sessionConfigValuesFromSDK(selectOption.Options), + }, true + case option.Boolean != nil: + booleanOption := option.Boolean + id := strings.TrimSpace(string(booleanOption.Id)) + if id == "" { + return SessionConfigOption{}, false + } + current := "false" + if booleanOption.CurrentValue { + current = "true" + } + return SessionConfigOption{ + ID: id, + Label: strings.TrimSpace(booleanOption.Name), + Description: trimStringPointer(booleanOption.Description), + Kind: SessionConfigOptionKindBoolean, + Current: current, + }, true + default: + return SessionConfigOption{}, false + } +} + +func sessionConfigValuesFromSDK(options acpsdk.SessionConfigSelectOptions) []SessionConfigOptionValue { + values := make([]SessionConfigOptionValue, 0) + if options.Ungrouped != nil { + for _, value := range *options.Ungrouped { + values = append(values, sessionConfigValueFromSDK(value)) + } + } + if options.Grouped != nil { + for _, group := range *options.Grouped { + for _, value := range group.Options { + values = append(values, sessionConfigValueFromSDK(value)) + } + } + } + if len(values) == 0 { + return nil + } + return values +} + +func sessionConfigValueFromSDK(value acpsdk.SessionConfigSelectOption) SessionConfigOptionValue { + return SessionConfigOptionValue{ + Value: strings.TrimSpace(string(value.Value)), + Label: strings.TrimSpace(value.Name), + Description: trimStringPointer(value.Description), + } +} + +func trimStringPointer(value *string) string { + if value == nil { + return "" + } + return strings.TrimSpace(*value) +} + +func findModelConfigOption(options []SessionConfigOption) (SessionConfigOption, bool) { + return findSelectConfigOption(options, "model") +} + +func findReasoningConfigOption(options []SessionConfigOption) (SessionConfigOption, bool) { + return findSelectConfigOption(options, "reasoning_effort", "effort") +} + +func findSelectConfigOption(options []SessionConfigOption, candidateIDs ...string) (SessionConfigOption, bool) { + for _, candidateID := range candidateIDs { + for _, option := range options { + if option.Kind != SessionConfigOptionKindSelect { + continue + } + if strings.TrimSpace(option.ID) == candidateID { + return option, true + } + } + } + return SessionConfigOption{}, false +} + +func configOptionAllowsValue(option SessionConfigOption, value string) bool { + value = strings.TrimSpace(value) + if value == "" || option.Kind != SessionConfigOptionKindSelect { + return false + } + for _, candidate := range option.Values { + if strings.TrimSpace(candidate.Value) == value { + return true + } + } + return false +} + +func legacyModelStateAllows(caps Caps, modelID string) bool { + modelID = strings.TrimSpace(modelID) + if modelID == "" { + return false + } + for _, candidate := range caps.SupportedModels { + if strings.TrimSpace(candidate) == modelID { + return true + } + } + return false +} diff --git a/internal/acp/config_options_test.go b/internal/acp/config_options_test.go new file mode 100644 index 000000000..99fb995b4 --- /dev/null +++ b/internal/acp/config_options_test.go @@ -0,0 +1,139 @@ +package acp + +import ( + "slices" + "testing" + + acpsdk "github.com/coder/acp-go-sdk" +) + +func TestSessionConfigOptionsFromSDK(t *testing.T) { + t.Run("Should convert select boolean and grouped config options", func(t *testing.T) { + t.Parallel() + + description := "Choose active model" + booleanDescription := "Enable verbose output" + grouped := acpsdk.SessionConfigSelectOptionsGrouped{ + { + Group: "frontier", + Name: "Frontier", + Options: []acpsdk.SessionConfigSelectOption{ + {Value: "model-a", Name: "Model A", Description: &description}, + }, + }, + } + + options := sessionConfigOptionsFromSDK([]acpsdk.SessionConfigOption{ + { + Select: &acpsdk.SessionConfigOptionSelect{ + Id: "model", + Name: "Model", + Description: &description, + CurrentValue: "model-a", + Options: acpsdk.SessionConfigSelectOptions{ + Grouped: &grouped, + }, + Type: "select", + }, + }, + { + Boolean: &acpsdk.SessionConfigOptionBoolean{ + Id: "verbose", + Name: "Verbose", + Description: &booleanDescription, + CurrentValue: true, + Type: "boolean", + }, + }, + { + Select: &acpsdk.SessionConfigOptionSelect{ + Id: " ", + Name: "Ignored", + Type: "select", + }, + }, + }) + + if len(options) != 2 { + t.Fatalf("sessionConfigOptionsFromSDK() len = %d, want 2: %#v", len(options), options) + } + model := options[0] + if model.ID != "model" || model.Label != "Model" || model.Description != description || + model.Kind != SessionConfigOptionKindSelect || model.Current != "model-a" { + t.Fatalf("model option = %#v", model) + } + if len(model.Values) != 1 || model.Values[0].Value != "model-a" || + model.Values[0].Description != description { + t.Fatalf("model values = %#v", model.Values) + } + boolean := options[1] + if boolean.ID != "verbose" || boolean.Kind != SessionConfigOptionKindBoolean || boolean.Current != "true" || + boolean.Description != booleanDescription { + t.Fatalf("boolean option = %#v", boolean) + } + }) + + t.Run("Should return nil for absent options", func(t *testing.T) { + t.Parallel() + + if got := sessionConfigOptionsFromSDK(nil); got != nil { + t.Fatalf("sessionConfigOptionsFromSDK(nil) = %#v, want nil", got) + } + }) +} + +func TestConfigOptionMatching(t *testing.T) { + t.Parallel() + + options := []SessionConfigOption{ + {ID: "verbose", Kind: SessionConfigOptionKindBoolean, Current: "true"}, + { + ID: "model", + Kind: SessionConfigOptionKindSelect, + Current: "model-a", + Values: []SessionConfigOptionValue{{Value: "model-a"}, {Value: "model-b"}}, + }, + { + ID: "effort", + Kind: SessionConfigOptionKindSelect, + Current: "low", + Values: []SessionConfigOptionValue{{Value: "low"}, {Value: "high"}}, + }, + } + + model, ok := findModelConfigOption(options) + if !ok || model.ID != "model" { + t.Fatalf("findModelConfigOption() = %#v, %v", model, ok) + } + reasoning, ok := findReasoningConfigOption(options) + if !ok || reasoning.ID != "effort" { + t.Fatalf("findReasoningConfigOption() = %#v, %v", reasoning, ok) + } + if !configOptionAllowsValue(model, "model-b") { + t.Fatal("configOptionAllowsValue() rejected advertised model") + } + if configOptionAllowsValue(model, "model-c") { + t.Fatal("configOptionAllowsValue() accepted unadvertised model") + } + if configOptionAllowsValue(options[0], "true") { + t.Fatal("configOptionAllowsValue() accepted boolean option as select") + } +} + +func TestLegacyModelStateAllows(t *testing.T) { + t.Parallel() + + caps := Caps{SupportedModels: []string{"model-a", "model-b"}} + if !legacyModelStateAllows(caps, "model-b") { + t.Fatal("legacyModelStateAllows() rejected advertised model") + } + if legacyModelStateAllows(caps, "model-c") { + t.Fatal("legacyModelStateAllows() accepted unadvertised model") + } + if legacyModelStateAllows(Caps{}, "model-a") { + t.Fatal("legacyModelStateAllows() accepted model without legacy state") + } + if !slices.Equal(CloneCaps(caps).SupportedModels, caps.SupportedModels) { + t.Fatalf("CloneCaps() did not preserve models") + } +} diff --git a/internal/acp/handlers.go b/internal/acp/handlers.go index 5308d20c1..b32f9ce0f 100644 --- a/internal/acp/handlers.go +++ b/internal/acp/handlers.go @@ -26,6 +26,7 @@ import ( const ( defaultTerminalOutputLimit = 64 * 1024 networkCommandName = "network" + sessionUpdateConfigOption = "config_option_update" ) type wireSessionNotification struct { @@ -381,6 +382,9 @@ func (p *AgentProcess) handleSessionUpdate(params json.RawMessage) error { if err := json.Unmarshal(params, ¬ification); err != nil { return fmt.Errorf("acp: decode session notification: %w", err) } + if notification.Update.ConfigOptionUpdate != nil { + p.setConfigOptions(sessionConfigOptionsFromSDK(notification.Update.ConfigOptionUpdate.ConfigOptions)) + } event := translateSessionUpdate(notification, raw.Update, p.activeTurnID()) event = p.markToolEventPrechecked(event) @@ -681,20 +685,20 @@ func (p *AgentProcess) takeExternalTerminalProcess(id string) *toolruntime.Handl } func (p *AgentProcess) handleKillTerminal( - request acpsdk.KillTerminalCommandRequest, -) (acpsdk.KillTerminalCommandResponse, error) { + request acpsdk.KillTerminalRequest, +) (acpsdk.KillTerminalResponse, error) { if err := p.ensureNetworkTurnTerminalAccess(request.TerminalId, false); err != nil { - return acpsdk.KillTerminalCommandResponse{}, err + return acpsdk.KillTerminalResponse{}, err } if err := p.toolHostOrDefault().KillTerminal(request.TerminalId); err != nil { - return acpsdk.KillTerminalCommandResponse{}, err + return acpsdk.KillTerminalResponse{}, err } p.completeExternalTerminalProcess( context.Background(), request.TerminalId, toolruntime.ProcessCompletion{Err: errors.New("terminal killed")}, ) - return acpsdk.KillTerminalCommandResponse{}, nil + return acpsdk.KillTerminalResponse{}, nil } func (p *AgentProcess) handleTerminalOutput( @@ -1185,6 +1189,9 @@ func translateSessionUpdate( case notification.Update.CurrentModeUpdate != nil: event.Type = EventTypeSystem event.Title = "current_mode_update" + case notification.Update.ConfigOptionUpdate != nil: + event.Type = EventTypeSystem + event.Title = sessionUpdateConfigOption default: event.Type = EventTypeSystem } diff --git a/internal/acp/handlers_test.go b/internal/acp/handlers_test.go index a79b0b0e3..fdc32e742 100644 --- a/internal/acp/handlers_test.go +++ b/internal/acp/handlers_test.go @@ -415,7 +415,7 @@ func TestHandleInboundPermissionRequest(t *testing.T) { {OptionId: "reject-once", Name: "reject once", Kind: acpsdk.PermissionOptionKindRejectOnce}, {OptionId: "reject-always", Name: "reject always", Kind: acpsdk.PermissionOptionKindRejectAlways}, }, - ToolCall: acpsdk.RequestPermissionToolCall{ + ToolCall: acpsdk.ToolCallUpdate{ ToolCallId: "tool-1", Title: &title, Kind: &kind, @@ -527,7 +527,7 @@ func TestHandleInboundPermissionRequestTimeout(t *testing.T) { {OptionId: "allow-once", Name: "allow once", Kind: acpsdk.PermissionOptionKindAllowOnce}, {OptionId: "reject-once", Name: "reject once", Kind: acpsdk.PermissionOptionKindRejectOnce}, }, - ToolCall: acpsdk.RequestPermissionToolCall{ + ToolCall: acpsdk.ToolCallUpdate{ ToolCallId: "tool-timeout", Title: &title, Kind: &kind, @@ -581,7 +581,7 @@ func TestHandleInboundPermissionRequestHonorsDenyAllWithToolGateway(t *testing.T {OptionId: "allow-once", Name: "allow once", Kind: acpsdk.PermissionOptionKindAllowOnce}, {OptionId: "reject-once", Name: "reject once", Kind: acpsdk.PermissionOptionKindRejectOnce}, }, - ToolCall: acpsdk.RequestPermissionToolCall{ + ToolCall: acpsdk.ToolCallUpdate{ ToolCallId: "tool-deny-all", Title: &title, Kind: &kind, @@ -637,7 +637,7 @@ func TestHandleInboundPermissionRequestHonorsApproveAllWithToolGateway(t *testin {OptionId: "allow-once", Name: "allow once", Kind: acpsdk.PermissionOptionKindAllowOnce}, {OptionId: "reject-once", Name: "reject once", Kind: acpsdk.PermissionOptionKindRejectOnce}, }, - ToolCall: acpsdk.RequestPermissionToolCall{ + ToolCall: acpsdk.ToolCallUpdate{ ToolCallId: "tool-approve-all", Title: &title, Kind: &kind, @@ -738,10 +738,10 @@ func TestResolvePermissionByTurnIDConflictsWhenMultipleRequestsPending(t *testin proc := newDirectProcess(t, aghconfig.PermissionModeDenyAll) turnID := "turn-conflict" _, first := proc.registerPendingPermission(turnID, acpsdk.RequestPermissionRequest{ - ToolCall: acpsdk.RequestPermissionToolCall{ToolCallId: "tool-1"}, + ToolCall: acpsdk.ToolCallUpdate{ToolCallId: "tool-1"}, }) _, second := proc.registerPendingPermission(turnID, acpsdk.RequestPermissionRequest{ - ToolCall: acpsdk.RequestPermissionToolCall{ToolCallId: "tool-2"}, + ToolCall: acpsdk.ToolCallUpdate{ToolCallId: "tool-2"}, }) t.Cleanup(func() { proc.clearPendingPermission(first.requestID) @@ -773,7 +773,7 @@ func TestResolvePermissionConcurrentSafety(t *testing.T) { requestID, pending := proc.registerPendingPermission( fmt.Sprintf("turn-%d", i), acpsdk.RequestPermissionRequest{ - ToolCall: acpsdk.RequestPermissionToolCall{ToolCallId: acpsdk.ToolCallId(fmt.Sprintf("tool-%d", i))}, + ToolCall: acpsdk.ToolCallUpdate{ToolCallId: acpsdk.ToolCallId(fmt.Sprintf("tool-%d", i))}, }, ) registeredPending = append(registeredPending, registered{ @@ -831,7 +831,7 @@ func TestHandleInboundPermissionRequestAutoApprovesReadRequests(t *testing.T) { {OptionId: "allow-once", Name: "allow once", Kind: acpsdk.PermissionOptionKindAllowOnce}, {OptionId: "reject-once", Name: "reject once", Kind: acpsdk.PermissionOptionKindRejectOnce}, }, - ToolCall: acpsdk.RequestPermissionToolCall{ + ToolCall: acpsdk.ToolCallUpdate{ ToolCallId: "tool-read", Title: &title, Kind: &kind, @@ -921,7 +921,7 @@ func TestTerminalLifecycleHandlers(t *testing.T) { if _, reqErr := proc.handleInbound( context.Background(), acpsdk.ClientMethodTerminalKill, - mustMarshalJSON(acpsdk.KillTerminalCommandRequest{ + mustMarshalJSON(acpsdk.KillTerminalRequest{ SessionId: "sess-direct", TerminalId: createResponse.TerminalId, }), @@ -1055,7 +1055,7 @@ func TestNetworkTurnTerminalOwnershipGuards(t *testing.T) { t.Fatalf("handleWaitForTerminalExit(previous network turn) error = %v, want ErrToolBlockedForNetworkTurn", err) } - if _, err := proc.handleKillTerminal(acpsdk.KillTerminalCommandRequest{ + if _, err := proc.handleKillTerminal(acpsdk.KillTerminalRequest{ SessionId: "sess-direct", TerminalId: networkCreate.TerminalId, }); err != nil { @@ -1096,7 +1096,7 @@ func TestNetworkTurnTerminalOwnershipGuards(t *testing.T) { { name: "kill user terminal", run: func() error { - _, err := proc.handleKillTerminal(acpsdk.KillTerminalCommandRequest{ + _, err := proc.handleKillTerminal(acpsdk.KillTerminalRequest{ SessionId: "sess-direct", TerminalId: userCreate.TerminalId, }) @@ -1444,7 +1444,7 @@ func TestPermissionHelperBranches(t *testing.T) { readKind := acpsdk.ToolKindRead readDecision, interactive := policy.permissionDecision(acpsdk.RequestPermissionRequest{ - ToolCall: acpsdk.RequestPermissionToolCall{ + ToolCall: acpsdk.ToolCallUpdate{ Kind: &readKind, Locations: []acpsdk.ToolCallLocation{{Path: filepath.Join(root, "inside.txt")}}, }, @@ -1459,7 +1459,7 @@ func TestPermissionHelperBranches(t *testing.T) { } editKind := acpsdk.ToolKindEdit editDecision, interactive := approveReadsPolicy.permissionDecision(acpsdk.RequestPermissionRequest{ - ToolCall: acpsdk.RequestPermissionToolCall{Kind: &editKind}, + ToolCall: acpsdk.ToolCallUpdate{Kind: &editKind}, }) if editDecision != decisionPending || !interactive { t.Fatalf("permissionDecision(edit) = %q, %v, want %q, true", editDecision, interactive, decisionPending) @@ -1470,7 +1470,7 @@ func TestPermissionHelperBranches(t *testing.T) { t.Fatalf("newPermissionPolicy(deny-all) error = %v", err) } denyDecision, interactive := denyAllPolicy.permissionDecision(acpsdk.RequestPermissionRequest{ - ToolCall: acpsdk.RequestPermissionToolCall{Kind: &editKind}, + ToolCall: acpsdk.ToolCallUpdate{Kind: &editKind}, }) if denyDecision != decisionRejectOnce || interactive { t.Fatalf("permissionDecision(deny-all) = %q, %v, want %q, false", denyDecision, interactive, decisionRejectOnce) @@ -1481,7 +1481,7 @@ func TestPermissionHelperBranches(t *testing.T) { } title := "Write file" if got := permissionRequestName("turn-1", acpsdk.RequestPermissionRequest{ - ToolCall: acpsdk.RequestPermissionToolCall{ + ToolCall: acpsdk.ToolCallUpdate{ Title: &title, Kind: &editKind, }, diff --git a/internal/acp/permission.go b/internal/acp/permission.go index 2b07f69ea..96342a3c4 100644 --- a/internal/acp/permission.go +++ b/internal/acp/permission.go @@ -555,14 +555,14 @@ func permissionRequestName(turnID string, request acpsdk.RequestPermissionReques return strings.Join(parts, ":") } -func toolCallTitle(toolCall acpsdk.RequestPermissionToolCall) string { +func toolCallTitle(toolCall acpsdk.ToolCallUpdate) string { if toolCall.Title == nil { return "" } return strings.TrimSpace(*toolCall.Title) } -func toolCallKind(toolCall acpsdk.RequestPermissionToolCall) string { +func toolCallKind(toolCall acpsdk.ToolCallUpdate) string { if toolCall.Kind == nil { return "" } diff --git a/internal/acp/process_tree_test.go b/internal/acp/process_tree_test.go index a251e4769..ae615b58f 100644 --- a/internal/acp/process_tree_test.go +++ b/internal/acp/process_tree_test.go @@ -83,7 +83,7 @@ func TestTerminalKillTerminatesWrappedProcessTree(t *testing.T) { if _, reqErr := proc.handleInbound( context.Background(), acpsdk.ClientMethodTerminalKill, - mustMarshalJSON(acpsdk.KillTerminalCommandRequest{ + mustMarshalJSON(acpsdk.KillTerminalRequest{ SessionId: "sess-direct", TerminalId: createResponse.TerminalId, }), diff --git a/internal/acp/types.go b/internal/acp/types.go index 24ac05faf..940291e95 100644 --- a/internal/acp/types.go +++ b/internal/acp/types.go @@ -62,6 +62,7 @@ type StartOpts struct { Permissions aghconfig.PermissionMode SystemPrompt string PreferredModel string + ReasoningEffort string ResumeSessionID string Launcher sandbox.Launcher ToolHost sandbox.ToolHost @@ -218,6 +219,23 @@ func (m PromptMeta) IsZero() bool { return normalized.TurnSource == "" && normalized.Network == nil && normalized.Synthetic == nil } +// ToMap converts normalized prompt metadata to the ACP SDK extensibility map. +func (m PromptMeta) ToMap() (map[string]any, error) { + normalized := m.Normalize() + if normalized.IsZero() { + return nil, nil + } + encoded, err := json.Marshal(normalized) + if err != nil { + return nil, fmt.Errorf("acp: encode prompt metadata: %w", err) + } + var decoded map[string]any + if err := json.Unmarshal(encoded, &decoded); err != nil { + return nil, fmt.Errorf("acp: decode prompt metadata map: %w", err) + } + return decoded, nil +} + // Validate ensures the metadata shape is internally consistent. func (m PromptMeta) Validate() error { normalized := m.Normalize() @@ -307,6 +325,58 @@ type Caps struct { SupportsLoadSession bool SupportedModes []string SupportedModels []string + ConfigOptions []SessionConfigOption +} + +// SessionConfigOptionKind identifies the ACP config option shape AGH exposes. +type SessionConfigOptionKind string + +const ( + // SessionConfigOptionKindSelect is a single-value ACP config selector. + SessionConfigOptionKindSelect SessionConfigOptionKind = "select" + // SessionConfigOptionKindBoolean is an ACP boolean config toggle. + SessionConfigOptionKindBoolean SessionConfigOptionKind = "boolean" +) + +// SessionConfigOption captures one active ACP session config option. +type SessionConfigOption struct { + ID string + Label string + Description string + Kind SessionConfigOptionKind + Current string + Values []SessionConfigOptionValue +} + +// SessionConfigOptionValue captures one selectable value for an ACP config option. +type SessionConfigOptionValue struct { + Value string + Label string + Description string +} + +// CloneCaps returns a deep copy of ACP caps. +func CloneCaps(caps Caps) Caps { + return Caps{ + SupportsLoadSession: caps.SupportsLoadSession, + SupportedModes: append([]string(nil), caps.SupportedModes...), + SupportedModels: append([]string(nil), caps.SupportedModels...), + ConfigOptions: CloneSessionConfigOptions(caps.ConfigOptions), + } +} + +// CloneSessionConfigOptions returns a deep copy of session config options. +func CloneSessionConfigOptions(options []SessionConfigOption) []SessionConfigOption { + if len(options) == 0 { + return nil + } + cloned := make([]SessionConfigOption, 0, len(options)) + for _, option := range options { + copyOption := option + copyOption.Values = append([]SessionConfigOptionValue(nil), option.Values...) + cloned = append(cloned, copyOption) + } + return cloned } // TokenUsage captures per-turn usage reported by the agent. @@ -416,6 +486,7 @@ type AgentProcess struct { Caps Caps StartedAt time.Time + capsMu sync.RWMutex managed *subprocess.Process handle sandbox.Handle toolHostMu sync.Mutex @@ -524,6 +595,34 @@ func (p *AgentProcess) HealthState() subprocess.HealthState { return p.managed.HealthState() } +// CapsSnapshot returns the latest ACP capability/config snapshot. +func (p *AgentProcess) CapsSnapshot() Caps { + if p == nil { + return Caps{} + } + p.capsMu.RLock() + defer p.capsMu.RUnlock() + return CloneCaps(p.Caps) +} + +func (p *AgentProcess) setCaps(caps Caps) { + if p == nil { + return + } + p.capsMu.Lock() + defer p.capsMu.Unlock() + p.Caps = CloneCaps(caps) +} + +func (p *AgentProcess) setConfigOptions(options []SessionConfigOption) { + if p == nil { + return + } + p.capsMu.Lock() + defer p.capsMu.Unlock() + p.Caps.ConfigOptions = CloneSessionConfigOptions(options) +} + // ToolHost returns the sandbox-owned tool host used by this process. func (p *AgentProcess) ToolHost() sandbox.ToolHost { if p == nil { diff --git a/internal/acp/types_test.go b/internal/acp/types_test.go index 33d832e22..56db02dd9 100644 --- a/internal/acp/types_test.go +++ b/internal/acp/types_test.go @@ -7,6 +7,73 @@ import ( "time" ) +func TestPromptMetaToMap(t *testing.T) { + t.Parallel() + + t.Run("Should convert normalized prompt metadata", func(t *testing.T) { + t.Parallel() + + metaMap, err := (PromptMeta{ + TurnSource: PromptTurnSourceSynthetic, + Synthetic: &PromptSyntheticMeta{ + Reason: "wake", + }, + }).ToMap() + if err != nil { + t.Fatalf("ToMap() error = %v", err) + } + if metaMap["turn_source"] != PromptTurnSourceSynthetic { + t.Fatalf("ToMap() turn_source = %#v", metaMap["turn_source"]) + } + synthetic, ok := metaMap["synthetic"].(map[string]any) + if !ok || synthetic["reason"] != "wake" { + t.Fatalf("ToMap() synthetic = %#v", metaMap["synthetic"]) + } + }) + + t.Run("Should return nil for empty metadata", func(t *testing.T) { + t.Parallel() + + metaMap, err := (PromptMeta{}).ToMap() + if err != nil { + t.Fatalf("ToMap(empty) error = %v", err) + } + if metaMap != nil { + t.Fatalf("ToMap(empty) = %#v, want nil", metaMap) + } + }) +} + +func TestAgentProcessCapsSnapshotClonesConfigOptions(t *testing.T) { + t.Parallel() + + proc := &AgentProcess{} + proc.setCaps(Caps{ + ConfigOptions: []SessionConfigOption{ + { + ID: "model", + Kind: SessionConfigOptionKindSelect, + Current: "model-a", + Values: []SessionConfigOptionValue{{Value: "model-a"}}, + }, + }, + }) + + first := proc.CapsSnapshot() + first.ConfigOptions[0].Current = "mutated" + first.ConfigOptions[0].Values[0].Value = "mutated" + + second := proc.CapsSnapshot() + if second.ConfigOptions[0].Current != "model-a" || second.ConfigOptions[0].Values[0].Value != "model-a" { + t.Fatalf("CapsSnapshot() leaked mutable config options: %#v", second.ConfigOptions) + } + proc.setConfigOptions([]SessionConfigOption{{ID: "reasoning_effort", Kind: SessionConfigOptionKindSelect}}) + updated := proc.CapsSnapshot() + if len(updated.ConfigOptions) != 1 || updated.ConfigOptions[0].ID != "reasoning_effort" { + t.Fatalf("setConfigOptions() = %#v", updated.ConfigOptions) + } +} + func TestEndPromptClearsActivePromptWhileEmitterIsBackpressured(t *testing.T) { t.Parallel() diff --git a/internal/api/contract/contract.go b/internal/api/contract/contract.go index e9dd9d64d..b9ab7b5a4 100644 --- a/internal/api/contract/contract.go +++ b/internal/api/contract/contract.go @@ -14,12 +14,14 @@ import ( // CreateSessionRequest is the shared session creation request payload. type CreateSessionRequest struct { - AgentName string `json:"agent_name,omitempty"` - Provider string `json:"provider,omitempty"` - Name string `json:"name,omitempty"` - Workspace string `json:"workspace,omitempty"` - WorkspacePath string `json:"workspace_path,omitempty"` - Channel string `json:"channel,omitempty"` + AgentName string `json:"agent_name,omitempty"` + Provider string `json:"provider,omitempty"` + Model string `json:"model,omitempty"` + ReasoningEffort string `json:"reasoning_effort,omitempty"` + Name string `json:"name,omitempty"` + Workspace string `json:"workspace,omitempty"` + WorkspacePath string `json:"workspace_path,omitempty"` + Channel string `json:"channel,omitempty"` } // ApproveSessionRequest is the interactive permission approval payload. @@ -31,15 +33,17 @@ type ApproveSessionRequest struct { // SessionPayload is the shared session response payload. type SessionPayload struct { - ID string `json:"id"` - Name string `json:"name,omitempty"` - AgentName string `json:"agent_name"` - Provider string `json:"provider"` - WorkspaceID string `json:"workspace_id,omitempty"` - WorkspacePath string `json:"workspace_path,omitempty"` - Channel string `json:"channel,omitempty"` - Type session.Type `json:"type,omitempty"` - State session.State `json:"state"` + ID string `json:"id"` + Name string `json:"name,omitempty"` + AgentName string `json:"agent_name"` + Provider string `json:"provider"` + Model string `json:"model,omitempty"` + ReasoningEffort string `json:"reasoning_effort,omitempty"` + WorkspaceID string `json:"workspace_id,omitempty"` + WorkspacePath string `json:"workspace_path,omitempty"` + Channel string `json:"channel,omitempty"` + Type session.Type `json:"type,omitempty"` + State session.State `json:"state"` // StopReason is the session-level stop classification, distinct from AgentEventPayload.StopReason. StopReason store.StopReason `json:"stop_reason,omitempty"` // StopDetail is the session-level stop context paired with StopReason. @@ -95,9 +99,151 @@ type SessionSandboxPayload struct { // ACPCapsPayload is the JSON representation of ACP capabilities. type ACPCapsPayload struct { - SupportsLoadSession bool `json:"supports_load_session"` - SupportedModes []string `json:"supported_modes,omitempty"` - SupportedModels []string `json:"supported_models,omitempty"` + SupportsLoadSession bool `json:"supports_load_session"` + SupportedModes []string `json:"supported_modes,omitempty"` + SupportedModels []string `json:"supported_models,omitempty"` + ConfigOptions []SessionConfigOptionPayload `json:"config_options,omitempty"` +} + +// SessionConfigOptionPayload is one active ACP session config option. +type SessionConfigOptionPayload struct { + ID string `json:"id"` + Label string `json:"label,omitempty"` + Description string `json:"description,omitempty"` + Kind string `json:"kind"` + Current string `json:"current,omitempty"` + Values []SessionConfigOptionValuePayload `json:"values,omitempty"` +} + +// SessionConfigOptionValuePayload is one selectable value for an active ACP config option. +type SessionConfigOptionValuePayload struct { + Value string `json:"value"` + Label string `json:"label,omitempty"` + Description string `json:"description,omitempty"` +} + +// ProviderModelListResponse is the native provider model catalog list payload. +type ProviderModelListResponse struct { + Models []ProviderModelPayload `json:"models"` +} + +// ProviderModelRefreshRequest captures one provider model catalog refresh request. +type ProviderModelRefreshRequest struct { + SourceID string `json:"source_id,omitempty"` + Force bool `json:"force,omitempty"` + RequestID string `json:"request_id,omitempty"` +} + +// ProviderModelRefreshResponse reports provider model catalog refresh source status. +type ProviderModelRefreshResponse struct { + Sources []ModelCatalogSourceStatusPayload `json:"sources"` + Error string `json:"error,omitempty"` +} + +// ProviderModelStatusResponse reports provider model catalog source status. +type ProviderModelStatusResponse struct { + Sources []ModelCatalogSourceStatusPayload `json:"sources"` +} + +// ProviderModelPayload is one merged provider model catalog projection. +type ProviderModelPayload struct { + ProviderID string `json:"provider_id"` + ModelID string `json:"model_id"` + DisplayName string `json:"display_name,omitempty"` + Sources []ModelCatalogSourceRefPayload `json:"sources"` + Available *bool `json:"available"` + AvailabilityState string `json:"availability_state"` + Stale bool `json:"stale"` + RefreshedAt string `json:"refreshed_at,omitempty"` + ContextWindow *int64 `json:"context_window,omitempty"` + MaxInputTokens *int64 `json:"max_input_tokens,omitempty"` + MaxOutputTokens *int64 `json:"max_output_tokens,omitempty"` + SupportsTools *bool `json:"supports_tools,omitempty"` + SupportsReasoning *bool `json:"supports_reasoning,omitempty"` + ReasoningEfforts []string `json:"reasoning_efforts,omitempty"` + DefaultReasoningEffort *string `json:"default_reasoning_effort,omitempty"` + Cost *ModelCatalogCostPayload `json:"cost,omitempty"` + LastError string `json:"last_error,omitempty"` +} + +// ModelCatalogSourceRefPayload identifies one source used by a merged model. +type ModelCatalogSourceRefPayload struct { + SourceID string `json:"source_id"` + SourceKind string `json:"source_kind"` + Priority int `json:"priority"` + RefreshedAt string `json:"refreshed_at,omitempty"` + Stale bool `json:"stale"` + LastError string `json:"last_error,omitempty"` +} + +// ModelCatalogSourceStatusPayload reports provider-scoped catalog source health. +type ModelCatalogSourceStatusPayload struct { + SourceID string `json:"source_id"` + SourceKind string `json:"source_kind"` + ProviderID string `json:"provider_id"` + Priority int `json:"priority"` + LastRefresh string `json:"last_refresh,omitempty"` + NextRefresh string `json:"next_refresh,omitempty"` + LastSuccess string `json:"last_success,omitempty"` + LastError string `json:"last_error,omitempty"` + RefreshState string `json:"refresh_state"` + RowCount int `json:"row_count"` + Stale bool `json:"stale"` +} + +// ModelCatalogCostPayload reports normalized model price hints. +type ModelCatalogCostPayload struct { + InputPerMillion *float64 `json:"input_per_million,omitempty"` + OutputPerMillion *float64 `json:"output_per_million,omitempty"` +} + +// OpenAIModelListResponse is the OpenAI-compatible model list projection. +type OpenAIModelListResponse struct { + Object string `json:"object"` + Data []OpenAIModelPayload `json:"data"` +} + +// OpenAIModelPayload is one OpenAI-compatible model object with AGH metadata. +type OpenAIModelPayload struct { + ID string `json:"id"` + Object string `json:"object"` + Created int64 `json:"created"` + OwnedBy string `json:"owned_by"` + AGH OpenAIModelAGHPayload `json:"agh"` +} + +// OpenAIModelAGHPayload carries AGH-specific model metadata under the `agh` key. +type OpenAIModelAGHPayload struct { + ProviderID string `json:"provider_id"` + ModelID string `json:"model_id"` + DisplayName string `json:"display_name,omitempty"` + Sources []string `json:"sources"` + Available *bool `json:"available"` + AvailabilityState string `json:"availability_state"` + Stale bool `json:"stale"` + RefreshedAt string `json:"refreshed_at,omitempty"` + ContextWindow *int64 `json:"context_window,omitempty"` + MaxInputTokens *int64 `json:"max_input_tokens,omitempty"` + MaxOutputTokens *int64 `json:"max_output_tokens,omitempty"` + SupportsTools *bool `json:"supports_tools,omitempty"` + SupportsReasoning *bool `json:"supports_reasoning,omitempty"` + ReasoningEfforts []string `json:"reasoning_efforts,omitempty"` + DefaultReasoningEffort *string `json:"default_reasoning_effort,omitempty"` + Cost *ModelCatalogCostPayload `json:"cost,omitempty"` + LastError string `json:"last_error,omitempty"` +} + +// OpenAIErrorResponse is the OpenAI-compatible error envelope. +type OpenAIErrorResponse struct { + Error OpenAIErrorPayload `json:"error"` +} + +// OpenAIErrorPayload carries OpenAI-style error details. +type OpenAIErrorPayload struct { + Message string `json:"message"` + Type string `json:"type"` + Param *string `json:"param"` + Code string `json:"code"` } // SessionEventPayload is the shared session event response payload. @@ -914,7 +1060,6 @@ type SessionProviderOptionPayload struct { DisplayName string `json:"display_name,omitempty"` Harness string `json:"harness,omitempty"` RuntimeProvider string `json:"runtime_provider,omitempty"` - DefaultModel string `json:"default_model,omitempty"` AuthMode string `json:"auth_mode,omitempty"` EnvPolicy string `json:"env_policy,omitempty"` HomePolicy string `json:"home_policy,omitempty"` diff --git a/internal/api/contract/contract_test.go b/internal/api/contract/contract_test.go index 4ef461b57..b1bd8dcab 100644 --- a/internal/api/contract/contract_test.go +++ b/internal/api/contract/contract_test.go @@ -23,14 +23,16 @@ func TestSessionPayloadJSONShape(t *testing.T) { now := time.Date(2026, 4, 7, 10, 30, 0, 0, time.UTC) ttl := now.Add(time.Hour) payload := core.SessionPayloadFromInfo(&session.Info{ - ID: "sess-1", - Name: "demo", - AgentName: "coder", - Provider: "fake", - WorkspaceID: "ws_alpha", - Workspace: "/workspace", - State: session.StateActive, - ACPSessionID: "acp-123", + ID: "sess-1", + Name: "demo", + AgentName: "coder", + Provider: "fake", + Model: "gpt-test", + ReasoningEffort: "high", + WorkspaceID: "ws_alpha", + Workspace: "/workspace", + State: session.StateActive, + ACPSessionID: "acp-123", Lineage: &store.SessionLineage{ RootSessionID: "sess-1", SpawnDepth: 0, @@ -51,6 +53,17 @@ func TestSessionPayloadJSONShape(t *testing.T) { SupportsLoadSession: true, SupportedModes: []string{"chat"}, SupportedModels: []string{"gpt-test"}, + ConfigOptions: []acp.SessionConfigOption{ + { + ID: "model", + Label: "Model", + Kind: acp.SessionConfigOptionKindSelect, + Current: "gpt-test", + Values: []acp.SessionConfigOptionValue{ + {Value: "gpt-test", Label: "GPT Test"}, + }, + }, + }, }, }) @@ -59,6 +72,8 @@ func TestSessionPayloadJSONShape(t *testing.T) { if got["agent_name"] != "coder" || got["provider"] != "fake" || + got["model"] != "gpt-test" || + got["reasoning_effort"] != "high" || got["workspace_id"] != "ws_alpha" || got["workspace_path"] != "/workspace" { t.Fatalf("session JSON = %#v", got) @@ -89,6 +104,17 @@ func TestSessionPayloadJSONShape(t *testing.T) { if acpCaps["supports_load_session"] != true { t.Fatalf("acp_caps JSON = %#v", acpCaps) } + configOptions, ok := acpCaps["config_options"].([]any) + if !ok || len(configOptions) != 1 { + t.Fatalf("config_options JSON = %#v", acpCaps["config_options"]) + } + configOption, ok := configOptions[0].(map[string]any) + if !ok { + t.Fatalf("config option type = %T, want object", configOptions[0]) + } + if configOption["id"] != "model" || configOption["kind"] != "select" || configOption["current"] != "gpt-test" { + t.Fatalf("config option JSON = %#v", configOption) + } sandboxPayload, ok := got["sandbox"].(map[string]any) if !ok { t.Fatalf("sandbox type = %T, want object", got["sandbox"]) @@ -216,6 +242,49 @@ func TestCreateSessionRequestJSONShape(t *testing.T) { t.Fatalf("request = %#v", req) } }) + + t.Run("Should round-trip model and reasoning_effort overrides", func(t *testing.T) { + t.Parallel() + + req := contract.CreateSessionRequest{ + AgentName: "coder", + Provider: "codex", + Model: "gpt-5.4", + ReasoningEffort: "high", + Workspace: "alpha", + } + raw, err := json.Marshal(req) + if err != nil { + t.Fatalf("json.Marshal() error = %v", err) + } + var decoded contract.CreateSessionRequest + if err := json.Unmarshal(raw, &decoded); err != nil { + t.Fatalf("json.Unmarshal() error = %v", err) + } + if decoded.Model != "gpt-5.4" || decoded.ReasoningEffort != "high" { + t.Fatalf("decoded = %#v", decoded) + } + var shape map[string]any + if err := json.Unmarshal(raw, &shape); err != nil { + t.Fatalf("json.Unmarshal(map) error = %v", err) + } + if shape["model"] != "gpt-5.4" || shape["reasoning_effort"] != "high" { + t.Fatalf("shape = %#v", shape) + } + }) + + t.Run("Should omit model and reasoning_effort cleanly when absent", func(t *testing.T) { + t.Parallel() + + req := contract.CreateSessionRequest{AgentName: "coder", Workspace: "alpha"} + raw, err := json.Marshal(req) + if err != nil { + t.Fatalf("json.Marshal() error = %v", err) + } + if strings.Contains(string(raw), "model") || strings.Contains(string(raw), "reasoning_effort") { + t.Fatalf("raw = %s", string(raw)) + } + }) } func TestMemoryV2PublicContractJSONShape(t *testing.T) { diff --git a/internal/api/contract/settings.go b/internal/api/contract/settings.go index ff6aef4c6..eee014726 100644 --- a/internal/api/contract/settings.go +++ b/internal/api/contract/settings.go @@ -539,7 +539,7 @@ type SettingsSourceMetadataPayload struct { type SettingsProviderSettingsPayload struct { Command string `json:"command,omitempty"` DisplayName string `json:"display_name,omitempty"` - DefaultModel string `json:"default_model,omitempty"` + Models *SettingsProviderModelsPayload `json:"models,omitempty"` Harness string `json:"harness,omitempty"` RuntimeProvider string `json:"runtime_provider,omitempty"` Transport string `json:"transport,omitempty"` @@ -552,6 +552,33 @@ type SettingsProviderSettingsPayload struct { CredentialSlots []SettingsProviderCredentialSlotPayload `json:"credential_slots,omitempty"` } +type SettingsProviderModelsPayload struct { + Default string `json:"default,omitempty"` + Curated []SettingsProviderModelPayload `json:"curated,omitempty"` + Discovery *SettingsProviderModelsDiscoveryPayload `json:"discovery,omitempty"` +} + +type SettingsProviderModelsDiscoveryPayload struct { + Enabled *bool `json:"enabled,omitempty"` + Command string `json:"command,omitempty"` + Endpoint string `json:"endpoint,omitempty"` + Timeout string `json:"timeout,omitempty"` +} + +type SettingsProviderModelPayload struct { + ID string `json:"id"` + DisplayName string `json:"display_name,omitempty"` + ContextWindow *int64 `json:"context_window,omitempty"` + MaxInputTokens *int64 `json:"max_input_tokens,omitempty"` + MaxOutputTokens *int64 `json:"max_output_tokens,omitempty"` + SupportsTools *bool `json:"supports_tools,omitempty"` + SupportsReasoning *bool `json:"supports_reasoning,omitempty"` + ReasoningEfforts []string `json:"reasoning_efforts,omitempty"` + DefaultReasoningEffort string `json:"default_reasoning_effort,omitempty"` + CostInputPerMillion *float64 `json:"cost_input_per_million,omitempty"` + CostOutputPerMillion *float64 `json:"cost_output_per_million,omitempty"` +} + type SettingsProviderCredentialSlotPayload struct { Name string `json:"name"` TargetEnv string `json:"target_env"` diff --git a/internal/api/core/conversions.go b/internal/api/core/conversions.go index c58523c55..0ac69894e 100644 --- a/internal/api/core/conversions.go +++ b/internal/api/core/conversions.go @@ -43,22 +43,24 @@ func SessionPayloadFromInfo(info *session.Info) contract.SessionPayload { ref := workref.NewPath(info.WorkspaceID, info.Workspace) payload = contract.SessionPayload{ - ID: info.ID, - Name: info.Name, - AgentName: info.AgentName, - Provider: info.Provider, - WorkspaceID: ref.WorkspaceID, - WorkspacePath: ref.WorkspacePath, - Channel: info.Channel, - Type: info.Type, - State: info.State, - StopReason: info.StopReason, - StopDetail: info.StopDetail, - Failure: SessionFailurePayloadFromStore(info.Failure), - ACPSessionID: info.ACPSessionID, - Lineage: contract.SessionLineagePayloadFromStore(info.Lineage), - CreatedAt: info.CreatedAt, - UpdatedAt: info.UpdatedAt, + ID: info.ID, + Name: info.Name, + AgentName: info.AgentName, + Provider: info.Provider, + Model: strings.TrimSpace(info.Model), + ReasoningEffort: strings.TrimSpace(info.ReasoningEffort), + WorkspaceID: ref.WorkspaceID, + WorkspacePath: ref.WorkspacePath, + Channel: info.Channel, + Type: info.Type, + State: info.State, + StopReason: info.StopReason, + StopDetail: info.StopDetail, + Failure: SessionFailurePayloadFromStore(info.Failure), + ACPSessionID: info.ACPSessionID, + Lineage: contract.SessionLineagePayloadFromStore(info.Lineage), + CreatedAt: info.CreatedAt, + UpdatedAt: info.UpdatedAt, } if caps := ACPCapsPayloadFromInfo(info.ACPCaps); caps != nil { payload.ACPCaps = caps @@ -175,7 +177,10 @@ func SessionFailurePayloadFromStore(failure *store.SessionFailure) *contract.Ses // ACPCapsPayloadFromInfo converts ACP capability info into the shared payload. func ACPCapsPayloadFromInfo(caps acp.Caps) *contract.ACPCapsPayload { - if !caps.SupportsLoadSession && len(caps.SupportedModes) == 0 && len(caps.SupportedModels) == 0 { + if !caps.SupportsLoadSession && + len(caps.SupportedModes) == 0 && + len(caps.SupportedModels) == 0 && + len(caps.ConfigOptions) == 0 { return nil } @@ -183,9 +188,46 @@ func ACPCapsPayloadFromInfo(caps acp.Caps) *contract.ACPCapsPayload { SupportsLoadSession: caps.SupportsLoadSession, SupportedModes: append([]string(nil), caps.SupportedModes...), SupportedModels: append([]string(nil), caps.SupportedModels...), + ConfigOptions: SessionConfigOptionPayloadsFromInfo(caps.ConfigOptions), } } +// SessionConfigOptionPayloadsFromInfo converts active ACP config options into the shared payload. +func SessionConfigOptionPayloadsFromInfo(options []acp.SessionConfigOption) []contract.SessionConfigOptionPayload { + if len(options) == 0 { + return nil + } + payloads := make([]contract.SessionConfigOptionPayload, 0, len(options)) + for _, option := range options { + payloads = append(payloads, contract.SessionConfigOptionPayload{ + ID: strings.TrimSpace(option.ID), + Label: strings.TrimSpace(option.Label), + Description: strings.TrimSpace(option.Description), + Kind: string(option.Kind), + Current: strings.TrimSpace(option.Current), + Values: sessionConfigOptionValuePayloads(option.Values), + }) + } + return payloads +} + +func sessionConfigOptionValuePayloads( + values []acp.SessionConfigOptionValue, +) []contract.SessionConfigOptionValuePayload { + if len(values) == 0 { + return nil + } + payloads := make([]contract.SessionConfigOptionValuePayload, 0, len(values)) + for _, value := range values { + payloads = append(payloads, contract.SessionConfigOptionValuePayload{ + Value: strings.TrimSpace(value.Value), + Label: strings.TrimSpace(value.Label), + Description: strings.TrimSpace(value.Description), + }) + } + return payloads +} + // SessionEventPayloadFromEvent converts a session event into the shared payload. func SessionEventPayloadFromEvent(event store.SessionEvent, info *session.Info) contract.SessionEventPayload { ref := workref.NewPath(sessionWorkspaceFromInfo(info)) @@ -1031,7 +1073,6 @@ func sessionProviderOptionPayloadFromConfig( DisplayName: strings.TrimSpace(resolved.DisplayName), Harness: string(resolved.EffectiveHarness()), RuntimeProvider: strings.TrimSpace(resolved.RuntimeProviderName(providerName)), - DefaultModel: strings.TrimSpace(resolved.DefaultModel), AuthMode: string(resolved.EffectiveAuthMode()), EnvPolicy: string(resolved.EffectiveEnvPolicy()), HomePolicy: string(resolved.EffectiveHomePolicy()), @@ -1922,13 +1963,16 @@ func settingsProviderItemPayloads(values []settingspkg.ProviderItem) []contract. return nil } payloads := make([]contract.SettingsProviderItemPayload, 0, len(values)) - for _, value := range values { - payloads = append(payloads, settingsProviderItemPayload(value)) + for idx := range values { + payloads = append(payloads, settingsProviderItemPayload(&values[idx])) } return payloads } -func settingsProviderItemPayload(value settingspkg.ProviderItem) contract.SettingsProviderItemPayload { +func settingsProviderItemPayload(value *settingspkg.ProviderItem) contract.SettingsProviderItemPayload { + if value == nil { + return contract.SettingsProviderItemPayload{} + } payload := contract.SettingsProviderItemPayload{ Name: strings.TrimSpace(value.Name), Settings: settingsProviderSettingsPayload(value.Settings), @@ -1951,7 +1995,7 @@ func settingsProviderSettingsPayload(value settingspkg.ProviderSettings) contrac return contract.SettingsProviderSettingsPayload{ Command: strings.TrimSpace(value.Command), DisplayName: strings.TrimSpace(value.DisplayName), - DefaultModel: strings.TrimSpace(value.DefaultModel), + Models: settingsProviderModelsPayload(value.Models), Harness: string(value.Harness), RuntimeProvider: strings.TrimSpace(value.RuntimeProvider), Transport: strings.TrimSpace(value.Transport), @@ -1965,6 +2009,70 @@ func settingsProviderSettingsPayload(value settingspkg.ProviderSettings) contrac } } +func settingsProviderModelsPayload( + value aghconfig.ProviderModelsConfig, +) *contract.SettingsProviderModelsPayload { + if providerModelsConfigIsEmpty(value) { + return nil + } + return &contract.SettingsProviderModelsPayload{ + Default: strings.TrimSpace(value.Default), + Curated: settingsProviderModelPayloads(value.Curated), + Discovery: settingsProviderModelsDiscoveryPayload(value.Discovery), + } +} + +func settingsProviderModelsDiscoveryPayload( + value aghconfig.ProviderModelsDiscoveryConfig, +) *contract.SettingsProviderModelsDiscoveryPayload { + if value.Enabled == nil && + strings.TrimSpace(value.Command) == "" && + strings.TrimSpace(value.Endpoint) == "" && + strings.TrimSpace(value.Timeout) == "" { + return nil + } + return &contract.SettingsProviderModelsDiscoveryPayload{ + Enabled: cloneBoolPtr(value.Enabled), + Command: strings.TrimSpace(value.Command), + Endpoint: strings.TrimSpace(value.Endpoint), + Timeout: strings.TrimSpace(value.Timeout), + } +} + +func settingsProviderModelPayloads( + values []aghconfig.ProviderModelConfig, +) []contract.SettingsProviderModelPayload { + if values == nil { + return nil + } + payloads := make([]contract.SettingsProviderModelPayload, 0, len(values)) + for _, value := range values { + payloads = append(payloads, contract.SettingsProviderModelPayload{ + ID: strings.TrimSpace(value.ID), + DisplayName: strings.TrimSpace(value.DisplayName), + ContextWindow: cloneInt64Ptr(value.ContextWindow), + MaxInputTokens: cloneInt64Ptr(value.MaxInputTokens), + MaxOutputTokens: cloneInt64Ptr(value.MaxOutputTokens), + SupportsTools: cloneBoolPtr(value.SupportsTools), + SupportsReasoning: cloneBoolPtr(value.SupportsReasoning), + ReasoningEfforts: cloneStrings(value.ReasoningEfforts), + DefaultReasoningEffort: strings.TrimSpace(value.DefaultReasoningEffort), + CostInputPerMillion: cloneFloat64Ptr(value.CostInputPerMillion), + CostOutputPerMillion: cloneFloat64Ptr(value.CostOutputPerMillion), + }) + } + return payloads +} + +func providerModelsConfigIsEmpty(value aghconfig.ProviderModelsConfig) bool { + return strings.TrimSpace(value.Default) == "" && + value.Curated == nil && + value.Discovery.Enabled == nil && + strings.TrimSpace(value.Discovery.Command) == "" && + strings.TrimSpace(value.Discovery.Endpoint) == "" && + strings.TrimSpace(value.Discovery.Timeout) == "" +} + func settingsProviderCredentialSlotPayloads( values []aghconfig.ProviderCredentialSlot, ) []contract.SettingsProviderCredentialSlotPayload { @@ -2248,6 +2356,14 @@ func cloneStrings(src []string) []string { return append([]string(nil), src...) } +func cloneBoolPtr(src *bool) *bool { + if src == nil { + return nil + } + value := *src + return &value +} + func resourceKindsToStrings(values []resources.ResourceKind) []string { if len(values) == 0 { return nil diff --git a/internal/api/core/conversions_parsers_test.go b/internal/api/core/conversions_parsers_test.go index a89dc02e6..2eb958b6d 100644 --- a/internal/api/core/conversions_parsers_test.go +++ b/internal/api/core/conversions_parsers_test.go @@ -27,14 +27,16 @@ func TestSessionPayloadFromInfo(t *testing.T) { now := time.Date(2026, 4, 3, 12, 0, 0, 0, time.UTC) ttl := now.Add(time.Hour) payload := core.SessionPayloadFromInfo(&session.Info{ - ID: "sess-1", - Name: "demo", - AgentName: "coder", - Provider: "fake", - WorkspaceID: "ws_alpha", - Workspace: "/workspace", - Channel: "builders", - Type: session.SessionTypeDream, + ID: "sess-1", + Name: "demo", + AgentName: "coder", + Provider: "fake", + Model: "gpt-test", + ReasoningEffort: "high", + WorkspaceID: "ws_alpha", + Workspace: "/workspace", + Channel: "builders", + Type: session.SessionTypeDream, Lineage: &store.SessionLineage{ ParentSessionID: "sess-root", RootSessionID: "sess-root", @@ -82,6 +84,18 @@ func TestSessionPayloadFromInfo(t *testing.T) { SupportsLoadSession: true, SupportedModes: []string{"chat"}, SupportedModels: []string{"gpt-test"}, + ConfigOptions: []acp.SessionConfigOption{ + { + ID: "reasoning_effort", + Label: "Reasoning effort", + Kind: acp.SessionConfigOptionKindSelect, + Current: "high", + Values: []acp.SessionConfigOptionValue{ + {Value: "low", Label: "Low"}, + {Value: "high", Label: "High"}, + }, + }, + }, }, }) @@ -92,6 +106,12 @@ func TestSessionPayloadFromInfo(t *testing.T) { if payload.Provider != "fake" { t.Fatalf("payload.Provider = %q, want %q", payload.Provider, "fake") } + if payload.Model != "gpt-test" { + t.Fatalf("payload.Model = %q, want %q", payload.Model, "gpt-test") + } + if payload.ReasoningEffort != "high" { + t.Fatalf("payload.ReasoningEffort = %q, want %q", payload.ReasoningEffort, "high") + } if payload.State != session.StateActive || payload.ACPSessionID != "acp-123" { t.Fatalf("payload session fields = %#v", payload) } @@ -122,6 +142,13 @@ func TestSessionPayloadFromInfo(t *testing.T) { if payload.ACPCaps == nil || !payload.ACPCaps.SupportsLoadSession || len(payload.ACPCaps.SupportedModels) != 1 { t.Fatalf("caps = %#v", payload.ACPCaps) } + if len(payload.ACPCaps.ConfigOptions) != 1 { + t.Fatalf("config options = %#v", payload.ACPCaps.ConfigOptions) + } + if got := payload.ACPCaps.ConfigOptions[0]; got.ID != "reasoning_effort" || got.Current != "high" || + got.Kind != "select" || len(got.Values) != 2 { + t.Fatalf("config option payload = %#v", got) + } if payload.Sandbox == nil || payload.Sandbox.SandboxID != "env-1" || payload.Sandbox.Backend != "local" || payload.Sandbox.Profile != "local" || diff --git a/internal/api/core/errors.go b/internal/api/core/errors.go index 7798e4442..efbde601b 100644 --- a/internal/api/core/errors.go +++ b/internal/api/core/errors.go @@ -15,6 +15,7 @@ import ( bridgepkg "github.com/pedronauck/agh/internal/bridges" aghconfig "github.com/pedronauck/agh/internal/config" "github.com/pedronauck/agh/internal/memory" + "github.com/pedronauck/agh/internal/modelcatalog" "github.com/pedronauck/agh/internal/network" "github.com/pedronauck/agh/internal/resources" "github.com/pedronauck/agh/internal/session" @@ -365,6 +366,12 @@ var ErrAutomationValidation = errors.New("automation validation error") // ErrNetworkValidation is the sentinel for malformed network control-plane requests. var ErrNetworkValidation = errors.New("network validation error") +// ErrModelCatalogValidation is the sentinel for malformed model catalog requests. +var ErrModelCatalogValidation = errors.New("model catalog validation error") + +// ErrModelCatalogUnavailable reports that the daemon model catalog surface is not configured. +var ErrModelCatalogUnavailable = errors.New("model catalog service unavailable") + // StatusForSkillError maps skill-domain errors to transport statuses. func StatusForSkillError(err error) int { switch { @@ -465,3 +472,83 @@ func StatusForNetworkError(err error) int { return http.StatusInternalServerError } } + +// NewModelCatalogValidationError wraps a model catalog request validation failure. +func NewModelCatalogValidationError(err error) error { + if err == nil { + return nil + } + return fmt.Errorf("%w: %w", ErrModelCatalogValidation, err) +} + +// StatusForModelCatalogError maps model catalog failures to transport statuses. +func StatusForModelCatalogError(err error) int { + var maxBytesErr *http.MaxBytesError + switch { + case err == nil: + return http.StatusOK + case errors.As(err, &maxBytesErr): + return http.StatusRequestEntityTooLarge + case errors.Is(err, ErrModelCatalogValidation), + errors.Is(err, modelcatalog.ErrSourceNotRegistered): + return http.StatusBadRequest + case errors.Is(err, ErrModelCatalogUnavailable), + errors.Is(err, modelcatalog.ErrAllSourcesFailed): + return http.StatusServiceUnavailable + default: + return http.StatusInternalServerError + } +} + +// RespondOpenAIError writes an OpenAI-compatible error response envelope. +func RespondOpenAIError(c *gin.Context, status int, err error, maskInternalErrors bool) { + message := http.StatusText(status) + switch { + case maskInternalErrors && status >= http.StatusInternalServerError: + if strings.TrimSpace(message) == "" { + message = "internal server error" + } + case err != nil && strings.TrimSpace(err.Error()) != "": + message = err.Error() + case strings.TrimSpace(message) == "": + message = "unknown error" + } + message = taskpkg.RedactClaimTokens(message) + c.JSON(status, contract.OpenAIErrorResponse{ + Error: contract.OpenAIErrorPayload{ + Message: message, + Type: openAIErrorTypeForStatus(status), + Param: nil, + Code: openAIErrorCodeForStatus(status), + }, + }) +} + +func openAIErrorTypeForStatus(status int) string { + if status >= http.StatusInternalServerError { + return "server_error" + } + return "invalid_request_error" +} + +func openAIErrorCodeForStatus(status int) string { + switch status { + case http.StatusBadRequest: + return "invalid_request" + case http.StatusUnauthorized: + return "unauthorized" + case http.StatusForbidden: + return "forbidden" + case http.StatusNotFound: + return "not_found" + case http.StatusRequestEntityTooLarge: + return "request_too_large" + case http.StatusServiceUnavailable: + return "service_unavailable" + default: + if status >= http.StatusInternalServerError { + return "internal_error" + } + return "api_error" + } +} diff --git a/internal/api/core/handlers.go b/internal/api/core/handlers.go index e0133930d..448cc7723 100644 --- a/internal/api/core/handlers.go +++ b/internal/api/core/handlers.go @@ -51,6 +51,7 @@ type BaseHandlerConfig struct { Vault VaultService Workspaces WorkspaceService AgentCatalog AgentCatalog + ModelCatalog ModelCatalogService AgentContextService AgentContextService SoulAuthoring SoulAuthoringService SoulRefresher SoulRefresher @@ -102,6 +103,7 @@ type BaseHandlers struct { Vault VaultService Workspaces WorkspaceService AgentCatalog AgentCatalog + ModelCatalog ModelCatalogService AgentContextService AgentContextService SoulAuthoring SoulAuthoringService SoulRefresher SoulRefresher @@ -161,6 +163,7 @@ func NewBaseHandlers(cfg *BaseHandlerConfig) *BaseHandlers { Vault: cfg.Vault, Workspaces: cfg.Workspaces, AgentCatalog: cfg.AgentCatalog, + ModelCatalog: cfg.ModelCatalog, AgentContextService: cfg.AgentContextService, CoordinatorConfig: cfg.CoordinatorConfig, SkillsRegistry: cfg.SkillsRegistry, @@ -326,13 +329,15 @@ func (h *BaseHandlers) CreateSession(c *gin.Context) { } sess, err := h.Sessions.Create(c.Request.Context(), session.CreateOpts{ - AgentName: req.AgentName, - Provider: strings.TrimSpace(req.Provider), - Name: req.Name, - Workspace: strings.TrimSpace(req.Workspace), - WorkspacePath: strings.TrimSpace(req.WorkspacePath), - Channel: channel, - Type: session.SessionTypeUser, + AgentName: req.AgentName, + Provider: strings.TrimSpace(req.Provider), + Model: strings.TrimSpace(req.Model), + ReasoningEffort: strings.TrimSpace(req.ReasoningEffort), + Name: req.Name, + Workspace: strings.TrimSpace(req.Workspace), + WorkspacePath: strings.TrimSpace(req.WorkspacePath), + Channel: channel, + Type: session.SessionTypeUser, }) if err != nil { h.respondError(c, StatusForSessionError(err), err) diff --git a/internal/api/core/interfaces.go b/internal/api/core/interfaces.go index 05f16dab7..eaa2fa2cc 100644 --- a/internal/api/core/interfaces.go +++ b/internal/api/core/interfaces.go @@ -13,6 +13,7 @@ import ( aghconfig "github.com/pedronauck/agh/internal/config" "github.com/pedronauck/agh/internal/heartbeat" hookspkg "github.com/pedronauck/agh/internal/hooks" + "github.com/pedronauck/agh/internal/modelcatalog" "github.com/pedronauck/agh/internal/network" "github.com/pedronauck/agh/internal/observe" "github.com/pedronauck/agh/internal/resources" @@ -37,6 +38,11 @@ type AgentCatalog interface { GetAgent(ctx context.Context, name string) (aghconfig.AgentDef, error) } +// ModelCatalogService exposes daemon-owned provider model catalog reads and refreshes. +type ModelCatalogService interface { + modelcatalog.Service +} + // SessionManager is the runtime session surface exposed by API transports. // List returns the current in-memory session snapshot without performing I/O. // ListAll may perform I/O to return the authoritative session set, so it accepts a context. diff --git a/internal/api/core/memory_workspace_test.go b/internal/api/core/memory_workspace_test.go index 559f2d795..c8677c542 100644 --- a/internal/api/core/memory_workspace_test.go +++ b/internal/api/core/memory_workspace_test.go @@ -8,6 +8,7 @@ import ( "net/url" "os" "path/filepath" + "reflect" "strings" "testing" "time" @@ -884,7 +885,7 @@ func TestWorkspaceHandlersDelegateToService(t *testing.T) { t.Fatalf("len(providers) = %d, want %d (%#v)", got, want, getPayload.Providers) } for i, want := range expectedProviders { - if got := getPayload.Providers[i]; got != want { + if got := getPayload.Providers[i]; !reflect.DeepEqual(got, want) { t.Fatalf("providers[%d] = %#v, want %#v", i, got, want) } } diff --git a/internal/api/core/model_catalog.go b/internal/api/core/model_catalog.go new file mode 100644 index 000000000..c6fa5143f --- /dev/null +++ b/internal/api/core/model_catalog.go @@ -0,0 +1,288 @@ +package core + +import ( + "errors" + "fmt" + "io" + "net/http" + "strings" + + "github.com/gin-gonic/gin" + "github.com/pedronauck/agh/internal/api/contract" + "github.com/pedronauck/agh/internal/modelcatalog" +) + +const ( + modelCatalogModelsSegment = "models" + modelCatalogRefreshSegment = "refresh" + modelCatalogStatusSegment = "status" +) + +var errModelCatalogRouteNotFound = errors.New("model catalog route not found") + +// ProviderModelCatalog dispatches the native provider model catalog route family. +func (h *BaseHandlers) ProviderModelCatalog(c *gin.Context) { + if h == nil { + RespondError(c, http.StatusServiceUnavailable, ErrModelCatalogUnavailable, false) + return + } + parts := modelCatalogPathParts(c.Param("catalog_path")) + switch c.Request.Method { + case http.MethodGet: + h.dispatchProviderModelCatalogGET(c, parts) + case http.MethodPost: + h.dispatchProviderModelCatalogPOST(c, parts) + default: + RespondError(c, http.StatusNotFound, errModelCatalogRouteNotFound, h.MaskInternalErrors) + } +} + +// OpenAIModels lists catalog models using the OpenAI-compatible shape. +func (h *BaseHandlers) OpenAIModels(c *gin.Context) { + providerID, err := validateModelCatalogProviderID(c.Query("provider_id")) + if err != nil { + RespondOpenAIError( + c, + StatusForModelCatalogError(err), + err, + h != nil && h.MaskInternalErrors, + ) + return + } + service, err := h.modelCatalogService() + if err != nil { + RespondOpenAIError(c, StatusForModelCatalogError(err), err, h != nil && h.MaskInternalErrors) + return + } + models, err := service.ListModels(c.Request.Context(), modelcatalog.ListOptions{ + ProviderID: providerID, + IncludeStale: true, + Now: h.nowUTC(), + }) + if err != nil { + RespondOpenAIError(c, StatusForModelCatalogError(err), err, h.MaskInternalErrors) + return + } + c.JSON(http.StatusOK, OpenAIModelListPayloadFromModels(models)) +} + +func (h *BaseHandlers) dispatchProviderModelCatalogGET(c *gin.Context, parts []string) { + switch { + case len(parts) == 1 && parts[0] == modelCatalogModelsSegment: + h.listProviderModels(c, "") + case len(parts) == 2 && parts[0] == modelCatalogModelsSegment && parts[1] == modelCatalogStatusSegment: + h.providerModelStatus(c, "") + case len(parts) == 2 && parts[1] == modelCatalogModelsSegment: + h.listProviderModels(c, parts[0]) + case len(parts) == 3 && parts[1] == modelCatalogModelsSegment && parts[2] == modelCatalogStatusSegment: + h.providerModelStatus(c, parts[0]) + default: + RespondError(c, http.StatusNotFound, errModelCatalogRouteNotFound, h.MaskInternalErrors) + } +} + +func (h *BaseHandlers) dispatchProviderModelCatalogPOST(c *gin.Context, parts []string) { + switch { + case len(parts) == 2 && parts[0] == modelCatalogModelsSegment && parts[1] == modelCatalogRefreshSegment: + h.refreshProviderModels(c, "") + case len(parts) == 3 && parts[1] == modelCatalogModelsSegment && parts[2] == modelCatalogRefreshSegment: + h.refreshProviderModels(c, parts[0]) + default: + RespondError(c, http.StatusNotFound, errModelCatalogRouteNotFound, h.MaskInternalErrors) + } +} + +func (h *BaseHandlers) listProviderModels(c *gin.Context, providerParam string) { + opts, err := h.modelCatalogListOptions(c, providerParam) + if err != nil { + RespondError(c, StatusForModelCatalogError(err), err, h.MaskInternalErrors) + return + } + service, err := h.modelCatalogService() + if err != nil { + RespondError(c, StatusForModelCatalogError(err), err, h.MaskInternalErrors) + return + } + models, err := service.ListModels(c.Request.Context(), opts) + if err != nil { + RespondError(c, StatusForModelCatalogError(err), err, h.MaskInternalErrors) + return + } + c.JSON(http.StatusOK, ProviderModelListPayloadFromModels(models)) +} + +func (h *BaseHandlers) refreshProviderModels(c *gin.Context, providerParam string) { + opts, err := h.modelCatalogRefreshOptions(c, providerParam) + if err != nil { + RespondError(c, StatusForModelCatalogError(err), err, h.MaskInternalErrors) + return + } + service, err := h.modelCatalogService() + if err != nil { + RespondError(c, StatusForModelCatalogError(err), err, h.MaskInternalErrors) + return + } + statuses, err := service.Refresh(c.Request.Context(), opts) + payload := contract.ProviderModelRefreshResponse{ + Sources: SourceStatusPayloadsFromStatuses(statuses), + } + if err != nil { + status := StatusForModelCatalogError(err) + if len(payload.Sources) > 0 { + payload.Error = modelcatalog.RedactString(err.Error()) + c.JSON(status, payload) + return + } + RespondError(c, status, err, h.MaskInternalErrors) + return + } + c.JSON(http.StatusOK, payload) +} + +func (h *BaseHandlers) providerModelStatus(c *gin.Context, providerParam string) { + providerID, err := validateModelCatalogProviderID(providerParam) + if err != nil { + RespondError(c, StatusForModelCatalogError(err), err, h.MaskInternalErrors) + return + } + service, err := h.modelCatalogService() + if err != nil { + RespondError(c, StatusForModelCatalogError(err), err, h.MaskInternalErrors) + return + } + statuses, err := service.ListSourceStatus(c.Request.Context(), providerID) + if err != nil { + RespondError(c, StatusForModelCatalogError(err), err, h.MaskInternalErrors) + return + } + c.JSON(http.StatusOK, contract.ProviderModelStatusResponse{ + Sources: SourceStatusPayloadsFromStatuses(statuses), + }) +} + +func (h *BaseHandlers) modelCatalogListOptions( + c *gin.Context, + providerParam string, +) (modelcatalog.ListOptions, error) { + providerID := providerParam + if providerID == "" { + providerID = c.Query("provider_id") + } + trimmedProvider, err := validateModelCatalogProviderID(providerID) + if err != nil { + return modelcatalog.ListOptions{}, err + } + sourceID, err := validateOptionalModelCatalogSourceID(firstNonEmpty(c.Query("source_id"), c.Query("source"))) + if err != nil { + return modelcatalog.ListOptions{}, err + } + refresh, err := parseBoolQuery(c, "refresh") + if err != nil { + return modelcatalog.ListOptions{}, NewModelCatalogValidationError(err) + } + includeStale, err := parseBoolQuery(c, "include_stale") + if err != nil { + return modelcatalog.ListOptions{}, NewModelCatalogValidationError(err) + } + return modelcatalog.ListOptions{ + ProviderID: trimmedProvider, + SourceID: sourceID, + Refresh: refresh, + IncludeStale: includeStale, + Now: h.nowUTC(), + }, nil +} + +func (h *BaseHandlers) modelCatalogRefreshOptions( + c *gin.Context, + providerParam string, +) (modelcatalog.RefreshOptions, error) { + providerID, err := validateModelCatalogProviderID(providerParam) + if err != nil { + return modelcatalog.RefreshOptions{}, err + } + var request contract.ProviderModelRefreshRequest + if err := bindOptionalModelCatalogRefreshRequest(c, &request); err != nil { + return modelcatalog.RefreshOptions{}, err + } + sourceID, err := validateOptionalModelCatalogSourceID( + firstNonEmpty(request.SourceID, c.Query("source_id"), c.Query("source")), + ) + if err != nil { + return modelcatalog.RefreshOptions{}, err + } + return modelcatalog.RefreshOptions{ + ProviderID: providerID, + SourceID: sourceID, + Force: request.Force, + RequestID: strings.TrimSpace(request.RequestID), + Now: h.nowUTC(), + }, nil +} + +func (h *BaseHandlers) modelCatalogService() (ModelCatalogService, error) { + if h == nil || h.ModelCatalog == nil { + return nil, ErrModelCatalogUnavailable + } + return h.ModelCatalog, nil +} + +func bindOptionalModelCatalogRefreshRequest( + c *gin.Context, + request *contract.ProviderModelRefreshRequest, +) error { + if c == nil || c.Request == nil || c.Request.Body == nil || c.Request.Body == http.NoBody { + return nil + } + if err := c.ShouldBindJSON(request); err != nil { + if errors.Is(err, io.EOF) { + return nil + } + return NewModelCatalogValidationError(fmt.Errorf("invalid refresh request body: %w", err)) + } + return nil +} + +func validateOptionalModelCatalogSourceID(sourceID string) (string, error) { + trimmed := strings.TrimSpace(sourceID) + if trimmed == "" { + return "", nil + } + if err := modelcatalog.ValidateSourceID(trimmed); err != nil { + return "", NewModelCatalogValidationError(err) + } + return trimmed, nil +} + +func validateModelCatalogProviderID(providerID string) (string, error) { + trimmed := strings.TrimSpace(providerID) + if trimmed == "" { + return "", nil + } + for idx, ch := range trimmed { + valid := ch >= 'a' && ch <= 'z' || + ch >= '0' && ch <= '9' || + (idx > 0 && (ch == '-' || ch == '_')) + if !valid { + return "", NewModelCatalogValidationError( + fmt.Errorf("provider_id %q must match ^[a-z0-9][a-z0-9_-]*$", providerID), + ) + } + } + return trimmed, nil +} + +func modelCatalogPathParts(path string) []string { + trimmed := strings.Trim(strings.TrimSpace(path), "/") + if trimmed == "" { + return nil + } + rawParts := strings.Split(trimmed, "/") + parts := make([]string, 0, len(rawParts)) + for _, part := range rawParts { + if trimmedPart := strings.TrimSpace(part); trimmedPart != "" { + parts = append(parts, trimmedPart) + } + } + return parts +} diff --git a/internal/api/core/model_catalog_conversions.go b/internal/api/core/model_catalog_conversions.go new file mode 100644 index 000000000..661566981 --- /dev/null +++ b/internal/api/core/model_catalog_conversions.go @@ -0,0 +1,163 @@ +package core + +import ( + "time" + + "github.com/pedronauck/agh/internal/api/contract" + "github.com/pedronauck/agh/internal/modelcatalog" +) + +func ProviderModelListPayloadFromModels(models []modelcatalog.Model) contract.ProviderModelListResponse { + payload := contract.ProviderModelListResponse{ + Models: make([]contract.ProviderModelPayload, 0, len(models)), + } + for _, model := range models { + payload.Models = append(payload.Models, ProviderModelPayloadFromModel(model)) + } + return payload +} + +func ProviderModelPayloadFromModel(model modelcatalog.Model) contract.ProviderModelPayload { + payload := contract.ProviderModelPayload{ + ProviderID: model.ProviderID, + ModelID: model.ModelID, + DisplayName: model.DisplayName, + Sources: SourceRefPayloadsFromRefs(model.Sources), + Available: model.Available, + AvailabilityState: model.AvailabilityState, + Stale: model.Stale, + RefreshedAt: modelCatalogTimeString(model.RefreshedAt), + ContextWindow: model.ContextWindow, + MaxInputTokens: model.MaxInputTokens, + MaxOutputTokens: model.MaxOutputTokens, + SupportsTools: model.SupportsTools, + SupportsReasoning: model.SupportsReasoning, + ReasoningEfforts: reasoningEffortStrings(model.ReasoningEfforts), + DefaultReasoningEffort: reasoningEffortStringPtr(model.DefaultReasoningEffort), + LastError: modelcatalog.RedactString(model.LastError), + } + if model.CostInputPerMillion != nil || model.CostOutputPerMillion != nil { + payload.Cost = &contract.ModelCatalogCostPayload{ + InputPerMillion: model.CostInputPerMillion, + OutputPerMillion: model.CostOutputPerMillion, + } + } + return payload +} + +func SourceRefPayloadsFromRefs(refs []modelcatalog.SourceRef) []contract.ModelCatalogSourceRefPayload { + payloads := make([]contract.ModelCatalogSourceRefPayload, 0, len(refs)) + for _, ref := range refs { + payloads = append(payloads, contract.ModelCatalogSourceRefPayload{ + SourceID: ref.SourceID, + SourceKind: string(ref.SourceKind), + Priority: ref.Priority, + RefreshedAt: modelCatalogTimeString(ref.RefreshedAt), + Stale: ref.Stale, + LastError: modelcatalog.RedactString(ref.LastError), + }) + } + return payloads +} + +func SourceStatusPayloadsFromStatuses( + statuses []modelcatalog.SourceStatus, +) []contract.ModelCatalogSourceStatusPayload { + payloads := make([]contract.ModelCatalogSourceStatusPayload, 0, len(statuses)) + for _, status := range statuses { + payloads = append(payloads, contract.ModelCatalogSourceStatusPayload{ + SourceID: status.SourceID, + SourceKind: string(status.SourceKind), + ProviderID: status.ProviderID, + Priority: status.Priority, + LastRefresh: modelCatalogTimeString(status.LastRefresh), + NextRefresh: modelCatalogTimeString(status.NextRefresh), + LastSuccess: modelCatalogTimeString(status.LastSuccess), + LastError: modelcatalog.RedactString(status.LastError), + RefreshState: status.RefreshState, + RowCount: status.RowCount, + Stale: status.Stale, + }) + } + return payloads +} + +func OpenAIModelListPayloadFromModels(models []modelcatalog.Model) contract.OpenAIModelListResponse { + payload := contract.OpenAIModelListResponse{ + Object: "list", + Data: make([]contract.OpenAIModelPayload, 0, len(models)), + } + for _, model := range models { + payload.Data = append(payload.Data, OpenAIModelPayloadFromModel(model)) + } + return payload +} + +func OpenAIModelPayloadFromModel(model modelcatalog.Model) contract.OpenAIModelPayload { + return contract.OpenAIModelPayload{ + ID: model.ModelID, + Object: "model", + Created: 0, + OwnedBy: model.ProviderID, + AGH: contract.OpenAIModelAGHPayload{ + ProviderID: model.ProviderID, + ModelID: model.ModelID, + DisplayName: model.DisplayName, + Sources: sourceIDsFromRefs(model.Sources), + Available: model.Available, + AvailabilityState: model.AvailabilityState, + Stale: model.Stale, + RefreshedAt: modelCatalogTimeString(model.RefreshedAt), + ContextWindow: model.ContextWindow, + MaxInputTokens: model.MaxInputTokens, + MaxOutputTokens: model.MaxOutputTokens, + SupportsTools: model.SupportsTools, + SupportsReasoning: model.SupportsReasoning, + ReasoningEfforts: reasoningEffortStrings(model.ReasoningEfforts), + DefaultReasoningEffort: reasoningEffortStringPtr(model.DefaultReasoningEffort), + Cost: costPayloadFromModel(model), + LastError: modelcatalog.RedactString(model.LastError), + }, + } +} + +func costPayloadFromModel(model modelcatalog.Model) *contract.ModelCatalogCostPayload { + if model.CostInputPerMillion == nil && model.CostOutputPerMillion == nil { + return nil + } + return &contract.ModelCatalogCostPayload{ + InputPerMillion: model.CostInputPerMillion, + OutputPerMillion: model.CostOutputPerMillion, + } +} + +func sourceIDsFromRefs(refs []modelcatalog.SourceRef) []string { + ids := make([]string, 0, len(refs)) + for _, ref := range refs { + ids = append(ids, ref.SourceID) + } + return ids +} + +func reasoningEffortStrings(efforts []modelcatalog.ReasoningEffort) []string { + values := make([]string, 0, len(efforts)) + for _, effort := range efforts { + values = append(values, string(effort)) + } + return values +} + +func reasoningEffortStringPtr(effort *modelcatalog.ReasoningEffort) *string { + if effort == nil { + return nil + } + value := string(*effort) + return &value +} + +func modelCatalogTimeString(value time.Time) string { + if value.IsZero() { + return "" + } + return value.UTC().Format(time.RFC3339Nano) +} diff --git a/internal/api/core/model_catalog_test.go b/internal/api/core/model_catalog_test.go new file mode 100644 index 000000000..9b2bd9dec --- /dev/null +++ b/internal/api/core/model_catalog_test.go @@ -0,0 +1,388 @@ +package core + +import ( + "context" + "encoding/json" + "errors" + "fmt" + "net/http" + "net/http/httptest" + "strings" + "testing" + "time" + + "github.com/gin-gonic/gin" + "github.com/pedronauck/agh/internal/api/contract" + "github.com/pedronauck/agh/internal/modelcatalog" +) + +func TestBaseHandlersModelCatalogDependency(t *testing.T) { + t.Parallel() + + t.Run("Should carry model catalog service from config", func(t *testing.T) { + t.Parallel() + + service := coreModelCatalogServiceStub{} + handlers := NewBaseHandlers(&BaseHandlerConfig{ModelCatalog: service}) + if handlers.ModelCatalog == nil { + t.Fatal("NewBaseHandlers() ModelCatalog = nil, want injected service") + } + if handlers.ModelCatalog != service { + t.Fatalf("NewBaseHandlers() ModelCatalog = %#v, want %#v", handlers.ModelCatalog, service) + } + }) +} + +func TestProviderModelPayloadConversion(t *testing.T) { + t.Parallel() + + t.Run("Should preserve nullable availability and source stale fields", func(t *testing.T) { + t.Parallel() + + effort := modelcatalog.ReasoningEffortHigh + model := modelcatalog.Model{ + ProviderID: "codex", + ModelID: "gpt-5.4", + DisplayName: "GPT-5.4", + Available: nil, + AvailabilityState: string(modelcatalog.AvailabilityStateUnknown), + Stale: true, + RefreshedAt: time.Date(2026, 5, 7, 12, 0, 0, 0, time.UTC), + SupportsReasoning: boolPtr(true), + ReasoningEfforts: []modelcatalog.ReasoningEffort{modelcatalog.ReasoningEffortHigh}, + DefaultReasoningEffort: &effort, + Sources: []modelcatalog.SourceRef{ + { + SourceID: modelcatalog.SourceIDConfig, + SourceKind: modelcatalog.SourceKindConfig, + Priority: modelcatalog.PriorityConfig, + RefreshedAt: time.Date(2026, 5, 7, 11, 0, 0, 0, time.UTC), + Stale: true, + LastError: "cached provider config", + }, + }, + } + + payload := ProviderModelPayloadFromModel(model) + if payload.Available != nil { + t.Fatalf("Available = %#v, want nil", payload.Available) + } + if !payload.Stale || len(payload.Sources) != 1 || !payload.Sources[0].Stale { + t.Fatalf("Payload = %#v, want stale model and source", payload) + } + if payload.DefaultReasoningEffort == nil || *payload.DefaultReasoningEffort != "high" { + t.Fatalf("DefaultReasoningEffort = %#v, want high", payload.DefaultReasoningEffort) + } + encoded, err := json.Marshal(payload) + if err != nil { + t.Fatalf("json.Marshal(payload) error = %v", err) + } + if !strings.Contains(string(encoded), `"available":null`) { + t.Fatalf("payload JSON = %s, want nullable available field", encoded) + } + }) + + t.Run("Should redact source errors in native and OpenAI projections", func(t *testing.T) { + t.Parallel() + + model := seedModelCatalogModel("codex", "gpt-5.4") + model.LastError = "provider failed with api_key=sk-native-secret-token" + model.Sources[0].LastError = "source failed with OAUTH_TOKEN=oauth-secret-token" + + nativePayload := ProviderModelPayloadFromModel(model) + assertRedactedModelCatalogPayload(t, nativePayload.LastError, "sk-native-secret-token") + assertRedactedModelCatalogPayload(t, nativePayload.Sources[0].LastError, "oauth-secret-token") + + openAIPayload := OpenAIModelPayloadFromModel(model) + assertRedactedModelCatalogPayload(t, openAIPayload.AGH.LastError, "sk-native-secret-token") + + statusPayloads := SourceStatusPayloadsFromStatuses([]modelcatalog.SourceStatus{ + { + SourceID: modelcatalog.SourceIDModelsDev, + SourceKind: modelcatalog.SourceKindModelsDev, + ProviderID: "codex", + RefreshState: string(modelcatalog.RefreshStateFailed), + LastError: "models.dev failed with Bearer ya29.api-secret-token", + }, + }) + if got, want := len(statusPayloads), 1; got != want { + t.Fatalf("len(statusPayloads) = %d, want %d", got, want) + } + assertRedactedModelCatalogPayload(t, statusPayloads[0].LastError, "ya29.api-secret-token") + }) +} + +func TestProviderModelCatalogHandlers(t *testing.T) { + t.Parallel() + + t.Run("Should pass list filters and return native model payload", func(t *testing.T) { + t.Parallel() + + service := &modelCatalogServiceSpy{ + listModelsFn: func(_ context.Context, opts modelcatalog.ListOptions) ([]modelcatalog.Model, error) { + if got, want := opts.ProviderID, "codex"; got != want { + t.Fatalf("ProviderID = %q, want %q", got, want) + } + if got, want := opts.SourceID, modelcatalog.SourceIDConfig; got != want { + t.Fatalf("SourceID = %q, want %q", got, want) + } + if !opts.Refresh || !opts.IncludeStale { + t.Fatalf("ListOptions = %#v, want refresh and include_stale", opts) + } + return []modelcatalog.Model{seedModelCatalogModel("codex", "gpt-5.4")}, nil + }, + } + engine := newModelCatalogCoreEngine(t, service) + + recorder := performModelCatalogRequest( + t, + engine, + http.MethodGet, + "/providers/codex/models?source_id=config&refresh=true&include_stale=true", + nil, + ) + if recorder.Code != http.StatusOK { + t.Fatalf("status = %d, want 200; body=%s", recorder.Code, recorder.Body.String()) + } + var payload contract.ProviderModelListResponse + decodeModelCatalogResponse(t, recorder, &payload) + if len(payload.Models) != 1 || payload.Models[0].ProviderID != "codex" { + t.Fatalf("payload = %#v, want codex model", payload) + } + }) + + t.Run("Should return deterministic validation error for invalid provider id", func(t *testing.T) { + t.Parallel() + + engine := newModelCatalogCoreEngine(t, &modelCatalogServiceSpy{}) + + recorder := performModelCatalogRequest(t, engine, http.MethodGet, "/providers/bad%20id/models", nil) + if recorder.Code != http.StatusBadRequest { + t.Fatalf("status = %d, want 400; body=%s", recorder.Code, recorder.Body.String()) + } + var payload contract.ErrorPayload + decodeModelCatalogResponse(t, recorder, &payload) + if !strings.Contains(payload.Error, "provider_id") { + t.Fatalf("Error = %q, want provider_id validation message", payload.Error) + } + }) + + t.Run("Should return source statuses when refresh fails", func(t *testing.T) { + t.Parallel() + + secret := "sk-refresh-secret-token" + service := &modelCatalogServiceSpy{ + refreshFn: func(_ context.Context, _ modelcatalog.RefreshOptions) ([]modelcatalog.SourceStatus, error) { + return []modelcatalog.SourceStatus{ + { + SourceID: modelcatalog.SourceIDConfig, + SourceKind: modelcatalog.SourceKindConfig, + ProviderID: "codex", + RefreshState: string(modelcatalog.RefreshStateFailed), + LastError: "config source failed with api_key=" + secret, + Stale: true, + }, + }, fmt.Errorf("%w: api_key=%s", modelcatalog.ErrAllSourcesFailed, secret) + }, + } + engine := newModelCatalogCoreEngine(t, service) + + recorder := performModelCatalogRequest(t, engine, http.MethodPost, "/providers/codex/models/refresh", nil) + if recorder.Code != http.StatusServiceUnavailable { + t.Fatalf("status = %d, want 503; body=%s", recorder.Code, recorder.Body.String()) + } + var payload contract.ProviderModelRefreshResponse + decodeModelCatalogResponse(t, recorder, &payload) + if len(payload.Sources) != 1 || payload.Sources[0].RefreshState != string(modelcatalog.RefreshStateFailed) { + t.Fatalf("payload = %#v, want failed source status", payload) + } + if payload.Error == "" { + t.Fatalf("payload.Error = empty, want refresh error") + } + assertRedactedModelCatalogPayload(t, payload.Error, secret) + assertRedactedModelCatalogPayload(t, payload.Sources[0].LastError, secret) + }) +} + +func TestOpenAIModelCatalogHandler(t *testing.T) { + t.Parallel() + + t.Run("Should use AGH metadata and provider filter", func(t *testing.T) { + t.Parallel() + + service := &modelCatalogServiceSpy{ + listModelsFn: func(_ context.Context, opts modelcatalog.ListOptions) ([]modelcatalog.Model, error) { + if got, want := opts.ProviderID, "codex"; got != want { + t.Fatalf("ProviderID = %q, want %q", got, want) + } + return []modelcatalog.Model{seedModelCatalogModel("codex", "gpt-5.4")}, nil + }, + } + engine := newModelCatalogCoreEngine(t, service) + + recorder := performModelCatalogRequest(t, engine, http.MethodGet, "/openai/v1/models?provider_id=codex", nil) + if recorder.Code != http.StatusOK { + t.Fatalf("status = %d, want 200; body=%s", recorder.Code, recorder.Body.String()) + } + var payload contract.OpenAIModelListResponse + decodeModelCatalogResponse(t, recorder, &payload) + if payload.Object != "list" || len(payload.Data) != 1 { + t.Fatalf("payload = %#v, want one OpenAI model list item", payload) + } + model := payload.Data[0] + if model.Object != "model" || model.OwnedBy != "codex" || model.AGH.ProviderID != "codex" { + t.Fatalf("model = %#v, want OpenAI shape with agh metadata", model) + } + if len(model.AGH.Sources) != 1 || model.AGH.Sources[0] != modelcatalog.SourceIDConfig { + t.Fatalf("AGH.Sources = %#v, want config source", model.AGH.Sources) + } + }) + + t.Run("Should return OpenAI shaped validation errors", func(t *testing.T) { + t.Parallel() + + engine := newModelCatalogCoreEngine(t, &modelCatalogServiceSpy{}) + + recorder := performModelCatalogRequest(t, engine, http.MethodGet, "/openai/v1/models?provider_id=bad%20id", nil) + if recorder.Code != http.StatusBadRequest { + t.Fatalf("status = %d, want 400; body=%s", recorder.Code, recorder.Body.String()) + } + var payload contract.OpenAIErrorResponse + decodeModelCatalogResponse(t, recorder, &payload) + if payload.Error.Code != "invalid_request" || payload.Error.Type != "invalid_request_error" { + t.Fatalf("error = %#v, want OpenAI invalid_request error", payload.Error) + } + }) +} + +type coreModelCatalogServiceStub struct{} + +func (coreModelCatalogServiceStub) ListModels( + context.Context, + modelcatalog.ListOptions, +) ([]modelcatalog.Model, error) { + return nil, nil +} + +func (coreModelCatalogServiceStub) Refresh( + context.Context, + modelcatalog.RefreshOptions, +) ([]modelcatalog.SourceStatus, error) { + return nil, nil +} + +func (coreModelCatalogServiceStub) ListSourceStatus( + context.Context, + string, +) ([]modelcatalog.SourceStatus, error) { + return nil, nil +} + +type modelCatalogServiceSpy struct { + listModelsFn func(context.Context, modelcatalog.ListOptions) ([]modelcatalog.Model, error) + refreshFn func(context.Context, modelcatalog.RefreshOptions) ([]modelcatalog.SourceStatus, error) + listSourceStatusFn func(context.Context, string) ([]modelcatalog.SourceStatus, error) +} + +func (s *modelCatalogServiceSpy) ListModels( + ctx context.Context, + opts modelcatalog.ListOptions, +) ([]modelcatalog.Model, error) { + if s.listModelsFn != nil { + return s.listModelsFn(ctx, opts) + } + return nil, errors.New("unexpected ListModels call") +} + +func (s *modelCatalogServiceSpy) Refresh( + ctx context.Context, + opts modelcatalog.RefreshOptions, +) ([]modelcatalog.SourceStatus, error) { + if s.refreshFn != nil { + return s.refreshFn(ctx, opts) + } + return nil, errors.New("unexpected Refresh call") +} + +func (s *modelCatalogServiceSpy) ListSourceStatus( + ctx context.Context, + providerID string, +) ([]modelcatalog.SourceStatus, error) { + if s.listSourceStatusFn != nil { + return s.listSourceStatusFn(ctx, providerID) + } + return nil, errors.New("unexpected ListSourceStatus call") +} + +func newModelCatalogCoreEngine(t *testing.T, service ModelCatalogService) *gin.Engine { + t.Helper() + + gin.SetMode(gin.TestMode) + handlers := NewBaseHandlers(&BaseHandlerConfig{ + ModelCatalog: service, + Now: func() time.Time { + return time.Date(2026, 5, 7, 12, 30, 0, 0, time.UTC) + }, + }) + engine := gin.New() + engine.GET("/providers/*catalog_path", handlers.ProviderModelCatalog) + engine.POST("/providers/*catalog_path", handlers.ProviderModelCatalog) + engine.GET("/openai/v1/models", handlers.OpenAIModels) + return engine +} + +func performModelCatalogRequest( + t *testing.T, + engine http.Handler, + method string, + path string, + body []byte, +) *httptest.ResponseRecorder { + t.Helper() + + recorder := httptest.NewRecorder() + req := httptest.NewRequestWithContext(context.Background(), method, path, strings.NewReader(string(body))) + engine.ServeHTTP(recorder, req) + return recorder +} + +func decodeModelCatalogResponse(t *testing.T, recorder *httptest.ResponseRecorder, dest any) { + t.Helper() + + if err := json.Unmarshal(recorder.Body.Bytes(), dest); err != nil { + t.Fatalf("json.Unmarshal(response) error = %v; body=%s", err, recorder.Body.String()) + } +} + +func seedModelCatalogModel(providerID string, modelID string) modelcatalog.Model { + available := true + return modelcatalog.Model{ + ProviderID: providerID, + ModelID: modelID, + DisplayName: "GPT-5.4", + Available: &available, + AvailabilityState: string(modelcatalog.AvailabilityStateAvailableLive), + Sources: []modelcatalog.SourceRef{ + { + SourceID: modelcatalog.SourceIDConfig, + SourceKind: modelcatalog.SourceKindConfig, + Priority: modelcatalog.PriorityConfig, + }, + }, + } +} + +func assertRedactedModelCatalogPayload(t *testing.T, value string, secret string) { + t.Helper() + + if strings.Contains(value, secret) { + t.Fatalf("payload value = %q, want secret redacted", value) + } + if !strings.Contains(value, "[REDACTED]") { + t.Fatalf("payload value = %q, want redaction marker", value) + } +} + +func boolPtr(value bool) *bool { + return &value +} diff --git a/internal/api/core/session_workspace.go b/internal/api/core/session_workspace.go index 8ff8f08a1..be4b84df9 100644 --- a/internal/api/core/session_workspace.go +++ b/internal/api/core/session_workspace.go @@ -36,6 +36,27 @@ func validateCreateSessionRequest(prefix string, workspaceRef string, workspaceP } } +// validateCreateSessionRuntimeOverrides enforces the model + reasoning_effort +// invariants for create-session payloads. Provider must be set when either +// override is present, and reasoning_effort must match the supported enum. +func validateCreateSessionRuntimeOverrides(prefix string, provider string, model string, reasoningEffort string) error { + trimmedProvider := strings.TrimSpace(provider) + trimmedModel := strings.TrimSpace(model) + trimmedEffort := strings.TrimSpace(reasoningEffort) + if trimmedModel != "" && trimmedProvider == "" { + return prefixedRuntimeOverrideError(prefix, "provider is required when model is set") + } + if trimmedEffort != "" { + if trimmedProvider == "" { + return prefixedRuntimeOverrideError(prefix, "provider is required when reasoning_effort is set") + } + if err := session.ValidateReasoningEffort(trimmedEffort); err != nil { + return prefixedRuntimeOverrideErr(prefix, err) + } + } + return nil +} + // LookupWorkspaceID resolves a workspace reference into a stable workspace ID. func lookupWorkspaceID(ctx context.Context, prefix string, workspaces WorkspaceGetter, ref string) (string, error) { if workspaces == nil { @@ -132,6 +153,8 @@ func statusForSessionError(err error) int { return http.StatusBadRequest case errors.Is(err, aghconfig.ErrProviderUnavailable): return http.StatusBadRequest + case errors.Is(err, session.ErrInvalidRuntimeOverride): + return http.StatusBadRequest case errors.Is(err, session.ErrSessionNotActive): return http.StatusBadRequest case errors.Is(err, session.ErrMaxSessionsReached), @@ -153,3 +176,15 @@ func prefixedError(prefix string, message string) error { } return fmt.Errorf("%s: %s", label, message) } + +func prefixedRuntimeOverrideError(prefix string, message string) error { + return fmt.Errorf("%w: %w", session.ErrInvalidRuntimeOverride, prefixedError(prefix, message)) +} + +func prefixedRuntimeOverrideErr(prefix string, err error) error { + label := strings.TrimSpace(prefix) + if label == "" { + return err + } + return fmt.Errorf("%s: %w", label, err) +} diff --git a/internal/api/core/session_workspace_internal_test.go b/internal/api/core/session_workspace_internal_test.go index 5e7eef22f..34af53402 100644 --- a/internal/api/core/session_workspace_internal_test.go +++ b/internal/api/core/session_workspace_internal_test.go @@ -40,6 +40,40 @@ func TestSessionWorkspaceHelpers(t *testing.T) { } }) + t.Run("validate create session runtime overrides", func(t *testing.T) { + t.Parallel() + + if err := validateCreateSessionRuntimeOverrides("core-test", "", "gpt-5.4", ""); !errors.Is( + err, + session.ErrInvalidRuntimeOverride, + ) { + t.Fatalf("validateCreateSessionRuntimeOverrides(model) error = %v, want ErrInvalidRuntimeOverride", err) + } + if err := validateCreateSessionRuntimeOverrides("core-test", "", "", "high"); !errors.Is( + err, + session.ErrInvalidRuntimeOverride, + ) { + t.Fatalf( + "validateCreateSessionRuntimeOverrides(reasoning provider) error = %v, want ErrInvalidRuntimeOverride", + err, + ) + } + if err := validateCreateSessionRuntimeOverrides( + "core-test", + "codex", + "", + "unsupported", + ); !errors.Is(err, session.ErrInvalidRuntimeOverride) { + t.Fatalf( + "validateCreateSessionRuntimeOverrides(reasoning enum) error = %v, want ErrInvalidRuntimeOverride", + err, + ) + } + if err := validateCreateSessionRuntimeOverrides("core-test", "codex", "gpt-5.4", "high"); err != nil { + t.Fatalf("validateCreateSessionRuntimeOverrides(valid) error = %v", err) + } + }) + t.Run("lookup workspace id", func(t *testing.T) { t.Parallel() @@ -136,6 +170,9 @@ func TestSessionWorkspaceStatusMappings(t *testing.T) { if got := statusForSessionError(aghconfig.ErrProviderUnavailable); got != http.StatusBadRequest { t.Fatalf("statusForSessionError(provider unavailable) = %d, want %d", got, http.StatusBadRequest) } + if got := statusForSessionError(session.ErrInvalidRuntimeOverride); got != http.StatusBadRequest { + t.Fatalf("statusForSessionError(invalid runtime override) = %d, want %d", got, http.StatusBadRequest) + } if got := statusForSessionError(session.ErrSessionNotActive); got != http.StatusBadRequest { t.Fatalf("statusForSessionError(not active) = %d, want %d", got, http.StatusBadRequest) } diff --git a/internal/api/core/settings.go b/internal/api/core/settings.go index 9944268f4..193f2e811 100644 --- a/internal/api/core/settings.go +++ b/internal/api/core/settings.go @@ -469,7 +469,7 @@ func (h *BaseHandlers) getSettingsCollectionItem(c *gin.Context, collection sett h.respondError(c, StatusForSettingsError(notFound), notFound) return } - c.JSON(http.StatusOK, contract.SettingsProviderResponse{Provider: settingsProviderItemPayload(item)}) + c.JSON(http.StatusOK, contract.SettingsProviderResponse{Provider: settingsProviderItemPayload(&item)}) case settingspkg.CollectionSandboxes: item, found := findSettingsSandbox(envelope.Sandboxes, name) if !found { @@ -797,7 +797,8 @@ func parsePutSettingsProviderRequest(c *gin.Context) (settingspkg.CollectionItem settings := settingspkg.ProviderSettings{ Command: strings.TrimSpace(body.Settings.Command), DisplayName: strings.TrimSpace(body.Settings.DisplayName), - DefaultModel: strings.TrimSpace(body.Settings.DefaultModel), + Models: providerModelsFromPayload(body.Settings.Models), + ModelsSet: body.Settings.Models != nil, Harness: aghconfig.ProviderHarness(strings.TrimSpace(body.Settings.Harness)), RuntimeProvider: strings.TrimSpace(body.Settings.RuntimeProvider), Transport: strings.TrimSpace(body.Settings.Transport), @@ -820,7 +821,7 @@ func parsePutSettingsProviderRequest(c *gin.Context) (settingspkg.CollectionItem func providerSettingsPayloadEmpty(payload contract.SettingsProviderSettingsPayload) bool { return strings.TrimSpace(payload.Command) == "" && strings.TrimSpace(payload.DisplayName) == "" && - strings.TrimSpace(payload.DefaultModel) == "" && + payload.Models == nil && strings.TrimSpace(payload.Harness) == "" && strings.TrimSpace(payload.RuntimeProvider) == "" && strings.TrimSpace(payload.Transport) == "" && @@ -852,6 +853,64 @@ func providerCredentialSlotsFromPayload( return slots } +func providerModelsFromPayload(payload *contract.SettingsProviderModelsPayload) aghconfig.ProviderModelsConfig { + if payload == nil { + return aghconfig.ProviderModelsConfig{} + } + return aghconfig.ProviderModelsConfig{ + Default: strings.TrimSpace(payload.Default), + Curated: providerModelConfigsFromPayload(payload.Curated), + Discovery: providerModelsDiscoveryFromPayload(payload.Discovery), + } +} + +func providerModelsDiscoveryFromPayload( + payload *contract.SettingsProviderModelsDiscoveryPayload, +) aghconfig.ProviderModelsDiscoveryConfig { + if payload == nil { + return aghconfig.ProviderModelsDiscoveryConfig{} + } + return aghconfig.ProviderModelsDiscoveryConfig{ + Enabled: cloneBoolPtr(payload.Enabled), + Command: strings.TrimSpace(payload.Command), + Endpoint: strings.TrimSpace(payload.Endpoint), + Timeout: strings.TrimSpace(payload.Timeout), + } +} + +func providerModelConfigsFromPayload( + payloads []contract.SettingsProviderModelPayload, +) []aghconfig.ProviderModelConfig { + if payloads == nil { + return nil + } + models := make([]aghconfig.ProviderModelConfig, 0, len(payloads)) + for _, payload := range payloads { + models = append(models, aghconfig.ProviderModelConfig{ + ID: strings.TrimSpace(payload.ID), + DisplayName: strings.TrimSpace(payload.DisplayName), + ContextWindow: cloneInt64Ptr(payload.ContextWindow), + MaxInputTokens: cloneInt64Ptr(payload.MaxInputTokens), + MaxOutputTokens: cloneInt64Ptr(payload.MaxOutputTokens), + SupportsTools: cloneBoolPtr(payload.SupportsTools), + SupportsReasoning: cloneBoolPtr(payload.SupportsReasoning), + ReasoningEfforts: trimStringSliceInternal(payload.ReasoningEfforts), + DefaultReasoningEffort: strings.TrimSpace(payload.DefaultReasoningEffort), + CostInputPerMillion: cloneFloat64Ptr(payload.CostInputPerMillion), + CostOutputPerMillion: cloneFloat64Ptr(payload.CostOutputPerMillion), + }) + } + return models +} + +func cloneFloat64Ptr(value *float64) *float64 { + if value == nil { + return nil + } + cloned := *value + return &cloned +} + func providerSecretWritesFromPayload( payloads []contract.SettingsProviderSecretWritePayload, ) []settingspkg.ProviderSecretWrite { @@ -1658,9 +1717,10 @@ func (h *BaseHandlers) drainSettingsLogTail( } func findSettingsProvider(values []settingspkg.ProviderItem, name string) (settingspkg.ProviderItem, bool) { - for _, value := range values { + for idx := range values { + value := &values[idx] if strings.TrimSpace(value.Name) == name { - return value, true + return *value, true } } return settingspkg.ProviderItem{}, false diff --git a/internal/api/core/settings_internal_test.go b/internal/api/core/settings_internal_test.go index fc23aef88..bc93a238a 100644 --- a/internal/api/core/settings_internal_test.go +++ b/internal/api/core/settings_internal_test.go @@ -56,6 +56,12 @@ func TestSettingsHelperFunctionsAndNilErrorWrappers(t *testing.T) { t.Fatal("findSettingsSandbox() = false, want true") } + if providerSettingsPayloadEmpty(contract.SettingsProviderSettingsPayload{ + Models: &contract.SettingsProviderModelsPayload{}, + }) { + t.Fatal("providerSettingsPayloadEmpty(empty models payload) = true, want false") + } + fireLimit := automationFireLimitFromPayload(automationmodel.FireLimitConfig{Max: 5, Window: "1m"}) if fireLimit.Max != 5 || fireLimit.Window != "1m" { t.Fatalf("automationFireLimitFromPayload() = %#v", fireLimit) diff --git a/internal/api/core/settings_test.go b/internal/api/core/settings_test.go index 67be94ab6..bd3c46a1b 100644 --- a/internal/api/core/settings_test.go +++ b/internal/api/core/settings_test.go @@ -623,8 +623,10 @@ func TestSettingsSectionAndCollectionConversions(t *testing.T) { { Name: "openai", Settings: settingspkg.ProviderSettings{ - Command: "codex", - DefaultModel: "gpt-5.4", + Command: "codex", + Models: aghconfig.ProviderModelsConfig{ + Default: "gpt-5.4", + }, CredentialSlots: []aghconfig.ProviderCredentialSlot{ { Name: "api_key", @@ -663,8 +665,10 @@ func TestSettingsSectionAndCollectionConversions(t *testing.T) { Scope: settingspkg.ScopeGlobal, }, Settings: settingspkg.ProviderSettings{ - Command: "codex", - DefaultModel: "gpt-5.4", + Command: "codex", + Models: aghconfig.ProviderModelsConfig{ + Default: "gpt-5.4", + }, }, }, }, @@ -1258,8 +1262,20 @@ func TestSettingsCollectionHandlersDelegateValidPayloads(t *testing.T) { path: "/api/settings/providers/openai", body: contract.PutSettingsProviderRequest{ Settings: contract.SettingsProviderSettingsPayload{ - Command: "codex", - DefaultModel: "gpt-5.4", + Command: "codex", + Models: &contract.SettingsProviderModelsPayload{ + Default: "gpt-5.4", + Curated: []contract.SettingsProviderModelPayload{ + { + ID: "gpt-5.4", + DisplayName: "GPT-5.4", + SupportsReasoning: boolPointer(true), + ReasoningEfforts: []string{"low", "high"}, + DefaultReasoningEffort: "high", + }, + {ID: "gpt-5.4-mini", DisplayName: "GPT-5.4 Mini"}, + }, + }, CredentialSlots: []contract.SettingsProviderCredentialSlotPayload{ { Name: "api_key", @@ -1274,9 +1290,24 @@ func TestSettingsCollectionHandlersDelegateValidPayloads(t *testing.T) { assert: func(t *testing.T, service *stubSettingsService) { t.Helper() if service.LastPutCollectionRequest.Provider == nil || - service.LastPutCollectionRequest.Provider.DefaultModel != "gpt-5.4" { + service.LastPutCollectionRequest.Provider.Models.Default != "gpt-5.4" { t.Fatalf("LastPutCollectionRequest.Provider = %#v", service.LastPutCollectionRequest.Provider) } + if got := service.LastPutCollectionRequest.Provider.Models.Curated; len(got) != 2 || + got[0].ID != "gpt-5.4" || + got[1].ID != "gpt-5.4-mini" { + t.Fatalf("Provider.Models.Curated = %#v", got) + } + model := service.LastPutCollectionRequest.Provider.Models.Curated[0] + if model.SupportsReasoning == nil || !*model.SupportsReasoning { + t.Fatalf( + "Provider.Models.Curated[0].SupportsReasoning = %#v, want true", + model.SupportsReasoning, + ) + } + if got, want := model.DefaultReasoningEffort, "high"; got != want { + t.Fatalf("Provider.Models.Curated[0].DefaultReasoningEffort = %q, want %q", got, want) + } }, assertResponse: assertAppliedSettingsMutation, }, @@ -1347,8 +1378,10 @@ func TestSettingsCollectionHandlersDelegateValidPayloads(t *testing.T) { { Name: "openai", Settings: settingspkg.ProviderSettings{ - Command: "codex", - DefaultModel: "gpt-5.4", + Command: "codex", + Models: aghconfig.ProviderModelsConfig{ + Default: "gpt-5.4", + }, CredentialSlots: []aghconfig.ProviderCredentialSlot{ { Name: "api_key", @@ -2179,6 +2212,10 @@ func decodeJSON(t *testing.T, body []byte, dest any) { } } +func boolPointer(value bool) *bool { + return &value +} + func appendLine(t *testing.T, path string, line string) { t.Helper() diff --git a/internal/api/core/workspaces.go b/internal/api/core/workspaces.go index bc7b67c20..909c53cfc 100644 --- a/internal/api/core/workspaces.go +++ b/internal/api/core/workspaces.go @@ -257,7 +257,10 @@ func (h *BaseHandlers) ResolveWorkspace(c *gin.Context) { } func (h *BaseHandlers) validateCreateSessionRequest(req contract.CreateSessionRequest) error { - return validateCreateSessionRequest(h.transportName(), req.Workspace, req.WorkspacePath) + if err := validateCreateSessionRequest(h.transportName(), req.Workspace, req.WorkspacePath); err != nil { + return err + } + return validateCreateSessionRuntimeOverrides(h.transportName(), req.Provider, req.Model, req.ReasoningEffort) } func (h *BaseHandlers) lookupWorkspaceID(ctx context.Context, ref string) (string, error) { diff --git a/internal/api/httpapi/handlers.go b/internal/api/httpapi/handlers.go index 5a6424879..aea2c139e 100644 --- a/internal/api/httpapi/handlers.go +++ b/internal/api/httpapi/handlers.go @@ -32,6 +32,7 @@ type handlerConfig struct { vault core.VaultService workspaces core.WorkspaceService agentCatalog core.AgentCatalog + modelCatalog core.ModelCatalogService agentContext core.AgentContextService soulAuthoring core.SoulAuthoringService soulRefresher core.SoulRefresher @@ -112,6 +113,7 @@ func newHandlers(cfg *handlerConfig) *Handlers { Vault: cfg.vault, Workspaces: cfg.workspaces, AgentCatalog: cfg.agentCatalog, + ModelCatalog: cfg.modelCatalog, AgentContextService: cfg.agentContext, SoulAuthoring: cfg.soulAuthoring, SoulRefresher: cfg.soulRefresher, diff --git a/internal/api/httpapi/handlers_test.go b/internal/api/httpapi/handlers_test.go index f5b89519f..37b2227fb 100644 --- a/internal/api/httpapi/handlers_test.go +++ b/internal/api/httpapi/handlers_test.go @@ -128,6 +128,8 @@ func TestRegisterRoutesCoversTechSpecEndpoints(t *testing.T) { "GET /api/observe/health", "GET /api/observe/tasks/dashboard", "GET /api/observe/tasks/inbox", + "GET /api/openai/v1/models", + "GET /api/providers/*catalog_path", "GET /api/sessions", "GET /api/sessions/:id", "GET /api/sessions/:id/events", @@ -219,6 +221,7 @@ func TestRegisterRoutesCoversTechSpecEndpoints(t *testing.T) { "POST /api/memory/sessions/prune", "POST /api/memory/sessions/repair", "POST /api/memory/sessions/:session_id/replay", + "POST /api/providers/*catalog_path", "POST /api/network/channels", "POST /api/network/channels/:channel/directs/resolve", "POST /api/bridges", diff --git a/internal/api/httpapi/middleware.go b/internal/api/httpapi/middleware.go index ad73dcb0c..9a352d68e 100644 --- a/internal/api/httpapi/middleware.go +++ b/internal/api/httpapi/middleware.go @@ -57,7 +57,12 @@ func corsMiddleware(boundHost string) gin.HandlerFunc { if origin != "" { allowedOrigin, ok := resolveAllowedOrigin(origin, requestScheme(c.Request), c.Request.Host, boundHost) if !ok { - c.AbortWithStatusJSON(http.StatusForbidden, contract.ErrorPayload{Error: "origin not allowed"}) + if isOpenAICompatiblePath(c) { + core.RespondOpenAIError(c, http.StatusForbidden, errors.New("origin not allowed"), false) + c.Abort() + } else { + c.AbortWithStatusJSON(http.StatusForbidden, contract.ErrorPayload{Error: "origin not allowed"}) + } return } headers.Set("Access-Control-Allow-Origin", allowedOrigin) @@ -267,7 +272,11 @@ func loopbackAPIGuard(boundHost string) gin.HandlerFunc { c.Next() return } - core.RespondError(c, http.StatusForbidden, errLoopbackAPIRequired, false) + if isOpenAICompatiblePath(c) { + core.RespondOpenAIError(c, http.StatusForbidden, errLoopbackAPIRequired, false) + } else { + core.RespondError(c, http.StatusForbidden, errLoopbackAPIRequired, false) + } c.Abort() } } @@ -283,3 +292,10 @@ func loopbackMutationGuard(boundHost string) gin.HandlerFunc { c.Abort() } } + +func isOpenAICompatiblePath(c *gin.Context) bool { + if c == nil || c.Request == nil || c.Request.URL == nil { + return false + } + return strings.HasPrefix(c.Request.URL.Path, "/api/openai/") +} diff --git a/internal/api/httpapi/model_catalog_test.go b/internal/api/httpapi/model_catalog_test.go new file mode 100644 index 000000000..6f97dbc3f --- /dev/null +++ b/internal/api/httpapi/model_catalog_test.go @@ -0,0 +1,206 @@ +package httpapi + +import ( + "context" + "net/http" + "strings" + "testing" + "time" + + "github.com/pedronauck/agh/internal/api/contract" + aghconfig "github.com/pedronauck/agh/internal/config" + "github.com/pedronauck/agh/internal/modelcatalog" +) + +func TestHTTPHandlersModelCatalogDependency(t *testing.T) { + t.Parallel() + + t.Run("ShouldPassModelCatalogServiceToBaseHandlers", func(t *testing.T) { + t.Parallel() + + service := httpModelCatalogServiceStub{} + handlers := newHandlers(&handlerConfig{modelCatalog: service}) + if handlers.BaseHandlers == nil { + t.Fatal("newHandlers() BaseHandlers = nil") + } + if handlers.ModelCatalog == nil { + t.Fatal("newHandlers() ModelCatalog = nil, want injected service") + } + if handlers.ModelCatalog != service { + t.Fatalf("newHandlers() ModelCatalog = %#v, want %#v", handlers.ModelCatalog, service) + } + }) +} + +func TestHTTPModelCatalogRoutes(t *testing.T) { + t.Parallel() + + t.Run("Should expose native provider model list route", func(t *testing.T) { + t.Parallel() + + service := &httpModelCatalogServiceSpy{ + listModelsFn: func(_ context.Context, opts modelcatalog.ListOptions) ([]modelcatalog.Model, error) { + if got, want := opts.ProviderID, "codex"; got != want { + t.Fatalf("ProviderID = %q, want %q", got, want) + } + return []modelcatalog.Model{httpSeedCatalogModel("codex", "gpt-5.4")}, nil + }, + } + engine := newHTTPModelCatalogRouter(t, service, "127.0.0.1") + + recorder := performRequest(t, engine, http.MethodGet, "/api/providers/codex/models", nil) + if recorder.Code != http.StatusOK { + t.Fatalf("status = %d, want 200; body=%s", recorder.Code, recorder.Body.String()) + } + var payload contract.ProviderModelListResponse + decodeJSONResponse(t, recorder, &payload) + if len(payload.Models) != 1 || payload.Models[0].ProviderID != "codex" { + t.Fatalf("payload = %#v, want codex model", payload) + } + }) + + t.Run("Should expose OpenAI model route with AGH metadata", func(t *testing.T) { + t.Parallel() + + service := &httpModelCatalogServiceSpy{ + listModelsFn: func(_ context.Context, opts modelcatalog.ListOptions) ([]modelcatalog.Model, error) { + if got, want := opts.ProviderID, "codex"; got != want { + t.Fatalf("ProviderID = %q, want %q", got, want) + } + return []modelcatalog.Model{httpSeedCatalogModel("codex", "gpt-5.4")}, nil + }, + } + engine := newHTTPModelCatalogRouter(t, service, "127.0.0.1") + + recorder := performRequest(t, engine, http.MethodGet, "/api/openai/v1/models?provider_id=codex", nil) + if recorder.Code != http.StatusOK { + t.Fatalf("status = %d, want 200; body=%s", recorder.Code, recorder.Body.String()) + } + var payload contract.OpenAIModelListResponse + decodeJSONResponse(t, recorder, &payload) + if payload.Object != "list" || len(payload.Data) != 1 || payload.Data[0].AGH.ProviderID != "codex" { + t.Fatalf("payload = %#v, want OpenAI list with agh metadata", payload) + } + }) + + t.Run("Should return OpenAI shaped forbidden error from API middleware", func(t *testing.T) { + t.Parallel() + + engine := newHTTPModelCatalogRouter(t, &httpModelCatalogServiceSpy{}, "0.0.0.0") + + recorder := performRequest(t, engine, http.MethodGet, "/api/openai/v1/models", nil) + if recorder.Code != http.StatusForbidden { + t.Fatalf("status = %d, want 403; body=%s", recorder.Code, recorder.Body.String()) + } + var payload contract.OpenAIErrorResponse + decodeJSONResponse(t, recorder, &payload) + if payload.Error.Code != "forbidden" || !strings.Contains(payload.Error.Message, "remote HTTP API access") { + t.Fatalf("error = %#v, want OpenAI-shaped forbidden API middleware error", payload.Error) + } + }) +} + +type httpModelCatalogServiceStub struct{} + +func (httpModelCatalogServiceStub) ListModels( + context.Context, + modelcatalog.ListOptions, +) ([]modelcatalog.Model, error) { + return nil, nil +} + +func (httpModelCatalogServiceStub) Refresh( + context.Context, + modelcatalog.RefreshOptions, +) ([]modelcatalog.SourceStatus, error) { + return nil, nil +} + +func (httpModelCatalogServiceStub) ListSourceStatus( + context.Context, + string, +) ([]modelcatalog.SourceStatus, error) { + return nil, nil +} + +type httpModelCatalogServiceSpy struct { + listModelsFn func(context.Context, modelcatalog.ListOptions) ([]modelcatalog.Model, error) + refreshFn func(context.Context, modelcatalog.RefreshOptions) ([]modelcatalog.SourceStatus, error) + listSourceStatusFn func(context.Context, string) ([]modelcatalog.SourceStatus, error) +} + +func (s *httpModelCatalogServiceSpy) ListModels( + ctx context.Context, + opts modelcatalog.ListOptions, +) ([]modelcatalog.Model, error) { + if s.listModelsFn != nil { + return s.listModelsFn(ctx, opts) + } + return nil, nil +} + +func (s *httpModelCatalogServiceSpy) Refresh( + ctx context.Context, + opts modelcatalog.RefreshOptions, +) ([]modelcatalog.SourceStatus, error) { + if s.refreshFn != nil { + return s.refreshFn(ctx, opts) + } + return nil, nil +} + +func (s *httpModelCatalogServiceSpy) ListSourceStatus( + ctx context.Context, + providerID string, +) ([]modelcatalog.SourceStatus, error) { + if s.listSourceStatusFn != nil { + return s.listSourceStatusFn(ctx, providerID) + } + return nil, nil +} + +func newHTTPModelCatalogRouter( + t *testing.T, + service coreModelCatalogService, + boundHost string, +) http.Handler { + t.Helper() + + cfg := testConfigWithDisabledNetwork(newTestHomePaths(t)) + cfg.HTTP.Host = boundHost + cfg.HTTP.Port = 2123 + handlers := newHandlers(&handlerConfig{ + modelCatalog: service, + staticFS: mustStaticFS(t), + config: cfg, + boundHost: boundHost, + logger: discardLogger(), + startedAt: time.Date(2026, 4, 3, 12, 0, 0, 0, time.UTC), + now: func() time.Time { return time.Date(2026, 4, 3, 12, 0, 1, 0, time.UTC) }, + pollInterval: 5 * time.Millisecond, + agentLoader: aghconfig.LoadAgentDef, + httpPort: cfg.HTTP.Port, + workspaces: stubWorkspaceService{}, + tasks: stubTaskManager{}, + }) + return newTestRouter(t, handlers) +} + +type coreModelCatalogService interface { + ListModels(context.Context, modelcatalog.ListOptions) ([]modelcatalog.Model, error) + Refresh(context.Context, modelcatalog.RefreshOptions) ([]modelcatalog.SourceStatus, error) + ListSourceStatus(context.Context, string) ([]modelcatalog.SourceStatus, error) +} + +func httpSeedCatalogModel(providerID string, modelID string) modelcatalog.Model { + available := true + return modelcatalog.Model{ + ProviderID: providerID, + ModelID: modelID, + Available: &available, + AvailabilityState: string(modelcatalog.AvailabilityStateAvailableLive), + Sources: []modelcatalog.SourceRef{ + {SourceID: modelcatalog.SourceIDConfig, SourceKind: modelcatalog.SourceKindConfig}, + }, + } +} diff --git a/internal/api/httpapi/routes.go b/internal/api/httpapi/routes.go index 0bc3f787b..fd82f3d33 100644 --- a/internal/api/httpapi/routes.go +++ b/internal/api/httpapi/routes.go @@ -33,6 +33,8 @@ func RegisterRoutes(router gin.IRouter, handlers *Handlers) { registerExtensionRoutes(api, handlers) registerSettingsRoutes(api, handlers) registerVaultRoutes(api, handlers) + registerProviderModelRoutes(api, handlers) + registerOpenAIModelRoutes(api, handlers) if engine, ok := router.(*gin.Engine); ok { engine.NoRoute(handlers.serveStaticRoute) @@ -397,6 +399,15 @@ func registerVaultRoutes(api gin.IRouter, handlers *Handlers) { vaultGroup.DELETE("/secrets", privileged, handlers.DeleteVaultSecret) } +func registerProviderModelRoutes(api gin.IRouter, handlers *Handlers) { + api.GET("/providers/*catalog_path", handlers.ProviderModelCatalog) + api.POST("/providers/*catalog_path", handlers.ProviderModelCatalog) +} + +func registerOpenAIModelRoutes(api gin.IRouter, handlers *Handlers) { + api.GET("/openai/v1/models", handlers.OpenAIModels) +} + func registerWebhookRoutes(api gin.IRouter, handlers *Handlers) { webhooks := api.Group("/webhooks") webhooks.POST("/global/:endpoint", handlers.DeliverGlobalWebhook) diff --git a/internal/api/httpapi/server.go b/internal/api/httpapi/server.go index 4f5c79863..a90b6b585 100644 --- a/internal/api/httpapi/server.go +++ b/internal/api/httpapi/server.go @@ -59,6 +59,7 @@ type Server struct { vault core.VaultService workspaces core.WorkspaceService agentCatalog core.AgentCatalog + modelCatalog core.ModelCatalogService agentContext core.AgentContextService soulAuthoring core.SoulAuthoringService soulRefresher core.SoulRefresher @@ -284,6 +285,13 @@ func WithAgentCatalog(catalog core.AgentCatalog) Option { } } +// WithModelCatalogService injects the daemon-owned provider model catalog service. +func WithModelCatalogService(service core.ModelCatalogService) Option { + return func(server *Server) { + server.modelCatalog = service + } +} + // WithAgentContext injects the bounded agent situation context service. func WithAgentContext(service core.AgentContextService) Option { return func(server *Server) { @@ -542,6 +550,7 @@ func (s *Server) handlerConfig(staticFS fs.FS) *handlerConfig { vault: s.vault, workspaces: s.workspaces, agentCatalog: s.agentCatalog, + modelCatalog: s.modelCatalog, agentContext: s.agentContext, soulAuthoring: s.soulAuthoring, soulRefresher: s.soulRefresher, diff --git a/internal/api/httpapi/transport_parity_integration_test.go b/internal/api/httpapi/transport_parity_integration_test.go index a1825ec8f..3d34ef485 100644 --- a/internal/api/httpapi/transport_parity_integration_test.go +++ b/internal/api/httpapi/transport_parity_integration_test.go @@ -657,7 +657,7 @@ func writeTransportProviderOverrideConfig( } } tree.SetPath(append(providerPath, "command"), strings.TrimSpace(command)) - tree.SetPath(append(providerPath, "default_model"), "transport-override-model") + tree.SetPath(append(providerPath, "models", "default"), "transport-override-model") credentialSlot, err := tomltree.TreeFromMap(map[string]any{ "name": "api_key", "target_env": "TRANSPORT_OVERRIDE_API_KEY", diff --git a/internal/api/spec/model_catalog.go b/internal/api/spec/model_catalog.go new file mode 100644 index 000000000..2a9f03f16 --- /dev/null +++ b/internal/api/spec/model_catalog.go @@ -0,0 +1,143 @@ +package spec + +import "github.com/pedronauck/agh/internal/api/contract" + +func modelCatalogOperations() []OperationSpec { + operations := []OperationSpec{openAIModelCatalogOperation()} + return append(operations, nativeModelCatalogOperations()...) +} + +func openAIModelCatalogOperation() OperationSpec { + return OperationSpec{ + Method: "GET", + Path: "/api/openai/v1/models", + OperationID: "listOpenAIModels", + Summary: "List provider models using the OpenAI-compatible model shape", + Tags: []string{"openai"}, + Transports: []Transport{TransportHTTP}, + Parameters: []ParameterSpec{ + queryParam("provider_id", "Filter by AGH provider id", false), + }, + Responses: []ResponseSpec{ + {Status: 200, Description: "OK", Body: contract.OpenAIModelListResponse{}}, + {Status: 400, Description: "Invalid model catalog filter", Body: contract.OpenAIErrorResponse{}}, + {Status: 401, Description: "Unauthorized", Body: contract.OpenAIErrorResponse{}}, + {Status: 403, Description: "Forbidden", Body: contract.OpenAIErrorResponse{}}, + {Status: 503, Description: "Model catalog unavailable", Body: contract.OpenAIErrorResponse{}}, + {Status: 500, Description: "Internal server error", Body: contract.OpenAIErrorResponse{}}, + }, + } +} + +func nativeModelCatalogOperations() []OperationSpec { + return []OperationSpec{ + { + Method: "GET", + Path: "/api/providers/models", + OperationID: "listProviderModels", + Summary: "List provider model catalog entries across providers", + Tags: []string{"providers"}, + Transports: []Transport{TransportHTTP, TransportUDS}, + Parameters: modelCatalogListParameters(false), + Responses: modelCatalogListResponses(), + }, + { + Method: "GET", + Path: "/api/providers/{provider_id}/models", + OperationID: "listProviderModelsByProvider", + Summary: "List provider model catalog entries for one provider", + Tags: []string{"providers"}, + Transports: []Transport{TransportHTTP, TransportUDS}, + Parameters: modelCatalogListParameters(true), + Responses: modelCatalogListResponses(), + }, + { + Method: "POST", + Path: "/api/providers/models/refresh", + OperationID: "refreshProviderModels", + Summary: "Refresh provider model catalog sources across providers", + Tags: []string{"providers"}, + Transports: []Transport{TransportHTTP, TransportUDS}, + RequestBody: contract.ProviderModelRefreshRequest{}, + RequestBodyOptional: true, + Responses: modelCatalogRefreshResponses(), + }, + { + Method: "POST", + Path: "/api/providers/{provider_id}/models/refresh", + OperationID: "refreshProviderModelsByProvider", + Summary: "Refresh provider model catalog sources for one provider", + Tags: []string{"providers"}, + Transports: []Transport{TransportHTTP, TransportUDS}, + Parameters: []ParameterSpec{pathParam("provider_id", "AGH provider id")}, + RequestBody: contract.ProviderModelRefreshRequest{}, + RequestBodyOptional: true, + Responses: modelCatalogRefreshResponses(), + }, + { + Method: "GET", + Path: "/api/providers/models/status", + OperationID: "getProviderModelStatus", + Summary: "List provider model catalog source status across providers", + Tags: []string{"providers"}, + Transports: []Transport{TransportHTTP, TransportUDS}, + Responses: modelCatalogStatusResponses(), + }, + { + Method: "GET", + Path: "/api/providers/{provider_id}/models/status", + OperationID: "getProviderModelStatusByProvider", + Summary: "List provider model catalog source status for one provider", + Tags: []string{"providers"}, + Transports: []Transport{TransportHTTP, TransportUDS}, + Parameters: []ParameterSpec{pathParam("provider_id", "AGH provider id")}, + Responses: modelCatalogStatusResponses(), + }, + } +} + +func modelCatalogListParameters(providerPath bool) []ParameterSpec { + parameters := make([]ParameterSpec, 0, 5) + if providerPath { + parameters = append(parameters, pathParam("provider_id", "AGH provider id")) + } else { + parameters = append(parameters, queryParam("provider_id", "Filter by AGH provider id", false)) + } + parameters = append( + parameters, + queryParam("source_id", "Filter by catalog source id", false), + boolQueryParam("refresh", "Refresh sources before listing models"), + boolQueryParam("include_stale", "Include stale source rows in the merged projection"), + ) + return parameters +} + +func modelCatalogListResponses() []ResponseSpec { + return []ResponseSpec{ + {Status: 200, Description: "OK", Body: contract.ProviderModelListResponse{}}, + {Status: 400, Description: "Invalid model catalog filter", Body: contract.ErrorPayload{}}, + {Status: 403, Description: "Forbidden", Body: contract.ErrorPayload{}}, + {Status: 503, Description: "Model catalog unavailable", Body: contract.ErrorPayload{}}, + {Status: 500, Description: "Internal server error", Body: contract.ErrorPayload{}}, + } +} + +func modelCatalogRefreshResponses() []ResponseSpec { + return []ResponseSpec{ + {Status: 200, Description: "OK", Body: contract.ProviderModelRefreshResponse{}}, + {Status: 400, Description: "Invalid model catalog refresh request", Body: contract.ErrorPayload{}}, + {Status: 403, Description: "Forbidden", Body: contract.ErrorPayload{}}, + {Status: 503, Description: "Model catalog refresh unavailable", Body: contract.ProviderModelRefreshResponse{}}, + {Status: 500, Description: "Internal server error", Body: contract.ErrorPayload{}}, + } +} + +func modelCatalogStatusResponses() []ResponseSpec { + return []ResponseSpec{ + {Status: 200, Description: "OK", Body: contract.ProviderModelStatusResponse{}}, + {Status: 400, Description: "Invalid model catalog filter", Body: contract.ErrorPayload{}}, + {Status: 403, Description: "Forbidden", Body: contract.ErrorPayload{}}, + {Status: 503, Description: "Model catalog unavailable", Body: contract.ErrorPayload{}}, + {Status: 500, Description: "Internal server error", Body: contract.ErrorPayload{}}, + } +} diff --git a/internal/api/spec/spec.go b/internal/api/spec/spec.go index ac80fc936..a1f9030d4 100644 --- a/internal/api/spec/spec.go +++ b/internal/api/spec/spec.go @@ -199,6 +199,8 @@ func Document() (*openapi3.T, error) { {Name: "hooks"}, {Name: "memory"}, {Name: "observe"}, + {Name: "openai"}, + {Name: "providers"}, {Name: "resources"}, {Name: "sessions"}, {Name: "settings"}, @@ -4446,6 +4448,7 @@ var operationRegistry = []OperationSpec{ func Operations() []OperationSpec { ops := append([]OperationSpec(nil), operationRegistry...) ops = append(ops, authoredContextOperations()...) + ops = append(ops, modelCatalogOperations()...) sort.SliceStable(ops, func(i, j int) bool { if ops[i].Path == ops[j].Path { return ops[i].Method < ops[j].Method diff --git a/internal/api/spec/spec_test.go b/internal/api/spec/spec_test.go index 12dc88d17..575807f5c 100644 --- a/internal/api/spec/spec_test.go +++ b/internal/api/spec/spec_test.go @@ -86,6 +86,40 @@ func TestDocumentTracksRequiredFieldsAndEnums(t *testing.T) { ) }, }, + { + name: "ShouldDescribeProviderModelCatalogAndOpenAIProjection", + check: func(t *testing.T, doc *openapi3.T) { + t.Helper() + + listModels := operationFor(t, doc, "/api/providers/{provider_id}/models", "GET") + assertTagsContain(t, listModels, "providers") + assertParameter(t, listModels, "provider_id", openapi3.ParameterInPath, true) + assertParameter(t, listModels, "source_id", openapi3.ParameterInQuery, false) + assertParameter(t, listModels, "refresh", openapi3.ParameterInQuery, false) + assertParameter(t, listModels, "include_stale", openapi3.ParameterInQuery, false) + listSchema := jsonResponseSchema(t, listModels, 200) + assertRequired(t, listSchema, "models") + + refresh := operationFor(t, doc, "/api/providers/{provider_id}/models/refresh", "POST") + if refresh.RequestBody == nil || refresh.RequestBody.Value == nil || + refresh.RequestBody.Value.Required { + t.Fatalf("refresh request body required = %#v, want optional body", refresh.RequestBody) + } + assertResponseStatus(t, refresh, 503) + + status := operationFor(t, doc, "/api/providers/models/status", "GET") + statusSchema := jsonResponseSchema(t, status, 200) + assertRequired(t, statusSchema, "sources") + + openAI := operationFor(t, doc, "/api/openai/v1/models", "GET") + assertTagsContain(t, openAI, "openai") + assertParameter(t, openAI, "provider_id", openapi3.ParameterInQuery, false) + openAISchema := jsonResponseSchema(t, openAI, 200) + assertRequired(t, openAISchema, "object", "data") + assertResponseStatus(t, openAI, 403) + assertResponseStatus(t, openAI, 503) + }, + }, { name: "ShouldDescribeApproveSessionRequiredFields", check: func(t *testing.T, doc *openapi3.T) { diff --git a/internal/api/testutil/model_catalog_parity_test.go b/internal/api/testutil/model_catalog_parity_test.go new file mode 100644 index 000000000..24d117e17 --- /dev/null +++ b/internal/api/testutil/model_catalog_parity_test.go @@ -0,0 +1,207 @@ +package testutil_test + +import ( + "context" + "encoding/json" + "io" + "log/slog" + "net/http" + "net/http/httptest" + "os" + "testing" + "time" + + "github.com/gin-gonic/gin" + "github.com/pedronauck/agh/internal/api/contract" + "github.com/pedronauck/agh/internal/api/httpapi" + "github.com/pedronauck/agh/internal/api/testutil" + "github.com/pedronauck/agh/internal/api/udsapi" + "github.com/pedronauck/agh/internal/cli" + aghconfig "github.com/pedronauck/agh/internal/config" + "github.com/pedronauck/agh/internal/modelcatalog" +) + +func TestModelCatalogTransportParity(t *testing.T) { + t.Parallel() + + t.Run("Should return canonical native list JSON bytes for the same catalog state", func(t *testing.T) { + t.Parallel() + + service := &parityModelCatalogService{ + models: []modelcatalog.Model{parityCatalogModel("codex", "gpt-5.4")}, + } + httpEngine := newParityHTTPRouter(t, service) + udsEngine := newParityUDSRouter(t, service) + + httpResp := performParityRequest(t, httpEngine, http.MethodGet, "/api/providers/codex/models") + udsResp := performParityRequest(t, udsEngine, http.MethodGet, "/api/providers/codex/models") + if httpResp.Code != http.StatusOK || udsResp.Code != http.StatusOK { + t.Fatalf( + "statuses = http:%d uds:%d, want 200; http=%s uds=%s", + httpResp.Code, + udsResp.Code, + httpResp.Body.String(), + udsResp.Body.String(), + ) + } + if got, want := httpResp.Body.String(), udsResp.Body.String(); got != want { + t.Fatalf("HTTP body = %s, want UDS body %s", got, want) + } + var cliRecord cli.ProviderModelListRecord + if err := json.Unmarshal(httpResp.Body.Bytes(), &cliRecord); err != nil { + t.Fatalf("json.Unmarshal(HTTP body as CLI record) error = %v", err) + } + cliJSON, err := json.Marshal(cliRecord) + if err != nil { + t.Fatalf("json.Marshal(CLI record) error = %v", err) + } + if got, want := string(cliJSON), httpResp.Body.String(); got != want { + t.Fatalf("CLI JSON = %s, want canonical native body %s", got, want) + } + + openAIResp := performParityRequest( + t, + httpEngine, + http.MethodGet, + "/api/openai/v1/models?provider_id=codex", + ) + if openAIResp.Code != http.StatusOK { + t.Fatalf("OpenAI status = %d, want 200; body=%s", openAIResp.Code, openAIResp.Body.String()) + } + var openAIPayload contract.OpenAIModelListResponse + if err := json.Unmarshal(openAIResp.Body.Bytes(), &openAIPayload); err != nil { + t.Fatalf("json.Unmarshal(OpenAI body) error = %v", err) + } + if len(openAIPayload.Data) != 1 { + t.Fatalf("OpenAI data = %#v, want one model", openAIPayload.Data) + } + openAIModel := openAIPayload.Data[0] + nativeModel := cliRecord.Models[0] + if openAIModel.ID != nativeModel.ModelID || + openAIModel.OwnedBy != nativeModel.ProviderID || + openAIModel.AGH.ProviderID != nativeModel.ProviderID || + openAIModel.AGH.ModelID != nativeModel.ModelID { + t.Fatalf("OpenAI model = %#v, want native catalog identity %#v", openAIModel, nativeModel) + } + }) +} + +func newParityHTTPRouter(t *testing.T, service *parityModelCatalogService) http.Handler { + t.Helper() + + gin.SetMode(gin.TestMode) + engine := gin.New() + homePaths := newShortParityHomePaths(t) + cfg := testutil.ConfigWithDisabledNetwork(homePaths) + cfg.HTTP.Host = "127.0.0.1" + cfg.HTTP.Port = 2123 + if _, err := httpapi.New( + httpapi.WithEngine(engine), + httpapi.WithHomePaths(homePaths), + httpapi.WithConfig(&cfg), + httpapi.WithHost(cfg.HTTP.Host), + httpapi.WithPort(cfg.HTTP.Port), + httpapi.WithLogger(slog.New(slog.NewTextHandler(io.Discard, nil))), + httpapi.WithStartedAt(time.Date(2026, 4, 3, 12, 0, 0, 0, time.UTC)), + httpapi.WithNow(func() time.Time { return time.Date(2026, 4, 3, 12, 0, 1, 0, time.UTC) }), + httpapi.WithSessionManager(testutil.StubSessionManager{}), + httpapi.WithTaskService(testutil.StubTaskManager{}), + httpapi.WithObserver(testutil.StubObserver{}), + httpapi.WithWorkspaceResolver(testutil.StubWorkspaceService{}), + httpapi.WithModelCatalogService(service), + ); err != nil { + t.Fatalf("httpapi.New() error = %v", err) + } + return engine +} + +func newParityUDSRouter(t *testing.T, service *parityModelCatalogService) http.Handler { + t.Helper() + + gin.SetMode(gin.TestMode) + engine := gin.New() + homePaths := newShortParityHomePaths(t) + cfg := testutil.ConfigWithDisabledNetwork(homePaths) + if _, err := udsapi.New( + udsapi.WithEngine(engine), + udsapi.WithHomePaths(homePaths), + udsapi.WithConfig(&cfg), + udsapi.WithLogger(slog.New(slog.NewTextHandler(io.Discard, nil))), + udsapi.WithStartedAt(time.Date(2026, 4, 3, 12, 0, 0, 0, time.UTC)), + udsapi.WithNow(func() time.Time { return time.Date(2026, 4, 3, 12, 0, 1, 0, time.UTC) }), + udsapi.WithSessionManager(testutil.StubSessionManager{}), + udsapi.WithTaskService(testutil.StubTaskManager{}), + udsapi.WithObserver(testutil.StubObserver{}), + udsapi.WithWorkspaceResolver(testutil.StubWorkspaceService{}), + udsapi.WithModelCatalogService(service), + ); err != nil { + t.Fatalf("udsapi.New() error = %v", err) + } + return engine +} + +func performParityRequest(t *testing.T, handler http.Handler, method string, path string) *httptest.ResponseRecorder { + t.Helper() + + request := httptest.NewRequestWithContext(context.Background(), method, path, http.NoBody) + recorder := httptest.NewRecorder() + handler.ServeHTTP(recorder, request) + return recorder +} + +func newShortParityHomePaths(t *testing.T) aghconfig.HomePaths { + t.Helper() + + root, err := os.MkdirTemp("/tmp", "agh-model-parity-*") + if err != nil { + t.Fatalf("MkdirTemp() error = %v", err) + } + t.Cleanup(func() { + if err := os.RemoveAll(root); err != nil { + t.Errorf("RemoveAll(%q) error = %v", root, err) + } + }) + homePaths, err := aghconfig.ResolveHomePathsFrom(root) + if err != nil { + t.Fatalf("ResolveHomePathsFrom() error = %v", err) + } + return homePaths +} + +type parityModelCatalogService struct { + models []modelcatalog.Model +} + +func (s *parityModelCatalogService) ListModels( + context.Context, + modelcatalog.ListOptions, +) ([]modelcatalog.Model, error) { + return append([]modelcatalog.Model(nil), s.models...), nil +} + +func (*parityModelCatalogService) Refresh( + context.Context, + modelcatalog.RefreshOptions, +) ([]modelcatalog.SourceStatus, error) { + return nil, nil +} + +func (*parityModelCatalogService) ListSourceStatus( + context.Context, + string, +) ([]modelcatalog.SourceStatus, error) { + return nil, nil +} + +func parityCatalogModel(providerID string, modelID string) modelcatalog.Model { + available := true + return modelcatalog.Model{ + ProviderID: providerID, + ModelID: modelID, + Available: &available, + AvailabilityState: string(modelcatalog.AvailabilityStateAvailableLive), + Sources: []modelcatalog.SourceRef{ + {SourceID: modelcatalog.SourceIDConfig, SourceKind: modelcatalog.SourceKindConfig}, + }, + } +} diff --git a/internal/api/udsapi/handlers_test.go b/internal/api/udsapi/handlers_test.go index b31110303..b5c29cc90 100644 --- a/internal/api/udsapi/handlers_test.go +++ b/internal/api/udsapi/handlers_test.go @@ -8,6 +8,7 @@ import ( "net/http/httptest" "os" "path/filepath" + "reflect" "slices" "sort" "strings" @@ -184,6 +185,7 @@ func TestRegisterRoutesCoversTechSpecEndpoints(t *testing.T) { "GET /api/observe/health", "GET /api/observe/tasks/dashboard", "GET /api/observe/tasks/inbox", + "GET /api/providers/*catalog_path", "GET /api/resources", "GET /api/resources/:kind", "GET /api/resources/:kind/:id", @@ -299,6 +301,7 @@ func TestRegisterRoutesCoversTechSpecEndpoints(t *testing.T) { "POST /api/memory/sessions/prune", "POST /api/memory/sessions/repair", "POST /api/memory/sessions/:session_id/replay", + "POST /api/providers/*catalog_path", "POST /api/network/channels", "POST /api/network/channels/:channel/directs/resolve", "POST /api/network/send", @@ -1096,7 +1099,7 @@ func TestGetWorkspaceHandlerReturnsDetail(t *testing.T) { t.Fatalf("len(providers) = %d, want %d (%#v)", len(response.Providers), len(expectedProviders), response) } for i, want := range expectedProviders { - if got := response.Providers[i]; got != want { + if got := response.Providers[i]; !reflect.DeepEqual(got, want) { t.Fatalf("providers[%d] = %#v, want %#v", i, got, want) } } diff --git a/internal/api/udsapi/model_catalog_test.go b/internal/api/udsapi/model_catalog_test.go new file mode 100644 index 000000000..96f08a1ef --- /dev/null +++ b/internal/api/udsapi/model_catalog_test.go @@ -0,0 +1,141 @@ +package udsapi + +import ( + "context" + "net/http" + "testing" + + "github.com/pedronauck/agh/internal/api/contract" + "github.com/pedronauck/agh/internal/modelcatalog" +) + +func TestUDSHandlersModelCatalogDependency(t *testing.T) { + t.Parallel() + + t.Run("ShouldPassModelCatalogServiceToBaseHandlers", func(t *testing.T) { + t.Parallel() + + service := udsModelCatalogServiceStub{} + handlers := newHandlers(&handlerConfig{modelCatalog: service}) + if handlers.BaseHandlers == nil { + t.Fatal("newHandlers() BaseHandlers = nil") + } + if handlers.ModelCatalog == nil { + t.Fatal("newHandlers() ModelCatalog = nil, want injected service") + } + if handlers.ModelCatalog != service { + t.Fatalf("newHandlers() ModelCatalog = %#v, want %#v", handlers.ModelCatalog, service) + } + }) +} + +func TestUDSModelCatalogRoutes(t *testing.T) { + t.Parallel() + + t.Run("Should expose native provider model list route", func(t *testing.T) { + t.Parallel() + + service := &udsModelCatalogServiceSpy{ + listModelsFn: func(_ context.Context, opts modelcatalog.ListOptions) ([]modelcatalog.Model, error) { + if got, want := opts.ProviderID, "codex"; got != want { + t.Fatalf("ProviderID = %q, want %q", got, want) + } + return []modelcatalog.Model{udsSeedCatalogModel("codex", "gpt-5.4")}, nil + }, + } + engine := newTestRouter(t, newHandlers(&handlerConfig{modelCatalog: service})) + + recorder := performRequest(t, engine, http.MethodGet, "/api/providers/codex/models", nil) + if recorder.Code != http.StatusOK { + t.Fatalf("status = %d, want 200; body=%s", recorder.Code, recorder.Body.String()) + } + var payload contract.ProviderModelListResponse + decodeJSONResponse(t, recorder, &payload) + if len(payload.Models) != 1 || payload.Models[0].ProviderID != "codex" { + t.Fatalf("payload = %#v, want codex model", payload) + } + }) + + t.Run("Should not register OpenAI model projection", func(t *testing.T) { + t.Parallel() + + engine := newTestRouter(t, newHandlers(&handlerConfig{modelCatalog: &udsModelCatalogServiceSpy{}})) + + recorder := performRequest(t, engine, http.MethodGet, "/api/openai/v1/models", nil) + if recorder.Code != http.StatusNotFound { + t.Fatalf("status = %d, want 404; body=%s", recorder.Code, recorder.Body.String()) + } + }) +} + +type udsModelCatalogServiceStub struct{} + +func (udsModelCatalogServiceStub) ListModels( + context.Context, + modelcatalog.ListOptions, +) ([]modelcatalog.Model, error) { + return nil, nil +} + +func (udsModelCatalogServiceStub) Refresh( + context.Context, + modelcatalog.RefreshOptions, +) ([]modelcatalog.SourceStatus, error) { + return nil, nil +} + +func (udsModelCatalogServiceStub) ListSourceStatus( + context.Context, + string, +) ([]modelcatalog.SourceStatus, error) { + return nil, nil +} + +type udsModelCatalogServiceSpy struct { + listModelsFn func(context.Context, modelcatalog.ListOptions) ([]modelcatalog.Model, error) + refreshFn func(context.Context, modelcatalog.RefreshOptions) ([]modelcatalog.SourceStatus, error) + listSourceStatusFn func(context.Context, string) ([]modelcatalog.SourceStatus, error) +} + +func (s *udsModelCatalogServiceSpy) ListModels( + ctx context.Context, + opts modelcatalog.ListOptions, +) ([]modelcatalog.Model, error) { + if s.listModelsFn != nil { + return s.listModelsFn(ctx, opts) + } + return nil, nil +} + +func (s *udsModelCatalogServiceSpy) Refresh( + ctx context.Context, + opts modelcatalog.RefreshOptions, +) ([]modelcatalog.SourceStatus, error) { + if s.refreshFn != nil { + return s.refreshFn(ctx, opts) + } + return nil, nil +} + +func (s *udsModelCatalogServiceSpy) ListSourceStatus( + ctx context.Context, + providerID string, +) ([]modelcatalog.SourceStatus, error) { + if s.listSourceStatusFn != nil { + return s.listSourceStatusFn(ctx, providerID) + } + return nil, nil +} + +func udsSeedCatalogModel(providerID string, modelID string) modelcatalog.Model { + available := true + return modelcatalog.Model{ + ProviderID: providerID, + ModelID: modelID, + Available: &available, + AvailabilityState: string(modelcatalog.AvailabilityStateAvailableLive), + Sources: []modelcatalog.SourceRef{ + {SourceID: modelcatalog.SourceIDConfig, SourceKind: modelcatalog.SourceKindConfig}, + }, + } +} diff --git a/internal/api/udsapi/routes.go b/internal/api/udsapi/routes.go index f2c2a48a7..6d2c02f6e 100644 --- a/internal/api/udsapi/routes.go +++ b/internal/api/udsapi/routes.go @@ -26,6 +26,7 @@ func RegisterRoutes(router gin.IRouter, handlers *Handlers) { registerExtensionRoutes(api, handlers) registerSettingsRoutes(api, handlers) registerVaultRoutes(api, handlers) + registerProviderModelRoutes(api, handlers) registerHostedMCPRoutes(api, handlers) } @@ -437,3 +438,8 @@ func registerVaultRoutes(api gin.IRouter, handlers *Handlers) { vaultGroup.DELETE("/secrets", handlers.DeleteVaultSecret) } } + +func registerProviderModelRoutes(api gin.IRouter, handlers *Handlers) { + api.GET("/providers/*catalog_path", handlers.ProviderModelCatalog) + api.POST("/providers/*catalog_path", handlers.ProviderModelCatalog) +} diff --git a/internal/api/udsapi/server.go b/internal/api/udsapi/server.go index 349a669af..580c0b9af 100644 --- a/internal/api/udsapi/server.go +++ b/internal/api/udsapi/server.go @@ -79,6 +79,7 @@ type Server struct { vault core.VaultService workspaces core.WorkspaceService agentCatalog core.AgentCatalog + modelCatalog core.ModelCatalogService agentContext core.AgentContextService soulAuthoring core.SoulAuthoringService soulRefresher core.SoulRefresher @@ -127,6 +128,7 @@ type handlerConfig struct { vault core.VaultService workspaces core.WorkspaceService agentCatalog core.AgentCatalog + modelCatalog core.ModelCatalogService agentContext core.AgentContextService soulAuthoring core.SoulAuthoringService soulRefresher core.SoulRefresher @@ -356,6 +358,13 @@ func WithAgentCatalog(catalog core.AgentCatalog) Option { } } +// WithModelCatalogService injects the daemon-owned provider model catalog service. +func WithModelCatalogService(service core.ModelCatalogService) Option { + return func(server *Server) { + server.modelCatalog = service + } +} + // WithAgentContext injects the bounded agent situation context service. func WithAgentContext(service core.AgentContextService) Option { return func(server *Server) { @@ -610,6 +619,7 @@ func (s *Server) handlerConfig() *handlerConfig { vault: s.vault, workspaces: s.workspaces, agentCatalog: s.agentCatalog, + modelCatalog: s.modelCatalog, agentContext: s.agentContext, soulAuthoring: s.soulAuthoring, soulRefresher: s.soulRefresher, @@ -859,6 +869,7 @@ func newHandlers(cfg *handlerConfig) *Handlers { Vault: cfg.vault, Workspaces: cfg.workspaces, AgentCatalog: cfg.agentCatalog, + ModelCatalog: cfg.modelCatalog, AgentContextService: cfg.agentContext, SoulAuthoring: cfg.soulAuthoring, SoulRefresher: cfg.soulRefresher, diff --git a/internal/api/udsapi/transport_parity_integration_test.go b/internal/api/udsapi/transport_parity_integration_test.go index 6a1e23d5e..1b174cdfc 100644 --- a/internal/api/udsapi/transport_parity_integration_test.go +++ b/internal/api/udsapi/transport_parity_integration_test.go @@ -1068,7 +1068,8 @@ func writeTransportProviderOverrideConfig( builder.WriteString(`command = "`) builder.WriteString(escapeTransportConfigString(command)) builder.WriteString("\"\n") - builder.WriteString(`default_model = "transport-override-model"` + "\n") + builder.WriteString("[providers." + providerName + ".models]\n") + builder.WriteString(`default = "transport-override-model"` + "\n") builder.WriteString("[[providers." + providerName + ".credential_slots]]\n") builder.WriteString(`name = "api_key"` + "\n") builder.WriteString(`target_env = "TRANSPORT_OVERRIDE_API_KEY"` + "\n") diff --git a/internal/cli/client.go b/internal/cli/client.go index 875102f23..f4deb651b 100644 --- a/internal/cli/client.go +++ b/internal/cli/client.go @@ -42,6 +42,12 @@ type DaemonClient interface { GetSettingsRestartStatus(ctx context.Context, operationID string) (SettingsRestartStatusRecord, error) GetSettingsUpdate(ctx context.Context) (SettingsUpdateRecord, error) UpdateSettingsSkills(ctx context.Context, request UpdateSettingsSkillsRequest) (SettingsMutationRecord, error) + ListProviderModels(ctx context.Context, query ProviderModelListQuery) (ProviderModelListRecord, error) + RefreshProviderModels(ctx context.Context, providerID string, request ProviderModelRefreshRequest) ( + ProviderModelRefreshRecord, + error, + ) + ProviderModelStatus(ctx context.Context, providerID string) (ProviderModelStatusRecord, error) ListVaultSecrets(ctx context.Context, query VaultListQuery) ([]VaultRecord, error) GetVaultSecret(ctx context.Context, ref string) (VaultRecord, error) PutVaultSecret(ctx context.Context, request PutVaultSecretRequest) (VaultRecord, error) diff --git a/internal/cli/client_provider_models.go b/internal/cli/client_provider_models.go new file mode 100644 index 000000000..f7e9409cc --- /dev/null +++ b/internal/cli/client_provider_models.go @@ -0,0 +1,111 @@ +package cli + +import ( + "context" + "net/http" + "net/url" + "strings" + + "github.com/pedronauck/agh/internal/api/contract" +) + +// ProviderModelListQuery captures provider model catalog list filters. +type ProviderModelListQuery struct { + ProviderID string + SourceID string + Refresh bool + IncludeStale bool +} + +// ProviderModelListRecord is the native provider model catalog list response. +type ProviderModelListRecord = contract.ProviderModelListResponse + +// ProviderModelRecord is one native provider model catalog projection. +type ProviderModelRecord = contract.ProviderModelPayload + +// ProviderModelRefreshRequest captures one provider model catalog refresh request. +type ProviderModelRefreshRequest = contract.ProviderModelRefreshRequest + +// ProviderModelRefreshRecord is the native provider model catalog refresh response. +type ProviderModelRefreshRecord = contract.ProviderModelRefreshResponse + +// ProviderModelStatusRecord is the native provider model catalog source status response. +type ProviderModelStatusRecord = contract.ProviderModelStatusResponse + +// ProviderModelSourceStatusRecord is one provider-scoped source status row. +type ProviderModelSourceStatusRecord = contract.ModelCatalogSourceStatusPayload + +func (c *unixSocketClient) ListProviderModels( + ctx context.Context, + query ProviderModelListQuery, +) (ProviderModelListRecord, error) { + path := providerModelsPath(query.ProviderID, "") + var response ProviderModelListRecord + values := providerModelListValues(ProviderModelListQuery{ + SourceID: query.SourceID, + Refresh: query.Refresh, + IncludeStale: query.IncludeStale, + }) + if err := c.doJSON(ctx, http.MethodGet, path, values, nil, &response); err != nil { + return ProviderModelListRecord{}, err + } + return response, nil +} + +func (c *unixSocketClient) RefreshProviderModels( + ctx context.Context, + providerID string, + request ProviderModelRefreshRequest, +) (ProviderModelRefreshRecord, error) { + path := providerModelsPath(providerID, "refresh") + var response ProviderModelRefreshRecord + if err := c.doJSON(ctx, http.MethodPost, path, nil, request, &response); err != nil { + return ProviderModelRefreshRecord{}, err + } + return response, nil +} + +func (c *unixSocketClient) ProviderModelStatus( + ctx context.Context, + providerID string, +) (ProviderModelStatusRecord, error) { + path := providerModelsPath(providerID, "status") + var response ProviderModelStatusRecord + if err := c.doJSON(ctx, http.MethodGet, path, nil, nil, &response); err != nil { + return ProviderModelStatusRecord{}, err + } + return response, nil +} + +func providerModelListValues(query ProviderModelListQuery) url.Values { + values := url.Values{} + if trimmed := strings.TrimSpace(query.ProviderID); trimmed != "" { + values.Set("provider_id", trimmed) + } + if trimmed := strings.TrimSpace(query.SourceID); trimmed != "" { + values.Set("source_id", trimmed) + } + if query.Refresh { + values.Set("refresh", "true") + } + if query.IncludeStale { + values.Set("include_stale", "true") + } + return values +} + +func providerModelsPath(providerID string, action string) string { + trimmedProvider := strings.TrimSpace(providerID) + trimmedAction := strings.Trim(strings.TrimSpace(action), "/") + if trimmedProvider == "" { + if trimmedAction == "" { + return "/api/providers/models" + } + return "/api/providers/models/" + url.PathEscape(trimmedAction) + } + path := "/api/providers/" + url.PathEscape(trimmedProvider) + "/models" + if trimmedAction != "" { + path += "/" + url.PathEscape(trimmedAction) + } + return path +} diff --git a/internal/cli/config.go b/internal/cli/config.go index 4aa0f8828..c987f85c6 100644 --- a/internal/cli/config.go +++ b/internal/cli/config.go @@ -25,6 +25,9 @@ const ( configEnvKey = "env" configSecretEnvKey = "secret_env" configProvidersKey = "providers" + configModelsKey = "models" + configDiscoveryKey = "discovery" + configDefaultKey = "default" configSessionMCPKey = "session_mcp" ) @@ -251,6 +254,10 @@ var ( "extensions.resources.operator_write_rate_limit.requests": configSetInt, "extensions.resources.operator_write_rate_limit.window": configSetDuration, "extensions.resources.operator_write_rate_limit.queue": configSetInt, + "model_catalog.sources.models_dev.enabled": configSetBool, + "model_catalog.sources.models_dev.endpoint": configSetString, + "model_catalog.sources.models_dev.ttl": configSetDuration, + "model_catalog.sources.models_dev.timeout": configSetDuration, "automation.enabled": configSetBool, "automation.timezone": configSetString, "automation.max_concurrent_jobs": configSetInt, @@ -1357,6 +1364,13 @@ func classifyConfigMutationPath(path []string) (configSetValueKind, bool, error) if len(path) == 3 && path[0] == configProvidersKey && path[2] == configSessionMCPKey { return configSetBool, false, nil } + if len(path) == 5 && + path[0] == configProvidersKey && + path[2] == configModelsKey && + path[3] == configDiscoveryKey && + path[4] == "enabled" { + return configSetBool, false, nil + } if isProviderMutationPath(path) { return configSetString, false, nil } @@ -1440,15 +1454,31 @@ func isProviderMutationPath(path []string) bool { if len(path) == 3 && path[0] == configProvidersKey { switch path[2] { case "command", - "default_model", "auth_mode", "env_policy", "home_policy", + "runtime_provider", + "transport", + "base_url", "auth_status_command", "auth_login_command": return true } } + if len(path) == 4 && path[0] == configProvidersKey && path[2] == configModelsKey { + if path[3] == configDefaultKey { + return true + } + } + if len(path) == 5 && + path[0] == configProvidersKey && + path[2] == configModelsKey && + path[3] == configDiscoveryKey { + switch path[4] { + case "command", "endpoint", "timeout": + return true + } + } return false } diff --git a/internal/cli/config_test.go b/internal/cli/config_test.go index 540dcc0ba..b42bc4be3 100644 --- a/internal/cli/config_test.go +++ b/internal/cli/config_test.go @@ -814,6 +814,18 @@ func TestConfigRenderingAndMutationHelpers(t *testing.T) { wantKind: configSetString, wantAllowed: true, }, + { + name: "Should allow provider model default", + path: "providers.codex.models.default", + wantKind: configSetString, + wantAllowed: true, + }, + { + name: "Should allow provider model discovery enabled", + path: "providers.codex.models.discovery.enabled", + wantKind: configSetBool, + wantAllowed: true, + }, { name: "Should redact sandbox env values", path: "sandboxes.dev.env.API_TOKEN", diff --git a/internal/cli/helpers_test.go b/internal/cli/helpers_test.go index 9983b5bdd..ced459669 100644 --- a/internal/cli/helpers_test.go +++ b/internal/cli/helpers_test.go @@ -21,6 +21,9 @@ type stubClient struct { getSettingsRestartStatusFn func(context.Context, string) (SettingsRestartStatusRecord, error) getSettingsUpdateFn func(context.Context) (SettingsUpdateRecord, error) updateSettingsSkillsFn func(context.Context, UpdateSettingsSkillsRequest) (SettingsMutationRecord, error) + listProviderModelsFn func(context.Context, ProviderModelListQuery) (ProviderModelListRecord, error) + refreshProviderModelsFn func(context.Context, string, ProviderModelRefreshRequest) (ProviderModelRefreshRecord, error) + providerModelStatusFn func(context.Context, string) (ProviderModelStatusRecord, error) listVaultSecretsFn func(context.Context, VaultListQuery) ([]VaultRecord, error) getVaultSecretFn func(context.Context, string) (VaultRecord, error) putVaultSecretFn func(context.Context, PutVaultSecretRequest) (VaultRecord, error) @@ -279,6 +282,37 @@ func (s *stubClient) UpdateSettingsSkills( return SettingsMutationRecord{}, errors.New("unexpected UpdateSettingsSkills call") } +func (s *stubClient) ListProviderModels( + ctx context.Context, + query ProviderModelListQuery, +) (ProviderModelListRecord, error) { + if s.listProviderModelsFn != nil { + return s.listProviderModelsFn(ctx, query) + } + return ProviderModelListRecord{}, errors.New("unexpected ListProviderModels call") +} + +func (s *stubClient) RefreshProviderModels( + ctx context.Context, + providerID string, + request ProviderModelRefreshRequest, +) (ProviderModelRefreshRecord, error) { + if s.refreshProviderModelsFn != nil { + return s.refreshProviderModelsFn(ctx, providerID, request) + } + return ProviderModelRefreshRecord{}, errors.New("unexpected RefreshProviderModels call") +} + +func (s *stubClient) ProviderModelStatus( + ctx context.Context, + providerID string, +) (ProviderModelStatusRecord, error) { + if s.providerModelStatusFn != nil { + return s.providerModelStatusFn(ctx, providerID) + } + return ProviderModelStatusRecord{}, errors.New("unexpected ProviderModelStatus call") +} + func (s *stubClient) ListVaultSecrets( ctx context.Context, query VaultListQuery, diff --git a/internal/cli/install.go b/internal/cli/install.go index a7f2dd3a9..65712ca8a 100644 --- a/internal/cli/install.go +++ b/internal/cli/install.go @@ -104,7 +104,7 @@ func newInstallCommand(deps commandDeps) *cobra.Command { record := installRecord{ AgentName: aghconfig.DefaultAgentName, Provider: cfg.Defaults.Provider, - Model: cfg.Providers[cfg.Defaults.Provider].DefaultModel, + Model: cfg.Providers[cfg.Defaults.Provider].Models.Default, Permissions: string(cfg.Permissions.Mode), ConfigFile: homePaths.ConfigFile, AgentFile: agentPath, @@ -206,12 +206,12 @@ func buildInstallWizardInput(cfg *aghconfig.Config) installWizardInput { for _, provider := range providers { resolved, err := cfg.ResolveProvider(provider) if err == nil { - suggestedModels[provider] = strings.TrimSpace(resolved.DefaultModel) + suggestedModels[provider] = strings.TrimSpace(resolved.Models.Default) modelRequired[provider] = installProviderRequiresModel(resolved) continue } configured := cfg.Providers[provider] - suggestedModels[provider] = strings.TrimSpace(configured.DefaultModel) + suggestedModels[provider] = strings.TrimSpace(configured.Models.Default) modelRequired[provider] = installProviderRequiresModel(configured) } diff --git a/internal/cli/install_test.go b/internal/cli/install_test.go index 1dc2bf528..f82e1dc05 100644 --- a/internal/cli/install_test.go +++ b/internal/cli/install_test.go @@ -175,8 +175,8 @@ func TestInstallCommandMachineOutput(t *testing.T) { if cfg.Defaults.Provider != "blackbox" { t.Fatalf("cfg.Defaults.Provider = %q, want blackbox", cfg.Defaults.Provider) } - if got := cfg.Providers["blackbox"].DefaultModel; got != "" { - t.Fatalf("cfg.Providers[blackbox].DefaultModel = %q, want empty", got) + if got := cfg.Providers["blackbox"].Models.Default; got != "" { + t.Fatalf("cfg.Providers[blackbox].Models.Default = %q, want empty", got) } }) @@ -291,7 +291,11 @@ func TestBuildInstallWizardInputAndBundleFormats(t *testing.T) { cfg := aghconfig.DefaultWithHome(aghconfig.HomePaths{}) cfg.Defaults.Provider = "codex" - cfg.Providers["custom"] = aghconfig.ProviderConfig{DefaultModel: "custom-model"} + cfg.Providers["custom"] = aghconfig.ProviderConfig{ + Models: aghconfig.ProviderModelsConfig{ + Default: "custom-model", + }, + } input := buildInstallWizardInput(&cfg) if len(input.Providers) == 0 { diff --git a/internal/cli/provider.go b/internal/cli/provider.go index b15aba044..125b48291 100644 --- a/internal/cli/provider.go +++ b/internal/cli/provider.go @@ -82,6 +82,7 @@ func newProviderCommand(deps commandDeps) *cobra.Command { Short: "Inspect and manage provider authentication", } cmd.AddCommand(newProviderAuthCommand(deps)) + cmd.AddCommand(newProviderModelsCommand(deps)) return cmd } diff --git a/internal/cli/provider_models.go b/internal/cli/provider_models.go new file mode 100644 index 000000000..32d6a9e84 --- /dev/null +++ b/internal/cli/provider_models.go @@ -0,0 +1,223 @@ +package cli + +import ( + "fmt" + "strings" + + "github.com/pedronauck/agh/internal/api/contract" + "github.com/spf13/cobra" +) + +const providerModelAvailabilityUnknown = "unknown" + +func newProviderModelsCommand(deps commandDeps) *cobra.Command { + cmd := &cobra.Command{ + Use: "models", + Short: "Inspect and refresh the provider model catalog", + } + cmd.AddCommand(newProviderModelsListCommand(deps)) + cmd.AddCommand(newProviderModelsRefreshCommand(deps)) + cmd.AddCommand(newProviderModelsStatusCommand(deps)) + return cmd +} + +func newProviderModelsListCommand(deps commandDeps) *cobra.Command { + var sourceID string + var refresh bool + var includeStale bool + cmd := &cobra.Command{ + Use: "list [provider]", + Short: "List provider model catalog entries", + Args: optionalProviderArg(), + RunE: func(cmd *cobra.Command, args []string) error { + providerID := providerArgValue(args) + client, err := clientFromDeps(deps) + if err != nil { + return err + } + record, err := client.ListProviderModels(cmd.Context(), ProviderModelListQuery{ + ProviderID: providerID, + SourceID: sourceID, + Refresh: refresh, + IncludeStale: includeStale, + }) + if err != nil { + return err + } + return writeCommandOutput(cmd, providerModelListBundle(record)) + }, + } + cmd.Flags().StringVar(&sourceID, "source", "", "Filter by catalog source id") + cmd.Flags().BoolVar(&refresh, "refresh", false, "Refresh sources before listing models") + cmd.Flags().BoolVar(&includeStale, "include-stale", false, "Include stale source rows") + return cmd +} + +func newProviderModelsRefreshCommand(deps commandDeps) *cobra.Command { + var sourceID string + var force bool + var requestID string + cmd := &cobra.Command{ + Use: "refresh [provider]", + Short: "Refresh provider model catalog sources", + Args: optionalProviderArg(), + RunE: func(cmd *cobra.Command, args []string) error { + client, err := clientFromDeps(deps) + if err != nil { + return err + } + record, err := client.RefreshProviderModels( + cmd.Context(), + providerArgValue(args), + ProviderModelRefreshRequest{ + SourceID: sourceID, + Force: force, + RequestID: requestID, + }, + ) + if err != nil { + return err + } + return writeCommandOutput(cmd, providerModelRefreshBundle(record)) + }, + } + cmd.Flags().StringVar(&sourceID, "source", "", "Refresh only one catalog source id") + cmd.Flags().BoolVar(&force, "force", false, "Force refresh even when cached status is fresh") + cmd.Flags().StringVar(&requestID, "request-id", "", "Refresh request id for daemon logs") + return cmd +} + +func newProviderModelsStatusCommand(deps commandDeps) *cobra.Command { + cmd := &cobra.Command{ + Use: "status [provider]", + Short: "Show provider model catalog source status", + Args: optionalProviderArg(), + RunE: func(cmd *cobra.Command, args []string) error { + client, err := clientFromDeps(deps) + if err != nil { + return err + } + record, err := client.ProviderModelStatus(cmd.Context(), providerArgValue(args)) + if err != nil { + return err + } + return writeCommandOutput(cmd, providerModelStatusBundle(record)) + }, + } + return cmd +} + +func providerModelListBundle(record ProviderModelListRecord) outputBundle { + return listBundle( + record, + record.Models, + "Provider Models", + []string{"Provider", "Model", "Available", "State", "Stale", "Sources", "Refreshed"}, + "provider_models", + []string{"provider_id", "model_id", "available", "availability_state", "stale", "sources", "refreshed_at"}, + func(model ProviderModelRecord) []string { + return []string{ + model.ProviderID, + model.ModelID, + providerModelNullableBoolString(model.Available), + model.AvailabilityState, + providerModelBoolString(model.Stale), + providerModelSourceSummary(model.Sources), + model.RefreshedAt, + } + }, + func(model ProviderModelRecord) []string { + return []string{ + model.ProviderID, + model.ModelID, + providerModelNullableBoolString(model.Available), + model.AvailabilityState, + providerModelBoolString(model.Stale), + providerModelSourceSummary(model.Sources), + model.RefreshedAt, + } + }, + ) +} + +func providerModelRefreshBundle(record ProviderModelRefreshRecord) outputBundle { + return providerModelSourceStatusBundle("Provider Model Refresh", "provider_model_refresh", record, record.Sources) +} + +func providerModelStatusBundle(record ProviderModelStatusRecord) outputBundle { + return providerModelSourceStatusBundle("Provider Model Status", "provider_model_status", record, record.Sources) +} + +func providerModelSourceStatusBundle( + humanTitle string, + toonName string, + jsonValue any, + sources []ProviderModelSourceStatusRecord, +) outputBundle { + return listBundle( + jsonValue, + sources, + humanTitle, + []string{"Provider", "Source", "Kind", "State", "Rows", "Stale", "Error"}, + toonName, + []string{"provider_id", "source_id", "source_kind", "refresh_state", "row_count", "stale", "last_error"}, + providerModelSourceStatusRow, + providerModelSourceStatusRow, + ) +} + +func providerModelSourceStatusRow(source ProviderModelSourceStatusRecord) []string { + return []string{ + source.ProviderID, + source.SourceID, + source.SourceKind, + source.RefreshState, + fmt.Sprintf("%d", source.RowCount), + providerModelBoolString(source.Stale), + source.LastError, + } +} + +func optionalProviderArg() cobra.PositionalArgs { + return func(_ *cobra.Command, args []string) error { + if len(args) > 1 { + return fmt.Errorf("accepts at most 1 arg(s), received %d", len(args)) + } + if len(args) == 1 && strings.TrimSpace(args[0]) == "" { + return fmt.Errorf("provider id cannot be blank") + } + return nil + } +} + +func providerArgValue(args []string) string { + if len(args) == 0 { + return "" + } + return strings.TrimSpace(args[0]) +} + +func providerModelSourceSummary(sources []contract.ModelCatalogSourceRefPayload) string { + if len(sources) == 0 { + return "" + } + values := make([]string, 0, len(sources)) + for _, source := range sources { + values = append(values, source.SourceID) + } + return strings.Join(values, ",") +} + +func providerModelNullableBoolString(value *bool) string { + if value == nil { + return providerModelAvailabilityUnknown + } + return providerModelBoolString(*value) +} + +func providerModelBoolString(value bool) string { + if value { + return toolBoolTrue + } + return toolBoolFalse +} diff --git a/internal/cli/provider_models_test.go b/internal/cli/provider_models_test.go new file mode 100644 index 000000000..289d2558f --- /dev/null +++ b/internal/cli/provider_models_test.go @@ -0,0 +1,188 @@ +package cli + +import ( + "context" + "encoding/json" + "errors" + "strings" + "testing" + + "github.com/pedronauck/agh/internal/api/contract" +) + +func TestProviderModelsCommands(t *testing.T) { + t.Parallel() + + t.Run("Should list structured JSON with model source status fields", func(t *testing.T) { + t.Parallel() + + available := true + client := &stubClient{ + listProviderModelsFn: func(_ context.Context, query ProviderModelListQuery) (ProviderModelListRecord, error) { + if got, want := query.ProviderID, "codex"; got != want { + t.Fatalf("ProviderID = %q, want %q", got, want) + } + if got, want := query.SourceID, "config"; got != want { + t.Fatalf("SourceID = %q, want %q", got, want) + } + if !query.Refresh || !query.IncludeStale { + t.Fatalf("query = %#v, want refresh and include-stale", query) + } + return ProviderModelListRecord{ + Models: []ProviderModelRecord{ + { + ProviderID: "codex", + ModelID: "gpt-5.4", + Available: &available, + AvailabilityState: "available_live", + Stale: true, + Sources: []contract.ModelCatalogSourceRefPayload{ + {SourceID: "config", SourceKind: "config", Stale: true}, + }, + }, + }, + }, nil + }, + } + + stdout, _, err := executeRootCommand( + t, + newTestDeps(t, client), + "provider", + "models", + "list", + "codex", + "--source", + "config", + "--refresh", + "--include-stale", + "-o", + "json", + ) + if err != nil { + t.Fatalf("provider models list error = %v", err) + } + var record ProviderModelListRecord + if err := json.Unmarshal([]byte(stdout), &record); err != nil { + t.Fatalf("json.Unmarshal(list) error = %v", err) + } + if len(record.Models) != 1 || record.Models[0].Sources[0].SourceID != "config" || !record.Models[0].Stale { + t.Fatalf("record = %#v, want model with source and stale fields", record) + } + }) + + t.Run("Should refresh and print source statuses", func(t *testing.T) { + t.Parallel() + + client := &stubClient{ + refreshProviderModelsFn: func( + _ context.Context, + providerID string, + request ProviderModelRefreshRequest, + ) (ProviderModelRefreshRecord, error) { + if got, want := providerID, "codex"; got != want { + t.Fatalf("providerID = %q, want %q", got, want) + } + if got, want := request.SourceID, "models_dev"; got != want { + t.Fatalf("SourceID = %q, want %q", got, want) + } + if !request.Force || request.RequestID != "req-1" { + t.Fatalf("request = %#v, want force request req-1", request) + } + return ProviderModelRefreshRecord{ + Sources: []ProviderModelSourceStatusRecord{ + { + ProviderID: "codex", + SourceID: "models_dev", + SourceKind: "models_dev", + RefreshState: "succeeded", + RowCount: 2, + }, + }, + }, nil + }, + } + + stdout, _, err := executeRootCommand( + t, + newTestDeps(t, client), + "provider", + "models", + "refresh", + "codex", + "--source", + "models_dev", + "--force", + "--request-id", + "req-1", + "-o", + "json", + ) + if err != nil { + t.Fatalf("provider models refresh error = %v", err) + } + var record ProviderModelRefreshRecord + if err := json.Unmarshal([]byte(stdout), &record); err != nil { + t.Fatalf("json.Unmarshal(refresh) error = %v", err) + } + if len(record.Sources) != 1 || record.Sources[0].RefreshState != "succeeded" { + t.Fatalf("record = %#v, want source status", record) + } + }) + + t.Run("Should show status for provider", func(t *testing.T) { + t.Parallel() + + client := &stubClient{ + providerModelStatusFn: func(_ context.Context, providerID string) (ProviderModelStatusRecord, error) { + if got, want := providerID, "codex"; got != want { + t.Fatalf("providerID = %q, want %q", got, want) + } + return ProviderModelStatusRecord{ + Sources: []ProviderModelSourceStatusRecord{ + {ProviderID: "codex", SourceID: "config", RefreshState: "succeeded", RowCount: 1}, + }, + }, nil + }, + } + + stdout, _, err := executeRootCommand( + t, + newTestDeps(t, client), + "provider", + "models", + "status", + "codex", + "-o", + "json", + ) + if err != nil { + t.Fatalf("provider models status error = %v", err) + } + var record ProviderModelStatusRecord + if err := json.Unmarshal([]byte(stdout), &record); err != nil { + t.Fatalf("json.Unmarshal(status) error = %v", err) + } + if len(record.Sources) != 1 || record.Sources[0].SourceID != "config" { + t.Fatalf("record = %#v, want config status", record) + } + }) + + t.Run("Should surface daemon service unavailable errors", func(t *testing.T) { + t.Parallel() + + client := &stubClient{ + listProviderModelsFn: func(context.Context, ProviderModelListQuery) (ProviderModelListRecord, error) { + return ProviderModelListRecord{}, errors.New("model catalog service unavailable") + }, + } + + _, _, err := executeRootCommand(t, newTestDeps(t, client), "provider", "models", "list", "-o", "json") + if err == nil { + t.Fatal("provider models list error = nil, want service unavailable") + } + if !strings.Contains(err.Error(), "model catalog service unavailable") { + t.Fatalf("provider models list error = %v, want service unavailable", err) + } + }) +} diff --git a/internal/cli/session.go b/internal/cli/session.go index 5806924e1..c1e468757 100644 --- a/internal/cli/session.go +++ b/internal/cli/session.go @@ -39,12 +39,14 @@ func newSessionCommand(deps commandDeps) *cobra.Command { func newSessionCreateCommand(deps commandDeps) *cobra.Command { var ( - agentName string - cwd string - name string - channel string - provider string - workspaceRef string + agentName string + cwd string + name string + channel string + provider string + model string + reasoningEffort string + workspaceRef string ) cmd := &cobra.Command{ @@ -56,6 +58,9 @@ func newSessionCreateCommand(deps commandDeps) *cobra.Command { # Start a named session for a specific registered workspace and agent agh session new --workspace checkout-api --agent reviewer --name review-api + # Override provider, model, and reasoning effort for this session only + agh session new --provider codex --model gpt-5.4 --reasoning-effort high + # Auto-register an absolute workspace path before creating the session agh session new --cwd "$PWD" --agent reviewer`, RunE: func(cmd *cobra.Command, _ []string) error { @@ -70,12 +75,14 @@ func newSessionCreateCommand(deps commandDeps) *cobra.Command { } created, err := client.CreateSession(cmd.Context(), CreateSessionRequest{ - AgentName: agentName, - Provider: strings.TrimSpace(provider), - Name: name, - Workspace: workspace, - WorkspacePath: workspacePath, - Channel: strings.TrimSpace(channel), + AgentName: agentName, + Provider: strings.TrimSpace(provider), + Model: strings.TrimSpace(model), + ReasoningEffort: strings.TrimSpace(reasoningEffort), + Name: name, + Workspace: workspace, + WorkspacePath: workspacePath, + Channel: strings.TrimSpace(channel), }) if err != nil { return err @@ -90,6 +97,13 @@ func newSessionCreateCommand(deps commandDeps) *cobra.Command { cmd.Flags().StringVar(&name, "name", "", "Optional session label") cmd.Flags().StringVar(&channel, "channel", "", "Optional network channel opt-in for the session") cmd.Flags().StringVar(&provider, "provider", "", "Optional provider override for this session") + cmd.Flags().StringVar(&model, "model", "", "Optional model override for this session") + cmd.Flags().StringVar( + &reasoningEffort, + "reasoning-effort", + "", + "Optional reasoning effort hint (minimal|low|medium|high|xhigh) for providers that support it", + ) return cmd } diff --git a/internal/config/autonomy.go b/internal/config/autonomy.go index 144ca8125..8ba8ec549 100644 --- a/internal/config/autonomy.go +++ b/internal/config/autonomy.go @@ -118,7 +118,7 @@ func (c CoordinatorConfig) Validate(path string, resolver providerResolver) erro return fmt.Errorf("%s.provider: %w", path, err) } if strings.TrimSpace(c.Model) == "" && - strings.TrimSpace(provider.DefaultModel) == "" && + strings.TrimSpace(provider.Models.Default) == "" && provider.RequiresRuntimeModel() { return fmt.Errorf("%s.model is required when provider %q has no default model", path, providerName) } @@ -151,7 +151,7 @@ func (c *Config) ResolveCoordinatorConfig(fallback AgentDef) (CoordinatorConfig, return CoordinatorConfig{}, fmt.Errorf("autonomy.coordinator.provider: %w", err) } if model == "" { - model = strings.TrimSpace(provider.DefaultModel) + model = strings.TrimSpace(provider.Models.Default) } if model == "" && provider.RequiresRuntimeModel() { return CoordinatorConfig{}, fmt.Errorf( diff --git a/internal/config/autonomy_test.go b/internal/config/autonomy_test.go index 645df938b..096ac4bab 100644 --- a/internal/config/autonomy_test.go +++ b/internal/config/autonomy_test.go @@ -192,7 +192,7 @@ max_active_per_workspace = 2 } } -func TestLoadAllowsDirectACPAutonomyProviderWithoutDefaultModel(t *testing.T) { +func TestLoadAllowsDirectACPAutonomyProviderWithoutModelDefault(t *testing.T) { t.Run("Should accept provider-managed model for direct ACP provider", func(t *testing.T) { workspaceRoot, homePaths := prepareAutonomyConfigTestEnv(t) writeFile(t, homePaths.ConfigFile, ` @@ -325,8 +325,9 @@ func TestLoadAutonomyOverlayPreservesOtherConfigSections(t *testing.T) { writeFile(t, homePaths.ConfigFile, ` [providers.claude] - default_model = "global-model" auth_mode = "bound_secret" + [providers.claude.models] + default = "global-model" [[providers.claude.credential_slots]] name = "api_key" target_env = "GLOBAL_KEY" @@ -423,7 +424,7 @@ max_children = 2 if err != nil { t.Fatalf("ResolveProvider(claude) error = %v", err) } - if claude.DefaultModel != "global-model" { + if claude.Models.Default != "global-model" { t.Fatalf("ResolveProvider(claude) = %#v, want merged provider fields", claude) } if slots := claude.EffectiveCredentialSlots(); len(slots) != 1 || diff --git a/internal/config/bootstrap.go b/internal/config/bootstrap.go index 97fdc2f3e..66fe7128f 100644 --- a/internal/config/bootstrap.go +++ b/internal/config/bootstrap.go @@ -80,7 +80,7 @@ func SaveBootstrapConfig(homePaths HomePaths, provider string, model string) (Co if selectedModel == "" { return nil } - return editor.SetValue([]string{"providers", selectedProvider, "default_model"}, selectedModel) + return editor.SetValue([]string{"providers", selectedProvider, "models", "default"}, selectedModel) }) } diff --git a/internal/config/bootstrap_test.go b/internal/config/bootstrap_test.go index ccfc848db..7200e1762 100644 --- a/internal/config/bootstrap_test.go +++ b/internal/config/bootstrap_test.go @@ -85,10 +85,10 @@ func TestSaveBootstrapConfigWritesManagedDefaults(t *testing.T) { slots, ) } - if reloaded.Providers["claude"].DefaultModel != "claude-sonnet-4-6" { + if reloaded.Providers["claude"].Models.Default != "claude-sonnet-4-6" { t.Fatalf( - "LoadGlobalConfig() Providers[claude].DefaultModel = %q, want %q", - reloaded.Providers["claude"].DefaultModel, + "LoadGlobalConfig() Providers[claude].Models.Default = %q, want %q", + reloaded.Providers["claude"].Models.Default, "claude-sonnet-4-6", ) } @@ -107,7 +107,7 @@ func TestSaveBootstrapConfigWritesManagedDefaults(t *testing.T) { `agent = "general"`, `provider = "claude"`, `mode = "approve-all"`, - `default_model = "claude-sonnet-4-6"`, + `default = "claude-sonnet-4-6"`, `port = 3030`, `secret_ref = "env:ANTHROPIC_KEY"`, } { @@ -132,8 +132,8 @@ func TestSaveBootstrapConfigAllowsProviderManagedModel(t *testing.T) { if cfg.Defaults.Provider != "blackbox" { t.Fatalf("SaveBootstrapConfig() Defaults.Provider = %q, want blackbox", cfg.Defaults.Provider) } - if got := cfg.Providers["blackbox"].DefaultModel; got != "" { - t.Fatalf("SaveBootstrapConfig() Providers[blackbox].DefaultModel = %q, want empty", got) + if got := cfg.Providers["blackbox"].Models.Default; got != "" { + t.Fatalf("SaveBootstrapConfig() Providers[blackbox].Models.Default = %q, want empty", got) } contents, err := os.ReadFile(homePaths.ConfigFile) diff --git a/internal/config/config.go b/internal/config/config.go index c26d14b97..5d5491982 100644 --- a/internal/config/config.go +++ b/internal/config/config.go @@ -28,6 +28,7 @@ const ( ConfigName = "config.toml" // marketplaceSchemeHTTP is the accepted plaintext marketplace URL scheme. marketplaceSchemeHTTP = "http" + urlSchemeHTTPS = "https" // skillsMarketplaceRegistryClawhub is the currently supported skills marketplace registry. skillsMarketplaceRegistryClawhub = "clawhub" ) @@ -442,6 +443,7 @@ type Config struct { Permissions PermissionsConfig `toml:"permissions"` MCPServers []MCPServer `toml:"mcp_servers,omitempty"` Providers map[string]ProviderConfig `toml:"providers"` + ModelCatalog ModelCatalogConfig `toml:"model_catalog"` Sandboxes map[string]SandboxProfile `toml:"sandboxes"` Observability ObservabilityConfig `toml:"observability"` Log LogConfig `toml:"log"` @@ -630,8 +632,9 @@ func DefaultWithHome(homePaths HomePaths) Config { Permissions: PermissionsConfig{ Mode: PermissionModeApproveAll, }, - Providers: map[string]ProviderConfig{}, - Sandboxes: map[string]SandboxProfile{}, + Providers: map[string]ProviderConfig{}, + ModelCatalog: DefaultModelCatalogConfig(), + Sandboxes: map[string]SandboxProfile{}, Observability: ObservabilityConfig{ Enabled: true, RetentionDays: 7, @@ -886,6 +889,9 @@ func (c *Config) validateFeatures(lookup envLookup) error { if err := c.Tools.Validate(c.MCPServers, c.Providers); err != nil { return err } + if err := c.ModelCatalog.Validate(); err != nil { + return err + } if err := c.Automation.validateWithEnv(lookup); err != nil { return fmt.Errorf("validate automation config: %w", err) } @@ -1555,7 +1561,7 @@ func (c MarketplaceConfig) Validate() error { if err != nil { return fmt.Errorf("skills.marketplace.base_url is invalid: %w", err) } - if parsed.Scheme != marketplaceSchemeHTTP && parsed.Scheme != "https" { + if parsed.Scheme != marketplaceSchemeHTTP && parsed.Scheme != urlSchemeHTTPS { return fmt.Errorf("skills.marketplace.base_url must use http or https: %q", c.BaseURL) } if strings.TrimSpace(parsed.Host) == "" { @@ -1588,7 +1594,7 @@ func (c ExtensionsMarketplaceConfig) Validate() error { if err != nil { return fmt.Errorf("extensions.marketplace.base_url is invalid: %w", err) } - if parsed.Scheme != "http" && parsed.Scheme != "https" { + if parsed.Scheme != "http" && parsed.Scheme != urlSchemeHTTPS { return fmt.Errorf("extensions.marketplace.base_url must use http or https: %q", c.BaseURL) } if strings.TrimSpace(parsed.Host) == "" { diff --git a/internal/config/config_test.go b/internal/config/config_test.go index 9768ad9c3..f443c1498 100644 --- a/internal/config/config_test.go +++ b/internal/config/config_test.go @@ -88,8 +88,9 @@ approval_timeout_seconds = 90 trusted_sources = ["mcp:linear", "extension:linear"] [providers.claude] - default_model = "claude-opus" auth_mode = "bound_secret" + [providers.claude.models] + default = "claude-opus" [[providers.claude.credential_slots]] name = "api_key" target_env = "ANTHROPIC_KEY" @@ -344,7 +345,7 @@ max_queue_depth = 250 if err != nil { t.Fatalf("ResolveProvider() error = %v", err) } - if claude.Command == "" || claude.DefaultModel != "claude-opus" { + if claude.Command == "" || claude.Models.Default != "claude-opus" { t.Fatalf("ResolveProvider() = %#v", claude) } if slots := claude.EffectiveCredentialSlots(); len(slots) != 1 || @@ -831,8 +832,12 @@ host = "localhost" port = 2123 [providers.claude] - default_model = "global-model" auth_mode = "bound_secret" + [providers.claude.models] + default = "global-model" + [[providers.claude.models.curated]] + id = "global-model" + display_name = "Global Model" [[providers.claude.credential_slots]] name = "api_key" target_env = "GLOBAL_KEY" @@ -859,7 +864,13 @@ base_url = "https://global.example.test/api/v1" port = 4242 [providers.claude] -default_model = "workspace-model" +[providers.claude.models] +default = "workspace-model" +[[providers.claude.models.curated]] +id = "workspace-model" +display_name = "Workspace Model" +reasoning_efforts = ["low", "high"] +default_reasoning_effort = "high" [session.limits] timeout = "45m" @@ -892,8 +903,20 @@ base_url = "https://workspace.example.test/api/v1" if err != nil { t.Fatalf("ResolveProvider() error = %v", err) } - if claude.DefaultModel != "workspace-model" { - t.Fatalf("ResolveProvider() DefaultModel = %q, want %q", claude.DefaultModel, "workspace-model") + if claude.Models.Default != "workspace-model" { + t.Fatalf("ResolveProvider() Models.Default = %q, want %q", claude.Models.Default, "workspace-model") + } + if len(claude.Models.Curated) != 1 { + t.Fatalf("ResolveProvider() Models.Curated = %#v, want one workspace model", claude.Models.Curated) + } + if got, want := claude.Models.Curated[0].ID, "workspace-model"; got != want { + t.Fatalf("ResolveProvider() Models.Curated[0].ID = %q, want %q", got, want) + } + if got, want := claude.Models.Curated[0].DisplayName, "Workspace Model"; got != want { + t.Fatalf("ResolveProvider() Models.Curated[0].DisplayName = %q, want %q", got, want) + } + if got, want := claude.Models.Curated[0].DefaultReasoningEffort, "high"; got != want { + t.Fatalf("ResolveProvider() Models.Curated[0].DefaultReasoningEffort = %q, want %q", got, want) } if slots := claude.EffectiveCredentialSlots(); len(slots) != 1 || slots[0].TargetEnv != "GLOBAL_KEY" || @@ -1355,8 +1378,9 @@ segment_bytes = 128 max_bytes_per_session = 2048 [providers.claude] -default_model = "global-model" auth_mode = "bound_secret" +[providers.claude.models] +default = "global-model" `) writeFile(t, filepath.Join(workspaceRoot, DirName, ConfigName), ` [observability.transcripts] @@ -1389,7 +1413,7 @@ segment_bytes = 256 if err != nil { t.Fatalf("ResolveProvider() error = %v", err) } - if claude.DefaultModel != "global-model" { + if claude.Models.Default != "global-model" { t.Fatalf("ResolveProvider() = %#v", claude) } if slots := claude.EffectiveCredentialSlots(); len(slots) != 1 || diff --git a/internal/config/merge.go b/internal/config/merge.go index 211bca2a6..cb34545f5 100644 --- a/internal/config/merge.go +++ b/internal/config/merge.go @@ -13,6 +13,8 @@ import ( "github.com/pedronauck/agh/internal/resources" ) +const providersConfigKey = "providers" + type configOverlay struct { Daemon daemonOverlay `toml:"daemon"` HTTP httpOverlay `toml:"http"` @@ -23,6 +25,7 @@ type configOverlay struct { Permissions permissionsOverlay `toml:"permissions"` MCPServers []mcpServerOverlay `toml:"mcp_servers"` Providers map[string]providerOverlay `toml:"providers"` + ModelCatalog modelCatalogOverlay `toml:"model_catalog"` Sandboxes map[string]sandboxOverlay `toml:"sandboxes"` Observability observabilityOverlay `toml:"observability"` Log logOverlay `toml:"log"` @@ -123,7 +126,7 @@ type permissionsOverlay struct { type providerOverlay struct { Command *string `toml:"command"` DisplayName *string `toml:"display_name"` - DefaultModel *string `toml:"default_model"` + Models *providerModelsOverlay `toml:"models"` Harness *ProviderHarness `toml:"harness"` RuntimeProvider *string `toml:"runtime_provider"` Transport *string `toml:"transport"` @@ -139,6 +142,34 @@ type providerOverlay struct { MCPServers []mcpServerOverlay `toml:"mcp_servers"` } +type providerModelsOverlay struct { + Default *string `toml:"default"` + Curated []ProviderModelConfig `toml:"curated"` + Discovery providerModelsDiscoveryOverlay `toml:"discovery"` +} + +type providerModelsDiscoveryOverlay struct { + Enabled *bool `toml:"enabled"` + Command *string `toml:"command"` + Endpoint *string `toml:"endpoint"` + Timeout *string `toml:"timeout"` +} + +type modelCatalogOverlay struct { + Sources modelCatalogSourcesOverlay `toml:"sources"` +} + +type modelCatalogSourcesOverlay struct { + ModelsDev modelsDevSourceOverlay `toml:"models_dev"` +} + +type modelsDevSourceOverlay struct { + Enabled *bool `toml:"enabled"` + Endpoint *string `toml:"endpoint"` + TTL *string `toml:"ttl"` + Timeout *string `toml:"timeout"` +} + type providerCredentialOverlay struct { Name *string `toml:"name"` TargetEnv *string `toml:"target_env"` @@ -534,12 +565,41 @@ func loadConfigOverlayBytes(contents []byte, source string) (configOverlay, erro } if undecoded := meta.Undecoded(); len(undecoded) > 0 { + if err := rejectRemovedProviderModelKeys(source, undecoded); err != nil { + return overlay, err + } return overlay, fmt.Errorf("unknown config keys in %q: %s", source, joinTOMLKeys(undecoded)) } return overlay, nil } +func rejectRemovedProviderModelKeys(source string, keys []burnttoml.Key) error { + for _, key := range sortedTOMLKeys(keys) { + if len(key) != 3 || key[0] != providersConfigKey { + continue + } + replacement := "" + switch key[2] { + case "default_model": + replacement = fmt.Sprintf("providers.%s.models.default", key[1]) + case "supported_models": + replacement = fmt.Sprintf("providers.%s.models.curated", key[1]) + case "supports_reasoning_effort": + replacement = fmt.Sprintf("providers.%s.models.curated[].reasoning_efforts", key[1]) + } + if replacement != "" { + return fmt.Errorf( + "removed config key %q in %q: use %q", + key.String(), + source, + replacement, + ) + } + } + return nil +} + func (o *configOverlay) Apply(dst *Config) error { o.Daemon.Apply(&dst.Daemon) o.HTTP.Apply(&dst.HTTP) @@ -552,6 +612,7 @@ func (o *configOverlay) Apply(dst *Config) error { dst.MCPServers = applyMCPServerOverlays(dst.MCPServers, o.MCPServers) } applyProviderOverlays(dst, o.Providers) + o.ModelCatalog.Apply(&dst.ModelCatalog) applySandboxOverlays(dst, o.Sandboxes) o.Observability.Apply(&dst.Observability) o.Log.Apply(&dst.Log) @@ -705,8 +766,8 @@ func (o providerOverlay) Apply(dst *ProviderConfig) { if o.DisplayName != nil { dst.DisplayName = *o.DisplayName } - if o.DefaultModel != nil { - dst.DefaultModel = *o.DefaultModel + if o.Models != nil { + o.Models.Apply(&dst.Models) } if o.Harness != nil { dst.Harness = *o.Harness @@ -749,6 +810,54 @@ func (o providerOverlay) Apply(dst *ProviderConfig) { } } +func (o providerModelsOverlay) Apply(dst *ProviderModelsConfig) { + if o.Default != nil { + dst.Default = *o.Default + } + if o.Curated != nil { + dst.Curated = cloneProviderModelConfigs(o.Curated) + } + o.Discovery.Apply(&dst.Discovery) +} + +func (o providerModelsDiscoveryOverlay) Apply(dst *ProviderModelsDiscoveryConfig) { + if o.Enabled != nil { + dst.Enabled = boolRef(*o.Enabled) + } + if o.Command != nil { + dst.Command = *o.Command + } + if o.Endpoint != nil { + dst.Endpoint = *o.Endpoint + } + if o.Timeout != nil { + dst.Timeout = *o.Timeout + } +} + +func (o modelCatalogOverlay) Apply(dst *ModelCatalogConfig) { + o.Sources.Apply(&dst.Sources) +} + +func (o modelCatalogSourcesOverlay) Apply(dst *ModelCatalogSourcesConfig) { + o.ModelsDev.Apply(&dst.ModelsDev) +} + +func (o modelsDevSourceOverlay) Apply(dst *ModelsDevSourceConfig) { + if o.Enabled != nil { + dst.Enabled = boolRef(*o.Enabled) + } + if o.Endpoint != nil { + dst.Endpoint = *o.Endpoint + } + if o.TTL != nil { + dst.TTL = *o.TTL + } + if o.Timeout != nil { + dst.Timeout = *o.Timeout + } +} + func applyProviderCredentialOverlays(overlays []providerCredentialOverlay) []ProviderCredentialSlot { slots := make([]ProviderCredentialSlot, 0, len(overlays)) for _, overlay := range overlays { @@ -1594,11 +1703,19 @@ func joinTOMLKeys(keys []burnttoml.Key) string { return "" } - values := make([]string, 0, len(keys)) - for _, key := range keys { + sorted := sortedTOMLKeys(keys) + values := make([]string, 0, len(sorted)) + for _, key := range sorted { values = append(values, key.String()) } - sort.Strings(values) return strings.Join(values, ", ") } + +func sortedTOMLKeys(keys []burnttoml.Key) []burnttoml.Key { + sorted := append([]burnttoml.Key(nil), keys...) + sort.Slice(sorted, func(i, j int) bool { + return sorted[i].String() < sorted[j].String() + }) + return sorted +} diff --git a/internal/config/perf_bench_test.go b/internal/config/perf_bench_test.go index bdad69c7a..7d8c23131 100644 --- a/internal/config/perf_bench_test.go +++ b/internal/config/perf_bench_test.go @@ -37,9 +37,9 @@ func BenchmarkResolveAgentMergedMCPServers(b *testing.B) { MCPServers: benchmarkMCPServers("global", 24, 0), Providers: map[string]ProviderConfig{ "claude": { - Command: "npx -y @agentclientprotocol/claude-agent-acp@latest", - DefaultModel: "claude-sonnet-4-6", - MCPServers: benchmarkMCPServers("provider", 24, 8), + Command: "npx -y @agentclientprotocol/claude-agent-acp@latest", + Models: ProviderModelsConfig{Default: "claude-sonnet-4-6"}, + MCPServers: benchmarkMCPServers("provider", 24, 8), }, }, } diff --git a/internal/config/persistence_test.go b/internal/config/persistence_test.go index a97f4f424..5a3329fba 100644 --- a/internal/config/persistence_test.go +++ b/internal/config/persistence_test.go @@ -451,7 +451,7 @@ func TestOverlayEditorSetTableMutations(t *testing.T) { editor, err := newOverlayEditor(ConfigName, []byte(` # provider block [providers.openai] - default_model = "gpt-4o" + models = { default = "gpt-4o" } command = "openai" [defaults] @@ -462,8 +462,8 @@ agent = "general" } err = editor.SetTable([]string{"providers", "openai"}, map[string]any{ - "default_model": "gpt-5", - "command": "openai-next", + "models": map[string]any{"default": "gpt-5"}, + "command": "openai-next", }) if err != nil { t.Fatalf("editor.SetTable() error = %v", err) @@ -477,7 +477,7 @@ agent = "general" for _, want := range []string{ "[providers.openai]", - `default_model = "gpt-5"`, + `default = "gpt-5"`, `command = "openai-next"`, "[defaults]", `agent = "general"`, @@ -486,7 +486,7 @@ agent = "general" t.Fatalf("rendered config missing %q\n%s", want, text) } } - if strings.Contains(text, `default_model = "gpt-4o"`) { + if strings.Contains(text, `default = "gpt-4o"`) { t.Fatalf("rendered config still contains old model\n%s", text) } }) @@ -496,17 +496,17 @@ agent = "general" editor, err := newOverlayEditor(ConfigName, []byte(` [providers.openai] -default_model = "gpt-4o" +command = "openai" -[providers.openai.options] -temperature = 0.2 +[providers.openai.models] +default = "gpt-4o" `)) if err != nil { t.Fatalf("newOverlayEditor() error = %v", err) } err = editor.SetTable([]string{"providers", "openai"}, map[string]any{ - "default_model": "gpt-5", + "models": map[string]any{"default": "gpt-5"}, }) if err == nil { t.Fatal("editor.SetTable() error = nil, want nested-subtable rejection") @@ -676,7 +676,7 @@ agent = "general" provider = "openai" [providers.openai] - default_model = "gpt-4o" + models = { default = "gpt-4o" } command = "openai" `)) if err != nil { @@ -728,7 +728,7 @@ provider = "openai" for _, unwanted := range []string{ `provider = "openai"`, "[providers.openai]", - `default_model = "gpt-4o"`, + `default = "gpt-4o"`, `command = "openai"`, } { if strings.Contains(text, unwanted) { diff --git a/internal/config/provider.go b/internal/config/provider.go index 1049a8334..a87227c38 100644 --- a/internal/config/provider.go +++ b/internal/config/provider.go @@ -4,8 +4,10 @@ import ( "errors" "fmt" "maps" + "net/url" "regexp" "strings" + "time" "github.com/pedronauck/agh/internal/vault" ) @@ -63,11 +65,59 @@ type ProviderCredentialSlot struct { Required bool `toml:"required"` } +// ProviderModelsConfig describes provider-scoped model defaults and metadata. +type ProviderModelsConfig struct { + Default string `toml:"default,omitempty"` + Curated []ProviderModelConfig `toml:"curated,omitempty"` + Discovery ProviderModelsDiscoveryConfig `toml:"discovery,omitempty"` +} + +// ProviderModelsDiscoveryConfig describes optional side-effect-free model discovery. +type ProviderModelsDiscoveryConfig struct { + Enabled *bool `toml:"enabled,omitempty"` + Command string `toml:"command,omitempty"` + Endpoint string `toml:"endpoint,omitempty"` + Timeout string `toml:"timeout,omitempty"` +} + +// ProviderModelConfig describes one curated provider model entry. +type ProviderModelConfig struct { + ID string `toml:"id"` + DisplayName string `toml:"display_name,omitempty"` + ContextWindow *int64 `toml:"context_window,omitempty"` + MaxInputTokens *int64 `toml:"max_input_tokens,omitempty"` + MaxOutputTokens *int64 `toml:"max_output_tokens,omitempty"` + SupportsTools *bool `toml:"supports_tools,omitempty"` + SupportsReasoning *bool `toml:"supports_reasoning,omitempty"` + ReasoningEfforts []string `toml:"reasoning_efforts,omitempty"` + DefaultReasoningEffort string `toml:"default_reasoning_effort,omitempty"` + CostInputPerMillion *float64 `toml:"cost_input_per_million,omitempty"` + CostOutputPerMillion *float64 `toml:"cost_output_per_million,omitempty"` +} + +// ModelCatalogConfig controls daemon-owned model catalog sources. +type ModelCatalogConfig struct { + Sources ModelCatalogSourcesConfig `toml:"sources,omitempty"` +} + +// ModelCatalogSourcesConfig groups built-in model catalog sources. +type ModelCatalogSourcesConfig struct { + ModelsDev ModelsDevSourceConfig `toml:"models_dev,omitempty"` +} + +// ModelsDevSourceConfig controls the models.dev catalog source. +type ModelsDevSourceConfig struct { + Enabled *bool `toml:"enabled,omitempty"` + Endpoint string `toml:"endpoint,omitempty"` + TTL string `toml:"ttl,omitempty"` + Timeout string `toml:"timeout,omitempty"` +} + // ProviderConfig describes how to launch a provider in ACP mode. type ProviderConfig struct { Command string `toml:"command"` DisplayName string `toml:"display_name,omitempty"` - DefaultModel string `toml:"default_model"` + Models ProviderModelsConfig `toml:"models,omitempty"` Harness ProviderHarness `toml:"harness,omitempty"` RuntimeProvider string `toml:"runtime_provider,omitempty"` Transport string `toml:"transport,omitempty"` @@ -164,6 +214,9 @@ const ( piACPCommand = "npx -y pi-acp@latest" piACPAuthLoginCommand = piACPCommand + " --terminal-login" providerAPIKeyCredential = "api_key" + defaultModelsDevEndpoint = "https://models.dev/api.json" + defaultModelsDevTTL = "24h" + defaultModelsDevTimeout = "10s" ) var builtinProviderAliases = map[string]string{ @@ -208,22 +261,56 @@ var builtinProviderAliases = map[string]string{ var builtinProviders = map[string]ProviderConfig{ "claude": { - Command: "npx -y @agentclientprotocol/claude-agent-acp@latest", - DisplayName: "Claude Code", - Harness: ProviderHarnessACP, - DefaultModel: "claude-sonnet-4-6", + Command: "npx -y @agentclientprotocol/claude-agent-acp@latest", + DisplayName: "Claude Code", + Harness: ProviderHarnessACP, + Models: ProviderModelsConfig{ + Default: "claude-sonnet-4-6", + Curated: []ProviderModelConfig{ + {ID: "claude-opus-4-7", DisplayName: "Claude Opus 4.7"}, + {ID: "claude-sonnet-4-6", DisplayName: "Claude Sonnet 4.6"}, + {ID: "claude-haiku-4-5", DisplayName: "Claude Haiku 4.5"}, + }, + }, }, "codex": { - Command: "npx -y @zed-industries/codex-acp@latest", - DisplayName: "Codex", - Harness: ProviderHarnessACP, - DefaultModel: "gpt-5.4", + Command: "npx -y @zed-industries/codex-acp@latest", + DisplayName: "Codex", + Harness: ProviderHarnessACP, + Models: ProviderModelsConfig{ + Default: "gpt-5.4", + Curated: []ProviderModelConfig{ + { + ID: "gpt-5.4", + DisplayName: "GPT-5.4", + SupportsTools: boolRef(true), + SupportsReasoning: boolRef(true), + ReasoningEfforts: []string{"minimal", "low", "medium", "high", "xhigh"}, + DefaultReasoningEffort: "medium", + }, + { + ID: "gpt-5.4-mini", + DisplayName: "GPT-5.4 Mini", + SupportsTools: boolRef(true), + SupportsReasoning: boolRef(true), + ReasoningEfforts: []string{"minimal", "low", "medium", "high", "xhigh"}, + DefaultReasoningEffort: "medium", + }, + {ID: "gpt-5.3", DisplayName: "GPT-5.3"}, + {ID: "gpt-5.3-mini", DisplayName: "GPT-5.3 Mini"}, + }, + }, }, "gemini": { - Command: "gemini --acp", - DisplayName: "Gemini CLI", - Harness: ProviderHarnessACP, - DefaultModel: "gemini-3.1-pro-preview", + Command: "gemini --acp", + DisplayName: "Gemini CLI", + Harness: ProviderHarnessACP, + Models: ProviderModelsConfig{ + Default: "gemini-3.1-pro-preview", + Curated: []ProviderModelConfig{ + {ID: "gemini-3.1-pro-preview", DisplayName: "Gemini 3.1 Pro Preview"}, + }, + }, }, "opencode": { Command: "npx -y opencode-ai@latest acp", @@ -277,10 +364,15 @@ var builtinProviders = map[string]ProviderConfig{ Harness: ProviderHarnessACP, }, "qwen-code": { - Command: "npx -y @qwen-code/qwen-code@latest --acp --experimental-skills", - DisplayName: "Qwen Code", - Harness: ProviderHarnessACP, - DefaultModel: "qwen3.6-plus", + Command: "npx -y @qwen-code/qwen-code@latest --acp --experimental-skills", + DisplayName: "Qwen Code", + Harness: ProviderHarnessACP, + Models: ProviderModelsConfig{ + Default: "qwen3.6-plus", + Curated: []ProviderModelConfig{ + {ID: "qwen3.6-plus", DisplayName: "Qwen3.6 Plus"}, + }, + }, }, "copilot": { Command: "copilot --acp --stdio", @@ -302,72 +394,117 @@ var builtinProviders = map[string]ProviderConfig{ DisplayName: "Pi", Harness: ProviderHarnessPiACP, RuntimeProvider: "anthropic", - DefaultModel: "claude-opus-4-7", AuthLoginCmd: piACPAuthLoginCommand, + Models: ProviderModelsConfig{ + Default: "claude-opus-4-7", + Curated: []ProviderModelConfig{ + {ID: "claude-opus-4-7", DisplayName: "Claude Opus 4.7"}, + }, + }, }, "openrouter": { Command: piACPCommand, DisplayName: "OpenRouter", Harness: ProviderHarnessPiACP, RuntimeProvider: "openrouter", - DefaultModel: "openai/gpt-5.4", CredentialSlots: []ProviderCredentialSlot{apiKeyCredentialSlot("OPENROUTER_API_KEY")}, + Models: ProviderModelsConfig{ + Default: "openai/gpt-5.4", + Curated: []ProviderModelConfig{ + {ID: "openai/gpt-5.4", DisplayName: "OpenAI GPT-5.4"}, + }, + }, }, "zai": { Command: piACPCommand, DisplayName: "z.ai", Harness: ProviderHarnessPiACP, RuntimeProvider: "zai", - DefaultModel: "glm-4.6", CredentialSlots: []ProviderCredentialSlot{apiKeyCredentialSlot("ZAI_API_KEY")}, + Models: ProviderModelsConfig{ + Default: "glm-4.6", + Curated: []ProviderModelConfig{ + {ID: "glm-4.6", DisplayName: "GLM-4.6"}, + }, + }, }, "moonshot": { Command: piACPCommand, DisplayName: "Moonshot / Kimi", Harness: ProviderHarnessPiACP, RuntimeProvider: "kimi-coding", - DefaultModel: "kimi-k2-thinking", CredentialSlots: []ProviderCredentialSlot{apiKeyCredentialSlot("KIMI_API_KEY")}, + Models: ProviderModelsConfig{ + Default: "kimi-k2-thinking", + Curated: []ProviderModelConfig{ + {ID: "kimi-k2-thinking", DisplayName: "Kimi K2 Thinking"}, + }, + }, }, "vercel-ai-gateway": { Command: piACPCommand, DisplayName: "Vercel AI Gateway", Harness: ProviderHarnessPiACP, RuntimeProvider: "vercel-ai-gateway", - DefaultModel: "anthropic/claude-opus-4-7", CredentialSlots: []ProviderCredentialSlot{apiKeyCredentialSlot("AI_GATEWAY_API_KEY")}, + Models: ProviderModelsConfig{ + Default: "anthropic/claude-opus-4-7", + Curated: []ProviderModelConfig{ + {ID: "anthropic/claude-opus-4-7", DisplayName: "Anthropic Claude Opus 4.7"}, + }, + }, }, "xai": { Command: piACPCommand, DisplayName: "xAI", Harness: ProviderHarnessPiACP, RuntimeProvider: "xai", - DefaultModel: "grok-4-fast-non-reasoning", CredentialSlots: []ProviderCredentialSlot{apiKeyCredentialSlot("XAI_API_KEY")}, + Models: ProviderModelsConfig{ + Default: "grok-4-fast-non-reasoning", + Curated: []ProviderModelConfig{ + {ID: "grok-4-fast-non-reasoning", DisplayName: "Grok 4 Fast Non-Reasoning"}, + }, + }, }, "minimax": { Command: piACPCommand, DisplayName: "MiniMax", Harness: ProviderHarnessPiACP, RuntimeProvider: "minimax", - DefaultModel: "MiniMax-M2.1", CredentialSlots: []ProviderCredentialSlot{apiKeyCredentialSlot("MINIMAX_API_KEY")}, + Models: ProviderModelsConfig{ + Default: "MiniMax-M2.1", + Curated: []ProviderModelConfig{ + {ID: "MiniMax-M2.1", DisplayName: "MiniMax M2.1"}, + }, + }, }, "mistral": { Command: piACPCommand, DisplayName: "Mistral", Harness: ProviderHarnessPiACP, RuntimeProvider: "mistral", - DefaultModel: "devstral-medium-latest", CredentialSlots: []ProviderCredentialSlot{apiKeyCredentialSlot("MISTRAL_API_KEY")}, + Models: ProviderModelsConfig{ + Default: "devstral-medium-latest", + Curated: []ProviderModelConfig{ + {ID: "devstral-medium-latest", DisplayName: "Devstral Medium Latest"}, + }, + }, }, "groq": { Command: piACPCommand, DisplayName: "Groq", Harness: ProviderHarnessPiACP, RuntimeProvider: "groq", - DefaultModel: "openai/gpt-oss-120b", CredentialSlots: []ProviderCredentialSlot{apiKeyCredentialSlot("GROQ_API_KEY")}, + Models: ProviderModelsConfig{ + Default: "openai/gpt-oss-120b", + Curated: []ProviderModelConfig{ + {ID: "openai/gpt-oss-120b", DisplayName: "OpenAI GPT-OSS 120B"}, + }, + }, }, } @@ -477,7 +614,7 @@ func (c *Config) ResolveAgent(agent AgentDef) (ResolvedAgent, error) { model := strings.TrimSpace(agent.Model) if model == "" { - model = strings.TrimSpace(provider.DefaultModel) + model = strings.TrimSpace(provider.Models.Default) } if model == "" && provider.RequiresRuntimeModel() { return ResolvedAgent{}, fmt.Errorf( @@ -599,8 +736,8 @@ func mergeProvider(base ProviderConfig, override ProviderConfig) ProviderConfig if strings.TrimSpace(override.DisplayName) != "" { merged.DisplayName = override.DisplayName } - if strings.TrimSpace(override.DefaultModel) != "" { - merged.DefaultModel = override.DefaultModel + if !providerModelsConfigIsZero(override.Models) { + merged.Models = mergeProviderModels(merged.Models, override.Models) } if override.Harness != "" { merged.Harness = override.Harness @@ -643,6 +780,53 @@ func mergeProvider(base ProviderConfig, override ProviderConfig) ProviderConfig return merged } +func mergeProviderModels(base ProviderModelsConfig, override ProviderModelsConfig) ProviderModelsConfig { + merged := cloneProviderModelsConfig(base) + if strings.TrimSpace(override.Default) != "" { + merged.Default = override.Default + } + if override.Curated != nil { + merged.Curated = cloneProviderModelConfigs(override.Curated) + } + if !providerModelsDiscoveryConfigIsZero(override.Discovery) { + merged.Discovery = mergeProviderModelsDiscovery(merged.Discovery, override.Discovery) + } + return merged +} + +func mergeProviderModelsDiscovery( + base ProviderModelsDiscoveryConfig, + override ProviderModelsDiscoveryConfig, +) ProviderModelsDiscoveryConfig { + merged := cloneProviderModelsDiscoveryConfig(base) + if override.Enabled != nil { + merged.Enabled = boolRef(*override.Enabled) + } + if strings.TrimSpace(override.Command) != "" { + merged.Command = override.Command + } + if strings.TrimSpace(override.Endpoint) != "" { + merged.Endpoint = override.Endpoint + } + if strings.TrimSpace(override.Timeout) != "" { + merged.Timeout = override.Timeout + } + return merged +} + +func providerModelsConfigIsZero(value ProviderModelsConfig) bool { + return strings.TrimSpace(value.Default) == "" && + value.Curated == nil && + providerModelsDiscoveryConfigIsZero(value.Discovery) +} + +func providerModelsDiscoveryConfigIsZero(value ProviderModelsDiscoveryConfig) bool { + return value.Enabled == nil && + strings.TrimSpace(value.Command) == "" && + strings.TrimSpace(value.Endpoint) == "" && + strings.TrimSpace(value.Timeout) == "" +} + func newUnknownProviderError(providerName string) error { return fmt.Errorf("%w: unknown provider %q", ErrProviderUnavailable, providerName) } @@ -840,6 +1024,159 @@ func (p ProviderConfig) SessionMCPEnabled() bool { return *p.SessionMCP } +// Validate reports whether the provider model block is usable. +func (m ProviderModelsConfig) Validate(path string) error { + if strings.TrimSpace(m.Default) == "" && m.Default != "" { + return fmt.Errorf("%s.default is required", path) + } + seen := make(map[string]struct{}, len(m.Curated)) + for idx, model := range m.Curated { + modelPath := fmt.Sprintf("%s.curated[%d]", path, idx) + id := strings.TrimSpace(model.ID) + if id == "" { + return fmt.Errorf("%s.id is required", modelPath) + } + if _, ok := seen[id]; ok { + return fmt.Errorf("%s.id duplicates %q", modelPath, id) + } + seen[id] = struct{}{} + efforts := make(map[string]struct{}, len(model.ReasoningEfforts)) + for effortIdx, effort := range model.ReasoningEfforts { + trimmed := strings.TrimSpace(effort) + if trimmed == "" { + return fmt.Errorf("%s.reasoning_efforts[%d] is required", modelPath, effortIdx) + } + if _, ok := efforts[trimmed]; ok { + return fmt.Errorf("%s.reasoning_efforts[%d] duplicates %q", modelPath, effortIdx, trimmed) + } + efforts[trimmed] = struct{}{} + } + defaultEffort := strings.TrimSpace(model.DefaultReasoningEffort) + if defaultEffort != "" && len(efforts) > 0 { + if _, ok := efforts[defaultEffort]; !ok { + return fmt.Errorf("%s.default_reasoning_effort must be listed in reasoning_efforts", modelPath) + } + } + } + return m.Discovery.Validate(path + ".discovery") +} + +// Validate reports whether the discovery source config is usable. +func (d ProviderModelsDiscoveryConfig) Validate(path string) error { + command := strings.TrimSpace(d.Command) + endpoint := strings.TrimSpace(d.Endpoint) + if command != "" && unsafeDiscoveryCommand(command) { + return fmt.Errorf("%s.command must be a single-line command", path) + } + if endpoint != "" { + if err := validateAbsoluteHTTPURL(path+".endpoint", endpoint); err != nil { + return err + } + } + if command != "" && endpoint != "" { + return fmt.Errorf("%s.command and %s.endpoint are mutually exclusive", path, path) + } + if strings.TrimSpace(d.Timeout) != "" { + if err := validatePositiveDuration(path+".timeout", d.Timeout); err != nil { + return err + } + } + if d.Enabled != nil && *d.Enabled && command == "" && endpoint == "" { + return fmt.Errorf("%s requires command or endpoint when enabled", path) + } + return nil +} + +// DefaultModelCatalogConfig returns the default model catalog source config. +func DefaultModelCatalogConfig() ModelCatalogConfig { + return ModelCatalogConfig{ + Sources: ModelCatalogSourcesConfig{ + ModelsDev: ModelsDevSourceConfig{ + Enabled: boolRef(true), + Endpoint: defaultModelsDevEndpoint, + TTL: defaultModelsDevTTL, + Timeout: defaultModelsDevTimeout, + }, + }, + } +} + +// Validate reports whether model catalog config is usable. +func (c ModelCatalogConfig) Validate() error { + return c.Sources.ModelsDev.Validate("model_catalog.sources.models_dev") +} + +// EffectiveEnabled reports whether the models.dev source should run. +func (c ModelsDevSourceConfig) EffectiveEnabled() bool { + if c.Enabled == nil { + return true + } + return *c.Enabled +} + +// EffectiveEndpoint returns the configured endpoint or the default models.dev endpoint. +func (c ModelsDevSourceConfig) EffectiveEndpoint() string { + if endpoint := strings.TrimSpace(c.Endpoint); endpoint != "" { + return endpoint + } + return defaultModelsDevEndpoint +} + +// EffectiveTTL returns the configured TTL or the default models.dev TTL. +func (c ModelsDevSourceConfig) EffectiveTTL() string { + if ttl := strings.TrimSpace(c.TTL); ttl != "" { + return ttl + } + return defaultModelsDevTTL +} + +// EffectiveTimeout returns the configured timeout or the default models.dev timeout. +func (c ModelsDevSourceConfig) EffectiveTimeout() string { + if timeout := strings.TrimSpace(c.Timeout); timeout != "" { + return timeout + } + return defaultModelsDevTimeout +} + +// Validate reports whether the models.dev source config is usable. +func (c ModelsDevSourceConfig) Validate(path string) error { + if err := validateAbsoluteHTTPURL(path+".endpoint", c.EffectiveEndpoint()); err != nil { + return err + } + if err := validatePositiveDuration(path+".ttl", c.EffectiveTTL()); err != nil { + return err + } + return validatePositiveDuration(path+".timeout", c.EffectiveTimeout()) +} + +func validatePositiveDuration(path string, raw string) error { + duration, err := time.ParseDuration(strings.TrimSpace(raw)) + if err != nil { + return fmt.Errorf("%s must be a positive duration", path) + } + if duration <= 0 { + return fmt.Errorf("%s must be a positive duration", path) + } + return nil +} + +func validateAbsoluteHTTPURL(path string, raw string) error { + parsed, err := url.Parse(strings.TrimSpace(raw)) + if err != nil || parsed.Scheme == "" || parsed.Host == "" { + return fmt.Errorf("%s must be an absolute HTTP(S) URL", path) + } + switch parsed.Scheme { + case string(MCPServerTransportHTTP), urlSchemeHTTPS: + return nil + default: + return fmt.Errorf("%s must be an absolute HTTP(S) URL", path) + } +} + +func unsafeDiscoveryCommand(command string) bool { + return strings.ContainsAny(command, "\x00\r\n") +} + // Validate reports whether the harness is supported. func (h ProviderHarness) Validate(path string) error { switch h { @@ -903,7 +1240,7 @@ func validProviderSecretRef(ref string) bool { if vault.IsEnvRef(normalized) { return vault.ValidateRef(normalized) == nil } - if err := vault.ValidateSecretRefNamespace(normalized, "providers"); err != nil { + if err := vault.ValidateSecretRefNamespace(normalized, providersConfigKey); err != nil { return false } path := strings.TrimPrefix(normalized, "vault:providers/") @@ -997,6 +1334,9 @@ func validateResolvedProvider(name string, provider ProviderConfig) error { if strings.TrimSpace(provider.Command) == "" { return fmt.Errorf("provider %q command is required", name) } + if err := provider.Models.Validate(fmt.Sprintf("providers.%s.models", name)); err != nil { + return err + } if err := provider.EffectiveHarness().Validate(fmt.Sprintf("providers.%s.harness", name)); err != nil { return err } @@ -1147,7 +1487,7 @@ func cloneProvider(src ProviderConfig) ProviderConfig { return ProviderConfig{ Command: src.Command, DisplayName: src.DisplayName, - DefaultModel: src.DefaultModel, + Models: cloneProviderModelsConfig(src.Models), Harness: src.Harness, RuntimeProvider: src.RuntimeProvider, Transport: src.Transport, @@ -1175,6 +1515,64 @@ func cloneBoolRef(src *bool) *bool { return boolRef(*src) } +func cloneInt64Ref(src *int64) *int64 { + if src == nil { + return nil + } + value := *src + return &value +} + +func cloneFloat64Ref(src *float64) *float64 { + if src == nil { + return nil + } + value := *src + return &value +} + +func cloneProviderModelsConfig(src ProviderModelsConfig) ProviderModelsConfig { + return ProviderModelsConfig{ + Default: src.Default, + Curated: cloneProviderModelConfigs(src.Curated), + Discovery: cloneProviderModelsDiscoveryConfig(src.Discovery), + } +} + +func cloneProviderModelsDiscoveryConfig( + src ProviderModelsDiscoveryConfig, +) ProviderModelsDiscoveryConfig { + return ProviderModelsDiscoveryConfig{ + Enabled: cloneBoolRef(src.Enabled), + Command: src.Command, + Endpoint: src.Endpoint, + Timeout: src.Timeout, + } +} + +func cloneProviderModelConfigs(src []ProviderModelConfig) []ProviderModelConfig { + if src == nil { + return nil + } + cloned := make([]ProviderModelConfig, len(src)) + for idx, model := range src { + cloned[idx] = ProviderModelConfig{ + ID: model.ID, + DisplayName: model.DisplayName, + ContextWindow: cloneInt64Ref(model.ContextWindow), + MaxInputTokens: cloneInt64Ref(model.MaxInputTokens), + MaxOutputTokens: cloneInt64Ref(model.MaxOutputTokens), + SupportsTools: cloneBoolRef(model.SupportsTools), + SupportsReasoning: cloneBoolRef(model.SupportsReasoning), + ReasoningEfforts: cloneStrings(model.ReasoningEfforts), + DefaultReasoningEffort: model.DefaultReasoningEffort, + CostInputPerMillion: cloneFloat64Ref(model.CostInputPerMillion), + CostOutputPerMillion: cloneFloat64Ref(model.CostOutputPerMillion), + } + } + return cloned +} + func cloneProviderCredentialSlots(src []ProviderCredentialSlot) []ProviderCredentialSlot { if len(src) == 0 { return nil diff --git a/internal/config/provider_test.go b/internal/config/provider_test.go index f77919136..09d774339 100644 --- a/internal/config/provider_test.go +++ b/internal/config/provider_test.go @@ -218,11 +218,19 @@ func TestBuiltinProvidersContainExpectedCommands(t *testing.T) { firstNonEmpty(tc.runtimeProvider, tc.name), ) } - if got.DefaultModel != tc.defaultModel { + if got.Models.Default != tc.defaultModel { t.Fatalf( - "BuiltinProviders()[%q].DefaultModel = %q, want %q", + "BuiltinProviders()[%q].Models.Default = %q, want %q", tc.name, - got.DefaultModel, + got.Models.Default, + tc.defaultModel, + ) + } + if tc.defaultModel != "" && !providerCuratedModelsContain(got.Models.Curated, tc.defaultModel) { + t.Fatalf( + "BuiltinProviders()[%q].Models.Curated = %#v, want default model %q", + tc.name, + got.Models.Curated, tc.defaultModel, ) } @@ -266,6 +274,15 @@ func TestBuiltinProvidersContainExpectedCommands(t *testing.T) { } } +func providerCuratedModelsContain(models []ProviderModelConfig, id string) bool { + for _, model := range models { + if model.ID == id { + return true + } + } + return false +} + func TestRepoRootConfigProviderDefaultsMatchBuiltinRegistry(t *testing.T) { t.Parallel() @@ -277,15 +294,15 @@ func TestRepoRootConfigProviderDefaultsMatchBuiltinRegistry(t *testing.T) { builtins := BuiltinProviders() for name, provider := range overlay.Providers { - if provider.DefaultModel == "" { + if provider.Models.Default == "" { continue } builtin, ok := builtins[name] if !ok { t.Fatalf("repo config provider %q is not in the builtin registry", name) } - if got, want := provider.DefaultModel, builtin.DefaultModel; got != want { - t.Fatalf("repo config provider %q default_model = %q, want builtin %q", name, got, want) + if got, want := provider.Models.Default, builtin.Models.Default; got != want { + t.Fatalf("repo config provider %q models.default = %q, want builtin %q", name, got, want) } } } @@ -387,7 +404,7 @@ func TestProviderConfigOverrideMergesWithBuiltins(t *testing.T) { cfg := Config{ Providers: map[string]ProviderConfig{ "claude": { - DefaultModel: "claude-opus-override", + Models: ProviderModelsConfig{Default: "claude-opus-override"}, }, }, } @@ -399,8 +416,8 @@ func TestProviderConfigOverrideMergesWithBuiltins(t *testing.T) { if provider.Command == "" { t.Fatal("ResolveProvider() Command = empty, want builtin command") } - if provider.DefaultModel != "claude-opus-override" { - t.Fatalf("ResolveProvider() DefaultModel = %q, want %q", provider.DefaultModel, "claude-opus-override") + if provider.Models.Default != "claude-opus-override" { + t.Fatalf("ResolveProvider() Models.Default = %q, want %q", provider.Models.Default, "claude-opus-override") } if provider.EffectiveAuthMode() != ProviderAuthModeNativeCLI { t.Fatalf("ResolveProvider() AuthMode = %q, want native_cli", provider.EffectiveAuthMode()) @@ -1046,6 +1063,378 @@ func TestResolveProviderRejectsUnknownProvider(t *testing.T) { } } +func TestResolveProviderMergesRuntimeOverrideHints(t *testing.T) { + t.Parallel() + + homePaths, err := ResolveHomePathsFrom(filepath.Join(t.TempDir(), "home")) + if err != nil { + t.Fatalf("ResolveHomePathsFrom() error = %v", err) + } + cfg := DefaultWithHome(homePaths) + cfg.Providers["codex"] = ProviderConfig{ + Models: ProviderModelsConfig{ + Default: "gpt-manual", + Curated: []ProviderModelConfig{ + {ID: "gpt-custom", DisplayName: "Custom GPT"}, + {ID: "gpt-mini", DisplayName: "Mini GPT"}, + }, + }, + } + + provider, err := cfg.ResolveProvider("codex") + if err != nil { + t.Fatalf("ResolveProvider(codex) error = %v", err) + } + if got, want := provider.Models.Default, "gpt-manual"; got != want { + t.Fatalf("ResolveProvider(codex) Models.Default = %q, want %q", got, want) + } + wantModels := []ProviderModelConfig{ + {ID: "gpt-custom", DisplayName: "Custom GPT"}, + {ID: "gpt-mini", DisplayName: "Mini GPT"}, + } + if !reflect.DeepEqual(provider.Models.Curated, wantModels) { + t.Fatalf("ResolveProvider(codex) Models.Curated = %#v, want %#v", provider.Models.Curated, wantModels) + } +} + +func TestResolveProviderPreservesExplicitEmptyCuratedModels(t *testing.T) { + t.Parallel() + + homePaths, err := ResolveHomePathsFrom(filepath.Join(t.TempDir(), "home")) + if err != nil { + t.Fatalf("ResolveHomePathsFrom() error = %v", err) + } + cfg := DefaultWithHome(homePaths) + cfg.Providers["codex"] = ProviderConfig{ + Models: ProviderModelsConfig{ + Curated: []ProviderModelConfig{}, + }, + } + + provider, err := cfg.ResolveProvider("codex") + if err != nil { + t.Fatalf("ResolveProvider(codex) error = %v", err) + } + if got := len(provider.Models.Curated); got != 0 { + t.Fatalf("ResolveProvider(codex) Models.Curated len = %d, want 0", got) + } +} + +func TestLoadProviderRuntimeOverrideHintsFromTOML(t *testing.T) { + t.Parallel() + + homePaths, err := ResolveHomePathsFrom(filepath.Join(t.TempDir(), "home")) + if err != nil { + t.Fatalf("ResolveHomePathsFrom() error = %v", err) + } + if err := EnsureHomeLayout(homePaths); err != nil { + t.Fatalf("EnsureHomeLayout() error = %v", err) + } + writeFile(t, homePaths.ConfigFile, ` +[providers.codex.models] +default = "gpt-manual" + +[[providers.codex.models.curated]] +id = "gpt-custom" +display_name = "Custom GPT" + +[[providers.codex.models.curated]] +id = "gpt-mini" +display_name = "Mini GPT" +`) + + cfg, err := LoadForHome(homePaths, withoutDotEnv()) + if err != nil { + t.Fatalf("LoadForHome() error = %v", err) + } + provider, err := cfg.ResolveProvider("codex") + if err != nil { + t.Fatalf("ResolveProvider(codex) error = %v", err) + } + if got, want := provider.Models.Default, "gpt-manual"; got != want { + t.Fatalf("ResolveProvider(codex) Models.Default = %q, want %q", got, want) + } + wantModels := []ProviderModelConfig{ + {ID: "gpt-custom", DisplayName: "Custom GPT"}, + {ID: "gpt-mini", DisplayName: "Mini GPT"}, + } + if !reflect.DeepEqual(provider.Models.Curated, wantModels) { + t.Fatalf("ResolveProvider(codex) Models.Curated = %#v, want %#v", provider.Models.Curated, wantModels) + } +} + +func TestLoadRejectsBlankProviderCuratedModelID(t *testing.T) { + t.Parallel() + + homePaths, err := ResolveHomePathsFrom(filepath.Join(t.TempDir(), "home")) + if err != nil { + t.Fatalf("ResolveHomePathsFrom() error = %v", err) + } + if err := EnsureHomeLayout(homePaths); err != nil { + t.Fatalf("EnsureHomeLayout() error = %v", err) + } + writeFile(t, homePaths.ConfigFile, ` +[[providers.codex.models.curated]] +id = "gpt-custom" + +[[providers.codex.models.curated]] +id = " " +`) + + _, err = LoadForHome(homePaths, withoutDotEnv()) + if err == nil { + t.Fatal("LoadForHome() error = nil, want blank curated id validation") + } + if !strings.Contains(err.Error(), `providers.codex.models.curated[1].id is required`) { + t.Fatalf("LoadForHome() error = %v, want curated id index detail", err) + } +} + +func TestLoadRejectsInvalidProviderModelsConfig(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + config string + wantErr string + }{ + { + name: "Should reject duplicate curated model IDs", + config: ` +[[providers.codex.models.curated]] +id = "gpt-5.4" + +[[providers.codex.models.curated]] +id = "gpt-5.4" +`, + wantErr: `providers.codex.models.curated[1].id duplicates "gpt-5.4"`, + }, + { + name: "Should reject default reasoning effort outside allowed efforts", + config: ` +[[providers.codex.models.curated]] +id = "gpt-5.4" +reasoning_efforts = ["low", "medium"] +default_reasoning_effort = "high" +`, + wantErr: `providers.codex.models.curated[0].default_reasoning_effort must be listed in reasoning_efforts`, + }, + } + for _, tc := range tests { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + + homePaths, err := ResolveHomePathsFrom(filepath.Join(t.TempDir(), "home")) + if err != nil { + t.Fatalf("ResolveHomePathsFrom() error = %v", err) + } + if err := EnsureHomeLayout(homePaths); err != nil { + t.Fatalf("EnsureHomeLayout() error = %v", err) + } + writeFile(t, homePaths.ConfigFile, tc.config) + + _, err = LoadForHome(homePaths, withoutDotEnv()) + if err == nil { + t.Fatal("LoadForHome() error = nil, want validation error") + } + if !strings.Contains(err.Error(), tc.wantErr) { + t.Fatalf("LoadForHome() error = %v, want %q", err, tc.wantErr) + } + }) + } +} + +func TestLoadRejectsRemovedProviderModelKeys(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + config string + removedPath string + replacement string + }{ + { + name: "Should reject old default_model key", + config: ` +[providers.codex] +default_model = "gpt-5.4" +`, + removedPath: `providers.codex.default_model`, + replacement: `providers.codex.models.default`, + }, + { + name: "Should reject old supported_models key", + config: ` +[providers.codex] +supported_models = ["gpt-5.4"] +`, + removedPath: `providers.codex.supported_models`, + replacement: `providers.codex.models.curated`, + }, + { + name: "Should reject old supports_reasoning_effort key", + config: ` +[providers.codex] +supports_reasoning_effort = true +`, + removedPath: `providers.codex.supports_reasoning_effort`, + replacement: `providers.codex.models.curated[].reasoning_efforts`, + }, + } + for _, tc := range tests { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + + homePaths, err := ResolveHomePathsFrom(filepath.Join(t.TempDir(), "home")) + if err != nil { + t.Fatalf("ResolveHomePathsFrom() error = %v", err) + } + if err := EnsureHomeLayout(homePaths); err != nil { + t.Fatalf("EnsureHomeLayout() error = %v", err) + } + writeFile(t, homePaths.ConfigFile, tc.config) + + _, err = LoadForHome(homePaths, withoutDotEnv()) + if err == nil { + t.Fatal("LoadForHome() error = nil, want removed key error") + } + message := err.Error() + if !strings.Contains(message, `removed config key "`+tc.removedPath+`"`) || + !strings.Contains(message, `use "`+tc.replacement+`"`) { + t.Fatalf( + "LoadForHome() error = %v, want removed path %q and replacement %q", + err, + tc.removedPath, + tc.replacement, + ) + } + }) + } +} + +func TestModelCatalogModelsDevConfigValidatesDefaultsAndOverrides(t *testing.T) { + t.Parallel() + + defaults := DefaultModelCatalogConfig().Sources.ModelsDev + if !defaults.EffectiveEnabled() { + t.Fatal("DefaultModelCatalogConfig().ModelsDev enabled = false, want true") + } + if got, want := defaults.EffectiveEndpoint(), defaultModelsDevEndpoint; got != want { + t.Fatalf("ModelsDev EffectiveEndpoint() = %q, want %q", got, want) + } + if got, want := defaults.EffectiveTTL(), defaultModelsDevTTL; got != want { + t.Fatalf("ModelsDev EffectiveTTL() = %q, want %q", got, want) + } + if got, want := defaults.EffectiveTimeout(), defaultModelsDevTimeout; got != want { + t.Fatalf("ModelsDev EffectiveTimeout() = %q, want %q", got, want) + } + + enabled := false + override := ModelsDevSourceConfig{ + Enabled: &enabled, + Endpoint: "https://models.example.test/api.json", + TTL: "2h", + Timeout: "5s", + } + if err := override.Validate("model_catalog.sources.models_dev"); err != nil { + t.Fatalf("ModelsDev Validate(valid override) error = %v", err) + } + if override.EffectiveEnabled() { + t.Fatal("ModelsDev EffectiveEnabled() = true, want explicit false") + } + if got, want := override.EffectiveEndpoint(), "https://models.example.test/api.json"; got != want { + t.Fatalf("ModelsDev EffectiveEndpoint() = %q, want %q", got, want) + } + + tests := []struct { + name string + value ModelsDevSourceConfig + wantErr string + }{ + { + name: "Should reject invalid endpoint", + value: ModelsDevSourceConfig{Endpoint: "file:///tmp/models.json"}, + wantErr: "model_catalog.sources.models_dev.endpoint must be an absolute HTTP(S) URL", + }, + { + name: "Should reject invalid TTL", + value: ModelsDevSourceConfig{TTL: "soon"}, + wantErr: "model_catalog.sources.models_dev.ttl must be a positive duration", + }, + { + name: "Should reject invalid timeout", + value: ModelsDevSourceConfig{Timeout: "0s"}, + wantErr: "model_catalog.sources.models_dev.timeout must be a positive duration", + }, + } + for _, tc := range tests { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + + err := tc.value.Validate("model_catalog.sources.models_dev") + if err == nil { + t.Fatal("ModelsDev Validate() error = nil, want validation error") + } + if !strings.Contains(err.Error(), tc.wantErr) { + t.Fatalf("ModelsDev Validate() error = %v, want %q", err, tc.wantErr) + } + }) + } +} + +func TestProviderModelsDiscoveryConfigRejectsUnsafeConfiguration(t *testing.T) { + t.Parallel() + + enabled := true + tests := []struct { + name string + value ProviderModelsDiscoveryConfig + wantErr string + }{ + { + name: "Should reject multiline command", + value: ProviderModelsDiscoveryConfig{Command: "models\nlist"}, + wantErr: "providers.codex.models.discovery.command must be a single-line command", + }, + { + name: "Should reject ambiguous command and endpoint", + value: ProviderModelsDiscoveryConfig{ + Command: "models list", + Endpoint: "https://models.example.test", + }, + wantErr: "providers.codex.models.discovery.command and providers.codex.models.discovery.endpoint are mutually exclusive", + }, + { + name: "Should reject invalid endpoint", + value: ProviderModelsDiscoveryConfig{Endpoint: "ftp://models.example.test"}, + wantErr: "providers.codex.models.discovery.endpoint must be an absolute HTTP(S) URL", + }, + { + name: "Should reject enabled discovery without source", + value: ProviderModelsDiscoveryConfig{Enabled: &enabled}, + wantErr: "providers.codex.models.discovery requires command or endpoint when enabled", + }, + { + name: "Should reject invalid timeout", + value: ProviderModelsDiscoveryConfig{Command: "models list", Timeout: "-1s"}, + wantErr: "providers.codex.models.discovery.timeout must be a positive duration", + }, + } + for _, tc := range tests { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + + err := tc.value.Validate("providers.codex.models.discovery") + if err == nil { + t.Fatal("Discovery Validate() error = nil, want validation error") + } + if !strings.Contains(err.Error(), tc.wantErr) { + t.Fatalf("Discovery Validate() error = %v, want %q", err, tc.wantErr) + } + }) + } +} + func TestResolveAgentDefaultsToolsAndPermissions(t *testing.T) { homePaths, err := ResolveHomePathsFrom(filepath.Join(t.TempDir(), "home")) if err != nil { @@ -1149,8 +1538,8 @@ func TestResolveSessionAgent(t *testing.T) { cfg := DefaultWithHome(homePaths) cfg.Providers["claude"] = ProviderConfig{ - Command: "provider-claude-command", - DefaultModel: "provider-claude-model", + Command: "provider-claude-command", + Models: ProviderModelsConfig{Default: "provider-claude-model"}, } agent := AgentDef{ @@ -1184,8 +1573,8 @@ func TestResolveSessionAgent(t *testing.T) { cfg := DefaultWithHome(homePaths) cfg.Defaults.Provider = "claude" cfg.Providers["claude"] = ProviderConfig{ - Command: "provider-claude-command", - DefaultModel: "provider-claude-model", + Command: "provider-claude-command", + Models: ProviderModelsConfig{Default: "provider-claude-model"}, } agent := AgentDef{ @@ -1220,15 +1609,15 @@ func TestResolveSessionAgent(t *testing.T) { {Name: "global", Command: "global-command"}, } cfg.Providers["claude"] = ProviderConfig{ - Command: "workspace-claude-command", - DefaultModel: "workspace-claude-model", + Command: "workspace-claude-command", + Models: ProviderModelsConfig{Default: "workspace-claude-model"}, MCPServers: []MCPServer{ {Name: "provider-claude", Command: "provider-claude-command"}, }, } cfg.Providers["codex"] = ProviderConfig{ - Command: "workspace-codex-command", - DefaultModel: "workspace-codex-model", + Command: "workspace-codex-command", + Models: ProviderModelsConfig{Default: "workspace-codex-model"}, MCPServers: []MCPServer{ {Name: "provider-codex", Command: "provider-codex-command"}, {Name: "shared-provider", Command: "shared-provider-codex", Args: []string{"--codex"}}, @@ -1320,8 +1709,8 @@ func TestResolveSessionAgent(t *testing.T) { cfg := DefaultWithHome(homePaths) cfg.Providers["codex"] = ProviderConfig{ - Command: "workspace-codex-command", - DefaultModel: "workspace-codex-model", + Command: "workspace-codex-command", + Models: ProviderModelsConfig{Default: "workspace-codex-model"}, } agent := AgentDef{ diff --git a/internal/config/tool_surface.go b/internal/config/tool_surface.go index 94b8f5156..26e8001fb 100644 --- a/internal/config/tool_surface.go +++ b/internal/config/tool_surface.go @@ -321,10 +321,9 @@ func ClassifyToolConfigPath(path []string) (PathPolicy, error) { policy.Kind = kind return policy, nil } - if len(clean) == 3 && clean[0] == "providers" { + if len(clean) == 3 && clean[0] == providersConfigKey { switch clean[2] { case "command", - "default_model", "auth_mode", "env_policy", "home_policy", @@ -337,6 +336,12 @@ func ClassifyToolConfigPath(path []string) (PathPolicy, error) { return policy, nil } } + if len(clean) == 4 && clean[0] == providersConfigKey && clean[2] == "models" { + if clean[3] == "default" { + policy.Kind = ConfigValueString + return policy, nil + } + } policy.Denial = ConfigPathForbidden return policy, nil } @@ -646,7 +651,7 @@ func configPathIsTrustRoot(path []string) bool { return true case "hooks": return true - case "providers": + case providersConfigKey: return providerConfigPathIsTrustRoot(path) case "memory": return memoryConfigPathIsTrustRoot(path) diff --git a/internal/daemon/boot.go b/internal/daemon/boot.go index 25962443e..007b5bd2f 100644 --- a/internal/daemon/boot.go +++ b/internal/daemon/boot.go @@ -75,6 +75,7 @@ type bootState struct { sessions SessionManager hostedMCP *mcppkg.HostedService providerVault *vault.Service + modelCatalog *modelCatalogRuntime tasks *taskRuntime reviewRequests *runReviewRequestedForwarder spawnReaper *spawnReaper @@ -224,6 +225,7 @@ func (d *Daemon) beginBoot() error { d.lock != nil || d.registry != nil || d.sessions != nil || + d.modelCatalog != nil || d.network != nil || d.toolRegistry != nil || d.observer != nil || @@ -556,6 +558,9 @@ func (d *Daemon) bootRuntimeServices( return err } state.providerVault = providerVault + if err := d.bootModelCatalog(ctx, state, cleanup); err != nil { + return err + } state.bridges = d.composeBridgeRuntime(state, cleanup) hostedMCP, err := d.buildHostedMCPService(state) if err != nil { @@ -952,6 +957,7 @@ func (d *Daemon) runtimeDeps(ctx context.Context, state *bootState, sessions Ses MemorySessionLedger: newDaemonMemorySessionLedgerService(state, d.now), WorkspaceResolver: state.workspaceResolver, WorkspaceService: state.workspaceResolver, + ModelCatalog: state.modelCatalog, AgentCatalog: agentCatalogDependency(state.agentCatalog, agentSidecarCatalogs{ soul: state.soulCatalog, heartbeat: state.heartbeatCatalog, @@ -1586,6 +1592,7 @@ func (d *Daemon) extensionManagerDeps( Tasks: state.deps.Tasks, Network: state.deps.Network, NetworkStore: state.registry, + ModelCatalog: state.modelCatalog, MemoryStore: state.memoryStore, MemoryProviderRegistry: state.memoryProviderRegistry, Observer: state.observer, @@ -1875,6 +1882,7 @@ func (d *Daemon) publishBootState(state *bootState) { if state.localMemoryProvider != nil { d.localMemoryProvider = state.localMemoryProvider } + d.modelCatalog = state.modelCatalog d.situationContext = state.situationContext d.sessions = state.sessions d.tasks = state.tasks diff --git a/internal/daemon/daemon.go b/internal/daemon/daemon.go index dbf825077..d39f0838c 100644 --- a/internal/daemon/daemon.go +++ b/internal/daemon/daemon.go @@ -169,6 +169,7 @@ type RuntimeDeps struct { WorkspaceResolver workspacepkg.RuntimeResolver WorkspaceService core.WorkspaceService AgentCatalog core.AgentCatalog + ModelCatalog core.ModelCatalogService AgentContext *situation.Service SoulAuthoring core.SoulAuthoringService SoulRefresher core.SoulRefresher @@ -294,6 +295,7 @@ type extensionManagerDeps struct { Tasks taskpkg.Manager Network core.NetworkService NetworkStore store.NetworkConversationStore + ModelCatalog core.ModelCatalogService MemoryStore *memory.Store MemoryProviderRegistry *extensionpkg.MemoryProviderRegistry Observer Observer @@ -440,6 +442,7 @@ type Daemon struct { workspaceResolver workspacepkg.RuntimeResolver sandboxRegistry *sandbox.Registry skillsRegistry *skills.Registry + modelCatalog *modelCatalogRuntime skillsCancel context.CancelFunc skillsDone chan struct{} } @@ -465,6 +468,7 @@ type shutdownTargets struct { memoryExtractor *daemonMemoryExtractor memoryStore *memory.Store localMemoryProvider memoryProviderShutdowner + modelCatalog *modelCatalogRuntime skillsCancel context.CancelFunc skillsDone chan struct{} retention observerRetentionStopper @@ -693,18 +697,18 @@ func (d *Daemon) applyExtensionManagerFactoryDefault() { deps.MemoryStore, deps.Observer, deps.SkillsRegistry, - buildHostAPIOptions(deps, capChecker, deps.ResourceStore)..., + buildHostAPIOptions(&deps, capChecker, deps.ResourceStore)..., ) return extensionpkg.NewManager( deps.Registry, - buildExtensionManagerOptions(deps, capChecker, hostAPI, deps.SourceSessions)..., + buildExtensionManagerOptions(&deps, capChecker, hostAPI, deps.SourceSessions)..., ) } } func buildHostAPIOptions( - deps extensionManagerDeps, + deps *extensionManagerDeps, capChecker *extensionpkg.CapabilityChecker, resourceStore resources.RawStore, ) []extensionpkg.HostAPIOption { @@ -713,6 +717,7 @@ func buildHostAPIOptions( extensionpkg.WithHostAPITaskManager(deps.Tasks), extensionpkg.WithHostAPINetworkService(deps.Network), extensionpkg.WithHostAPINetworkStore(deps.NetworkStore), + extensionpkg.WithHostAPIModelCatalogService(deps.ModelCatalog), extensionpkg.WithHostAPICapabilityChecker(capChecker), extensionpkg.WithHostAPIWorkspaceResolver(deps.WorkspaceResolver), extensionpkg.WithHostAPIResourceStore(resourceStore), @@ -740,7 +745,7 @@ func buildHostAPIOptions( } func buildExtensionManagerOptions( - deps extensionManagerDeps, + deps *extensionManagerDeps, capChecker *extensionpkg.CapabilityChecker, hostAPI *extensionpkg.HostAPIHandler, sourceSessions resources.SourceSessionManager, @@ -1047,6 +1052,7 @@ func httpServerOptions(deps *RuntimeDeps) []httpapi.Option { httpapi.WithResourceService(deps.Resources), httpapi.WithWorkspaceResolver(deps.WorkspaceService), httpapi.WithAgentCatalog(deps.AgentCatalog), + httpapi.WithModelCatalogService(deps.ModelCatalog), httpapi.WithAgentContext(deps.AgentContext), httpapi.WithSoulAuthoring(deps.SoulAuthoring), httpapi.WithSoulRefresher(deps.SoulRefresher), @@ -1089,6 +1095,7 @@ func udsServerOptions(deps *RuntimeDeps) []udsapi.Option { udsapi.WithResourceService(deps.Resources), udsapi.WithWorkspaceResolver(deps.WorkspaceService), udsapi.WithAgentCatalog(deps.AgentCatalog), + udsapi.WithModelCatalogService(deps.ModelCatalog), udsapi.WithAgentContext(deps.AgentContext), udsapi.WithSoulAuthoring(deps.SoulAuthoring), udsapi.WithSoulRefresher(deps.SoulRefresher), @@ -1267,6 +1274,7 @@ func (d *Daemon) detachShutdownTargets() shutdownTargets { memoryExtractor: d.memoryExtractor, memoryStore: d.memoryStore, localMemoryProvider: d.localMemoryProvider, + modelCatalog: d.modelCatalog, skillsCancel: d.skillsCancel, skillsDone: d.skillsDone, } @@ -1296,6 +1304,7 @@ func (d *Daemon) resetRuntimeStateLocked() { d.memoryProviderRegistry = nil d.memoryExtractor = nil d.localMemoryProvider = nil + d.modelCatalog = nil d.skillsRegistry = nil d.lock = nil d.booting = false @@ -1334,6 +1343,9 @@ func (d *Daemon) shutdownRuntimeWorkers(ctx context.Context, targets shutdownTar targets.memoryStore.CloseRecallSignalRecorders(ctx), ) } + if targets.modelCatalog != nil { + appendWrappedError(errs, "daemon: shutdown model catalog", targets.modelCatalog.Shutdown(ctx)) + } stopSkillsWatcher(targets.skillsCancel, targets.skillsDone) if targets.resourceReconcile != nil { appendWrappedError(errs, "daemon: close resource reconcile driver", targets.resourceReconcile.Close(ctx)) diff --git a/internal/daemon/daemon_integration_test.go b/internal/daemon/daemon_integration_test.go index 873f05c5c..bd4bd3434 100644 --- a/internal/daemon/daemon_integration_test.go +++ b/internal/daemon/daemon_integration_test.go @@ -3975,10 +3975,38 @@ func (daemonSessionStopACPAgent) Cancel(context.Context, acpsdk.CancelNotificati return nil } +func (daemonSessionStopACPAgent) CloseSession( + context.Context, + acpsdk.CloseSessionRequest, +) (acpsdk.CloseSessionResponse, error) { + return acpsdk.CloseSessionResponse{}, nil +} + +func (daemonSessionStopACPAgent) ListSessions( + context.Context, + acpsdk.ListSessionsRequest, +) (acpsdk.ListSessionsResponse, error) { + return acpsdk.ListSessionsResponse{Sessions: []acpsdk.SessionInfo{}}, nil +} + func (daemonSessionStopACPAgent) NewSession(context.Context, acpsdk.NewSessionRequest) (acpsdk.NewSessionResponse, error) { return acpsdk.NewSessionResponse{SessionId: "daemon-stop-helper"}, nil } +func (daemonSessionStopACPAgent) ResumeSession( + context.Context, + acpsdk.ResumeSessionRequest, +) (acpsdk.ResumeSessionResponse, error) { + return acpsdk.ResumeSessionResponse{}, nil +} + +func (daemonSessionStopACPAgent) SetSessionConfigOption( + context.Context, + acpsdk.SetSessionConfigOptionRequest, +) (acpsdk.SetSessionConfigOptionResponse, error) { + return acpsdk.SetSessionConfigOptionResponse{ConfigOptions: []acpsdk.SessionConfigOption{}}, nil +} + func (daemonSessionStopACPAgent) LoadSession(context.Context, acpsdk.LoadSessionRequest) (acpsdk.LoadSessionResponse, error) { return acpsdk.LoadSessionResponse{}, nil } diff --git a/internal/daemon/daemon_nightly_combined_integration_test.go b/internal/daemon/daemon_nightly_combined_integration_test.go index 64549c7fc..964ef85bb 100644 --- a/internal/daemon/daemon_nightly_combined_integration_test.go +++ b/internal/daemon/daemon_nightly_combined_integration_test.go @@ -490,6 +490,20 @@ func (a *daemonNightlyCombinedACPAgent) Cancel(context.Context, acpsdk.CancelNot return nil } +func (a *daemonNightlyCombinedACPAgent) CloseSession( + context.Context, + acpsdk.CloseSessionRequest, +) (acpsdk.CloseSessionResponse, error) { + return acpsdk.CloseSessionResponse{}, nil +} + +func (a *daemonNightlyCombinedACPAgent) ListSessions( + context.Context, + acpsdk.ListSessionsRequest, +) (acpsdk.ListSessionsResponse, error) { + return acpsdk.ListSessionsResponse{Sessions: []acpsdk.SessionInfo{}}, nil +} + func (a *daemonNightlyCombinedACPAgent) NewSession( context.Context, acpsdk.NewSessionRequest, @@ -497,6 +511,20 @@ func (a *daemonNightlyCombinedACPAgent) NewSession( return acpsdk.NewSessionResponse{SessionId: "daemon-nightly-combined-helper"}, nil } +func (a *daemonNightlyCombinedACPAgent) ResumeSession( + context.Context, + acpsdk.ResumeSessionRequest, +) (acpsdk.ResumeSessionResponse, error) { + return acpsdk.ResumeSessionResponse{}, nil +} + +func (a *daemonNightlyCombinedACPAgent) SetSessionConfigOption( + context.Context, + acpsdk.SetSessionConfigOptionRequest, +) (acpsdk.SetSessionConfigOptionResponse, error) { + return acpsdk.SetSessionConfigOptionResponse{ConfigOptions: []acpsdk.SessionConfigOption{}}, nil +} + func (a *daemonNightlyCombinedACPAgent) LoadSession( context.Context, acpsdk.LoadSessionRequest, diff --git a/internal/daemon/daemon_sandbox_integration_test.go b/internal/daemon/daemon_sandbox_integration_test.go index 41556225e..5cdb9becd 100644 --- a/internal/daemon/daemon_sandbox_integration_test.go +++ b/internal/daemon/daemon_sandbox_integration_test.go @@ -260,6 +260,20 @@ func (a *daemonSandboxACPAgent) Cancel(context.Context, acpsdk.CancelNotificatio return nil } +func (a *daemonSandboxACPAgent) CloseSession( + context.Context, + acpsdk.CloseSessionRequest, +) (acpsdk.CloseSessionResponse, error) { + return acpsdk.CloseSessionResponse{}, nil +} + +func (a *daemonSandboxACPAgent) ListSessions( + context.Context, + acpsdk.ListSessionsRequest, +) (acpsdk.ListSessionsResponse, error) { + return acpsdk.ListSessionsResponse{Sessions: []acpsdk.SessionInfo{}}, nil +} + func (a *daemonSandboxACPAgent) NewSession( context.Context, acpsdk.NewSessionRequest, @@ -267,6 +281,20 @@ func (a *daemonSandboxACPAgent) NewSession( return acpsdk.NewSessionResponse{SessionId: "daemon-sandbox-helper"}, nil } +func (a *daemonSandboxACPAgent) ResumeSession( + context.Context, + acpsdk.ResumeSessionRequest, +) (acpsdk.ResumeSessionResponse, error) { + return acpsdk.ResumeSessionResponse{}, nil +} + +func (a *daemonSandboxACPAgent) SetSessionConfigOption( + context.Context, + acpsdk.SetSessionConfigOptionRequest, +) (acpsdk.SetSessionConfigOptionResponse, error) { + return acpsdk.SetSessionConfigOptionResponse{ConfigOptions: []acpsdk.SessionConfigOption{}}, nil +} + func (a *daemonSandboxACPAgent) LoadSession( context.Context, acpsdk.LoadSessionRequest, diff --git a/internal/daemon/model_catalog.go b/internal/daemon/model_catalog.go new file mode 100644 index 000000000..f3d1c757f --- /dev/null +++ b/internal/daemon/model_catalog.go @@ -0,0 +1,329 @@ +package daemon + +import ( + "context" + "errors" + "fmt" + "log/slog" + "net/http" + "os" + "strings" + "sync" + "time" + + extensionpkg "github.com/pedronauck/agh/internal/extension" + "github.com/pedronauck/agh/internal/modelcatalog" +) + +const defaultModelCatalogRefreshTimeout = 10 * time.Second + +type modelCatalogRuntime struct { + service modelcatalog.Service + logger *slog.Logger + now func() time.Time + timeout time.Duration + + ctx context.Context + cancel context.CancelFunc + wg sync.WaitGroup +} + +var _ modelcatalog.Service = (*modelCatalogRuntime)(nil) + +type modelCatalogRefreshResult struct { + statuses []modelcatalog.SourceStatus + err error +} + +func newModelCatalogRuntime( + ctx context.Context, + service modelcatalog.Service, + logger *slog.Logger, + now func() time.Time, + timeout time.Duration, +) (*modelCatalogRuntime, error) { + if ctx == nil { + return nil, errors.New("daemon: model catalog lifecycle context is required") + } + if service == nil { + return nil, errors.New("daemon: model catalog service is required") + } + if now == nil { + now = func() time.Time { + return time.Now().UTC() + } + } + if timeout <= 0 { + timeout = defaultModelCatalogRefreshTimeout + } + // #nosec G118 -- cancel is owned by modelCatalogRuntime and invoked from Shutdown. + runtimeCtx, cancel := context.WithCancel(ctx) + return &modelCatalogRuntime{ + service: service, + logger: logger, + now: now, + timeout: timeout, + ctx: runtimeCtx, + cancel: cancel, + }, nil +} + +func (r *modelCatalogRuntime) ListModels( + ctx context.Context, + opts modelcatalog.ListOptions, +) ([]modelcatalog.Model, error) { + if r == nil || r.service == nil { + return nil, errors.New("daemon: model catalog service is unavailable") + } + if ctx == nil { + return nil, errors.New("daemon: model catalog list context is required") + } + now := r.now().UTC() + if !opts.Now.IsZero() { + now = opts.Now + } + if opts.Refresh { + refreshOpts := modelcatalog.RefreshOptions{ + ProviderID: opts.ProviderID, + SourceID: opts.SourceID, + Force: true, + Now: now, + } + _, refreshErr := r.Refresh(ctx, refreshOpts) + listOpts := opts + listOpts.Refresh = false + listOpts.Now = now + models, listErr := r.service.ListModels(ctx, listOpts) + if listErr != nil { + return nil, listErr + } + if len(models) == 0 && refreshErr != nil { + return nil, refreshErr + } + return models, nil + } + opts.Now = now + return r.service.ListModels(ctx, opts) +} + +func (r *modelCatalogRuntime) Refresh( + ctx context.Context, + opts modelcatalog.RefreshOptions, +) ([]modelcatalog.SourceStatus, error) { + if r == nil || r.service == nil { + return nil, errors.New("daemon: model catalog service is unavailable") + } + if ctx == nil { + return nil, errors.New("daemon: model catalog refresh context is required") + } + if err := r.ctx.Err(); err != nil { + return nil, fmt.Errorf("daemon: model catalog refresh unavailable: %w", err) + } + runtimeNow := r.now().UTC() + now := runtimeNow + if !opts.Now.IsZero() { + now = opts.Now + } + refreshOpts := opts + refreshOpts.Now = now + if strings.TrimSpace(refreshOpts.RequestID) == "" { + refreshOpts.RequestID = fmt.Sprintf("model-catalog-refresh-%d", runtimeNow.UnixNano()) + } + + refreshCtx := context.WithoutCancel(ctx) + refreshCtx, cancel := context.WithTimeout(refreshCtx, r.timeout) + resultCh := make(chan modelCatalogRefreshResult, 1) + + r.wg.Go(func() { + stopRootCancel := context.AfterFunc(r.ctx, cancel) + defer func() { + stopRootCancel() + cancel() + }() + + statuses, err := r.service.Refresh(refreshCtx, refreshOpts) + if err != nil { + r.logRefreshFailure(refreshOpts, err) + } + resultCh <- modelCatalogRefreshResult{statuses: statuses, err: err} + }) + + select { + case result := <-resultCh: + return result.statuses, result.err + case <-ctx.Done(): + return nil, fmt.Errorf("daemon: model catalog refresh request canceled: %w", ctx.Err()) + case <-r.ctx.Done(): + return nil, fmt.Errorf("daemon: model catalog refresh stopped: %w", r.ctx.Err()) + } +} + +func (r *modelCatalogRuntime) ListSourceStatus( + ctx context.Context, + providerID string, +) ([]modelcatalog.SourceStatus, error) { + if r == nil || r.service == nil { + return nil, errors.New("daemon: model catalog service is unavailable") + } + return r.service.ListSourceStatus(ctx, providerID) +} + +func (r *modelCatalogRuntime) Shutdown(ctx context.Context) error { + if r == nil { + return nil + } + r.cancel() + done := make(chan struct{}) + go func() { + defer close(done) + r.wg.Wait() + }() + + if ctx == nil { + <-done + return nil + } + select { + case <-done: + return nil + case <-ctx.Done(): + return fmt.Errorf("daemon: wait for model catalog refresh workers: %w", ctx.Err()) + } +} + +func (r *modelCatalogRuntime) logRefreshFailure(opts modelcatalog.RefreshOptions, err error) { + if r == nil || r.logger == nil || err == nil { + return + } + if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) { + return + } + r.logger.Warn( + "daemon.model_catalog.refresh_failed", + "refresh_request_id", + opts.RequestID, + "provider_id", + strings.TrimSpace(opts.ProviderID), + "source_id", + strings.TrimSpace(opts.SourceID), + "error", + modelcatalog.RedactString(err.Error()), + ) +} + +func (d *Daemon) bootModelCatalog(ctx context.Context, state *bootState, cleanup *bootCleanup) error { + if state == nil { + return errors.New("daemon: model catalog state is required") + } + store, ok := state.registry.(modelcatalog.Store) + if !ok { + if state.logger != nil { + state.logger.Warn( + "daemon.model_catalog.disabled", + "reason", + "registry_missing_model_catalog_store", + "registry_type", + fmt.Sprintf("%T", state.registry), + ) + } + return nil + } + + sourceTimeout, err := time.ParseDuration(state.cfg.ModelCatalog.Sources.ModelsDev.EffectiveTimeout()) + if err != nil { + return fmt.Errorf("daemon: parse model catalog source timeout: %w", err) + } + httpClient := &http.Client{Timeout: sourceTimeout} + sources, err := d.modelCatalogSources(state, httpClient, sourceTimeout) + if err != nil { + return err + } + service, err := modelcatalog.NewService(store, sources) + if err != nil { + return fmt.Errorf("daemon: create model catalog service: %w", err) + } + runtime, err := newModelCatalogRuntime(ctx, service, state.logger, d.now, sourceTimeout) + if err != nil { + return err + } + state.modelCatalog = runtime + if cleanup != nil { + cleanup.add(runtime.Shutdown) + } + return nil +} + +func (d *Daemon) modelCatalogSources( + state *bootState, + httpClient *http.Client, + defaultTimeout time.Duration, +) ([]modelcatalog.Source, error) { + modelsDev, err := modelcatalog.NewModelsDevSource( + state.cfg.Providers, + state.cfg.ModelCatalog.Sources.ModelsDev, + modelcatalog.WithModelsDevHTTPClient(httpClient), + ) + if err != nil { + return nil, fmt.Errorf("daemon: create models.dev model catalog source: %w", err) + } + + sources := []modelcatalog.Source{ + modelcatalog.NewBuiltinSource(), + modelcatalog.NewConfigSource(state.cfg.Providers), + modelsDev, + } + liveSources, err := modelcatalog.NewLiveProviderSources(modelcatalog.LiveProviderSourcesConfig{ + Providers: state.cfg.Providers, + HomePaths: d.homePaths, + BaseEnv: os.Environ(), + SecretResolver: d.modelCatalogSecretResolver(state), + HTTPClient: httpClient, + DefaultTimeout: defaultTimeout, + }) + if err != nil { + return nil, fmt.Errorf("daemon: create live provider model catalog sources: %w", err) + } + sources = append(sources, liveSources...) + extensionSources, err := d.modelCatalogExtensionSources(state) + if err != nil { + return nil, err + } + sources = append(sources, extensionSources...) + return sources, nil +} + +func (d *Daemon) modelCatalogExtensionSources(state *bootState) ([]modelcatalog.Source, error) { + dbSource, ok := state.registry.(extensionDBSource) + if !ok || dbSource.DB() == nil { + return nil, nil + } + registry := extensionpkg.NewRegistry(dbSource.DB()) + sources, err := extensionpkg.NewExtensionModelSources(registry, func() extensionpkg.ModelSourceRuntime { + runtime, ok := state.currentExtensionRuntime().(extensionpkg.ModelSourceRuntime) + if !ok { + return nil + } + return runtime + }) + if err != nil { + return nil, fmt.Errorf("daemon: create extension model catalog sources: %w", err) + } + return sources, nil +} + +func (d *Daemon) modelCatalogSecretResolver(state *bootState) modelcatalog.ProviderSecretResolver { + if state != nil && state.providerVault != nil { + return state.providerVault + } + return modelcatalog.EnvSecretResolver{ + LookupEnv: func(key string) (string, bool) { + value := "" + if d.getenv != nil { + value = d.getenv(key) + } else { + value = os.Getenv(key) + } + return value, strings.TrimSpace(value) != "" + }, + } +} diff --git a/internal/daemon/model_catalog_test.go b/internal/daemon/model_catalog_test.go new file mode 100644 index 000000000..07232b566 --- /dev/null +++ b/internal/daemon/model_catalog_test.go @@ -0,0 +1,599 @@ +package daemon + +import ( + "bytes" + "context" + "errors" + "log/slog" + "strings" + "sync" + "testing" + "time" + + aghconfig "github.com/pedronauck/agh/internal/config" + "github.com/pedronauck/agh/internal/modelcatalog" + "github.com/pedronauck/agh/internal/testutil" +) + +func TestDaemonModelCatalogWiring(t *testing.T) { + t.Parallel() + + t.Run("Should compose catalog service when global DB and config are available", func(t *testing.T) { + t.Parallel() + + daemonInstance, httpDeps, udsDeps := bootModelCatalogTestDaemon(t, nil) + if daemonInstance.modelCatalog == nil { + t.Fatal("boot() modelCatalog = nil, want daemon-owned service") + } + if httpDeps.ModelCatalog == nil { + t.Fatal("HTTP RuntimeDeps ModelCatalog = nil, want injected service") + } + if udsDeps.ModelCatalog == nil { + t.Fatal("UDS RuntimeDeps ModelCatalog = nil, want injected service") + } + + ctx := testutil.Context(t) + if _, err := httpDeps.ModelCatalog.Refresh(ctx, modelcatalog.RefreshOptions{ + ProviderID: "codex", + SourceID: modelcatalog.SourceIDBuiltin, + Force: true, + }); err != nil { + t.Fatalf("ModelCatalog.Refresh(builtin) error = %v", err) + } + models, err := httpDeps.ModelCatalog.ListModels(ctx, modelcatalog.ListOptions{ProviderID: "codex"}) + if err != nil { + t.Fatalf("ModelCatalog.ListModels(codex) error = %v", err) + } + if !containsCatalogModel(models, "codex", "gpt-5.4") { + t.Fatalf("ModelCatalog.ListModels(codex) missing builtin gpt-5.4 row: %#v", models) + } + }) + + t.Run("Should record live source status when optional dependency is missing", func(t *testing.T) { + t.Parallel() + + daemonInstance, _, _ := bootModelCatalogTestDaemon(t, nil) + ctx := testutil.Context(t) + _, err := daemonInstance.modelCatalog.Refresh(ctx, modelcatalog.RefreshOptions{ + ProviderID: "hermes", + SourceID: modelcatalog.SourceKindProviderLiveID("hermes"), + Force: true, + }) + if !errors.Is(err, modelcatalog.ErrAllSourcesFailed) { + t.Fatalf("ModelCatalog.Refresh(hermes live) error = %v, want ErrAllSourcesFailed", err) + } + + statuses, err := daemonInstance.modelCatalog.ListSourceStatus(ctx, "hermes") + if err != nil { + t.Fatalf("ModelCatalog.ListSourceStatus(hermes) error = %v", err) + } + status, ok := findSourceStatus(statuses, modelcatalog.SourceKindProviderLiveID("hermes")) + if !ok { + t.Fatalf("ListSourceStatus(hermes) missing provider_live status: %#v", statuses) + } + if got, want := status.RefreshState, string(modelcatalog.RefreshStateFailed); got != want { + t.Fatalf("provider_live refresh state = %q, want %q", got, want) + } + if status.LastError == "" { + t.Fatal("provider_live LastError = empty, want redacted failure detail") + } + }) + + t.Run("Should cancel and join refresh work on shutdown", func(t *testing.T) { + t.Parallel() + + service := newBlockingModelCatalogService() + runtime, err := newModelCatalogRuntime( + testutil.Context(t), + service, + discardLogger(), + func() time.Time { + return time.Date(2026, 5, 7, 12, 0, 0, 0, time.UTC) + }, + 5*time.Second, + ) + if err != nil { + t.Fatalf("newModelCatalogRuntime() error = %v", err) + } + + requestCtx, cancelRequest := context.WithCancel(testutil.Context(t)) + resultCh := make(chan error, 1) + go func() { + _, refreshErr := runtime.Refresh(requestCtx, modelcatalog.RefreshOptions{ + ProviderID: "codex", + SourceID: modelcatalog.SourceIDBuiltin, + Force: true, + }) + resultCh <- refreshErr + }() + + waitForCatalogTestSignal(t, service.started, "refresh start") + cancelRequest() + refreshErr := waitForCatalogTestError(t, resultCh, "refresh request cancellation") + if !errors.Is(refreshErr, context.Canceled) { + t.Fatalf("Refresh() error = %v, want context.Canceled", refreshErr) + } + select { + case <-service.released: + t.Fatal("refresh worker stopped on request cancellation; want daemon shutdown to own worker cancellation") + default: + } + + shutdownCtx, cancelShutdown := context.WithTimeout(testutil.Context(t), time.Second) + defer cancelShutdown() + if err := runtime.Shutdown(shutdownCtx); err != nil { + t.Fatalf("Shutdown() error = %v", err) + } + waitForCatalogTestSignal(t, service.released, "refresh release") + }) + + t.Run("Should return shutdown deadline when refresh worker does not stop in time", func(t *testing.T) { + t.Parallel() + + service := newManuallyReleasedModelCatalogService() + runtime, err := newModelCatalogRuntime( + testutil.Context(t), + service, + discardLogger(), + func() time.Time { + return time.Date(2026, 5, 7, 12, 0, 0, 0, time.UTC) + }, + 5*time.Second, + ) + if err != nil { + t.Fatalf("newModelCatalogRuntime() error = %v", err) + } + + refreshErrCh := make(chan error, 1) + go func() { + _, refreshErr := runtime.Refresh(testutil.Context(t), modelcatalog.RefreshOptions{Force: true}) + refreshErrCh <- refreshErr + }() + waitForCatalogTestSignal(t, service.started, "manual refresh start") + + shutdownCtx, cancelShutdown := context.WithTimeout(testutil.Context(t), time.Nanosecond) + defer cancelShutdown() + err = runtime.Shutdown(shutdownCtx) + if !errors.Is(err, context.DeadlineExceeded) { + t.Fatalf("Shutdown(deadline) error = %v, want context.DeadlineExceeded", err) + } + refreshErr := waitForCatalogTestError(t, refreshErrCh, "manual refresh shutdown cancellation") + if !errors.Is(refreshErr, context.Canceled) { + t.Fatalf("Refresh(shutdown) error = %v, want context.Canceled", refreshErr) + } + + close(service.release) + waitForCatalogTestSignal(t, service.released, "manual refresh release") + }) + + t.Run("Should apply runtime timeout to detached refresh work", func(t *testing.T) { + t.Parallel() + + service := newBlockingModelCatalogService() + runtime, err := newModelCatalogRuntime( + testutil.Context(t), + service, + discardLogger(), + func() time.Time { + return time.Date(2026, 5, 7, 12, 0, 0, 0, time.UTC) + }, + 20*time.Millisecond, + ) + if err != nil { + t.Fatalf("newModelCatalogRuntime() error = %v", err) + } + + _, err = runtime.Refresh(testutil.Context(t), modelcatalog.RefreshOptions{ + ProviderID: "codex", + SourceID: modelcatalog.SourceIDBuiltin, + Force: true, + }) + if !errors.Is(err, context.DeadlineExceeded) { + t.Fatalf("Refresh(timeout) error = %v, want context.DeadlineExceeded", err) + } + waitForCatalogTestSignal(t, service.released, "timed refresh release") + }) + + t.Run("Should redact source errors in refresh logs", func(t *testing.T) { + t.Parallel() + + var logs bytes.Buffer + runtime := &modelCatalogRuntime{ + logger: slog.New(slog.NewTextHandler(&logs, nil)), + } + runtime.logRefreshFailure( + modelcatalog.RefreshOptions{ + ProviderID: "codex", + SourceID: modelcatalog.SourceIDModelsDev, + RequestID: "req-redaction", + }, + errors.New("source failed with api_key=sk-super-secret-token-123"), + ) + output := logs.String() + if strings.Contains(output, "sk-super-secret-token-123") { + t.Fatalf("log output = %q, want secret redacted", output) + } + if !strings.Contains(output, "[REDACTED]") { + t.Fatalf("log output = %q, want redaction marker", output) + } + }) + + t.Run("Should refresh before listing when list requests refresh", func(t *testing.T) { + t.Parallel() + + service := &recordingModelCatalogService{ + models: []modelcatalog.Model{{ProviderID: "codex", ModelID: "gpt-5.4"}}, + } + runtime, err := newModelCatalogRuntime( + testutil.Context(t), + service, + discardLogger(), + func() time.Time { + return time.Date(2026, 5, 7, 12, 0, 0, 0, time.UTC) + }, + 5*time.Second, + ) + if err != nil { + t.Fatalf("newModelCatalogRuntime() error = %v", err) + } + + models, err := runtime.ListModels(testutil.Context(t), modelcatalog.ListOptions{ + ProviderID: "codex", + Refresh: true, + }) + if err != nil { + t.Fatalf("ListModels(refresh) error = %v", err) + } + if !containsCatalogModel(models, "codex", "gpt-5.4") { + t.Fatalf("ListModels(refresh) = %#v, want gpt-5.4", models) + } + if service.refreshCalls != 1 { + t.Fatalf("Refresh calls = %d, want 1", service.refreshCalls) + } + if !service.lastRefresh.Force || service.lastRefresh.ProviderID != "codex" { + t.Fatalf("Refresh opts = %#v, want forced codex refresh", service.lastRefresh) + } + if service.lastList.Refresh { + t.Fatalf("List opts Refresh = true, want false after daemon refresh handoff") + } + if _, err := runtime.ListSourceStatus(testutil.Context(t), "codex"); err != nil { + t.Fatalf("ListSourceStatus() error = %v", err) + } + }) + + t.Run("Should validate runtime dependencies", func(t *testing.T) { + t.Parallel() + + if _, err := newModelCatalogRuntime(testutil.Context(t), nil, nil, nil, 0); err == nil { + t.Fatal("newModelCatalogRuntime(nil service) error = nil, want validation error") + } + runtime, err := newModelCatalogRuntime( + testutil.Context(t), + &recordingModelCatalogService{}, + nil, + nil, + 0, + ) + if err != nil { + t.Fatalf("newModelCatalogRuntime(defaults) error = %v", err) + } + if runtime.timeout != defaultModelCatalogRefreshTimeout { + t.Fatalf("runtime timeout = %s, want %s", runtime.timeout, defaultModelCatalogRefreshTimeout) + } + if err := runtime.Shutdown(context.Background()); err != nil { + t.Fatalf("Shutdown(context.Background()) error = %v", err) + } + var nilRuntime *modelCatalogRuntime + if err := nilRuntime.Shutdown(context.Background()); err != nil { + t.Fatalf("Shutdown(nil runtime) error = %v", err) + } + unavailable := &modelCatalogRuntime{} + if _, err := unavailable.ListModels(testutil.Context(t), modelcatalog.ListOptions{}); err == nil { + t.Fatal("ListModels(unavailable) error = nil, want validation error") + } + if _, err := unavailable.Refresh(testutil.Context(t), modelcatalog.RefreshOptions{}); err == nil { + t.Fatal("Refresh(unavailable) error = nil, want validation error") + } + if _, err := unavailable.ListSourceStatus(testutil.Context(t), "codex"); err == nil { + t.Fatal("ListSourceStatus(unavailable) error = nil, want validation error") + } + }) + + t.Run("Should disable catalog when registry does not expose store", func(t *testing.T) { + t.Parallel() + + homePaths := testHomePaths(t) + cfg := testConfig(t, homePaths) + daemonInstance := newTestDaemon(t, homePaths, &cfg) + state := &bootState{ + cfg: cfg, + logger: discardLogger(), + registry: &recordingRegistry{path: homePaths.DatabaseFile}, + } + if err := daemonInstance.bootModelCatalog(testutil.Context(t), state, &bootCleanup{}); err != nil { + t.Fatalf("bootModelCatalog(non-store registry) error = %v", err) + } + if state.modelCatalog != nil { + t.Fatalf("bootModelCatalog(non-store registry) modelCatalog = %#v, want nil", state.modelCatalog) + } + }) + + t.Run("Should reject invalid timeouts during catalog boot", func(t *testing.T) { + t.Parallel() + + homePaths := testHomePaths(t) + cfg := testConfig(t, homePaths) + cfg.ModelCatalog.Sources.ModelsDev.Timeout = "not-a-duration" + daemonInstance := newTestDaemon(t, homePaths, &cfg) + state := &bootState{ + cfg: cfg, + registry: &modelCatalogStoreRegistry{ + recordingRegistry: &recordingRegistry{path: homePaths.DatabaseFile}, + }, + } + if err := daemonInstance.bootModelCatalog(testutil.Context(t), state, &bootCleanup{}); err == nil { + t.Fatal("bootModelCatalog(invalid timeout) error = nil, want validation error") + } + + cfg = testConfig(t, homePaths) + cfg.ModelCatalog.Sources.ModelsDev.TTL = "not-a-duration" + state = &bootState{cfg: cfg} + if _, err := daemonInstance.modelCatalogSources(state, nil, defaultModelCatalogRefreshTimeout); err == nil { + t.Fatal("modelCatalogSources(invalid ttl) error = nil, want validation error") + } + }) + + t.Run("Should use env secret resolver when vault is unavailable", func(t *testing.T) { + t.Parallel() + + homePaths := testHomePaths(t) + cfg := testConfig(t, homePaths) + daemonInstance := newTestDaemon(t, homePaths, &cfg) + daemonInstance.getenv = func(key string) string { + if key == "MODEL_CATALOG_TEST_KEY" { + return "secret-value" + } + return "" + } + value, err := daemonInstance.modelCatalogSecretResolver(&bootState{}). + ResolveRef(testutil.Context(t), "env:MODEL_CATALOG_TEST_KEY") + if err != nil { + t.Fatalf("ResolveRef(env) error = %v", err) + } + if value != "secret-value" { + t.Fatalf("ResolveRef(env) = %q, want secret-value", value) + } + }) +} + +func bootModelCatalogTestDaemon( + t *testing.T, + mutate func(*aghconfig.Config), +) (*Daemon, RuntimeDeps, RuntimeDeps) { + t.Helper() + + homePaths := testHomePaths(t) + cfg := testConfig(t, homePaths) + cfg.Memory.Enabled = false + cfg.Network.Enabled = false + cfg.Skills.Enabled = false + modelsDevEnabled := false + cfg.ModelCatalog.Sources.ModelsDev.Enabled = &modelsDevEnabled + if mutate != nil { + mutate(&cfg) + } + + daemonInstance := newTestDaemon(t, homePaths, &cfg) + daemonInstance.newSessionManager = func(context.Context, SessionManagerDeps) (SessionManager, error) { + return &fakeSessionManager{}, nil + } + daemonInstance.newObserver = func(context.Context, RuntimeDeps) (Observer, error) { + return &fakeObserver{}, nil + } + + var httpDeps RuntimeDeps + var udsDeps RuntimeDeps + daemonInstance.httpFactory = func(_ context.Context, deps RuntimeDeps) (Server, error) { + httpDeps = deps + return &fakeServer{name: "http"}, nil + } + daemonInstance.udsFactory = func(_ context.Context, deps RuntimeDeps) (Server, error) { + udsDeps = deps + return &fakeServer{name: "uds"}, nil + } + + if err := daemonInstance.boot(testutil.Context(t)); err != nil { + t.Fatalf("boot() error = %v", err) + } + t.Cleanup(func() { + if err := daemonInstance.Shutdown(testutil.Context(t)); err != nil { + t.Fatalf("Shutdown() error = %v", err) + } + }) + return daemonInstance, httpDeps, udsDeps +} + +func containsCatalogModel(models []modelcatalog.Model, providerID string, modelID string) bool { + for _, model := range models { + if model.ProviderID == providerID && model.ModelID == modelID { + return true + } + } + return false +} + +func findSourceStatus( + statuses []modelcatalog.SourceStatus, + sourceID string, +) (modelcatalog.SourceStatus, bool) { + for _, status := range statuses { + if status.SourceID == sourceID { + return status, true + } + } + return modelcatalog.SourceStatus{}, false +} + +type blockingModelCatalogService struct { + started chan struct{} + released chan struct{} + startOnce sync.Once + releasedOnce sync.Once +} + +type recordingModelCatalogService struct { + models []modelcatalog.Model + refreshCalls int + lastRefresh modelcatalog.RefreshOptions + lastList modelcatalog.ListOptions +} + +func (s *recordingModelCatalogService) ListModels( + _ context.Context, + opts modelcatalog.ListOptions, +) ([]modelcatalog.Model, error) { + s.lastList = opts + return append([]modelcatalog.Model(nil), s.models...), nil +} + +func (s *recordingModelCatalogService) Refresh( + _ context.Context, + opts modelcatalog.RefreshOptions, +) ([]modelcatalog.SourceStatus, error) { + s.refreshCalls++ + s.lastRefresh = opts + return nil, nil +} + +func (s *recordingModelCatalogService) ListSourceStatus( + context.Context, + string, +) ([]modelcatalog.SourceStatus, error) { + return nil, nil +} + +type manuallyReleasedModelCatalogService struct { + started chan struct{} + release chan struct{} + released chan struct{} + once sync.Once +} + +func newManuallyReleasedModelCatalogService() *manuallyReleasedModelCatalogService { + return &manuallyReleasedModelCatalogService{ + started: make(chan struct{}), + release: make(chan struct{}), + released: make(chan struct{}), + } +} + +func (s *manuallyReleasedModelCatalogService) ListModels( + context.Context, + modelcatalog.ListOptions, +) ([]modelcatalog.Model, error) { + return nil, nil +} + +func (s *manuallyReleasedModelCatalogService) Refresh( + ctx context.Context, + _ modelcatalog.RefreshOptions, +) ([]modelcatalog.SourceStatus, error) { + s.once.Do(func() { + close(s.started) + }) + <-s.release + close(s.released) + return nil, ctx.Err() +} + +func (s *manuallyReleasedModelCatalogService) ListSourceStatus( + context.Context, + string, +) ([]modelcatalog.SourceStatus, error) { + return nil, nil +} + +type modelCatalogStoreRegistry struct { + *recordingRegistry +} + +func (r *modelCatalogStoreRegistry) ReplaceSourceRows( + context.Context, + string, + string, + []modelcatalog.ModelRow, + modelcatalog.SourceStatus, +) error { + return nil +} + +func (r *modelCatalogStoreRegistry) ListRows( + context.Context, + modelcatalog.ListOptions, +) ([]modelcatalog.ModelRow, error) { + return nil, nil +} + +func (r *modelCatalogStoreRegistry) ListSourceStatus( + context.Context, + string, +) ([]modelcatalog.SourceStatus, error) { + return nil, nil +} + +func newBlockingModelCatalogService() *blockingModelCatalogService { + return &blockingModelCatalogService{ + started: make(chan struct{}), + released: make(chan struct{}), + } +} + +func (s *blockingModelCatalogService) ListModels( + context.Context, + modelcatalog.ListOptions, +) ([]modelcatalog.Model, error) { + return nil, nil +} + +func (s *blockingModelCatalogService) Refresh( + ctx context.Context, + _ modelcatalog.RefreshOptions, +) ([]modelcatalog.SourceStatus, error) { + s.startOnce.Do(func() { + close(s.started) + }) + <-ctx.Done() + s.releasedOnce.Do(func() { + close(s.released) + }) + return nil, ctx.Err() +} + +func (s *blockingModelCatalogService) ListSourceStatus( + context.Context, + string, +) ([]modelcatalog.SourceStatus, error) { + return nil, nil +} + +func waitForCatalogTestSignal(t *testing.T, ch <-chan struct{}, label string) { + t.Helper() + + select { + case <-ch: + case <-time.After(time.Second): + t.Fatalf("timeout waiting for %s", label) + } +} + +func waitForCatalogTestError(t *testing.T, ch <-chan error, label string) error { + t.Helper() + + select { + case err := <-ch: + return err + case <-time.After(time.Second): + t.Fatalf("timeout waiting for %s", label) + } + return nil +} diff --git a/internal/daemon/tool_approval_bridge.go b/internal/daemon/tool_approval_bridge.go index 21cb4526a..1b9d90d5d 100644 --- a/internal/daemon/tool_approval_bridge.go +++ b/internal/daemon/tool_approval_bridge.go @@ -113,7 +113,7 @@ func (b *toolApprovalBridge) requestSessionToolApproval( sessionID, acp.RequestPermissionRequest{ SessionId: acpsdk.SessionId(sessionID), - ToolCall: acpsdk.RequestPermissionToolCall{ + ToolCall: acpsdk.ToolCallUpdate{ ToolCallId: acpsdk.ToolCallId(toolApprovalCallID(call, view)), Title: acpsdk.Ptr(toolApprovalTitle(descriptor)), Kind: acpsdk.Ptr(toolApprovalKind(descriptor)), diff --git a/internal/extension/capability.go b/internal/extension/capability.go index 959d3dc83..2681f7c93 100644 --- a/internal/extension/capability.go +++ b/internal/extension/capability.go @@ -77,6 +77,9 @@ var ( "memory/forget": "memory.write", "memory/recall": "memory.read", "memory/store": "memory.write", + "models/list": "model.read", + "models/refresh": "model.write", + "models/status": "model.read", "network/status": "network.read", "network/channels": "network.read", "network/peers": "network.read", diff --git a/internal/extension/capability_models_test.go b/internal/extension/capability_models_test.go new file mode 100644 index 000000000..857198a69 --- /dev/null +++ b/internal/extension/capability_models_test.go @@ -0,0 +1,130 @@ +package extensionpkg + +import ( + "errors" + "slices" + "testing" +) + +func TestCapabilityCheckerModelHostAPIMethods(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + actions []string + security []string + method string + wantError bool + wantNeeded []string + }{ + { + name: "Should allow models list with read grant", + actions: []string{"models/list"}, + security: []string{"model.read"}, + method: "models/list", + }, + { + name: "Should allow models status with read grant", + actions: []string{"models/status"}, + security: []string{"model.read"}, + method: "models/status", + }, + { + name: "Should allow models refresh with write grant", + actions: []string{"models/refresh"}, + security: []string{"model.write"}, + method: "models/refresh", + }, + { + name: "Should reject models refresh without write grant", + actions: []string{"models/refresh"}, + security: []string{"model.read"}, + method: "models/refresh", + wantError: true, + wantNeeded: []string{"model.write"}, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + checker := newTestCapabilityChecker("ext", SourceUser, tt.actions, tt.security) + err := checker.CheckHostAPI("ext", tt.method) + if !tt.wantError { + if err != nil { + t.Fatalf("CheckHostAPI(%q) error = %v, want nil", tt.method, err) + } + return + } + if err == nil { + t.Fatalf("CheckHostAPI(%q) error = nil, want capability denied", tt.method) + } + var denied *ErrCapabilityDenied + if !errors.As(err, &denied) { + t.Fatalf("CheckHostAPI(%q) error = %T, want *ErrCapabilityDenied", tt.method, err) + } + if !slices.Equal(denied.Data.Required, tt.wantNeeded) { + t.Fatalf("Data.Required = %v, want %v", denied.Data.Required, tt.wantNeeded) + } + }) + } +} + +func TestCapabilityCheckerMarketplaceModelCeilings(t *testing.T) { + t.Parallel() + + t.Run("Should deny marketplace model Host API methods", func(t *testing.T) { + t.Parallel() + + checker := newTestCapabilityChecker( + "ext", + SourceMarketplace, + []string{"models/list", "models/refresh", "models/status"}, + []string{"model.read", "model.write"}, + ) + for _, method := range []string{"models/list", "models/refresh", "models/status"} { + err := checker.CheckHostAPI("ext", method) + if err == nil { + t.Fatalf("CheckHostAPI(%q) error = nil, want capability denied", method) + } + var denied *ErrCapabilityDenied + if !errors.As(err, &denied) { + t.Fatalf("CheckHostAPI(%q) error = %T, want *ErrCapabilityDenied", method, err) + } + } + }) + + t.Run("Should remove marketplace model grants from effective grant", func(t *testing.T) { + t.Parallel() + + checker := &CapabilityChecker{} + checker.Register("ext", SourceMarketplace, &Manifest{ + Actions: ActionsConfig{ + Requires: []string{"models/list", "models/refresh", "models/status", "sessions/list"}, + }, + Security: SecurityConfig{ + Capabilities: []string{"model.read", "model.write", "session.read"}, + }, + }) + + grant := checker.Grant("ext") + if slices.Contains(grant.Actions, "models/list") || + slices.Contains(grant.Actions, "models/refresh") || + slices.Contains(grant.Actions, "models/status") { + t.Fatalf("Grant.Actions = %v, want marketplace model actions denied by source tier ceiling", grant.Actions) + } + if slices.Contains(grant.Security, "model.read") || slices.Contains(grant.Security, "model.write") { + t.Fatalf( + "Grant.Security = %v, want marketplace model security denied by source tier ceiling", + grant.Security, + ) + } + if !slices.Equal(grant.Actions, []string{"sessions/list"}) { + t.Fatalf("Grant.Actions = %v, want [sessions/list]", grant.Actions) + } + if !slices.Equal(grant.Security, []string{"session.read"}) { + t.Fatalf("Grant.Security = %v, want [session.read]", grant.Security) + } + }) +} diff --git a/internal/extension/contract/host_api.go b/internal/extension/contract/host_api.go index ec0a8d869..5654104a9 100644 --- a/internal/extension/contract/host_api.go +++ b/internal/extension/contract/host_api.go @@ -37,6 +37,9 @@ const ( HostAPIMethodObserveHealth = extensionprotocol.HostAPIMethodObserveHealth HostAPIMethodObserveEvents = extensionprotocol.HostAPIMethodObserveEvents HostAPIMethodSkillsList = extensionprotocol.HostAPIMethodSkillsList + HostAPIMethodModelsList = extensionprotocol.HostAPIMethodModelsList + HostAPIMethodModelsRefresh = extensionprotocol.HostAPIMethodModelsRefresh + HostAPIMethodModelsStatus = extensionprotocol.HostAPIMethodModelsStatus HostAPIMethodAgentsSoulGet = extensionprotocol.HostAPIMethodAgentsSoulGet HostAPIMethodAgentsSoulValidate = extensionprotocol.HostAPIMethodAgentsSoulValidate HostAPIMethodAgentsSoulPut = extensionprotocol.HostAPIMethodAgentsSoulPut @@ -128,10 +131,12 @@ type SessionsListParams struct { // SessionsCreateParams starts a new session. type SessionsCreateParams struct { - Agent string `json:"agent"` - Prompt string `json:"prompt,omitempty"` - Provider string `json:"provider,omitempty"` - Workspace string `json:"workspace,omitempty"` + Agent string `json:"agent"` + Prompt string `json:"prompt,omitempty"` + Provider string `json:"provider,omitempty"` + Model string `json:"model,omitempty"` + ReasoningEffort string `json:"reasoning_effort,omitempty"` + Workspace string `json:"workspace,omitempty"` } // SessionsPromptParams submits one prompt to an existing session. @@ -224,6 +229,61 @@ type SkillsListParams struct { ForAgent string `json:"for_agent,omitempty"` } +// ModelsListParams filters daemon-owned model catalog projections. +type ModelsListParams struct { + ProviderID string `json:"provider_id,omitempty"` + SourceID string `json:"source_id,omitempty"` + Refresh bool `json:"refresh,omitempty"` + IncludeStale bool `json:"include_stale,omitempty"` +} + +// ModelsRefreshParams requests a daemon-owned model catalog refresh. +type ModelsRefreshParams struct { + ProviderID string `json:"provider_id,omitempty"` + SourceID string `json:"source_id,omitempty"` + Force bool `json:"force,omitempty"` + RequestID string `json:"request_id,omitempty"` +} + +// ModelsStatusParams filters daemon-owned model catalog source status rows. +type ModelsStatusParams struct { + ProviderID string `json:"provider_id,omitempty"` +} + +// ModelSourceListParams is sent by AGH to extension model sources. +type ModelSourceListParams struct { + ProviderID string `json:"provider_id,omitempty"` + Refresh bool `json:"refresh,omitempty"` + IncludeStale bool `json:"include_stale,omitempty"` +} + +// ModelSourceListResponse is returned by extension model sources. +type ModelSourceListResponse struct { + Rows []ModelSourceRow `json:"rows"` +} + +// ModelSourceRow is one extension-provided model catalog source row. +type ModelSourceRow struct { + SourceID string `json:"source_id"` + ProviderID string `json:"provider_id"` + ModelID string `json:"model_id"` + DisplayName string `json:"display_name,omitempty"` + Priority int `json:"priority,omitempty"` + Available *bool `json:"available,omitempty"` + Stale bool `json:"stale,omitempty"` + RefreshedAt time.Time `json:"refreshed_at"` + ExpiresAt time.Time `json:"expires_at"` + ContextWindow *int64 `json:"context_window,omitempty"` + MaxInputTokens *int64 `json:"max_input_tokens,omitempty"` + MaxOutputTokens *int64 `json:"max_output_tokens,omitempty"` + SupportsTools *bool `json:"supports_tools,omitempty"` + SupportsReasoning *bool `json:"supports_reasoning,omitempty"` + ReasoningEfforts []string `json:"reasoning_efforts,omitempty"` + DefaultReasoningEffort *string `json:"default_reasoning_effort,omitempty"` + Cost *apicontract.ModelCatalogCostPayload `json:"cost,omitempty"` + LastError string `json:"last_error,omitempty"` +} + // AgentSoulGetParams identifies one workspace-visible Soul read model. type AgentSoulGetParams struct { WorkspaceID string `json:"workspace_id,omitempty"` @@ -755,6 +815,30 @@ var hostAPIMethodSpecs = []HostAPIMethodSpec{ Result: NamedType{Name: "SkillSummary", Value: []SkillSummary{}}, OptionalParams: true, }, + { + Method: HostAPIMethodModelsList, + Params: NamedType{Name: "ModelsListParams", Value: ModelsListParams{}}, + Result: NamedType{Name: "ProviderModelListResponse", Value: apicontract.ProviderModelListResponse{}}, + OptionalParams: true, + }, + { + Method: HostAPIMethodModelsRefresh, + Params: NamedType{Name: "ModelsRefreshParams", Value: ModelsRefreshParams{}}, + Result: NamedType{ + Name: "ProviderModelRefreshResponse", + Value: apicontract.ProviderModelRefreshResponse{}, + }, + OptionalParams: true, + }, + { + Method: HostAPIMethodModelsStatus, + Params: NamedType{Name: "ModelsStatusParams", Value: ModelsStatusParams{}}, + Result: NamedType{ + Name: "ProviderModelStatusResponse", + Value: apicontract.ProviderModelStatusResponse{}, + }, + OptionalParams: true, + }, { Method: HostAPIMethodAgentsSoulGet, Params: NamedType{Name: "AgentSoulGetParams", Value: AgentSoulGetParams{}}, diff --git a/internal/extension/contract/sdk.go b/internal/extension/contract/sdk.go index c26bd3596..c70a3974a 100644 --- a/internal/extension/contract/sdk.go +++ b/internal/extension/contract/sdk.go @@ -111,6 +111,9 @@ var sdkRootTypes = []NamedType{ {Name: "ExtensionProvideToolsResponse", Value: tools.ExtensionProvideToolsResponse{}}, {Name: "ExtensionToolCallRequest", Value: tools.ExtensionToolCallRequest{}}, {Name: "ExtensionToolCallResponse", Value: tools.ExtensionToolCallResponse{}}, + {Name: "ModelSourceListParams", Value: ModelSourceListParams{}}, + {Name: "ModelSourceListResponse", Value: ModelSourceListResponse{}}, + {Name: "ModelSourceRow", Value: ModelSourceRow{}}, {Name: "MemoryScope", Value: memcontract.Scope("")}, {Name: "HookEventFamily", Value: hooks.HookEventFamily("")}, {Name: "HookRunOutcome", Value: hooks.HookRunOutcome("")}, diff --git a/internal/extension/host_api.go b/internal/extension/host_api.go index 6a7c28eb1..49967d9b0 100644 --- a/internal/extension/host_api.go +++ b/internal/extension/host_api.go @@ -77,6 +77,7 @@ type HostAPIHandler struct { memory *memory.Store observer hostAPIObserver skills hostAPISkillsRegistry + modelCatalog hostAPIModelCatalogService workspaces workspacepkg.RuntimeResolver bridges hostAPIBridgeRegistry dedupStore hostAPIBridgeDedupStore @@ -562,6 +563,9 @@ func hostAPIMethodHandlers(handler *HostAPIHandler) map[string]hostAPIMethodFunc "memory/store": handler.handleMemoryStore, "observe/events": handler.handleObserveEvents, "observe/health": handler.handleObserveHealth, + string(extensioncontract.HostAPIMethodModelsList): handler.handleModelsList, + string(extensioncontract.HostAPIMethodModelsRefresh): handler.handleModelsRefresh, + string(extensioncontract.HostAPIMethodModelsStatus): handler.handleModelsStatus, string(extensioncontract.HostAPIMethodAgentsSoulGet): handler.handleAgentsSoulGet, string(extensioncontract.HostAPIMethodAgentsSoulValidate): handler.handleAgentsSoulValidate, string(extensioncontract.HostAPIMethodAgentsSoulPut): handler.handleAgentsSoulPut, @@ -883,10 +887,12 @@ func (h *HostAPIHandler) handleSessionsCreate(ctx context.Context, raw json.RawM } sess, err := h.sessions.Create(ctx, session.CreateOpts{ - AgentName: strings.TrimSpace(params.Agent), - Provider: strings.TrimSpace(params.Provider), - Workspace: strings.TrimSpace(params.Workspace), - Type: session.SessionTypeSystem, + AgentName: strings.TrimSpace(params.Agent), + Provider: strings.TrimSpace(params.Provider), + Model: strings.TrimSpace(params.Model), + ReasoningEffort: strings.TrimSpace(params.ReasoningEffort), + Workspace: strings.TrimSpace(params.Workspace), + Type: session.SessionTypeSystem, }) if err != nil { return nil, err diff --git a/internal/extension/host_api_models.go b/internal/extension/host_api_models.go new file mode 100644 index 000000000..e9269fafc --- /dev/null +++ b/internal/extension/host_api_models.go @@ -0,0 +1,278 @@ +package extensionpkg + +import ( + "context" + "encoding/json" + "errors" + "fmt" + "strings" + "time" + + apicontract "github.com/pedronauck/agh/internal/api/contract" + extensioncontract "github.com/pedronauck/agh/internal/extension/contract" + "github.com/pedronauck/agh/internal/modelcatalog" +) + +type hostAPIModelCatalogService interface { + ListModels(ctx context.Context, opts modelcatalog.ListOptions) ([]modelcatalog.Model, error) + Refresh(ctx context.Context, opts modelcatalog.RefreshOptions) ([]modelcatalog.SourceStatus, error) + ListSourceStatus(ctx context.Context, providerID string) ([]modelcatalog.SourceStatus, error) +} + +// WithHostAPIModelCatalogService injects daemon-owned model catalog projections. +func WithHostAPIModelCatalogService(service modelcatalog.Service) HostAPIOption { + return func(handler *HostAPIHandler) { + handler.modelCatalog = service + } +} + +func (h *HostAPIHandler) handleModelsList( + ctx context.Context, + raw json.RawMessage, +) (any, error) { + var params extensioncontract.ModelsListParams + if err := decodeHostAPIParams(raw, ¶ms); err != nil { + return nil, err + } + sourceID, err := validateHostAPIModelSourceID(params.SourceID) + if err != nil { + return nil, invalidParamsRPCError(err) + } + providerID, err := validateHostAPIModelProviderID(params.ProviderID) + if err != nil { + return nil, invalidParamsRPCError(err) + } + service, err := h.modelCatalogService() + if err != nil { + return nil, unavailableRPCError(err) + } + models, err := service.ListModels(ctx, modelcatalog.ListOptions{ + ProviderID: providerID, + SourceID: sourceID, + Refresh: params.Refresh, + IncludeStale: params.IncludeStale, + Now: h.hostAPINow(), + }) + if err != nil { + return nil, hostAPIModelCatalogRPCError(err) + } + return hostAPIProviderModelListPayloadFromModels(models), nil +} + +func (h *HostAPIHandler) handleModelsRefresh( + ctx context.Context, + raw json.RawMessage, +) (any, error) { + var params extensioncontract.ModelsRefreshParams + if err := decodeHostAPIParams(raw, ¶ms); err != nil { + return nil, err + } + sourceID, err := validateHostAPIModelSourceID(params.SourceID) + if err != nil { + return nil, invalidParamsRPCError(err) + } + providerID, err := validateHostAPIModelProviderID(params.ProviderID) + if err != nil { + return nil, invalidParamsRPCError(err) + } + service, err := h.modelCatalogService() + if err != nil { + return nil, unavailableRPCError(err) + } + statuses, err := service.Refresh(ctx, modelcatalog.RefreshOptions{ + ProviderID: providerID, + SourceID: sourceID, + Force: params.Force, + RequestID: strings.TrimSpace(params.RequestID), + Now: h.hostAPINow(), + }) + payload := apicontract.ProviderModelRefreshResponse{ + Sources: hostAPISourceStatusPayloadsFromStatuses(statuses), + } + if err != nil { + if len(payload.Sources) > 0 { + payload.Error = modelcatalog.RedactString(err.Error()) + return payload, nil + } + return nil, hostAPIModelCatalogRPCError(err) + } + return payload, nil +} + +func (h *HostAPIHandler) handleModelsStatus( + ctx context.Context, + raw json.RawMessage, +) (any, error) { + var params extensioncontract.ModelsStatusParams + if err := decodeHostAPIParams(raw, ¶ms); err != nil { + return nil, err + } + providerID, err := validateHostAPIModelProviderID(params.ProviderID) + if err != nil { + return nil, invalidParamsRPCError(err) + } + service, err := h.modelCatalogService() + if err != nil { + return nil, unavailableRPCError(err) + } + statuses, err := service.ListSourceStatus(ctx, providerID) + if err != nil { + return nil, unavailableRPCError(err) + } + return apicontract.ProviderModelStatusResponse{ + Sources: hostAPISourceStatusPayloadsFromStatuses(statuses), + }, nil +} + +func (h *HostAPIHandler) modelCatalogService() (hostAPIModelCatalogService, error) { + if h == nil || h.modelCatalog == nil { + return nil, errors.New("extension: model catalog service is unavailable") + } + return h.modelCatalog, nil +} + +func (h *HostAPIHandler) hostAPINow() time.Time { + if h == nil || h.now == nil { + return time.Now().UTC() + } + return h.now().UTC() +} + +func validateHostAPIModelSourceID(sourceID string) (string, error) { + trimmed := strings.TrimSpace(sourceID) + if trimmed == "" { + return "", nil + } + if err := modelcatalog.ValidateSourceID(trimmed); err != nil { + return "", err + } + return trimmed, nil +} + +func validateHostAPIModelProviderID(providerID string) (string, error) { + trimmed := strings.TrimSpace(providerID) + if trimmed == "" { + return "", nil + } + for idx, ch := range trimmed { + valid := ch >= 'a' && ch <= 'z' || + ch >= '0' && ch <= '9' || + (idx > 0 && (ch == '-' || ch == '_')) + if !valid { + return "", fmt.Errorf("provider_id %q must match ^[a-z0-9][a-z0-9_-]*$", providerID) + } + } + return trimmed, nil +} + +func hostAPIModelCatalogRPCError(err error) error { + if err == nil { + return nil + } + if errors.Is(err, modelcatalog.ErrSourceNotRegistered) { + return invalidParamsRPCError(err) + } + return unavailableRPCError(errors.New(modelcatalog.RedactString(err.Error()))) +} + +func hostAPIProviderModelListPayloadFromModels(models []modelcatalog.Model) apicontract.ProviderModelListResponse { + payload := apicontract.ProviderModelListResponse{ + Models: make([]apicontract.ProviderModelPayload, 0, len(models)), + } + for _, model := range models { + payload.Models = append(payload.Models, hostAPIProviderModelPayloadFromModel(model)) + } + return payload +} + +func hostAPIProviderModelPayloadFromModel(model modelcatalog.Model) apicontract.ProviderModelPayload { + return apicontract.ProviderModelPayload{ + ProviderID: model.ProviderID, + ModelID: model.ModelID, + DisplayName: model.DisplayName, + Sources: hostAPISourceRefPayloadsFromRefs(model.Sources), + Available: model.Available, + AvailabilityState: model.AvailabilityState, + Stale: model.Stale, + RefreshedAt: hostAPIModelCatalogTimeString(model.RefreshedAt), + ContextWindow: model.ContextWindow, + MaxInputTokens: model.MaxInputTokens, + MaxOutputTokens: model.MaxOutputTokens, + SupportsTools: model.SupportsTools, + SupportsReasoning: model.SupportsReasoning, + ReasoningEfforts: hostAPIReasoningEffortStrings(model.ReasoningEfforts), + DefaultReasoningEffort: hostAPIReasoningEffortStringPtr(model.DefaultReasoningEffort), + Cost: hostAPICostPayloadFromModel(model), + LastError: modelcatalog.RedactString(model.LastError), + } +} + +func hostAPISourceRefPayloadsFromRefs(refs []modelcatalog.SourceRef) []apicontract.ModelCatalogSourceRefPayload { + payloads := make([]apicontract.ModelCatalogSourceRefPayload, 0, len(refs)) + for _, ref := range refs { + payloads = append(payloads, apicontract.ModelCatalogSourceRefPayload{ + SourceID: ref.SourceID, + SourceKind: string(ref.SourceKind), + Priority: ref.Priority, + RefreshedAt: hostAPIModelCatalogTimeString(ref.RefreshedAt), + Stale: ref.Stale, + LastError: modelcatalog.RedactString(ref.LastError), + }) + } + return payloads +} + +func hostAPISourceStatusPayloadsFromStatuses( + statuses []modelcatalog.SourceStatus, +) []apicontract.ModelCatalogSourceStatusPayload { + payloads := make([]apicontract.ModelCatalogSourceStatusPayload, 0, len(statuses)) + for _, status := range statuses { + payloads = append(payloads, apicontract.ModelCatalogSourceStatusPayload{ + SourceID: status.SourceID, + SourceKind: string(status.SourceKind), + ProviderID: status.ProviderID, + Priority: status.Priority, + LastRefresh: hostAPIModelCatalogTimeString(status.LastRefresh), + NextRefresh: hostAPIModelCatalogTimeString(status.NextRefresh), + LastSuccess: hostAPIModelCatalogTimeString(status.LastSuccess), + LastError: modelcatalog.RedactString(status.LastError), + RefreshState: status.RefreshState, + RowCount: status.RowCount, + Stale: status.Stale, + }) + } + return payloads +} + +func hostAPICostPayloadFromModel(model modelcatalog.Model) *apicontract.ModelCatalogCostPayload { + if model.CostInputPerMillion == nil && model.CostOutputPerMillion == nil { + return nil + } + return &apicontract.ModelCatalogCostPayload{ + InputPerMillion: model.CostInputPerMillion, + OutputPerMillion: model.CostOutputPerMillion, + } +} + +func hostAPIReasoningEffortStrings(efforts []modelcatalog.ReasoningEffort) []string { + values := make([]string, 0, len(efforts)) + for _, effort := range efforts { + values = append(values, string(effort)) + } + return values +} + +func hostAPIReasoningEffortStringPtr(effort *modelcatalog.ReasoningEffort) *string { + if effort == nil { + return nil + } + value := string(*effort) + return &value +} + +func hostAPIModelCatalogTimeString(value time.Time) string { + if value.IsZero() { + return "" + } + return value.UTC().Format(time.RFC3339Nano) +} diff --git a/internal/extension/host_api_models_test.go b/internal/extension/host_api_models_test.go new file mode 100644 index 000000000..7f43b58aa --- /dev/null +++ b/internal/extension/host_api_models_test.go @@ -0,0 +1,505 @@ +package extensionpkg + +import ( + "context" + "encoding/json" + "errors" + "fmt" + "strings" + "testing" + "time" + + apicontract "github.com/pedronauck/agh/internal/api/contract" + "github.com/pedronauck/agh/internal/modelcatalog" + "github.com/pedronauck/agh/internal/subprocess" + "github.com/pedronauck/agh/internal/testutil" +) + +func TestHostAPIModelsListShouldReturnDaemonProjection(t *testing.T) { + t.Parallel() + + t.Run("Should return daemon projection", func(t *testing.T) { + t.Parallel() + + now := time.Date(2026, 5, 7, 12, 0, 0, 0, time.UTC) + available := true + cost := 2.5 + defaultEffort := modelcatalog.ReasoningEffortHigh + service := &fakeHostAPIModelCatalogService{ + models: []modelcatalog.Model{ + { + ProviderID: "codex", + ModelID: "daemon-model", + DisplayName: "Daemon Model", + Available: &available, + AvailabilityState: string(modelcatalog.AvailabilityStateAvailableLive), + RefreshedAt: now, + Sources: []modelcatalog.SourceRef{ + { + SourceID: "config", + SourceKind: modelcatalog.SourceKindConfig, + Priority: modelcatalog.PriorityConfig, + RefreshedAt: now, + LastError: "source failed with OAUTH_TOKEN=oauth-host-secret-token", + }, + }, + ReasoningEfforts: []modelcatalog.ReasoningEffort{modelcatalog.ReasoningEffortHigh}, + DefaultReasoningEffort: &defaultEffort, + CostInputPerMillion: &cost, + CostOutputPerMillion: &cost, + LastError: "model failed with api_key=sk-host-secret-token", + }, + }, + } + handler := NewHostAPIHandler( + nil, + nil, + nil, + nil, + WithHostAPIModelCatalogService(service), + WithHostAPICapabilityChecker(newTestCapabilityChecker( + "ext", + SourceUser, + []string{"models/list"}, + []string{"model.read"}, + )), + WithHostAPINow(func() time.Time { return now }), + ) + + result, err := handler.Handle( + testutil.Context(t), + "ext", + "models/list", + json.RawMessage(`{"provider_id":"codex","source_id":"extension:ext-models","include_stale":true}`), + ) + if err != nil { + t.Fatalf("Handle(models/list) error = %v, want nil", err) + } + payload, ok := result.(apicontract.ProviderModelListResponse) + if !ok { + t.Fatalf("Handle(models/list) result = %T, want ProviderModelListResponse", result) + } + if len(payload.Models) != 1 { + t.Fatalf("len(result.Models) = %d, want 1", len(payload.Models)) + } + model := payload.Models[0] + if model.ModelID != "daemon-model" || model.Sources[0].SourceID != "config" { + t.Fatalf("models/list payload = %#v, want daemon projection from model catalog service", model) + } + if model.DefaultReasoningEffort == nil || *model.DefaultReasoningEffort != "high" { + t.Fatalf("models/list default reasoning effort = %#v, want high", model.DefaultReasoningEffort) + } + assertRedactedHostAPIModelPayload(t, model.LastError, "sk-host-secret-token") + assertRedactedHostAPIModelPayload(t, model.Sources[0].LastError, "oauth-host-secret-token") + if len(service.listOpts) != 1 { + t.Fatalf("len(service.listOpts) = %d, want 1", len(service.listOpts)) + } + opts := service.listOpts[0] + if opts.ProviderID != "codex" || opts.SourceID != "extension:ext-models" || !opts.IncludeStale { + t.Fatalf("ListModels opts = %#v, want decoded Host API filters", opts) + } + }) +} + +func TestHostAPIModelsRefreshShouldReturnStatusPayloadOnSourceFailure(t *testing.T) { + t.Parallel() + + t.Run("Should return status payload on source failure", func(t *testing.T) { + t.Parallel() + + now := time.Date(2026, 5, 7, 12, 15, 0, 0, time.UTC) + secret := "sk-host-refresh-secret-token" + service := &fakeHostAPIModelCatalogService{ + statuses: []modelcatalog.SourceStatus{ + { + SourceID: "extension:ext-models", + SourceKind: modelcatalog.SourceKindExtension, + ProviderID: "codex", + Priority: modelcatalog.PriorityExtension, + LastRefresh: now, + LastError: "extension unavailable api_key=" + secret, + RefreshState: string(modelcatalog.RefreshStateFailed), + Stale: true, + }, + }, + refreshErr: fmt.Errorf("%w: api_key=%s", modelcatalog.ErrAllSourcesFailed, secret), + } + handler := NewHostAPIHandler( + nil, + nil, + nil, + nil, + WithHostAPIModelCatalogService(service), + WithHostAPICapabilityChecker(newTestCapabilityChecker( + "ext", + SourceUser, + []string{"models/refresh"}, + []string{"model.write"}, + )), + WithHostAPINow(func() time.Time { return now }), + ) + + result, err := handler.Handle( + testutil.Context(t), + "ext", + "models/refresh", + json.RawMessage(`{"provider_id":"codex","source_id":"extension:ext-models","force":true}`), + ) + if err != nil { + t.Fatalf("Handle(models/refresh) error = %v, want status payload with error field", err) + } + payload, ok := result.(apicontract.ProviderModelRefreshResponse) + if !ok { + t.Fatalf("Handle(models/refresh) result = %T, want ProviderModelRefreshResponse", result) + } + if payload.Error == "" || len(payload.Sources) != 1 || payload.Sources[0].RefreshState != "failed" { + t.Fatalf("models/refresh payload = %#v, want failed source status and error", payload) + } + assertRedactedHostAPIModelPayload(t, payload.Error, secret) + assertRedactedHostAPIModelPayload(t, payload.Sources[0].LastError, secret) + if len(service.refreshOpts) != 1 || !service.refreshOpts[0].Force { + t.Fatalf("Refresh opts = %#v, want force refresh recorded", service.refreshOpts) + } + }) +} + +func TestHostAPIModelsRefreshShouldReturnSuccessfulSourceStatus(t *testing.T) { + t.Parallel() + + t.Run("Should return successful source status", func(t *testing.T) { + t.Parallel() + + now := time.Date(2026, 5, 7, 12, 20, 0, 0, time.UTC) + service := &fakeHostAPIModelCatalogService{ + statuses: []modelcatalog.SourceStatus{ + { + SourceID: "extension:ext-models", + SourceKind: modelcatalog.SourceKindExtension, + ProviderID: "codex", + Priority: modelcatalog.PriorityExtension, + LastRefresh: now, + RefreshState: string(modelcatalog.RefreshStateSucceeded), + RowCount: 1, + }, + }, + } + handler := NewHostAPIHandler( + nil, + nil, + nil, + nil, + WithHostAPIModelCatalogService(service), + WithHostAPICapabilityChecker(newTestCapabilityChecker( + "ext", + SourceUser, + []string{"models/refresh"}, + []string{"model.write"}, + )), + ) + + result, err := handler.Handle( + testutil.Context(t), + "ext", + "models/refresh", + json.RawMessage(`{"provider_id":"codex"}`), + ) + if err != nil { + t.Fatalf("Handle(models/refresh) error = %v, want nil", err) + } + payload, ok := result.(apicontract.ProviderModelRefreshResponse) + if !ok { + t.Fatalf("Handle(models/refresh) result = %T, want ProviderModelRefreshResponse", result) + } + if payload.Error != "" || len(payload.Sources) != 1 || payload.Sources[0].RefreshState != "succeeded" { + t.Fatalf("models/refresh payload = %#v, want successful source status", payload) + } + }) +} + +func TestHostAPIModelsStatusShouldReturnDaemonSourceStatus(t *testing.T) { + t.Parallel() + + t.Run("Should return daemon source status", func(t *testing.T) { + t.Parallel() + + now := time.Date(2026, 5, 7, 12, 30, 0, 0, time.UTC) + service := &fakeHostAPIModelCatalogService{ + statuses: []modelcatalog.SourceStatus{ + { + SourceID: "extension:ext-models", + SourceKind: modelcatalog.SourceKindExtension, + ProviderID: "codex", + Priority: modelcatalog.PriorityExtension, + LastRefresh: now, + RefreshState: string(modelcatalog.RefreshStateSucceeded), + RowCount: 1, + }, + }, + } + handler := NewHostAPIHandler( + nil, + nil, + nil, + nil, + WithHostAPIModelCatalogService(service), + WithHostAPICapabilityChecker(newTestCapabilityChecker( + "ext", + SourceUser, + []string{"models/status"}, + []string{"model.read"}, + )), + ) + + result, err := handler.Handle( + testutil.Context(t), + "ext", + "models/status", + json.RawMessage(`{"provider_id":"codex"}`), + ) + if err != nil { + t.Fatalf("Handle(models/status) error = %v, want nil", err) + } + payload, ok := result.(apicontract.ProviderModelStatusResponse) + if !ok { + t.Fatalf("Handle(models/status) result = %T, want ProviderModelStatusResponse", result) + } + if len(payload.Sources) != 1 || payload.Sources[0].SourceID != "extension:ext-models" { + t.Fatalf("models/status payload = %#v, want extension status", payload) + } + if len(service.statusProviderIDs) != 1 || service.statusProviderIDs[0] != "codex" { + t.Fatalf("ListSourceStatus provider ids = %#v, want [codex]", service.statusProviderIDs) + } + }) +} + +func TestHostAPIModelsListShouldRequireModelReadGrant(t *testing.T) { + t.Parallel() + + t.Run("Should require model read grant", func(t *testing.T) { + t.Parallel() + + handler := NewHostAPIHandler( + nil, + nil, + nil, + nil, + WithHostAPIModelCatalogService(&fakeHostAPIModelCatalogService{}), + WithHostAPICapabilityChecker(newTestCapabilityChecker( + "ext", + SourceUser, + []string{"models/list"}, + []string{"session.read"}, + )), + ) + + _, err := handler.Handle(testutil.Context(t), "ext", "models/list", nil) + if err == nil { + t.Fatal("Handle(models/list) error = nil, want capability denied") + } + var rpcErr *subprocess.RPCError + if !errors.As(err, &rpcErr) { + t.Fatalf("Handle(models/list) error = %T, want *RPCError", err) + } + if rpcErr.Code != CapabilityDeniedCode { + t.Fatalf("RPCError.Code = %d, want %d", rpcErr.Code, CapabilityDeniedCode) + } + }) +} + +func TestHostAPIModelsShouldMapValidationAndAvailabilityErrors(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + method string + params json.RawMessage + actions []string + security []string + service modelcatalog.Service + wantCode int + }{ + { + name: "Should reject invalid source id", + method: "models/list", + params: json.RawMessage(`{"source_id":"bad source"}`), + actions: []string{"models/list"}, + security: []string{"model.read"}, + service: &fakeHostAPIModelCatalogService{}, + wantCode: HostAPIInvalidParamsCode, + }, + { + name: "Should reject invalid provider id", + method: "models/list", + params: json.RawMessage(`{"provider_id":"Bad"}`), + actions: []string{"models/list"}, + security: []string{"model.read"}, + service: &fakeHostAPIModelCatalogService{}, + wantCode: HostAPIInvalidParamsCode, + }, + { + name: "Should map unregistered source to invalid params", + method: "models/list", + params: json.RawMessage(`{"source_id":"extension:missing"}`), + actions: []string{"models/list"}, + security: []string{"model.read"}, + service: &fakeHostAPIModelCatalogService{listErr: modelcatalog.ErrSourceNotRegistered}, + wantCode: HostAPIInvalidParamsCode, + }, + { + name: "Should map refresh failure without statuses to unavailable", + method: "models/refresh", + params: json.RawMessage(`{"source_id":"extension:missing"}`), + actions: []string{"models/refresh"}, + security: []string{"model.write"}, + service: &fakeHostAPIModelCatalogService{refreshErr: modelcatalog.ErrAllSourcesFailed}, + wantCode: HostAPIUnavailableCode, + }, + { + name: "Should map missing status service to unavailable", + method: "models/status", + params: json.RawMessage(`{}`), + actions: []string{"models/status"}, + security: []string{"model.read"}, + wantCode: HostAPIUnavailableCode, + }, + { + name: "Should map status service failure to unavailable", + method: "models/status", + params: json.RawMessage(`{}`), + actions: []string{"models/status"}, + security: []string{"model.read"}, + service: &fakeHostAPIModelCatalogService{statusErr: errors.New("status offline")}, + wantCode: HostAPIUnavailableCode, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + opts := []HostAPIOption{ + WithHostAPICapabilityChecker(newTestCapabilityChecker("ext", SourceUser, tt.actions, tt.security)), + } + if tt.service != nil { + opts = append(opts, WithHostAPIModelCatalogService(tt.service)) + } + handler := NewHostAPIHandler(nil, nil, nil, nil, opts...) + _, err := handler.Handle(testutil.Context(t), "ext", tt.method, tt.params) + if err == nil { + t.Fatal("Handle() error = nil, want RPC error") + } + var rpcErr *subprocess.RPCError + if !errors.As(err, &rpcErr) { + t.Fatalf("Handle() error = %T, want *RPCError", err) + } + if rpcErr.Code != tt.wantCode { + t.Fatalf("RPCError.Code = %d, want %d", rpcErr.Code, tt.wantCode) + } + }) + } +} + +func TestHostAPIModelsShouldRedactUnavailableRPCErrorData(t *testing.T) { + t.Parallel() + + t.Run("Should redact unavailable RPC error data", func(t *testing.T) { + t.Parallel() + + secret := "oauth-rpc-secret-token" + handler := NewHostAPIHandler( + nil, + nil, + nil, + nil, + WithHostAPIModelCatalogService(&fakeHostAPIModelCatalogService{ + listErr: errors.New("catalog unavailable OAUTH_TOKEN=" + secret), + }), + WithHostAPICapabilityChecker(newTestCapabilityChecker( + "ext", + SourceUser, + []string{"models/list"}, + []string{"model.read"}, + )), + ) + _, err := handler.Handle(testutil.Context(t), "ext", "models/list", json.RawMessage(`{}`)) + if err == nil { + t.Fatal("Handle(models/list) error = nil, want RPC error") + } + var rpcErr *subprocess.RPCError + if !errors.As(err, &rpcErr) { + t.Fatalf("Handle(models/list) error = %T, want *RPCError", err) + } + data := string(rpcErr.Data) + if strings.Contains(data, secret) { + t.Fatalf("RPC error data = %s, want secret redacted", data) + } + if !strings.Contains(data, "[REDACTED]") { + t.Fatalf("RPC error data = %s, want redaction marker", data) + } + }) +} + +func TestHostAPIModelHelpersShouldHandleEmptyValues(t *testing.T) { + t.Parallel() + + t.Run("Should handle empty values", func(t *testing.T) { + t.Parallel() + + var nilHandler *HostAPIHandler + if nilHandler.hostAPINow().IsZero() { + t.Fatal("hostAPINow(nil) returned zero time, want UTC fallback") + } + if got := hostAPICostPayloadFromModel(modelcatalog.Model{}); got != nil { + t.Fatalf("hostAPICostPayloadFromModel(empty) = %#v, want nil", got) + } + if got := hostAPIReasoningEffortStringPtr(nil); got != nil { + t.Fatalf("hostAPIReasoningEffortStringPtr(nil) = %#v, want nil", got) + } + }) +} + +type fakeHostAPIModelCatalogService struct { + models []modelcatalog.Model + statuses []modelcatalog.SourceStatus + listOpts []modelcatalog.ListOptions + refreshOpts []modelcatalog.RefreshOptions + statusProviderIDs []string + listErr error + refreshErr error + statusErr error +} + +func (s *fakeHostAPIModelCatalogService) ListModels( + _ context.Context, + opts modelcatalog.ListOptions, +) ([]modelcatalog.Model, error) { + s.listOpts = append(s.listOpts, opts) + return append([]modelcatalog.Model(nil), s.models...), s.listErr +} + +func (s *fakeHostAPIModelCatalogService) Refresh( + _ context.Context, + opts modelcatalog.RefreshOptions, +) ([]modelcatalog.SourceStatus, error) { + s.refreshOpts = append(s.refreshOpts, opts) + return append([]modelcatalog.SourceStatus(nil), s.statuses...), s.refreshErr +} + +func (s *fakeHostAPIModelCatalogService) ListSourceStatus( + _ context.Context, + providerID string, +) ([]modelcatalog.SourceStatus, error) { + s.statusProviderIDs = append(s.statusProviderIDs, providerID) + return append([]modelcatalog.SourceStatus(nil), s.statuses...), s.statusErr +} + +func assertRedactedHostAPIModelPayload(t *testing.T, value string, secret string) { + t.Helper() + + if strings.Contains(value, secret) { + t.Fatalf("Host API payload value = %q, want secret redacted", value) + } + if !strings.Contains(value, "[REDACTED]") { + t.Fatalf("Host API payload value = %q, want redaction marker", value) + } +} diff --git a/internal/extension/manager_model_source_test.go b/internal/extension/manager_model_source_test.go new file mode 100644 index 000000000..9f7853521 --- /dev/null +++ b/internal/extension/manager_model_source_test.go @@ -0,0 +1,76 @@ +package extensionpkg + +import ( + "testing" + + extensioncontract "github.com/pedronauck/agh/internal/extension/contract" + extensionprotocol "github.com/pedronauck/agh/internal/extension/protocol" + "github.com/pedronauck/agh/internal/testutil" +) + +func TestManagerListModelSourceRows(t *testing.T) { + t.Run("Should call subprocess models list", func(t *testing.T) { + withDaemonVersion(t, "0.5.0") + env := newRegistryTestEnv(t) + fixture := createManagerTestExtension(t, managerTestManifest("ext-models", managerManifestOptions{ + command: helperCommand(t), + args: helperArgs(), + withEnv: helperEnv("model_source_success", ""), + capabilities: []string{extensionprotocol.CapabilityProvideModelSource}, + }), nil) + installManagerFixture(t, env.registry, fixture, SourceUser, true) + + manager := NewManager(env.registry) + if err := manager.Start(testutil.Context(t)); err != nil { + t.Fatalf("Start() error = %v", err) + } + t.Cleanup(func() { + if err := manager.Stop(testutil.Context(t)); err != nil { + t.Fatalf("Stop() cleanup error = %v", err) + } + }) + + rows, err := manager.ListModelSourceRows( + testutil.Context(t), + "ext-models", + extensioncontract.ModelSourceListParams{ProviderID: "codex", Refresh: true}, + ) + if err != nil { + t.Fatalf("ListModelSourceRows() error = %v, want nil", err) + } + if len(rows) != 1 || rows[0].ModelID != "subprocess-model" || rows[0].SourceID != "extension:ext-models" { + t.Fatalf("ListModelSourceRows() = %#v, want subprocess model source row", rows) + } + }) + + t.Run("Should deny missing model source capability", func(t *testing.T) { + withDaemonVersion(t, "0.5.0") + env := newRegistryTestEnv(t) + fixture := createManagerTestExtension(t, managerTestManifest("ext-no-models", managerManifestOptions{ + command: helperCommand(t), + args: helperArgs(), + withEnv: helperEnv("model_source_success", ""), + capabilities: []string{"memory.backend"}, + }), nil) + installManagerFixture(t, env.registry, fixture, SourceUser, true) + + manager := NewManager(env.registry) + if err := manager.Start(testutil.Context(t)); err != nil { + t.Fatalf("Start() error = %v", err) + } + t.Cleanup(func() { + if err := manager.Stop(testutil.Context(t)); err != nil { + t.Fatalf("Stop() cleanup error = %v", err) + } + }) + + _, err := manager.ListModelSourceRows( + testutil.Context(t), + "ext-no-models", + extensioncontract.ModelSourceListParams{ProviderID: "codex"}, + ) + if err == nil { + t.Fatal("ListModelSourceRows() error = nil, want denied service method") + } + }) +} diff --git a/internal/extension/manager_test.go b/internal/extension/manager_test.go index 4b3862ebe..27437bb2a 100644 --- a/internal/extension/manager_test.go +++ b/internal/extension/manager_test.go @@ -16,8 +16,10 @@ import ( automationpkg "github.com/pedronauck/agh/internal/automation" bridgepkg "github.com/pedronauck/agh/internal/bridges" + extensioncontract "github.com/pedronauck/agh/internal/extension/contract" extensionprotocol "github.com/pedronauck/agh/internal/extension/protocol" hookspkg "github.com/pedronauck/agh/internal/hooks" + "github.com/pedronauck/agh/internal/modelcatalog" "github.com/pedronauck/agh/internal/resources" skillspkg "github.com/pedronauck/agh/internal/skills" "github.com/pedronauck/agh/internal/subprocess" @@ -1894,6 +1896,37 @@ func (h *extensionHelperServer) handleRequest(req helperRequest) error { ack.ReplaceRemoteMessageID = fmt.Sprintf("remote-%d", ack.Seq-1) } return h.sendResult(req.ID, ack) + case "models/list": + var params extensioncontract.ModelSourceListParams + if err := json.Unmarshal(req.Params, ¶ms); err != nil { + return err + } + if h.scenario == "model_source_error" { + return h.sendError(req.ID, -32020, "Model source unavailable", map[string]string{ + "error": "model source unavailable", + }) + } + sourceID, err := modelcatalog.SourceKindExtensionID(h.extensionName()) + if err != nil { + return err + } + providerID := strings.TrimSpace(params.ProviderID) + if providerID == "" { + providerID = "codex" + } + row := extensioncontract.ModelSourceRow{ + SourceID: sourceID, + ProviderID: providerID, + ModelID: "subprocess-model", + DisplayName: "Subprocess Model", + Priority: modelcatalog.PriorityExtension, + } + if h.scenario == "model_source_malformed" { + row.ModelID = "" + } + return h.sendResult(req.ID, extensioncontract.ModelSourceListResponse{ + Rows: []extensioncontract.ModelSourceRow{row}, + }) case "provide_tools": return h.sendResult(req.ID, h.toolRuntimeDescriptors()) case "tools/call": diff --git a/internal/extension/manifest.go b/internal/extension/manifest.go index 39be65eb8..28b09981a 100644 --- a/internal/extension/manifest.go +++ b/internal/extension/manifest.go @@ -17,6 +17,7 @@ import ( bridgepkg "github.com/pedronauck/agh/internal/bridges" extensionprotocol "github.com/pedronauck/agh/internal/extension/protocol" "github.com/pedronauck/agh/internal/extension/surfaces" + "github.com/pedronauck/agh/internal/modelcatalog" "github.com/pedronauck/agh/internal/resources" toolspkg "github.com/pedronauck/agh/internal/tools" "github.com/pedronauck/agh/internal/vault" @@ -313,6 +314,9 @@ func (m *Manifest) Validate() error { if err := validateDottedIdentifiers("capabilities.provides", m.Capabilities.Provides, false); err != nil { return err } + if err := m.validateModelSourceCapability(); err != nil { + return err + } if err := validateSlashIdentifiers("actions.requires", m.Actions.Requires); err != nil { return err } @@ -334,6 +338,20 @@ func (m *Manifest) Validate() error { return nil } +func (m *Manifest) validateModelSourceCapability() error { + if !providesCapability(m.Capabilities.Provides, extensionprotocol.CapabilityProvideModelSource) { + return nil + } + if _, err := modelcatalog.SourceKindExtensionID(m.Name); err != nil { + return &ManifestValidationError{ + Field: "name", + Value: m.Name, + Message: err.Error(), + } + } + return nil +} + func (m *Manifest) validateBridgeAdapterCapability() error { if !providesCapability(m.Capabilities.Provides, extensionprotocol.CapabilityProvideBridgeAdapter) { return nil diff --git a/internal/extension/manifest_model_source_test.go b/internal/extension/manifest_model_source_test.go new file mode 100644 index 000000000..b35484467 --- /dev/null +++ b/internal/extension/manifest_model_source_test.go @@ -0,0 +1,45 @@ +package extensionpkg + +import ( + "errors" + "testing" + + extensionprotocol "github.com/pedronauck/agh/internal/extension/protocol" +) + +func TestManifestValidateModelSourceCapability(t *testing.T) { + t.Run("Should accept normalizable model source capability", func(t *testing.T) { + withDaemonVersion(t, "0.6.0") + + manifest := expectedManifest() + manifest.Name = "OpenAI Models" + manifest.Capabilities.Provides = []string{extensionprotocol.CapabilityProvideModelSource} + + if err := manifest.Validate(); err != nil { + t.Fatalf("Validate() error = %v, want nil", err) + } + }) + + t.Run("Should reject model source name without valid slug", func(t *testing.T) { + withDaemonVersion(t, "0.6.0") + + manifest := expectedManifest() + manifest.Name = "bad/source" + manifest.Capabilities.Provides = []string{extensionprotocol.CapabilityProvideModelSource} + + err := manifest.Validate() + if err == nil { + t.Fatal("Validate() error = nil, want ErrManifestInvalid") + } + if !errors.Is(err, ErrManifestInvalid) { + t.Fatalf("Validate() error = %v, want ErrManifestInvalid", err) + } + var validationErr *ManifestValidationError + if !errors.As(err, &validationErr) { + t.Fatalf("Validate() error = %T, want *ManifestValidationError", err) + } + if validationErr.Field != "name" { + t.Fatalf("ManifestValidationError.Field = %q, want name", validationErr.Field) + } + }) +} diff --git a/internal/extension/model_source.go b/internal/extension/model_source.go new file mode 100644 index 000000000..ce8e179b6 --- /dev/null +++ b/internal/extension/model_source.go @@ -0,0 +1,422 @@ +package extensionpkg + +import ( + "context" + "fmt" + "strings" + "time" + + apicontract "github.com/pedronauck/agh/internal/api/contract" + extensioncontract "github.com/pedronauck/agh/internal/extension/contract" + extensionprotocol "github.com/pedronauck/agh/internal/extension/protocol" + "github.com/pedronauck/agh/internal/modelcatalog" +) + +// ModelSourceRuntime calls AGH-to-extension model source services. +type ModelSourceRuntime interface { + ListModelSourceRows( + ctx context.Context, + extensionName string, + params extensioncontract.ModelSourceListParams, + ) ([]extensioncontract.ModelSourceRow, error) +} + +// ModelSourceRuntimeResolver returns the current extension runtime. +type ModelSourceRuntimeResolver func() ModelSourceRuntime + +// ModelSource adapts one extension into a daemon-owned model catalog source. +type ModelSource struct { + info ExtensionInfo + sourceID string + resolver ModelSourceRuntimeResolver +} + +var _ modelcatalog.Source = (*ModelSource)(nil) +var _ ModelSourceRuntime = (*Manager)(nil) + +// NewExtensionModelSources creates sources for installed extensions that provide model.source. +func NewExtensionModelSources(registry *Registry, resolver ModelSourceRuntimeResolver) ([]modelcatalog.Source, error) { + if registry == nil { + return nil, nil + } + infos, err := registry.List() + if err != nil { + return nil, fmt.Errorf("extension: list model source extensions: %w", err) + } + sources := make([]modelcatalog.Source, 0, len(infos)) + for _, info := range infos { + if !providesCapability(info.Capabilities.Provides, extensionprotocol.CapabilityProvideModelSource) { + continue + } + source, err := NewExtensionModelSource(info, resolver) + if err != nil { + return nil, err + } + sources = append(sources, source) + } + return sources, nil +} + +// NewExtensionModelSource creates a daemon model catalog source for one extension. +func NewExtensionModelSource(info ExtensionInfo, resolver ModelSourceRuntimeResolver) (*ModelSource, error) { + sourceID, err := modelcatalog.SourceKindExtensionID(info.Name) + if err != nil { + return nil, fmt.Errorf("extension: create model source for %q: %w", info.Name, err) + } + return &ModelSource{ + info: cloneExtensionInfo(info), + sourceID: sourceID, + resolver: resolver, + }, nil +} + +// ID returns the stable extension source id. +func (s *ModelSource) ID() string { + if s == nil { + return "" + } + return s.sourceID +} + +// Kind returns extension. +func (s *ModelSource) Kind() modelcatalog.SourceKind { + return modelcatalog.SourceKindExtension +} + +// Priority returns the extension merge priority. +func (s *ModelSource) Priority() int { + return modelcatalog.PriorityExtension +} + +// ListModels calls the extension models/list service and validates rows before persistence. +func (s *ModelSource) ListModels(ctx context.Context, opts modelcatalog.ListOptions) ([]modelcatalog.ModelRow, error) { + if ctx == nil { + return nil, fmt.Errorf("extension: model source context is required") + } + if s == nil { + return nil, fmt.Errorf("extension: model source is required") + } + if !s.info.Enabled { + return nil, modelcatalog.ErrSourceDisabled + } + if !providesCapability(s.info.Capabilities.Provides, extensionprotocol.CapabilityProvideModelSource) { + return nil, fmt.Errorf( + "extension: model source %q is missing %q capability", + s.info.Name, + extensionprotocol.CapabilityProvideModelSource, + ) + } + if s.resolver == nil { + return nil, fmt.Errorf("extension: model source runtime is unavailable") + } + runtime := s.resolver() + if runtime == nil { + return nil, fmt.Errorf("extension: model source runtime is unavailable") + } + rows, err := runtime.ListModelSourceRows(ctx, s.info.Name, extensioncontract.ModelSourceListParams{ + ProviderID: strings.TrimSpace(opts.ProviderID), + Refresh: opts.Refresh, + IncludeStale: opts.IncludeStale, + }) + if err != nil { + return nil, fmt.Errorf("extension: list model source %q: %w", s.info.Name, err) + } + return s.validateRows(rows, opts) +} + +// ListModelSourceRows calls one extension's negotiated models/list service. +func (m *Manager) ListModelSourceRows( + ctx context.Context, + extensionName string, + params extensioncontract.ModelSourceListParams, +) ([]extensioncontract.ModelSourceRow, error) { + process, name, err := m.extensionServiceProcess( + ctx, + extensionName, + extensionprotocol.ExtensionServiceMethodModelsList, + ) + if err != nil { + return nil, err + } + + var response extensioncontract.ModelSourceListResponse + if err := process.Call( + ctx, + string(extensionprotocol.ExtensionServiceMethodModelsList), + params, + &response, + ); err != nil { + return nil, fmt.Errorf("extension: list models via %q: %w", name, err) + } + return cloneModelSourceRows(response.Rows), nil +} + +func (s *ModelSource) validateRows( + rows []extensioncontract.ModelSourceRow, + opts modelcatalog.ListOptions, +) ([]modelcatalog.ModelRow, error) { + if len(rows) == 0 { + return nil, nil + } + now := opts.Now + if now.IsZero() { + now = time.Now().UTC() + } + providerFilter := strings.TrimSpace(opts.ProviderID) + validated := make([]modelcatalog.ModelRow, 0, len(rows)) + for index, row := range rows { + modelRow, include, err := s.validateRow(index, row, providerFilter, now) + if err != nil { + return nil, err + } + if include { + validated = append(validated, modelRow) + } + } + return validated, nil +} + +func (s *ModelSource) validateRow( + index int, + row extensioncontract.ModelSourceRow, + providerFilter string, + now time.Time, +) (modelcatalog.ModelRow, bool, error) { + sourceID, providerID, modelID, err := s.validateRowIdentity(index, row, providerFilter) + if err != nil { + return modelcatalog.ModelRow{}, false, err + } + priority := row.Priority + if priority == 0 { + priority = modelcatalog.PriorityExtension + } + if priority != modelcatalog.PriorityExtension { + return modelcatalog.ModelRow{}, false, fmt.Errorf( + "extension: model source row %d priority %d must equal %d", + index, + priority, + modelcatalog.PriorityExtension, + ) + } + if err := validateModelSourceMetadata(index, row); err != nil { + return modelcatalog.ModelRow{}, false, err + } + efforts, defaultEffort, err := modelSourceReasoning(index, row.ReasoningEfforts, row.DefaultReasoningEffort) + if err != nil { + return modelcatalog.ModelRow{}, false, err + } + refreshedAt := row.RefreshedAt + if refreshedAt.IsZero() { + refreshedAt = now + } + modelRow := modelcatalog.ModelRow{ + ProviderID: providerID, + ModelID: modelID, + DisplayName: strings.TrimSpace(row.DisplayName), + SourceID: sourceID, + SourceKind: modelcatalog.SourceKindExtension, + Priority: modelcatalog.PriorityExtension, + Available: row.Available, + Stale: row.Stale, + RefreshedAt: refreshedAt.UTC(), + ExpiresAt: row.ExpiresAt.UTC(), + ContextWindow: row.ContextWindow, + MaxInputTokens: row.MaxInputTokens, + MaxOutputTokens: row.MaxOutputTokens, + SupportsTools: row.SupportsTools, + SupportsReasoning: row.SupportsReasoning, + ReasoningEfforts: efforts, + DefaultReasoningEffort: defaultEffort, + LastError: strings.TrimSpace(row.LastError), + } + if row.Cost != nil { + modelRow.CostInputPerMillion = row.Cost.InputPerMillion + modelRow.CostOutputPerMillion = row.Cost.OutputPerMillion + } + return modelRow, true, nil +} + +func (s *ModelSource) validateRowIdentity( + index int, + row extensioncontract.ModelSourceRow, + providerFilter string, +) (string, string, string, error) { + sourceID := strings.TrimSpace(row.SourceID) + if sourceID == "" { + return "", "", "", fmt.Errorf("extension: model source row %d source_id is required", index) + } + if sourceID != s.sourceID { + return "", "", "", fmt.Errorf( + "extension: model source row %d source_id %q must equal %q", + index, + sourceID, + s.sourceID, + ) + } + if err := modelcatalog.ValidateSourceIdentity(sourceID, modelcatalog.SourceKindExtension); err != nil { + return "", "", "", fmt.Errorf("extension: model source row %d: %w", index, err) + } + providerID := strings.TrimSpace(row.ProviderID) + if providerID == "" { + return "", "", "", fmt.Errorf( + "extension: model source row %d provider_id is required", + index, + ) + } + if providerFilter != "" && providerID != providerFilter { + return "", "", "", fmt.Errorf( + "extension: model source row %d provider_id %q is outside requested provider %q", + index, + providerID, + providerFilter, + ) + } + modelID := strings.TrimSpace(row.ModelID) + if modelID == "" { + return "", "", "", fmt.Errorf("extension: model source row %d model_id is required", index) + } + return sourceID, providerID, modelID, nil +} + +func validateModelSourceMetadata(index int, row extensioncontract.ModelSourceRow) error { + for _, check := range []struct { + field string + value *int64 + }{ + {field: "context_window", value: row.ContextWindow}, + {field: "max_input_tokens", value: row.MaxInputTokens}, + {field: "max_output_tokens", value: row.MaxOutputTokens}, + } { + if check.value != nil && *check.value < 0 { + return fmt.Errorf("extension: model source row %d %s must be non-negative", index, check.field) + } + } + if row.Cost != nil { + if err := validateModelSourceCost(index, *row.Cost); err != nil { + return err + } + } + return nil +} + +func validateModelSourceCost(index int, cost apicontract.ModelCatalogCostPayload) error { + for _, check := range []struct { + field string + value *float64 + }{ + {field: "cost.input_per_million", value: cost.InputPerMillion}, + {field: "cost.output_per_million", value: cost.OutputPerMillion}, + } { + if check.value != nil && *check.value < 0 { + return fmt.Errorf("extension: model source row %d %s must be non-negative", index, check.field) + } + } + return nil +} + +func modelSourceReasoning( + index int, + values []string, + defaultValue *string, +) ([]modelcatalog.ReasoningEffort, *modelcatalog.ReasoningEffort, error) { + efforts := make([]modelcatalog.ReasoningEffort, 0, len(values)) + seen := make(map[modelcatalog.ReasoningEffort]struct{}, len(values)) + for _, value := range values { + effort, err := parseModelSourceReasoningEffort(value) + if err != nil { + return nil, nil, fmt.Errorf("extension: model source row %d: %w", index, err) + } + if _, exists := seen[effort]; exists { + return nil, nil, fmt.Errorf( + "extension: model source row %d reasoning_efforts contains duplicate %q", + index, + effort, + ) + } + seen[effort] = struct{}{} + efforts = append(efforts, effort) + } + if defaultValue == nil { + return efforts, nil, nil + } + defaultEffort, err := parseModelSourceReasoningEffort(*defaultValue) + if err != nil { + return nil, nil, fmt.Errorf("extension: model source row %d default_reasoning_effort: %w", index, err) + } + if len(seen) > 0 { + if _, ok := seen[defaultEffort]; !ok { + return nil, nil, fmt.Errorf( + "extension: model source row %d default_reasoning_effort %q is not in reasoning_efforts", + index, + defaultEffort, + ) + } + } + return efforts, &defaultEffort, nil +} + +func parseModelSourceReasoningEffort(value string) (modelcatalog.ReasoningEffort, error) { + trimmed := strings.ToLower(strings.TrimSpace(value)) + switch modelcatalog.ReasoningEffort(trimmed) { + case modelcatalog.ReasoningEffortMinimal, + modelcatalog.ReasoningEffortLow, + modelcatalog.ReasoningEffortMedium, + modelcatalog.ReasoningEffortHigh, + modelcatalog.ReasoningEffortXHigh: + return modelcatalog.ReasoningEffort(trimmed), nil + default: + return "", fmt.Errorf("reasoning effort %q is not supported", value) + } +} + +func cloneModelSourceRows(src []extensioncontract.ModelSourceRow) []extensioncontract.ModelSourceRow { + if len(src) == 0 { + return nil + } + cloned := make([]extensioncontract.ModelSourceRow, len(src)) + for index := range src { + cloned[index] = src[index] + cloned[index].Available = cloneModelSourceBoolPointer(src[index].Available) + cloned[index].ContextWindow = cloneModelSourceInt64Pointer(src[index].ContextWindow) + cloned[index].MaxInputTokens = cloneModelSourceInt64Pointer(src[index].MaxInputTokens) + cloned[index].MaxOutputTokens = cloneModelSourceInt64Pointer(src[index].MaxOutputTokens) + cloned[index].SupportsTools = cloneModelSourceBoolPointer(src[index].SupportsTools) + cloned[index].SupportsReasoning = cloneModelSourceBoolPointer(src[index].SupportsReasoning) + cloned[index].ReasoningEfforts = append([]string(nil), src[index].ReasoningEfforts...) + if src[index].DefaultReasoningEffort != nil { + value := *src[index].DefaultReasoningEffort + cloned[index].DefaultReasoningEffort = &value + } + if src[index].Cost != nil { + cloned[index].Cost = &apicontract.ModelCatalogCostPayload{ + InputPerMillion: cloneModelSourceFloat64Pointer(src[index].Cost.InputPerMillion), + OutputPerMillion: cloneModelSourceFloat64Pointer(src[index].Cost.OutputPerMillion), + } + } + } + return cloned +} + +func cloneModelSourceBoolPointer(src *bool) *bool { + if src == nil { + return nil + } + value := *src + return &value +} + +func cloneModelSourceInt64Pointer(src *int64) *int64 { + if src == nil { + return nil + } + value := *src + return &value +} + +func cloneModelSourceFloat64Pointer(src *float64) *float64 { + if src == nil { + return nil + } + value := *src + return &value +} diff --git a/internal/extension/model_source_test.go b/internal/extension/model_source_test.go new file mode 100644 index 000000000..d19f7232e --- /dev/null +++ b/internal/extension/model_source_test.go @@ -0,0 +1,671 @@ +package extensionpkg + +import ( + "context" + "errors" + "path/filepath" + "testing" + "time" + + apicontract "github.com/pedronauck/agh/internal/api/contract" + aghconfig "github.com/pedronauck/agh/internal/config" + extensioncontract "github.com/pedronauck/agh/internal/extension/contract" + extensionprotocol "github.com/pedronauck/agh/internal/extension/protocol" + "github.com/pedronauck/agh/internal/modelcatalog" + "github.com/pedronauck/agh/internal/store/globaldb" + "github.com/pedronauck/agh/internal/testutil" +) + +func TestModelSourceShouldPersistValidatedRowsThroughCatalogService(t *testing.T) { + t.Parallel() + + t.Run("Should persist validated rows through catalog service", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + now := time.Date(2026, 5, 7, 10, 0, 0, 0, time.UTC) + store := openModelSourceTestStore(t) + runtime := &fakeModelSourceRuntime{} + source := newTestModelSource(t, "ext-models", runtime) + available := true + cost := 1.25 + runtime.rows = []extensioncontract.ModelSourceRow{ + { + SourceID: source.ID(), + ProviderID: "codex", + ModelID: "gpt-5.4-extension", + DisplayName: "GPT 5.4 Extension", + Available: &available, + ReasoningEfforts: []string{"high"}, + ContextWindow: int64Pointer(200000), + SupportsTools: boolPointer(true), + SupportsReasoning: boolPointer(true), + Cost: &apicontract.ModelCatalogCostPayload{ + InputPerMillion: &cost, + OutputPerMillion: &cost, + }, + }, + } + service := newTestModelCatalogService(t, store, []modelcatalog.Source{source}) + + statuses, err := service.Refresh(ctx, modelcatalog.RefreshOptions{ + ProviderID: "codex", + SourceID: source.ID(), + Force: true, + Now: now, + }) + if err != nil { + t.Fatalf("Refresh() error = %v, want nil", err) + } + if len(statuses) != 1 || statuses[0].RefreshState != string(modelcatalog.RefreshStateSucceeded) { + t.Fatalf("Refresh() statuses = %#v, want succeeded extension status", statuses) + } + + models, err := service.ListModels(ctx, modelcatalog.ListOptions{ + ProviderID: "codex", + IncludeStale: true, + Now: now, + }) + if err != nil { + t.Fatalf("ListModels() error = %v, want nil", err) + } + if len(models) != 1 || models[0].ModelID != "gpt-5.4-extension" { + t.Fatalf("ListModels() = %#v, want persisted extension model", models) + } + if len(models[0].Sources) != 1 || models[0].Sources[0].SourceID != source.ID() { + t.Fatalf("ListModels()[0].Sources = %#v, want extension source ref", models[0].Sources) + } + }) +} + +func TestNewExtensionModelSourcesShouldFilterRegistryModelSourceCapabilities(t *testing.T) { + t.Run("Should filter registry model source capabilities", func(t *testing.T) { + withDaemonVersion(t, "0.5.0") + + store := openModelSourceTestStore(t) + registry := NewRegistry(store.DB()) + modelFixture := createManagerTestExtension(t, managerTestManifest("ext-registry-models", managerManifestOptions{ + capabilities: []string{extensionprotocol.CapabilityProvideModelSource}, + }), nil) + installManagerFixture(t, registry, modelFixture, SourceUser, true) + memoryFixture := createManagerTestExtension( + t, + managerTestManifest("ext-registry-memory", managerManifestOptions{ + capabilities: []string{"memory.backend"}, + }), + nil, + ) + installManagerFixture(t, registry, memoryFixture, SourceUser, true) + + sources, err := NewExtensionModelSources(registry, func() ModelSourceRuntime { return nil }) + if err != nil { + t.Fatalf("NewExtensionModelSources() error = %v, want nil", err) + } + if len(sources) != 1 || sources[0].ID() != "extension:ext-registry-models" { + t.Fatalf("NewExtensionModelSources() = %#v, want only model.source extension", sources) + } + if got, err := NewExtensionModelSources(nil, nil); err != nil || got != nil { + t.Fatalf("NewExtensionModelSources(nil) = (%#v, %v), want nil, nil", got, err) + } + }) +} + +func TestModelSourceIdentityHelpersShouldValidateInputs(t *testing.T) { + t.Parallel() + + t.Run("Should validate inputs", func(t *testing.T) { + t.Parallel() + + var nilSource *ModelSource + if got := nilSource.ID(); got != "" { + t.Fatalf("(*ModelSource)(nil).ID() = %q, want empty", got) + } + if _, err := NewExtensionModelSource(ExtensionInfo{Name: "bad/source"}, nil); err == nil { + t.Fatal("NewExtensionModelSource(invalid name) error = nil, want slug validation error") + } + }) +} + +func TestModelSourceListModelsShouldRejectInvalidRuntimeState(t *testing.T) { + t.Parallel() + + runtime := &fakeModelSourceRuntime{} + source := newTestModelSource(t, "ext-invalid-state", runtime) + tests := []struct { + name string + source *ModelSource + ctx context.Context + }{ + { + name: "Should reject nil context", + source: source, + }, + { + name: "Should reject disabled extension source", + source: mustTestModelSource(t, ExtensionInfo{ + Name: "ext-disabled-source", + Enabled: false, + Capabilities: CapabilitiesConfig{ + Provides: []string{extensionprotocol.CapabilityProvideModelSource}, + }, + }, func() ModelSourceRuntime { + return runtime + }), + ctx: testutil.Context(t), + }, + { + name: "Should reject missing model source capability", + source: mustTestModelSource(t, ExtensionInfo{ + Name: "ext-missing-capability", + Enabled: true, + }, func() ModelSourceRuntime { + return runtime + }), + ctx: testutil.Context(t), + }, + { + name: "Should reject nil runtime resolver", + source: mustTestModelSource(t, ExtensionInfo{ + Name: "ext-nil-resolver", + Enabled: true, + Capabilities: CapabilitiesConfig{ + Provides: []string{extensionprotocol.CapabilityProvideModelSource}, + }, + }, nil), + ctx: testutil.Context(t), + }, + { + name: "Should reject unavailable runtime", + source: mustTestModelSource(t, ExtensionInfo{ + Name: "ext-nil-runtime", + Enabled: true, + Capabilities: CapabilitiesConfig{ + Provides: []string{extensionprotocol.CapabilityProvideModelSource}, + }, + }, func() ModelSourceRuntime { + return nil + }), + ctx: testutil.Context(t), + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + _, err := tt.source.ListModels(tt.ctx, modelcatalog.ListOptions{}) + if err == nil { + t.Fatal("ListModels() error = nil, want invalid runtime state failure") + } + }) + } +} + +func TestModelSourceShouldRejectMalformedRowsAndRecordSourceStatus(t *testing.T) { + t.Parallel() + + t.Run("Should reject malformed rows and record source status", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + now := time.Date(2026, 5, 7, 10, 30, 0, 0, time.UTC) + store := openModelSourceTestStore(t) + runtime := &fakeModelSourceRuntime{} + source := newTestModelSource(t, "ext-malformed", runtime) + runtime.rows = []extensioncontract.ModelSourceRow{ + { + SourceID: source.ID(), + ProviderID: "codex", + }, + } + service := newTestModelCatalogService(t, store, []modelcatalog.Source{source}) + + statuses, err := service.Refresh(ctx, modelcatalog.RefreshOptions{ + ProviderID: "codex", + SourceID: source.ID(), + Force: true, + Now: now, + }) + if err == nil { + t.Fatal("Refresh() error = nil, want malformed row failure") + } + if len(statuses) != 1 || statuses[0].RefreshState != string(modelcatalog.RefreshStateFailed) { + t.Fatalf("Refresh() statuses = %#v, want failed status", statuses) + } + if statuses[0].LastError == "" { + t.Fatalf("Refresh() status LastError = empty, want malformed row error") + } + }) +} + +func TestModelSourceShouldRejectInvalidRowMetadata(t *testing.T) { + t.Parallel() + + baseRuntime := &fakeModelSourceRuntime{} + baseSource := newTestModelSource(t, "ext-row-validation", baseRuntime) + sourceID := baseSource.ID() + negativeInt := int64(-1) + negativeCost := float64(-1) + defaultEffort := "medium" + tests := []struct { + name string + row extensioncontract.ModelSourceRow + opts modelcatalog.ListOptions + }{ + { + name: "Should reject provider outside requested filter", + row: extensioncontract.ModelSourceRow{ + SourceID: sourceID, + ProviderID: "other", + ModelID: "model", + }, + opts: modelcatalog.ListOptions{ProviderID: "codex"}, + }, + { + name: "Should reject non-extension priority", + row: extensioncontract.ModelSourceRow{ + SourceID: sourceID, + ProviderID: "codex", + ModelID: "model", + Priority: modelcatalog.PriorityConfig, + }, + }, + { + name: "Should reject negative token metadata", + row: extensioncontract.ModelSourceRow{ + SourceID: sourceID, + ProviderID: "codex", + ModelID: "model", + ContextWindow: &negativeInt, + }, + }, + { + name: "Should reject negative cost metadata", + row: extensioncontract.ModelSourceRow{ + SourceID: sourceID, + ProviderID: "codex", + ModelID: "model", + Cost: &apicontract.ModelCatalogCostPayload{ + InputPerMillion: &negativeCost, + }, + }, + }, + { + name: "Should reject negative output cost metadata", + row: extensioncontract.ModelSourceRow{ + SourceID: sourceID, + ProviderID: "codex", + ModelID: "model", + Cost: &apicontract.ModelCatalogCostPayload{ + OutputPerMillion: &negativeCost, + }, + }, + }, + { + name: "Should reject unsupported reasoning effort", + row: extensioncontract.ModelSourceRow{ + SourceID: sourceID, + ProviderID: "codex", + ModelID: "model", + ReasoningEfforts: []string{"turbo"}, + }, + }, + { + name: "Should reject duplicate reasoning efforts", + row: extensioncontract.ModelSourceRow{ + SourceID: sourceID, + ProviderID: "codex", + ModelID: "model", + ReasoningEfforts: []string{"high", "high"}, + }, + }, + { + name: "Should reject default effort outside advertised list", + row: extensioncontract.ModelSourceRow{ + SourceID: sourceID, + ProviderID: "codex", + ModelID: "model", + ReasoningEfforts: []string{"high"}, + DefaultReasoningEffort: &defaultEffort, + }, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + t.Parallel() + + runtime := &fakeModelSourceRuntime{rows: []extensioncontract.ModelSourceRow{tt.row}} + source := newTestModelSource(t, "ext-row-validation", runtime) + _, err := source.ListModels(testutil.Context(t), tt.opts) + if err == nil { + t.Fatal("ListModels() error = nil, want row validation failure") + } + }) + } +} + +func TestModelSourceShouldRejectRowsWithInvalidSourceID(t *testing.T) { + t.Parallel() + + t.Run("Should reject rows with invalid source id", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + now := time.Date(2026, 5, 7, 10, 45, 0, 0, time.UTC) + store := openModelSourceTestStore(t) + runtime := &fakeModelSourceRuntime{} + source := newTestModelSource(t, "ext-invalid-source", runtime) + runtime.rows = []extensioncontract.ModelSourceRow{ + { + SourceID: "extension:Bad", + ProviderID: "codex", + ModelID: "bad-source-model", + }, + } + service := newTestModelCatalogService(t, store, []modelcatalog.Source{source}) + + statuses, err := service.Refresh(ctx, modelcatalog.RefreshOptions{ + ProviderID: "codex", + SourceID: source.ID(), + Force: true, + Now: now, + }) + if err == nil { + t.Fatal("Refresh() error = nil, want source_id validation failure") + } + if len(statuses) != 1 || statuses[0].LastError == "" { + t.Fatalf("Refresh() statuses = %#v, want recorded source_id validation error", statuses) + } + }) +} + +func TestModelSourceShouldRecordMalformedSubprocessRows(t *testing.T) { + t.Run("Should record malformed subprocess rows", func(t *testing.T) { + withDaemonVersion(t, "0.5.0") + + ctx := testutil.Context(t) + now := time.Date(2026, 5, 7, 10, 50, 0, 0, time.UTC) + store, _, source := startSubprocessModelSource(t, "ext-subprocess-malformed", "model_source_malformed") + service := newTestModelCatalogService(t, store, []modelcatalog.Source{source}) + + statuses, err := service.Refresh(ctx, modelcatalog.RefreshOptions{ + ProviderID: "codex", + SourceID: source.ID(), + Force: true, + Now: now, + }) + if err == nil { + t.Fatal("Refresh() error = nil, want malformed subprocess row failure") + } + if len(statuses) != 1 || statuses[0].RefreshState != string(modelcatalog.RefreshStateFailed) { + t.Fatalf("Refresh() statuses = %#v, want failed subprocess source status", statuses) + } + }) +} + +func TestModelSourceShouldPreserveStaleRowsWhenRuntimeIsUnavailable(t *testing.T) { + t.Parallel() + + t.Run("Should preserve stale rows when runtime is unavailable", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + now := time.Date(2026, 5, 7, 11, 0, 0, 0, time.UTC) + store := openModelSourceTestStore(t) + runtime := &fakeModelSourceRuntime{} + source := newTestModelSource(t, "ext-stale", runtime) + runtime.rows = []extensioncontract.ModelSourceRow{ + { + SourceID: source.ID(), + ProviderID: "codex", + ModelID: "stale-model", + }, + } + service := newTestModelCatalogService(t, store, []modelcatalog.Source{source}) + if _, err := service.Refresh(ctx, modelcatalog.RefreshOptions{ + ProviderID: "codex", + SourceID: source.ID(), + Force: true, + Now: now, + }); err != nil { + t.Fatalf("initial Refresh() error = %v, want nil", err) + } + + runtime.rows = nil + runtime.err = errors.New("extension offline") + statuses, err := service.Refresh(ctx, modelcatalog.RefreshOptions{ + ProviderID: "codex", + SourceID: source.ID(), + Force: true, + Now: now.Add(time.Minute), + }) + if err != nil { + t.Fatalf("stale Refresh() error = %v, want stale fallback success", err) + } + if len(statuses) != 1 || !statuses[0].Stale || statuses[0].RowCount != 1 { + t.Fatalf("stale Refresh() statuses = %#v, want one stale preserved row", statuses) + } + models, err := service.ListModels(ctx, modelcatalog.ListOptions{ + ProviderID: "codex", + IncludeStale: true, + Now: now.Add(time.Minute), + }) + if err != nil { + t.Fatalf("ListModels(include stale) error = %v, want nil", err) + } + if len(models) != 1 || !models[0].Stale || models[0].LastError == "" { + t.Fatalf("ListModels(include stale) = %#v, want stale model with last error", models) + } + }) +} + +func TestModelSourceShouldPreserveStaleRowsWhenSubprocessExtensionStops(t *testing.T) { + t.Run("Should preserve stale rows when subprocess extension stops", func(t *testing.T) { + withDaemonVersion(t, "0.5.0") + + ctx := testutil.Context(t) + now := time.Date(2026, 5, 7, 11, 15, 0, 0, time.UTC) + store, manager, source := startSubprocessModelSource(t, "ext-subprocess-stale", "model_source_success") + service := newTestModelCatalogService(t, store, []modelcatalog.Source{source}) + if _, err := service.Refresh(ctx, modelcatalog.RefreshOptions{ + ProviderID: "codex", + SourceID: source.ID(), + Force: true, + Now: now, + }); err != nil { + t.Fatalf("initial Refresh() error = %v, want nil", err) + } + if err := manager.Stop(ctx); err != nil { + t.Fatalf("Stop() error = %v, want nil", err) + } + + statuses, err := service.Refresh(ctx, modelcatalog.RefreshOptions{ + ProviderID: "codex", + SourceID: source.ID(), + Force: true, + Now: now.Add(time.Minute), + }) + if err != nil { + t.Fatalf("stale Refresh() error = %v, want stale fallback success", err) + } + if len(statuses) != 1 || !statuses[0].Stale || statuses[0].RowCount != 1 { + t.Fatalf("stale Refresh() statuses = %#v, want stale preserved subprocess row", statuses) + } + }) +} + +func TestModelSourceShouldFailClosedWithoutBlockingCatalogList(t *testing.T) { + t.Parallel() + + t.Run("Should fail closed without blocking catalog list", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + now := time.Date(2026, 5, 7, 11, 30, 0, 0, time.UTC) + store := openModelSourceTestStore(t) + runtime := &fakeModelSourceRuntime{} + deniedSource, err := NewExtensionModelSource(ExtensionInfo{ + Name: "ext-denied", + Enabled: true, + }, func() ModelSourceRuntime { + return runtime + }) + if err != nil { + t.Fatalf("NewExtensionModelSource() error = %v", err) + } + configSource := modelcatalog.NewConfigSource(map[string]aghconfig.ProviderConfig{ + "codex": { + Models: aghconfig.ProviderModelsConfig{ + Curated: []aghconfig.ProviderModelConfig{{ID: "configured-model"}}, + }, + }, + }) + service := newTestModelCatalogService(t, store, []modelcatalog.Source{configSource, deniedSource}) + + models, err := service.ListModels(ctx, modelcatalog.ListOptions{ + ProviderID: "codex", + IncludeStale: true, + Now: now, + }) + if err != nil { + t.Fatalf("ListModels() error = %v, want config source to remain available", err) + } + if len(models) != 1 || models[0].ModelID != "configured-model" { + t.Fatalf("ListModels() = %#v, want config model despite denied extension source", models) + } + statuses, err := service.ListSourceStatus(ctx, "codex") + if err != nil { + t.Fatalf("ListSourceStatus() error = %v, want nil", err) + } + foundDenied := false + for _, status := range statuses { + if status.SourceID == deniedSource.ID() { + foundDenied = status.RefreshState == string(modelcatalog.RefreshStateFailed) && status.LastError != "" + } + } + if !foundDenied { + t.Fatalf("ListSourceStatus() = %#v, want failed denied extension source", statuses) + } + }) +} + +type fakeModelSourceRuntime struct { + rows []extensioncontract.ModelSourceRow + err error + calls []extensioncontract.ModelSourceListParams +} + +func (r *fakeModelSourceRuntime) ListModelSourceRows( + _ context.Context, + _ string, + params extensioncontract.ModelSourceListParams, +) ([]extensioncontract.ModelSourceRow, error) { + r.calls = append(r.calls, params) + return cloneModelSourceRows(r.rows), r.err +} + +func newTestModelSource(t *testing.T, name string, runtime *fakeModelSourceRuntime) *ModelSource { + t.Helper() + + return mustTestModelSource(t, ExtensionInfo{ + Name: name, + Enabled: true, + Capabilities: CapabilitiesConfig{ + Provides: []string{extensionprotocol.CapabilityProvideModelSource}, + }, + }, func() ModelSourceRuntime { + return runtime + }) +} + +func mustTestModelSource( + t *testing.T, + info ExtensionInfo, + resolver ModelSourceRuntimeResolver, +) *ModelSource { + t.Helper() + + source, err := NewExtensionModelSource(info, resolver) + if err != nil { + t.Fatalf("NewExtensionModelSource() error = %v", err) + } + return source +} + +func openModelSourceTestStore(t *testing.T) *globaldb.GlobalDB { + t.Helper() + + store, err := globaldb.OpenGlobalDB(testutil.Context(t), filepath.Join(t.TempDir(), "agh.db")) + if err != nil { + t.Fatalf("OpenGlobalDB() error = %v", err) + } + t.Cleanup(func() { + if err := store.Close(testutil.Context(t)); err != nil { + t.Fatalf("GlobalDB.Close() error = %v", err) + } + }) + return store +} + +func newTestModelCatalogService( + t *testing.T, + store modelcatalog.Store, + sources []modelcatalog.Source, +) modelcatalog.Service { + t.Helper() + + service, err := modelcatalog.NewService(store, sources) + if err != nil { + t.Fatalf("NewService() error = %v", err) + } + return service +} + +func startSubprocessModelSource( + t *testing.T, + name string, + scenario string, +) (*globaldb.GlobalDB, *Manager, *ModelSource) { + t.Helper() + + store := openModelSourceTestStore(t) + registry := NewRegistry(store.DB()) + fixture := createManagerTestExtension(t, managerTestManifest(name, managerManifestOptions{ + command: helperCommand(t), + args: helperArgs(), + withEnv: helperEnv(scenario, ""), + capabilities: []string{extensionprotocol.CapabilityProvideModelSource}, + }), nil) + installManagerFixture(t, registry, fixture, SourceUser, true) + + manager := NewManager(registry) + if err := manager.Start(testutil.Context(t)); err != nil { + t.Fatalf("Start() error = %v", err) + } + t.Cleanup(func() { + if err := manager.Stop(testutil.Context(t)); err != nil { + t.Fatalf("Stop() cleanup error = %v", err) + } + }) + + info, err := registry.Get(name) + if err != nil { + t.Fatalf("Registry.Get(%q) error = %v", name, err) + } + source, err := NewExtensionModelSource(*info, func() ModelSourceRuntime { + return manager + }) + if err != nil { + t.Fatalf("NewExtensionModelSource() error = %v", err) + } + return store, manager, source +} + +func boolPointer(value bool) *bool { + return &value +} + +func int64Pointer(value int64) *int64 { + return &value +} diff --git a/internal/extension/protocol/host_api.go b/internal/extension/protocol/host_api.go index 2c4c7fddf..611100426 100644 --- a/internal/extension/protocol/host_api.go +++ b/internal/extension/protocol/host_api.go @@ -15,6 +15,8 @@ const ( CapabilityProvideBridgeAdapter = "bridge.adapter" // CapabilityToolProvider is the provide surface for executable extension-host tools. CapabilityToolProvider = "tool.provider" + // CapabilityProvideModelSource is the provide surface for model catalog source rows. + CapabilityProvideModelSource = "model.source" ) // ExtensionServiceMethod identifies one AGH -> extension capability service request. @@ -27,6 +29,7 @@ const ( ExtensionServiceMethodBridgesDeliver ExtensionServiceMethod = "bridges/deliver" ExtensionServiceMethodProvideTools ExtensionServiceMethod = "provide_tools" ExtensionServiceMethodToolsCall ExtensionServiceMethod = "tools/call" + ExtensionServiceMethodModelsList ExtensionServiceMethod = "models/list" ) const ( @@ -48,6 +51,9 @@ const ( HostAPIMethodObserveHealth HostAPIMethod = "observe/health" HostAPIMethodObserveEvents HostAPIMethod = "observe/events" HostAPIMethodSkillsList HostAPIMethod = "skills/list" + HostAPIMethodModelsList HostAPIMethod = "models/list" + HostAPIMethodModelsRefresh HostAPIMethod = "models/refresh" + HostAPIMethodModelsStatus HostAPIMethod = "models/status" HostAPIMethodAgentsSoulGet HostAPIMethod = "agents/soul/get" HostAPIMethodAgentsSoulValidate HostAPIMethod = "agents/soul/validate" HostAPIMethodAgentsSoulPut HostAPIMethod = "agents/soul/put" @@ -117,7 +123,22 @@ const ( // AllHostAPIMethods returns the canonical Host API method registry in wire order. func AllHostAPIMethods() []HostAPIMethod { - methods := []HostAPIMethod{ + methods := preNetworkHostAPIMethods() + methods = append(methods, networkHostAPIMethods()...) + methods = append(methods, + HostAPIMethodResourcesList, + HostAPIMethodResourcesGet, + HostAPIMethodResourcesSnapshot, + HostAPIMethodBridgesInstancesList, + HostAPIMethodBridgesMessagesIngest, + HostAPIMethodBridgesInstancesGet, + HostAPIMethodBridgesInstancesReportState, + ) + return methods +} + +func preNetworkHostAPIMethods() []HostAPIMethod { + return []HostAPIMethod{ HostAPIMethodSessionsList, HostAPIMethodSessionsCreate, HostAPIMethodSessionsPrompt, @@ -136,6 +157,9 @@ func AllHostAPIMethods() []HostAPIMethod { HostAPIMethodObserveHealth, HostAPIMethodObserveEvents, HostAPIMethodSkillsList, + HostAPIMethodModelsList, + HostAPIMethodModelsRefresh, + HostAPIMethodModelsStatus, HostAPIMethodAgentsSoulGet, HostAPIMethodAgentsSoulValidate, HostAPIMethodAgentsSoulPut, @@ -184,17 +208,6 @@ func AllHostAPIMethods() []HostAPIMethod { HostAPIMethodTasksRunsFail, HostAPIMethodTasksRunsCancel, } - methods = append(methods, networkHostAPIMethods()...) - methods = append(methods, - HostAPIMethodResourcesList, - HostAPIMethodResourcesGet, - HostAPIMethodResourcesSnapshot, - HostAPIMethodBridgesInstancesList, - HostAPIMethodBridgesMessagesIngest, - HostAPIMethodBridgesInstancesGet, - HostAPIMethodBridgesInstancesReportState, - ) - return methods } func networkHostAPIMethods() []HostAPIMethod { @@ -226,6 +239,9 @@ var capabilityServiceMethods = map[string][]ExtensionServiceMethod{ ExtensionServiceMethodProvideTools, ExtensionServiceMethodToolsCall, }, + CapabilityProvideModelSource: { + ExtensionServiceMethodModelsList, + }, } // CapabilityServiceMethods returns the negotiated AGH -> extension service methods diff --git a/internal/extension/protocol/host_api_test.go b/internal/extension/protocol/host_api_test.go index 2022d39d5..4deaf59b4 100644 --- a/internal/extension/protocol/host_api_test.go +++ b/internal/extension/protocol/host_api_test.go @@ -5,99 +5,150 @@ import "testing" func TestAllHostAPIMethodsReturnsCanonicalWireOrder(t *testing.T) { t.Parallel() - want := []HostAPIMethod{ - HostAPIMethodSessionsList, - HostAPIMethodSessionsCreate, - HostAPIMethodSessionsPrompt, - HostAPIMethodSessionsStop, - HostAPIMethodSessionsStatus, - HostAPIMethodSessionsEvents, - HostAPIMethodSessionsSoulRefresh, - HostAPIMethodSessionsHealthGet, - HostAPIMethodSessionsStatusGet, - HostAPIMethodSandboxList, - HostAPIMethodSandboxInfo, - HostAPIMethodSandboxExec, - HostAPIMethodMemoryRecall, - HostAPIMethodMemoryStore, - HostAPIMethodMemoryForget, - HostAPIMethodObserveHealth, - HostAPIMethodObserveEvents, - HostAPIMethodSkillsList, - HostAPIMethodAgentsSoulGet, - HostAPIMethodAgentsSoulValidate, - HostAPIMethodAgentsSoulPut, - HostAPIMethodAgentsSoulDelete, - HostAPIMethodAgentsSoulHistory, - HostAPIMethodAgentsSoulRollback, - HostAPIMethodAgentsHeartbeatGet, - HostAPIMethodAgentsHeartbeatValidate, - HostAPIMethodAgentsHeartbeatPut, - HostAPIMethodAgentsHeartbeatDelete, - HostAPIMethodAgentsHeartbeatHistory, - HostAPIMethodAgentsHeartbeatRollback, - HostAPIMethodAgentsHeartbeatStatus, - HostAPIMethodAgentsHeartbeatWake, - HostAPIMethodAutomationJobs, - HostAPIMethodAutomationJobsGet, - HostAPIMethodAutomationJobsCreate, - HostAPIMethodAutomationJobsUpdate, - HostAPIMethodAutomationJobsDelete, - HostAPIMethodAutomationJobsTrigger, - HostAPIMethodAutomationJobsRuns, - HostAPIMethodAutomationTriggers, - HostAPIMethodAutomationTriggersGet, - HostAPIMethodAutomationTriggersCreate, - HostAPIMethodAutomationTriggersUpdate, - HostAPIMethodAutomationTriggersDelete, - HostAPIMethodAutomationTriggersRuns, - HostAPIMethodAutomationTriggersFire, - HostAPIMethodAutomationRuns, - HostAPIMethodTasks, - HostAPIMethodTasksGet, - HostAPIMethodTasksTimeline, - HostAPIMethodTasksTree, - HostAPIMethodTasksDashboard, - HostAPIMethodTasksInbox, - HostAPIMethodTasksCreate, - HostAPIMethodTasksUpdate, - HostAPIMethodTasksCancel, - HostAPIMethodTasksRuns, - HostAPIMethodTasksRunsGet, - HostAPIMethodTasksRunsEnqueue, - HostAPIMethodTasksRunsClaim, - HostAPIMethodTasksRunsStart, - HostAPIMethodTasksRunsAttachSession, - HostAPIMethodTasksRunsComplete, - HostAPIMethodTasksRunsFail, - HostAPIMethodTasksRunsCancel, - HostAPIMethodNetworkStatus, - HostAPIMethodNetworkChannels, - HostAPIMethodNetworkPeers, - HostAPIMethodNetworkThreads, - HostAPIMethodNetworkThreadGet, - HostAPIMethodNetworkThreadMessages, - HostAPIMethodNetworkDirects, - HostAPIMethodNetworkDirectResolve, - HostAPIMethodNetworkDirectMessages, - HostAPIMethodNetworkWorkGet, - HostAPIMethodNetworkSend, - HostAPIMethodResourcesList, - HostAPIMethodResourcesGet, - HostAPIMethodResourcesSnapshot, - HostAPIMethodBridgesInstancesList, - HostAPIMethodBridgesMessagesIngest, - HostAPIMethodBridgesInstancesGet, - HostAPIMethodBridgesInstancesReportState, - } + t.Run("Should return canonical wire order", func(t *testing.T) { + t.Parallel() - got := AllHostAPIMethods() - if len(got) != len(want) { - t.Fatalf("len(AllHostAPIMethods()) = %d, want %d", len(got), len(want)) - } - for idx := range want { - if got[idx] != want[idx] { - t.Fatalf("AllHostAPIMethods()[%d] = %q, want %q", idx, got[idx], want[idx]) + want := []HostAPIMethod{ + HostAPIMethodSessionsList, + HostAPIMethodSessionsCreate, + HostAPIMethodSessionsPrompt, + HostAPIMethodSessionsStop, + HostAPIMethodSessionsStatus, + HostAPIMethodSessionsEvents, + HostAPIMethodSessionsSoulRefresh, + HostAPIMethodSessionsHealthGet, + HostAPIMethodSessionsStatusGet, + HostAPIMethodSandboxList, + HostAPIMethodSandboxInfo, + HostAPIMethodSandboxExec, + HostAPIMethodMemoryRecall, + HostAPIMethodMemoryStore, + HostAPIMethodMemoryForget, + HostAPIMethodObserveHealth, + HostAPIMethodObserveEvents, + HostAPIMethodSkillsList, + HostAPIMethodModelsList, + HostAPIMethodModelsRefresh, + HostAPIMethodModelsStatus, + HostAPIMethodAgentsSoulGet, + HostAPIMethodAgentsSoulValidate, + HostAPIMethodAgentsSoulPut, + HostAPIMethodAgentsSoulDelete, + HostAPIMethodAgentsSoulHistory, + HostAPIMethodAgentsSoulRollback, + HostAPIMethodAgentsHeartbeatGet, + HostAPIMethodAgentsHeartbeatValidate, + HostAPIMethodAgentsHeartbeatPut, + HostAPIMethodAgentsHeartbeatDelete, + HostAPIMethodAgentsHeartbeatHistory, + HostAPIMethodAgentsHeartbeatRollback, + HostAPIMethodAgentsHeartbeatStatus, + HostAPIMethodAgentsHeartbeatWake, + HostAPIMethodAutomationJobs, + HostAPIMethodAutomationJobsGet, + HostAPIMethodAutomationJobsCreate, + HostAPIMethodAutomationJobsUpdate, + HostAPIMethodAutomationJobsDelete, + HostAPIMethodAutomationJobsTrigger, + HostAPIMethodAutomationJobsRuns, + HostAPIMethodAutomationTriggers, + HostAPIMethodAutomationTriggersGet, + HostAPIMethodAutomationTriggersCreate, + HostAPIMethodAutomationTriggersUpdate, + HostAPIMethodAutomationTriggersDelete, + HostAPIMethodAutomationTriggersRuns, + HostAPIMethodAutomationTriggersFire, + HostAPIMethodAutomationRuns, + HostAPIMethodTasks, + HostAPIMethodTasksGet, + HostAPIMethodTasksTimeline, + HostAPIMethodTasksTree, + HostAPIMethodTasksDashboard, + HostAPIMethodTasksInbox, + HostAPIMethodTasksCreate, + HostAPIMethodTasksUpdate, + HostAPIMethodTasksCancel, + HostAPIMethodTasksRuns, + HostAPIMethodTasksRunsGet, + HostAPIMethodTasksRunsEnqueue, + HostAPIMethodTasksRunsClaim, + HostAPIMethodTasksRunsStart, + HostAPIMethodTasksRunsAttachSession, + HostAPIMethodTasksRunsComplete, + HostAPIMethodTasksRunsFail, + HostAPIMethodTasksRunsCancel, + HostAPIMethodNetworkStatus, + HostAPIMethodNetworkChannels, + HostAPIMethodNetworkPeers, + HostAPIMethodNetworkThreads, + HostAPIMethodNetworkThreadGet, + HostAPIMethodNetworkThreadMessages, + HostAPIMethodNetworkDirects, + HostAPIMethodNetworkDirectResolve, + HostAPIMethodNetworkDirectMessages, + HostAPIMethodNetworkWorkGet, + HostAPIMethodNetworkSend, + HostAPIMethodResourcesList, + HostAPIMethodResourcesGet, + HostAPIMethodResourcesSnapshot, + HostAPIMethodBridgesInstancesList, + HostAPIMethodBridgesMessagesIngest, + HostAPIMethodBridgesInstancesGet, + HostAPIMethodBridgesInstancesReportState, } - } + + got := AllHostAPIMethods() + if len(got) != len(want) { + t.Fatalf("len(AllHostAPIMethods()) = %d, want %d", len(got), len(want)) + } + for idx := range want { + if got[idx] != want[idx] { + t.Fatalf("AllHostAPIMethods()[%d] = %q, want %q", idx, got[idx], want[idx]) + } + } + }) +} + +func TestCapabilityServiceMethodsShouldIncludeModelSourceMethod(t *testing.T) { + t.Parallel() + + t.Run("Should include model source method", func(t *testing.T) { + t.Parallel() + + got := CapabilityServiceMethods([]string{CapabilityProvideModelSource}) + want := []string{string(ExtensionServiceMethodModelsList)} + if len(got) != len(want) { + t.Fatalf("len(CapabilityServiceMethods(model.source)) = %d, want %d", len(got), len(want)) + } + for idx := range want { + if got[idx] != want[idx] { + t.Fatalf("CapabilityServiceMethods(model.source)[%d] = %q, want %q", idx, got[idx], want[idx]) + } + } + }) +} + +func TestCapabilityServiceMethodsShouldNormalizeModelSourceProvides(t *testing.T) { + t.Parallel() + + t.Run("Should normalize model source provides", func(t *testing.T) { + t.Parallel() + + got := CapabilityServiceMethods([]string{ + " ", + CapabilityProvideModelSource, + CapabilityProvideModelSource, + "unknown.provide", + }) + want := []string{string(ExtensionServiceMethodModelsList)} + if len(got) != len(want) { + t.Fatalf("len(CapabilityServiceMethods()) = %d, want %d", len(got), len(want)) + } + if got[0] != want[0] { + t.Fatalf("CapabilityServiceMethods()[0] = %q, want %q", got[0], want[0]) + } + if got := CapabilityServiceMethods(nil); got != nil { + t.Fatalf("CapabilityServiceMethods(nil) = %#v, want nil", got) + } + }) } diff --git a/internal/extension/tool_runtime.go b/internal/extension/tool_runtime.go index ef51107fb..ac3605d42 100644 --- a/internal/extension/tool_runtime.go +++ b/internal/extension/tool_runtime.go @@ -105,10 +105,22 @@ func (m *Manager) extensionServiceProcess( m.mu.RUnlock() return nil, name, fmt.Errorf("extension: extension %q is not active: %w", name, toolspkg.ErrToolUnavailable) } + provides := ext.info.Capabilities.Provides + if ext.manifest != nil { + provides = ext.manifest.Capabilities.Provides + } process := ext.process initialize := cloneInitializeResponse(ext.initialize) m.mu.RUnlock() + if !slices.Contains(extensionprotocol.CapabilityServiceMethods(provides), methodName) { + return nil, name, fmt.Errorf( + "extension: extension %q is not granted service method %q: %w", + name, + methodName, + toolspkg.ErrToolUnavailable, + ) + } if initialize == nil || !slices.Contains(initialize.ImplementedMethods, methodName) { return nil, name, fmt.Errorf( "extension: extension %q does not implement %q: %w", diff --git a/internal/modelcatalog/errors.go b/internal/modelcatalog/errors.go new file mode 100644 index 000000000..6816c77a3 --- /dev/null +++ b/internal/modelcatalog/errors.go @@ -0,0 +1,49 @@ +package modelcatalog + +import ( + "errors" + "fmt" +) + +var ( + // ErrAllSourcesFailed reports that refresh could not produce usable rows. + ErrAllSourcesFailed = errors.New("model catalog: all usable sources failed") + // ErrSourceDisabled reports that a source is intentionally disabled. + ErrSourceDisabled = errors.New("model catalog: source disabled") + // ErrSourceNotRegistered reports that a requested source id is not registered. + ErrSourceNotRegistered = errors.New("model catalog: source not registered") +) + +// StaleFallbackError reports a refresh failure that returned stale fallback rows. +type StaleFallbackError struct { + SourceID string + Err error +} + +func (e *StaleFallbackError) Error() string { + if e == nil { + return "model catalog: stale fallback" + } + if e.Err == nil { + return fmt.Sprintf("model catalog: source %q returned stale fallback", e.SourceID) + } + return fmt.Sprintf("model catalog: source %q returned stale fallback: %v", e.SourceID, e.Err) +} + +func (e *StaleFallbackError) Unwrap() error { + if e == nil { + return nil + } + return e.Err +} + +func sourceErrorText(err error) string { + if err == nil { + return "" + } + var fallback *StaleFallbackError + if errors.As(err, &fallback) && fallback.Err != nil { + return RedactString(fallback.Err.Error()) + } + return RedactString(err.Error()) +} diff --git a/internal/modelcatalog/hardcut_residue_test.go b/internal/modelcatalog/hardcut_residue_test.go new file mode 100644 index 000000000..0ae17a64a --- /dev/null +++ b/internal/modelcatalog/hardcut_residue_test.go @@ -0,0 +1,164 @@ +package modelcatalog + +import ( + "bufio" + "fmt" + "io/fs" + "os" + "path/filepath" + "runtime" + "strings" + "testing" +) + +func TestProviderModelHardCutResidueGuard(t *testing.T) { + t.Parallel() + + t.Run("Should find no old provider model config residue outside allowlisted surfaces", func(t *testing.T) { + t.Parallel() + + repoRoot := testRepoRoot(t) + fields := []string{ + "default_model", + "supported_models", + "supports_reasoning_effort", + } + var residues []string + for _, target := range []string{"cmd", "internal", "web", "packages/site", "openapi", "config.toml"} { + targetPath := filepath.Join(repoRoot, target) + info, err := os.Stat(targetPath) + if err != nil { + t.Fatalf("os.Stat(%q) error = %v", targetPath, err) + } + if !info.IsDir() { + residues = appendResiduesFromFile(t, residues, repoRoot, targetPath, fields) + continue + } + err = filepath.WalkDir(targetPath, func(path string, entry fs.DirEntry, walkErr error) error { + if walkErr != nil { + return walkErr + } + if entry.IsDir() { + if skipResidueGuardDir(entry.Name()) { + return filepath.SkipDir + } + return nil + } + residues = appendResiduesFromFile(t, residues, repoRoot, path, fields) + return nil + }) + if err != nil { + t.Fatalf("WalkDir(%q) error = %v", targetPath, err) + } + } + + if len(residues) > 0 { + t.Fatalf( + "provider model hard-cut residue found in non-test surfaces:\n%s", + strings.Join(residues, "\n"), + ) + } + }) +} + +func testRepoRoot(t *testing.T) string { + t.Helper() + + _, file, _, ok := runtime.Caller(0) + if !ok { + t.Fatal("runtime.Caller() failed") + } + return filepath.Clean(filepath.Join(filepath.Dir(file), "..", "..")) +} + +func appendResiduesFromFile( + t *testing.T, + residues []string, + repoRoot string, + path string, + fields []string, +) []string { + t.Helper() + + rel, err := filepath.Rel(repoRoot, path) + if err != nil { + t.Fatalf("filepath.Rel(%q, %q) error = %v", repoRoot, path, err) + } + rel = filepath.ToSlash(rel) + if skipResidueGuardFile(rel) { + return residues + } + file, err := os.Open(path) + if err != nil { + t.Fatalf("os.Open(%q) error = %v", path, err) + } + defer func() { + if closeErr := file.Close(); closeErr != nil { + t.Errorf("Close(%q) error = %v", path, closeErr) + } + }() + + scanner := bufio.NewScanner(file) + scanner.Buffer(make([]byte, 1024), 1024*1024) + lineNo := 0 + for scanner.Scan() { + lineNo++ + line := scanner.Text() + for _, field := range fields { + if !strings.Contains(line, field) { + continue + } + if allowedProviderModelResidue(rel, line, field) { + continue + } + residues = append(residues, fmt.Sprintf("%s:%d contains %s", rel, lineNo, field)) + } + } + if err := scanner.Err(); err != nil { + t.Fatalf("Scan(%q) error = %v", path, err) + } + return residues +} + +func skipResidueGuardDir(name string) bool { + switch name { + case ".git", ".next", ".tmp", ".turbo", "coverage", "dist", "node_modules", "out", "storybook-static": + return true + default: + return false + } +} + +func skipResidueGuardFile(rel string) bool { + base := filepath.Base(rel) + if strings.HasSuffix(base, "_test.go") || + strings.Contains(base, ".test.") || + strings.Contains(base, ".spec.") || + strings.HasSuffix(base, ".snap") { + return true + } + return strings.Contains(rel, "/__tests__/") || strings.Contains(rel, "/testdata/") +} + +func allowedProviderModelResidue(rel string, line string, field string) bool { + if rel == "internal/config/merge.go" { + return strings.Contains(line, fmt.Sprintf("%q", field)) + } + if rel == "packages/site/content/runtime/core/agents/providers.mdx" || + rel == "packages/site/content/runtime/core/configuration/config-toml.mdx" { + return strings.Contains(line, "flat keys") || strings.Contains(line, "are no longer") + } + if field != "supported_models" { + return false + } + switch rel { + case "internal/api/contract/contract.go", + "web/src/generated/agh-openapi.d.ts", + "openapi/agh.json", + "web/src/systems/session/mocks/fixtures.ts", + "web/src/systems/network/mocks/fixtures.ts": + return true + default: + return false + } +} diff --git a/internal/modelcatalog/live_sources.go b/internal/modelcatalog/live_sources.go new file mode 100644 index 000000000..cde314761 --- /dev/null +++ b/internal/modelcatalog/live_sources.go @@ -0,0 +1,1034 @@ +package modelcatalog + +import ( + "bytes" + "context" + "encoding/json" + "errors" + "fmt" + "io" + "maps" + "net/http" + "net/url" + "os" + "os/exec" + "sort" + "strconv" + "strings" + "time" + + "github.com/kballard/go-shellquote" + aghconfig "github.com/pedronauck/agh/internal/config" + "github.com/pedronauck/agh/internal/procutil" + "github.com/pedronauck/agh/internal/providerenv" + "github.com/pedronauck/agh/internal/vault" +) + +const ( + defaultLiveDiscoveryTimeout = 10 * time.Second + maxLiveDiscoveryPayloadSize = 8 << 20 +) + +// ProviderSecretResolver resolves provider credential refs for live discovery. +type ProviderSecretResolver interface { + ResolveRef(ctx context.Context, ref string) (string, error) +} + +// EnvSecretResolver resolves env: secret refs from an environment lookup. +type EnvSecretResolver struct { + LookupEnv func(string) (string, bool) +} + +var _ ProviderSecretResolver = EnvSecretResolver{} + +// ResolveRef resolves one env-backed provider credential ref. +func (r EnvSecretResolver) ResolveRef(ctx context.Context, ref string) (string, error) { + if ctx == nil { + return "", fmt.Errorf("model catalog: provider secret context is required") + } + if err := ctx.Err(); err != nil { + return "", err + } + normalized := vault.NormalizeRef(ref) + if !vault.IsEnvRef(normalized) { + return "", fmt.Errorf("%w: %s", vault.ErrUnsupportedSecretRef, normalized) + } + envName, err := vault.EnvNameFromRef(normalized) + if err != nil { + return "", err + } + lookup := r.LookupEnv + if lookup == nil { + lookup = os.LookupEnv + } + value, ok := lookup(envName) + if !ok || strings.TrimSpace(value) == "" { + return "", fmt.Errorf("%w: env:%s", vault.ErrMissingSecret, envName) + } + return value, nil +} + +// DiscoveryCommandRequest describes one timeout-bound discovery subprocess. +type DiscoveryCommandRequest struct { + ProviderID string + Command string + Args []string + Env []string + Timeout time.Duration +} + +// DiscoveryCommandResult captures safe subprocess output for parsing. +type DiscoveryCommandResult struct { + Stdout string + Stderr string + ExitCode int +} + +// DiscoveryCommandExecutor runs a provider discovery command. +type DiscoveryCommandExecutor interface { + RunDiscoveryCommand(ctx context.Context, req DiscoveryCommandRequest) (DiscoveryCommandResult, error) +} + +// ExecDiscoveryCommandExecutor runs discovery commands as subprocesses. +type ExecDiscoveryCommandExecutor struct{} + +var _ DiscoveryCommandExecutor = ExecDiscoveryCommandExecutor{} + +// RunDiscoveryCommand runs one subprocess with the caller-supplied deadline. +func (ExecDiscoveryCommandExecutor) RunDiscoveryCommand( + ctx context.Context, + req DiscoveryCommandRequest, +) (DiscoveryCommandResult, error) { + if ctx == nil { + return DiscoveryCommandResult{}, fmt.Errorf("model catalog: discovery command context is required") + } + if strings.TrimSpace(req.Command) == "" { + return DiscoveryCommandResult{}, fmt.Errorf("model catalog: discovery command is required") + } + // #nosec G204 -- discovery commands come from validated provider model discovery config. + cmd := exec.CommandContext(ctx, req.Command, req.Args...) + cmd.Env = append([]string(nil), req.Env...) + var stdout bytes.Buffer + var stderr bytes.Buffer + cmd.Stdout = &stdout + cmd.Stderr = &stderr + err := cmd.Run() + result := DiscoveryCommandResult{ + Stdout: strings.TrimSpace(stdout.String()), + Stderr: strings.TrimSpace(stderr.String()), + } + if cmd.ProcessState != nil { + result.ExitCode = cmd.ProcessState.ExitCode() + } + if err != nil { + if errors.Is(ctx.Err(), context.DeadlineExceeded) { + return result, fmt.Errorf("model catalog: discovery command timed out after %s: %w", req.Timeout, ctx.Err()) + } + return result, fmt.Errorf("model catalog: discovery command failed: %w", err) + } + return result, nil +} + +// LiveProviderSourcesConfig configures built-in provider live discovery sources. +type LiveProviderSourcesConfig struct { + Providers map[string]aghconfig.ProviderConfig + HomePaths aghconfig.HomePaths + BaseEnv []string + SecretResolver ProviderSecretResolver + HTTPClient *http.Client + CommandExecutor DiscoveryCommandExecutor + DefaultTimeout time.Duration +} + +// NewLiveProviderSources creates provider_live sources for known provider adapters. +func NewLiveProviderSources(cfg LiveProviderSourcesConfig) ([]Source, error) { + providers := aghconfig.BuiltinProviders() + maps.Copy(providers, cfg.Providers) + providerIDs := make([]string, 0, len(providers)) + for providerID := range providers { + if _, ok := liveProviderAdapters[providerID]; ok { + providerIDs = append(providerIDs, providerID) + } + } + sort.Strings(providerIDs) + sources := make([]Source, 0, len(providerIDs)) + for _, providerID := range providerIDs { + source, err := NewLiveProviderSource(providerID, providers[providerID], cfg) + if err != nil { + return nil, err + } + sources = append(sources, source) + } + return sources, nil +} + +// NewLiveProviderSource creates one provider_live source. +func NewLiveProviderSource( + providerID string, + provider aghconfig.ProviderConfig, + cfg LiveProviderSourcesConfig, +) (*LiveProviderSource, error) { + trimmedProviderID := strings.TrimSpace(providerID) + adapter, ok := liveProviderAdapters[trimmedProviderID] + if !ok { + return nil, fmt.Errorf( + "model catalog: live discovery adapter for provider %q is not registered", + trimmedProviderID, + ) + } + sourceID := SourceKindProviderLiveID(trimmedProviderID) + if err := ValidateSourceIdentity(sourceID, SourceKindProviderLive); err != nil { + return nil, err + } + timeout := cfg.DefaultTimeout + if timeout <= 0 { + timeout = defaultLiveDiscoveryTimeout + } + executor := cfg.CommandExecutor + if executor == nil { + executor = ExecDiscoveryCommandExecutor{} + } + secretResolver := cfg.SecretResolver + if secretResolver == nil { + secretResolver = EnvSecretResolver{} + } + return &LiveProviderSource{ + providerID: trimmedProviderID, + provider: provider, + adapter: adapter, + sourceID: sourceID, + homePaths: cfg.HomePaths, + baseEnv: append([]string(nil), cfg.BaseEnv...), + secretResolver: secretResolver, + httpClient: cfg.HTTPClient, + commandExecutor: executor, + defaultTimeout: timeout, + }, nil +} + +// SourceKindProviderLiveID returns the stable source id for a live provider source. +func SourceKindProviderLiveID(providerID string) string { + return string(SourceKindProviderLive) + ":" + strings.TrimSpace(providerID) +} + +// LiveProviderSource performs side-effect-free model discovery for one provider. +type LiveProviderSource struct { + providerID string + provider aghconfig.ProviderConfig + adapter liveProviderAdapter + sourceID string + homePaths aghconfig.HomePaths + baseEnv []string + secretResolver ProviderSecretResolver + httpClient *http.Client + commandExecutor DiscoveryCommandExecutor + defaultTimeout time.Duration +} + +var _ Source = (*LiveProviderSource)(nil) + +// ID returns the provider_live source id. +func (s *LiveProviderSource) ID() string { + return s.sourceID +} + +// Kind returns provider_live. +func (s *LiveProviderSource) Kind() SourceKind { + return SourceKindProviderLive +} + +// Priority returns the provider_live merge priority. +func (s *LiveProviderSource) Priority() int { + return PriorityProviderLive +} + +// ProviderIDs returns the single AGH provider id this source owns. +func (s *LiveProviderSource) ProviderIDs() []string { + return []string{s.providerID} +} + +// ListModels discovers live provider models without touching ACP sessions. +func (s *LiveProviderSource) ListModels(ctx context.Context, opts ListOptions) ([]ModelRow, error) { + if ctx == nil { + return nil, fmt.Errorf("model catalog: live provider context is required") + } + if requested := strings.TrimSpace(opts.ProviderID); requested != "" && requested != s.providerID { + return nil, nil + } + target, err := s.discoveryTarget() + if err != nil { + return nil, err + } + env, err := s.discoveryEnv(ctx) + if err != nil { + return nil, err + } + timeout := target.timeout + if timeout <= 0 { + timeout = s.defaultTimeout + } + runCtx, cancel := context.WithTimeout(ctx, timeout) + defer cancel() + + now := defaultNow(opts.Now) + switch target.kind { + case liveDiscoveryHTTP: + rows, err := s.listHTTP(runCtx, target.endpoint, env, timeout, now) + if err != nil { + return nil, err + } + return rows, nil + case liveDiscoveryCommand: + rows, err := s.listCommand(runCtx, target.command, env, timeout, now) + if err != nil { + return nil, err + } + return rows, nil + default: + return nil, fmt.Errorf("model catalog: provider %q has no side-effect-free model discovery path", s.providerID) + } +} + +type liveDiscoveryKind string + +const ( + liveDiscoveryNone liveDiscoveryKind = "" + liveDiscoveryHTTP liveDiscoveryKind = "http" + liveDiscoveryCommand liveDiscoveryKind = "command" +) + +type liveAuthScheme string + +const ( + liveAuthNone liveAuthScheme = "" + liveAuthBearer liveAuthScheme = "bearer" + liveAuthAnthropic liveAuthScheme = "anthropic" + liveAuthGemini liveAuthScheme = "gemini" +) + +type liveProviderAdapter struct { + defaultKind liveDiscoveryKind + defaultEndpoint string + defaultCommand string + authScheme liveAuthScheme + authRequired bool + credentialEnvKeys []string + headers map[string]string +} + +type liveDiscoveryTarget struct { + kind liveDiscoveryKind + endpoint string + command string + timeout time.Duration +} + +var liveProviderAdapters = map[string]liveProviderAdapter{ + "codex": { + defaultKind: liveDiscoveryHTTP, + defaultEndpoint: "https://api.openai.com/v1/models", + authScheme: liveAuthBearer, + authRequired: true, + credentialEnvKeys: []string{"OPENAI_API_KEY"}, + }, + "openai": { + defaultKind: liveDiscoveryHTTP, + defaultEndpoint: "https://api.openai.com/v1/models", + authScheme: liveAuthBearer, + authRequired: true, + credentialEnvKeys: []string{"OPENAI_API_KEY"}, + }, + "claude": { + defaultKind: liveDiscoveryHTTP, + defaultEndpoint: "https://api.anthropic.com/v1/models", + authScheme: liveAuthAnthropic, + authRequired: true, + credentialEnvKeys: []string{"ANTHROPIC_API_KEY"}, + headers: map[string]string{"anthropic-version": "2023-06-01"}, + }, + "anthropic": { + defaultKind: liveDiscoveryHTTP, + defaultEndpoint: "https://api.anthropic.com/v1/models", + authScheme: liveAuthAnthropic, + authRequired: true, + credentialEnvKeys: []string{"ANTHROPIC_API_KEY"}, + headers: map[string]string{"anthropic-version": "2023-06-01"}, + }, + "gemini": { + defaultKind: liveDiscoveryHTTP, + defaultEndpoint: "https://generativelanguage.googleapis.com/v1beta/models", + authScheme: liveAuthGemini, + authRequired: true, + credentialEnvKeys: []string{"GEMINI_API_KEY", "GOOGLE_API_KEY"}, + }, + "openrouter": { + defaultKind: liveDiscoveryHTTP, + defaultEndpoint: "https://openrouter.ai/api/v1/models", + authScheme: liveAuthBearer, + authRequired: true, + credentialEnvKeys: []string{"OPENROUTER_API_KEY"}, + }, + "vercel-ai-gateway": { + defaultKind: liveDiscoveryHTTP, + defaultEndpoint: "https://ai-gateway.vercel.sh/v1/models", + authScheme: liveAuthBearer, + authRequired: false, + credentialEnvKeys: []string{"AI_GATEWAY_API_KEY", "VERCEL_AI_GATEWAY_API_KEY"}, + }, + "ollama": { + defaultKind: liveDiscoveryHTTP, + defaultEndpoint: "http://localhost:11434/api/tags", + }, + "opencode": { + defaultKind: liveDiscoveryCommand, + defaultCommand: "opencode models", + }, + "openclaw": { + defaultKind: liveDiscoveryNone, + }, + "hermes": { + defaultKind: liveDiscoveryNone, + }, + "pi": { + defaultKind: liveDiscoveryNone, + }, +} + +func (s *LiveProviderSource) discoveryTarget() (liveDiscoveryTarget, error) { + discovery := s.provider.Models.Discovery + configuredCommand := strings.TrimSpace(discovery.Command) + configuredEndpoint := strings.TrimSpace(discovery.Endpoint) + hasConfiguredPath := configuredCommand != "" || configuredEndpoint != "" + if discovery.Enabled != nil && !*discovery.Enabled { + return liveDiscoveryTarget{}, ErrSourceDisabled + } + if s.adapter.defaultKind == liveDiscoveryNone && discovery.Enabled == nil { + if hasConfiguredPath { + return liveDiscoveryTarget{}, ErrSourceDisabled + } + return liveDiscoveryTarget{}, fmt.Errorf( + "model catalog: provider %q has no configured side-effect-free model discovery command or endpoint", + s.providerID, + ) + } + timeout, err := s.discoveryTimeout(discovery.Timeout) + if err != nil { + return liveDiscoveryTarget{}, err + } + if configuredEndpoint != "" { + return liveDiscoveryTarget{kind: liveDiscoveryHTTP, endpoint: configuredEndpoint, timeout: timeout}, nil + } + if configuredCommand != "" { + return liveDiscoveryTarget{kind: liveDiscoveryCommand, command: configuredCommand, timeout: timeout}, nil + } + switch s.adapter.defaultKind { + case liveDiscoveryHTTP: + return liveDiscoveryTarget{ + kind: liveDiscoveryHTTP, + endpoint: s.defaultEndpoint(), + timeout: timeout, + }, nil + case liveDiscoveryCommand: + return liveDiscoveryTarget{ + kind: liveDiscoveryCommand, + command: s.adapter.defaultCommand, + timeout: timeout, + }, nil + default: + return liveDiscoveryTarget{}, fmt.Errorf( + "model catalog: provider %q has no configured side-effect-free model discovery command or endpoint", + s.providerID, + ) + } +} + +func (s *LiveProviderSource) discoveryTimeout(raw string) (time.Duration, error) { + trimmed := strings.TrimSpace(raw) + if trimmed == "" { + return s.defaultTimeout, nil + } + timeout, err := time.ParseDuration(trimmed) + if err != nil || timeout <= 0 { + return 0, fmt.Errorf("model catalog: provider %q discovery timeout must be a positive duration", s.providerID) + } + return timeout, nil +} + +func (s *LiveProviderSource) defaultEndpoint() string { + baseURL := strings.TrimSpace(s.provider.BaseURL) + if baseURL == "" { + return s.adapter.defaultEndpoint + } + return joinEndpoint(baseURL, defaultEndpointPath(s.adapter.defaultEndpoint)) +} + +func joinEndpoint(baseURL string, path string) string { + trimmedBase := strings.TrimRight(strings.TrimSpace(baseURL), "/") + if trimmedBase == "" { + return path + } + trimmedPath := strings.TrimSpace(path) + if trimmedPath == "" { + return trimmedBase + } + if parsed, err := url.Parse(trimmedBase); err == nil { + basePath := strings.TrimRight(parsed.Path, "/") + switch { + case strings.HasSuffix(basePath, "/v1") && strings.HasPrefix(trimmedPath, "/v1/"): + trimmedPath = strings.TrimPrefix(trimmedPath, "/v1") + case strings.HasSuffix(basePath, "/v1beta") && strings.HasPrefix(trimmedPath, "/v1beta/"): + trimmedPath = strings.TrimPrefix(trimmedPath, "/v1beta") + case strings.HasSuffix(basePath, "/api/v1") && strings.HasPrefix(trimmedPath, "/api/v1/"): + trimmedPath = strings.TrimPrefix(trimmedPath, "/api/v1") + case strings.HasSuffix(basePath, "/api") && strings.HasPrefix(trimmedPath, "/api/"): + trimmedPath = strings.TrimPrefix(trimmedPath, "/api") + } + } + if strings.HasPrefix(trimmedPath, "/") { + return trimmedBase + trimmedPath + } + return trimmedBase + "/" + trimmedPath +} + +func defaultEndpointPath(endpoint string) string { + parsed, err := url.Parse(endpoint) + if err != nil || parsed.Path == "" { + return "" + } + if parsed.RawQuery != "" { + return parsed.Path + "?" + parsed.RawQuery + } + return parsed.Path +} + +func (s *LiveProviderSource) discoveryEnv(ctx context.Context) ([]string, error) { + env := append([]string(nil), s.baseEnv...) + switch s.provider.EffectiveEnvPolicy() { + case aghconfig.ProviderEnvPolicyIsolated: + env = procutil.IsolatedDaemonEnv(env) + default: + env = procutil.FilteredDaemonEnv(env) + } + env = providerenv.SetEnvValue(env, "AGH_PROVIDER", s.providerID) + env = providerenv.SetEnvValue(env, "AGH_PROVIDER_AUTH_MODE", string(s.provider.EffectiveAuthMode())) + env = providerenv.SetEnvValue(env, "AGH_PROVIDER_ENV_POLICY", string(s.provider.EffectiveEnvPolicy())) + env = providerenv.SetEnvValue(env, "AGH_PROVIDER_HOME_POLICY", string(s.provider.EffectiveHomePolicy())) + + var err error + env, err = providerenv.ApplyHomePolicy(s.homePaths, s.providerID, s.provider.EffectiveHomePolicy(), env) + if err != nil { + return nil, fmt.Errorf("model catalog: apply provider home policy for %q: %w", s.providerID, err) + } + if s.provider.EffectiveHarness() == aghconfig.ProviderHarnessPiACP && + s.provider.EffectiveAuthMode() == aghconfig.ProviderAuthModeNativeCLI { + env, err = providerenv.ApplyPiAgentDirPolicy(s.homePaths, s.providerID, s.provider.EffectiveHomePolicy(), env) + if err != nil { + return nil, fmt.Errorf("model catalog: apply pi discovery home policy for %q: %w", s.providerID, err) + } + } + if s.provider.EffectiveAuthMode() != aghconfig.ProviderAuthModeBoundSecret { + return env, nil + } + for _, slot := range s.provider.EffectiveCredentialSlots() { + next, err := s.injectProviderSecret(ctx, env, slot) + if err != nil { + return nil, err + } + env = next + } + return env, nil +} + +func (s *LiveProviderSource) injectProviderSecret( + ctx context.Context, + env []string, + slot aghconfig.ProviderCredentialSlot, +) ([]string, error) { + targetEnv := strings.TrimSpace(slot.TargetEnv) + secretRef := vault.NormalizeRef(slot.SecretRef) + if targetEnv == "" || secretRef == "" { + return env, nil + } + value, err := s.secretResolver.ResolveRef(ctx, secretRef) + if err != nil { + if !slot.Required && (errors.Is(err, vault.ErrMissingSecret) || errors.Is(err, vault.ErrSecretNotFound)) { + return env, nil + } + return nil, fmt.Errorf("model catalog: resolve provider credential %q for %q: %w", slot.Name, s.providerID, err) + } + return providerenv.SetEnvValue(env, targetEnv, value), nil +} + +func (s *LiveProviderSource) listHTTP( + ctx context.Context, + endpoint string, + env []string, + timeout time.Duration, + now time.Time, +) (rows []ModelRow, err error) { + req, err := http.NewRequestWithContext(ctx, http.MethodGet, endpoint, http.NoBody) + if err != nil { + return nil, fmt.Errorf("model catalog: create live discovery request for %q: %w", s.providerID, err) + } + for key, value := range s.adapter.headers { + req.Header.Set(key, value) + } + if err := s.applyRequestAuth(req, env); err != nil { + return nil, err + } + client := s.httpClient + if client == nil { + client = &http.Client{Timeout: timeout} + } + resp, err := client.Do(req) + if err != nil { + if errors.Is(ctx.Err(), context.DeadlineExceeded) { + return nil, fmt.Errorf( + "model catalog: live discovery for %q timed out after %s: %w", + s.providerID, + timeout, + ctx.Err(), + ) + } + return nil, fmt.Errorf("model catalog: fetch live models for %q: %w", s.providerID, err) + } + defer func() { + if _, copyErr := io.Copy(io.Discard, resp.Body); copyErr != nil && err == nil { + err = fmt.Errorf("model catalog: drain live discovery response for %q: %w", s.providerID, copyErr) + } + if closeErr := resp.Body.Close(); closeErr != nil && err == nil { + err = fmt.Errorf("model catalog: close live discovery response for %q: %w", s.providerID, closeErr) + } + }() + if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices { + return nil, fmt.Errorf("model catalog: live discovery for %q returned HTTP %d", s.providerID, resp.StatusCode) + } + payload, err := io.ReadAll(io.LimitReader(resp.Body, maxLiveDiscoveryPayloadSize)) + if err != nil { + return nil, fmt.Errorf("model catalog: read live discovery response for %q: %w", s.providerID, err) + } + rows, err = parseLiveModelPayload(s.providerID, payload, now) + if err != nil { + return nil, err + } + return rows, nil +} + +func (s *LiveProviderSource) applyRequestAuth(req *http.Request, env []string) error { + credential := firstEnvValue(env, s.adapter.credentialEnvKeys...) + if credential == "" && s.adapter.authRequired { + return fmt.Errorf( + "model catalog: provider %q live discovery requires a bound_secret credential", + s.providerID, + ) + } + if credential == "" { + return nil + } + switch s.adapter.authScheme { + case liveAuthBearer: + req.Header.Set("Authorization", "Bearer "+credential) + case liveAuthAnthropic: + req.Header.Set("x-api-key", credential) + case liveAuthGemini: + req.Header.Set("x-goog-api-key", credential) + case liveAuthNone: + return nil + default: + return fmt.Errorf("model catalog: unsupported live discovery auth scheme %q", s.adapter.authScheme) + } + return nil +} + +func (s *LiveProviderSource) listCommand( + ctx context.Context, + command string, + env []string, + timeout time.Duration, + now time.Time, +) ([]ModelRow, error) { + bin, args, err := parseDiscoveryCommand(command) + if err != nil { + return nil, err + } + result, err := s.commandExecutor.RunDiscoveryCommand(ctx, DiscoveryCommandRequest{ + ProviderID: s.providerID, + Command: bin, + Args: args, + Env: env, + Timeout: timeout, + }) + if err != nil { + detail := firstNonEmptyLine(result.Stderr) + if detail == "" { + detail = firstNonEmptyLine(result.Stdout) + } + if detail != "" { + return nil, fmt.Errorf("%w: %s", err, RedactString(detail)) + } + return nil, err + } + if result.ExitCode != 0 { + detail := firstNonEmptyLine(result.Stderr) + if detail == "" { + detail = firstNonEmptyLine(result.Stdout) + } + if detail == "" { + detail = "no diagnostic output" + } + return nil, fmt.Errorf( + "model catalog: discovery command for %q exited %d: %s", + s.providerID, + result.ExitCode, + RedactString(detail), + ) + } + rows, err := parseLiveModelPayload(s.providerID, []byte(result.Stdout), now) + if err == nil { + return rows, nil + } + lineRows := parseLineModelRows(s.providerID, result.Stdout, now) + if len(lineRows) > 0 { + return lineRows, nil + } + return nil, fmt.Errorf("model catalog: parse discovery command output for %q: %w", s.providerID, err) +} + +func parseDiscoveryCommand(command string) (string, []string, error) { + parts, err := shellquote.Split(command) + if err != nil { + return "", nil, fmt.Errorf("model catalog: parse discovery command %q: %w", command, err) + } + if len(parts) == 0 { + return "", nil, fmt.Errorf("model catalog: discovery command is empty") + } + return parts[0], parts[1:], nil +} + +type livePayloadEnvelope struct { + Data []liveRawModel `json:"data"` + Models []liveRawModel `json:"models"` +} + +type liveRawModel struct { + ID string `json:"id"` + Name string `json:"name"` + Model string `json:"model"` + Value string `json:"value"` + Label string `json:"label"` + DisplayName string `json:"display_name"` + DisplayNameCamel string `json:"displayName"` + ContextWindow *int64 `json:"context_window"` + ContextLength *int64 `json:"context_length"` + MaxTokens *int64 `json:"max_tokens"` + MaxInputTokens *int64 `json:"max_input_tokens"` + MaxInputTokensCamel *int64 `json:"maxInputTokens"` + MaxOutputTokens *int64 `json:"max_output_tokens"` + MaxOutputTokensCamel *int64 `json:"maxOutputTokens"` + InputTokenLimit *int64 `json:"inputTokenLimit"` + OutputTokenLimit *int64 `json:"outputTokenLimit"` + SupportedGenerationMethods []string `json:"supportedGenerationMethods"` + SupportedParameters []string `json:"supported_parameters"` + SupportsTools *bool `json:"supports_tools"` + SupportsToolsCamel *bool `json:"supportsTools"` + ToolCall *bool `json:"tool_call"` + SupportsReasoning *bool `json:"supports_reasoning"` + SupportsReasoningCamel *bool `json:"supportsReasoning"` + SupportsEffort *bool `json:"supportsEffort"` + ReasoningEfforts []string `json:"reasoning_efforts"` + SupportedEffortLevels []string `json:"supportedEffortLevels"` + DefaultReasoningEffort string `json:"default_reasoning_effort"` + Pricing liveRawPricing `json:"pricing"` + Cost liveRawPricing `json:"cost"` + Raw json.RawMessage `json:"-"` +} + +type liveRawPricing struct { + Input json.RawMessage `json:"input"` + Output json.RawMessage `json:"output"` + Prompt json.RawMessage `json:"prompt"` + Completion json.RawMessage `json:"completion"` +} + +func parseLiveModelPayload(providerID string, payload []byte, now time.Time) ([]ModelRow, error) { + trimmed := bytes.TrimSpace(payload) + if len(trimmed) == 0 { + return nil, fmt.Errorf("model catalog: live discovery for %q returned empty payload", providerID) + } + rawModels, err := decodeLiveRawModels(trimmed) + if err != nil { + return nil, err + } + rows := make([]ModelRow, 0, len(rawModels)) + seen := make(map[string]struct{}, len(rawModels)) + for index := range rawModels { + row, ok := liveModelRow(providerID, &rawModels[index], now) + if !ok { + continue + } + if _, exists := seen[row.ModelID]; exists { + continue + } + seen[row.ModelID] = struct{}{} + rows = append(rows, row) + } + sortModelRowsByID(rows) + return rows, nil +} + +func decodeLiveRawModels(payload []byte) ([]liveRawModel, error) { + var array []liveRawModel + if err := json.Unmarshal(payload, &array); err == nil && len(array) > 0 { + return array, nil + } + var envelope livePayloadEnvelope + if err := json.Unmarshal(payload, &envelope); err == nil { + if len(envelope.Data) > 0 { + return envelope.Data, nil + } + if len(envelope.Models) > 0 { + return envelope.Models, nil + } + } + var objectMap map[string]liveRawModel + if err := json.Unmarshal(payload, &objectMap); err == nil && len(objectMap) > 0 { + keys := make([]string, 0, len(objectMap)) + for key := range objectMap { + keys = append(keys, key) + } + sort.Strings(keys) + models := make([]liveRawModel, 0, len(keys)) + for _, key := range keys { + model := objectMap[key] + if strings.TrimSpace(model.ID) == "" { + model.ID = key + } + models = append(models, model) + } + return models, nil + } + return nil, fmt.Errorf("model catalog: live discovery payload did not contain model rows") +} + +func liveModelRow(providerID string, raw *liveRawModel, now time.Time) (ModelRow, bool) { + if raw == nil { + return ModelRow{}, false + } + modelID := firstNonBlank(raw.ID, raw.Model, raw.Value, raw.Name) + modelID = strings.TrimPrefix(modelID, "models/") + if modelID == "" { + return ModelRow{}, false + } + available := true + row := ModelRow{ + ProviderID: providerID, + ModelID: modelID, + DisplayName: firstNonBlank(raw.DisplayName, raw.DisplayNameCamel, raw.Label, raw.Name), + SourceID: SourceKindProviderLiveID(providerID), + SourceKind: SourceKindProviderLive, + Priority: PriorityProviderLive, + Available: &available, + RefreshedAt: now, + ContextWindow: firstInt64(raw.ContextWindow, raw.ContextLength), + MaxInputTokens: firstInt64(raw.MaxInputTokens, raw.MaxInputTokensCamel, raw.InputTokenLimit), + MaxOutputTokens: firstInt64( + raw.MaxOutputTokens, + raw.MaxOutputTokensCamel, + raw.MaxTokens, + raw.OutputTokenLimit, + ), + SupportsTools: liveSupportsTools(raw), + SupportsReasoning: firstBool(raw.SupportsReasoning, raw.SupportsReasoningCamel, raw.SupportsEffort), + ReasoningEfforts: normalizedReasoningEfforts(raw.ReasoningEfforts, raw.SupportedEffortLevels), + CostInputPerMillion: livePricePerMillion(raw.Cost.Input, raw.Pricing.Input, raw.Pricing.Prompt), + CostOutputPerMillion: livePricePerMillion(raw.Cost.Output, raw.Pricing.Output, raw.Pricing.Completion), + DefaultReasoningEffort: normalizedDefaultReasoningEffort(raw.DefaultReasoningEffort), + } + if row.SupportsReasoning == nil && len(row.ReasoningEfforts) > 0 { + value := true + row.SupportsReasoning = &value + } + return row, true +} + +func parseLineModelRows(providerID string, stdout string, now time.Time) []ModelRow { + lines := strings.Split(stdout, "\n") + rows := make([]ModelRow, 0, len(lines)) + seen := make(map[string]struct{}, len(lines)) + for _, rawLine := range lines { + line := strings.TrimSpace(rawLine) + if line == "" { + continue + } + firstToken := strings.Fields(line)[0] + if strings.EqualFold(firstToken, "id") || strings.EqualFold(firstToken, "model") { + continue + } + if !strings.Contains(firstToken, "/") && providerID == "opencode" { + continue + } + modelID := strings.TrimSpace(firstToken) + if modelID == "" { + continue + } + if _, exists := seen[modelID]; exists { + continue + } + seen[modelID] = struct{}{} + available := true + rows = append(rows, ModelRow{ + ProviderID: providerID, + ModelID: modelID, + DisplayName: modelID, + SourceID: SourceKindProviderLiveID(providerID), + SourceKind: SourceKindProviderLive, + Priority: PriorityProviderLive, + Available: &available, + RefreshedAt: now, + }) + } + sortModelRowsByID(rows) + return rows +} + +func liveSupportsTools(raw *liveRawModel) *bool { + if raw == nil { + return nil + } + if value := firstBool(raw.SupportsTools, raw.SupportsToolsCamel, raw.ToolCall); value != nil { + return value + } + for _, parameter := range raw.SupportedParameters { + normalized := strings.ToLower(strings.TrimSpace(parameter)) + if normalized == "tools" || normalized == "tool_choice" { + value := true + return &value + } + } + for _, method := range raw.SupportedGenerationMethods { + if strings.EqualFold(strings.TrimSpace(method), "generateContent") { + value := true + return &value + } + } + return nil +} + +func normalizedReasoningEfforts(groups ...[]string) []ReasoningEffort { + efforts := make([]ReasoningEffort, 0) + seen := make(map[ReasoningEffort]struct{}) + for _, group := range groups { + for _, raw := range group { + effort, ok := normalizeReasoningEffort(raw) + if !ok { + continue + } + if _, exists := seen[effort]; exists { + continue + } + seen[effort] = struct{}{} + efforts = append(efforts, effort) + } + } + return efforts +} + +func normalizedDefaultReasoningEffort(raw string) *ReasoningEffort { + effort, ok := normalizeReasoningEffort(raw) + if !ok { + return nil + } + return &effort +} + +func normalizeReasoningEffort(raw string) (ReasoningEffort, bool) { + switch ReasoningEffort(strings.ToLower(strings.TrimSpace(raw))) { + case ReasoningEffortMinimal: + return ReasoningEffortMinimal, true + case ReasoningEffortLow: + return ReasoningEffortLow, true + case ReasoningEffortMedium: + return ReasoningEffortMedium, true + case ReasoningEffortHigh: + return ReasoningEffortHigh, true + case ReasoningEffortXHigh: + return ReasoningEffortXHigh, true + default: + return "", false + } +} + +func livePricePerMillion(values ...json.RawMessage) *float64 { + for _, raw := range values { + if len(bytes.TrimSpace(raw)) == 0 { + continue + } + value, ok := parseJSONFloat(raw) + if !ok { + continue + } + perMillion := value * 1_000_000 + return &perMillion + } + return nil +} + +func parseJSONFloat(raw json.RawMessage) (float64, bool) { + var number float64 + if err := json.Unmarshal(raw, &number); err == nil { + return number, true + } + var text string + if err := json.Unmarshal(raw, &text); err != nil { + return 0, false + } + parsed, err := strconv.ParseFloat(strings.TrimSpace(text), 64) + if err != nil { + return 0, false + } + return parsed, true +} + +func firstEnvValue(env []string, keys ...string) string { + keySet := make(map[string]struct{}, len(keys)) + for _, key := range keys { + if trimmed := strings.TrimSpace(key); trimmed != "" { + keySet[trimmed] = struct{}{} + } + } + for _, entry := range env { + key, value, ok := strings.Cut(entry, "=") + if !ok { + continue + } + if _, exists := keySet[key]; exists && strings.TrimSpace(value) != "" { + return value + } + } + return "" +} + +func firstNonBlank(values ...string) string { + for _, value := range values { + trimmed := strings.TrimSpace(value) + if trimmed != "" { + return trimmed + } + } + return "" +} + +func firstNonEmptyLine(text string) string { + for line := range strings.SplitSeq(text, "\n") { + if trimmed := strings.TrimSpace(line); trimmed != "" { + return trimmed + } + } + return "" +} + +func sortModelRowsByID(rows []ModelRow) { + sort.SliceStable(rows, func(i, j int) bool { + return rows[i].ModelID < rows[j].ModelID + }) +} diff --git a/internal/modelcatalog/live_sources_test.go b/internal/modelcatalog/live_sources_test.go new file mode 100644 index 000000000..798149773 --- /dev/null +++ b/internal/modelcatalog/live_sources_test.go @@ -0,0 +1,893 @@ +package modelcatalog + +import ( + "context" + "errors" + "fmt" + "net/http" + "net/http/httptest" + "os" + "slices" + "strings" + "sync" + "sync/atomic" + "testing" + "time" + + aghconfig "github.com/pedronauck/agh/internal/config" + "github.com/pedronauck/agh/internal/testutil" + "github.com/pedronauck/agh/internal/vault" +) + +func TestLiveProviderSources(t *testing.T) { + t.Parallel() + + t.Run("Should map Codex OpenAI list output into provider live rows", func(t *testing.T) { + t.Parallel() + + var sawAuth atomic.Bool + server := liveJSONServer(t, func(r *http.Request) string { + if got, want := r.Header.Get("Authorization"), "Bearer sk-codex-test"; got == want { + sawAuth.Store(true) + } + return `{"data":[{"id":"gpt-5.4","name":"GPT-5.4","supportsReasoning":true}]}` + }) + provider := boundSecretProvider("OPENAI_API_KEY", "env:OPENAI_API_KEY") + provider.Models.Discovery.Endpoint = server.URL + source := newLiveSourceForTest(t, "codex", provider, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin", "OPENAI_API_KEY=ambient-secret"}, + SecretResolver: mapSecretResolver{"env:OPENAI_API_KEY": "sk-codex-test"}, + }) + + rows, err := source.ListModels(testutil.Context(t), ListOptions{ProviderID: "codex", Now: testTime(0)}) + if err != nil { + t.Fatalf("ListModels() error = %v", err) + } + row := requireSingleRow(t, rows) + if !sawAuth.Load() { + t.Fatal("Authorization header was not populated from bound_secret credential") + } + if row.ProviderID != "codex" || row.ModelID != "gpt-5.4" || row.SourceID != "provider_live:codex" { + t.Fatalf("row identity = %#v, want codex/gpt-5.4 provider_live:codex", row) + } + if row.Available == nil || !*row.Available { + t.Fatalf("Available = %v, want true", row.Available) + } + if row.SupportsReasoning == nil || !*row.SupportsReasoning { + t.Fatalf("SupportsReasoning = %v, want true", row.SupportsReasoning) + } + if len(row.ReasoningEfforts) != 0 { + t.Fatalf("ReasoningEfforts = %#v, want no invented levels", row.ReasoningEfforts) + } + }) + + t.Run("Should map Claude Anthropic supported models and drop unknown reasoning levels", func(t *testing.T) { + t.Parallel() + + server := liveJSONServer(t, func(r *http.Request) string { + if got, want := r.Header.Get("x-api-key"), "sk-ant-test"; got != want { + t.Fatalf("x-api-key = %q, want %q", got, want) + } + if got := r.Header.Get("anthropic-version"); got == "" { + t.Fatal("anthropic-version header is empty") + } + return `{"data":[{` + + `"id":"claude-sonnet-4-6",` + + `"displayName":"Claude Sonnet 4.6",` + + `"supportsEffort":true,` + + `"supportedEffortLevels":["low","max","xhigh"]` + + `}]}` + }) + provider := boundSecretProvider("ANTHROPIC_API_KEY", "env:ANTHROPIC_API_KEY") + provider.Models.Discovery.Endpoint = server.URL + source := newLiveSourceForTest(t, "claude", provider, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin"}, + SecretResolver: mapSecretResolver{"env:ANTHROPIC_API_KEY": "sk-ant-test"}, + }) + + rows, err := source.ListModels(testutil.Context(t), ListOptions{ProviderID: "claude", Now: testTime(0)}) + if err != nil { + t.Fatalf("ListModels() error = %v", err) + } + row := requireSingleRow(t, rows) + if row.DisplayName != "Claude Sonnet 4.6" { + t.Fatalf("DisplayName = %q, want Claude Sonnet 4.6", row.DisplayName) + } + if !slices.Equal(row.ReasoningEfforts, []ReasoningEffort{ReasoningEffortLow, ReasoningEffortXHigh}) { + t.Fatalf("ReasoningEfforts = %#v, want low/xhigh only", row.ReasoningEfforts) + } + }) + + t.Run( + "Should record unavailable Claude runtime when native CLI auth cannot satisfy HTTP discovery", + func(t *testing.T) { + t.Parallel() + + store := newMemoryStore() + source := newLiveSourceForTest(t, "claude", aghconfig.ProviderConfig{}, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin", "ANTHROPIC_API_KEY=ambient-secret"}, + }) + service := newTestService(t, store, []Source{source}) + + statuses, err := service.Refresh( + testutil.Context(t), + RefreshOptions{ProviderID: "claude", Force: true, Now: testTime(0)}, + ) + if !errors.Is(err, ErrAllSourcesFailed) { + t.Fatalf("Refresh() error = %v, want ErrAllSourcesFailed", err) + } + status := requireStatus(t, statuses, "provider_live:claude") + if status.RefreshState != string(RefreshStateFailed) { + t.Fatalf("RefreshState = %q, want failed", status.RefreshState) + } + if strings.Contains(status.LastError, "ambient-secret") { + t.Fatalf("LastError = %q, want no ambient secret", status.LastError) + } + }, + ) + + t.Run("Should preserve OpenRouter and Vercel provider model ids", func(t *testing.T) { + t.Parallel() + + openRouterServer := liveJSONServer(t, func(r *http.Request) string { + if got, want := r.Header.Get("Authorization"), "Bearer sk-router"; got != want { + t.Fatalf("OpenRouter Authorization = %q, want %q", got, want) + } + return `{"data":[{"id":"anthropic/claude-sonnet-4-6","name":"Claude via OpenRouter"}]}` + }) + openRouter := boundSecretProvider("OPENROUTER_API_KEY", "env:OPENROUTER_API_KEY") + openRouter.Models.Discovery.Endpoint = openRouterServer.URL + routerRows, err := newLiveSourceForTest(t, "openrouter", openRouter, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin"}, + SecretResolver: mapSecretResolver{"env:OPENROUTER_API_KEY": "sk-router"}, + }).ListModels(testutil.Context(t), ListOptions{ProviderID: "openrouter", Now: testTime(0)}) + if err != nil { + t.Fatalf("OpenRouter ListModels() error = %v", err) + } + if got, want := requireSingleRow(t, routerRows).ModelID, "anthropic/claude-sonnet-4-6"; got != want { + t.Fatalf("OpenRouter ModelID = %q, want %q", got, want) + } + + vercelServer := liveJSONServer(t, func(_ *http.Request) string { + return `{"data":[{` + + `"id":"openai/gpt-5.4",` + + `"name":"GPT-5.4",` + + `"context_window":1000000,` + + `"max_tokens":32000,` + + `"pricing":{"input":"0.000001","output":0.000002}` + + `}]}` + }) + vercel := aghconfig.ProviderConfig{} + vercel.Models.Discovery.Endpoint = vercelServer.URL + vercelRows, err := newLiveSourceForTest(t, "vercel-ai-gateway", vercel, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin"}, + }).ListModels(testutil.Context(t), ListOptions{ProviderID: "vercel-ai-gateway", Now: testTime(0)}) + if err != nil { + t.Fatalf("Vercel ListModels() error = %v", err) + } + vercelRow := requireSingleRow(t, vercelRows) + if vercelRow.ModelID != "openai/gpt-5.4" { + t.Fatalf("Vercel ModelID = %q, want openai/gpt-5.4", vercelRow.ModelID) + } + if vercelRow.ContextWindow == nil || *vercelRow.ContextWindow != 1000000 { + t.Fatalf("ContextWindow = %v, want 1000000", vercelRow.ContextWindow) + } + if vercelRow.CostInputPerMillion == nil || *vercelRow.CostInputPerMillion != 1 { + t.Fatalf("CostInputPerMillion = %v, want 1", vercelRow.CostInputPerMillion) + } + }) + + t.Run("Should parse Gemini model envelope fields", func(t *testing.T) { + t.Parallel() + + server := liveJSONServer(t, func(r *http.Request) string { + if got, want := r.Header.Get("x-goog-api-key"), "gemini-key"; got != want { + t.Fatalf("x-goog-api-key = %q, want %q", got, want) + } + return `{"models":[{` + + `"name":"models/gemini-3.1-pro",` + + `"displayName":"Gemini 3.1 Pro",` + + `"inputTokenLimit":1000000,` + + `"outputTokenLimit":65536,` + + `"supportedGenerationMethods":["generateContent"]` + + `}]}` + }) + provider := boundSecretProvider("GEMINI_API_KEY", "env:GEMINI_API_KEY") + provider.Models.Discovery.Endpoint = server.URL + rows, err := newLiveSourceForTest(t, "gemini", provider, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin"}, + SecretResolver: mapSecretResolver{"env:GEMINI_API_KEY": "gemini-key"}, + }).ListModels(testutil.Context(t), ListOptions{ProviderID: "gemini", Now: testTime(0)}) + if err != nil { + t.Fatalf("ListModels() error = %v", err) + } + row := requireSingleRow(t, rows) + if row.ModelID != "gemini-3.1-pro" { + t.Fatalf("ModelID = %q, want gemini-3.1-pro", row.ModelID) + } + if row.MaxInputTokens == nil || *row.MaxInputTokens != 1000000 { + t.Fatalf("MaxInputTokens = %v, want 1000000", row.MaxInputTokens) + } + if row.SupportsTools == nil || !*row.SupportsTools { + t.Fatalf("SupportsTools = %v, want true from generateContent capability", row.SupportsTools) + } + }) + + t.Run("Should parse Ollama HTTP tags", func(t *testing.T) { + t.Parallel() + + server := liveJSONServer(t, func(_ *http.Request) string { + return `{"models":[{"name":"llama3:latest","model":"llama3:latest"}]}` + }) + provider := aghconfig.ProviderConfig{} + provider.Models.Discovery.Endpoint = server.URL + rows, err := newLiveSourceForTest(t, "ollama", provider, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin"}, + }).ListModels(testutil.Context(t), ListOptions{ProviderID: "ollama", Now: testTime(0)}) + if err != nil { + t.Fatalf("ListModels() error = %v", err) + } + if got, want := requireSingleRow(t, rows).ModelID, "llama3:latest"; got != want { + t.Fatalf("ModelID = %q, want %q", got, want) + } + }) + + t.Run("Should parse OpenCode model command output and apply effective env home policy", func(t *testing.T) { + t.Parallel() + + executor := &fakeDiscoveryExecutor{ + result: DiscoveryCommandResult{Stdout: "anthropic/claude-sonnet-4-6\nopenai/gpt-5.4\n"}, + } + homePaths, err := aghconfig.ResolveHomePathsFrom(t.TempDir()) + if err != nil { + t.Fatalf("ResolveHomePathsFrom() error = %v", err) + } + provider := aghconfig.ProviderConfig{ + EnvPolicy: aghconfig.ProviderEnvPolicyIsolated, + HomePolicy: aghconfig.ProviderHomePolicyIsolated, + } + source := newLiveSourceForTest(t, "opencode", provider, LiveProviderSourcesConfig{ + HomePaths: homePaths, + BaseEnv: []string{"PATH=/bin", "HOME=/Users/operator", "OPENAI_API_KEY=ambient-secret"}, + CommandExecutor: executor, + }) + + rows, err := source.ListModels(testutil.Context(t), ListOptions{ProviderID: "opencode", Now: testTime(0)}) + if err != nil { + t.Fatalf("ListModels() error = %v", err) + } + if got, want := rowModelIDs( + rows, + ), []string{ + "anthropic/claude-sonnet-4-6", + "openai/gpt-5.4", + }; !slices.Equal( + got, + want, + ) { + t.Fatalf("row ids = %#v, want %#v", got, want) + } + req := executor.singleRequest(t) + if envValue(req.Env, "OPENAI_API_KEY") != "" { + t.Fatalf("OPENAI_API_KEY = %q, want filtered", envValue(req.Env, "OPENAI_API_KEY")) + } + if got, want := envValue(req.Env, "HOME"), homePaths.HomeDir+"/providers/opencode"; got != want { + t.Fatalf("HOME = %q, want %q", got, want) + } + if got := envValue(req.Env, "OPENCODE_CONFIG_DIR"); got == "" { + t.Fatal("OPENCODE_CONFIG_DIR is empty, want isolated OpenCode config dir") + } + }) + + t.Run("Should record unavailable OpenCode command status", func(t *testing.T) { + t.Parallel() + + executor := &fakeDiscoveryExecutor{ + result: DiscoveryCommandResult{Stderr: "api_key=opencode-secret missing"}, + err: errors.New("exec: opencode not found"), + } + source := newLiveSourceForTest(t, "opencode", aghconfig.ProviderConfig{}, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin"}, + CommandExecutor: executor, + }) + store := newMemoryStore() + service := newTestService(t, store, []Source{source}) + + statuses, err := service.Refresh( + testutil.Context(t), + RefreshOptions{ProviderID: "opencode", Force: true, Now: testTime(0)}, + ) + if !errors.Is(err, ErrAllSourcesFailed) { + t.Fatalf("Refresh() error = %v, want ErrAllSourcesFailed", err) + } + status := requireStatus(t, statuses, "provider_live:opencode") + if status.RefreshState != string(RefreshStateFailed) { + t.Fatalf("RefreshState = %q, want failed", status.RefreshState) + } + if strings.Contains(status.LastError, "opencode-secret") || !strings.Contains(status.LastError, "[REDACTED]") { + t.Fatalf("LastError = %q, want redacted command detail", status.LastError) + } + }) + + t.Run("Should fail closed for OpenClaw without configured discovery path", func(t *testing.T) { + t.Parallel() + + source := newLiveSourceForTest(t, "openclaw", aghconfig.ProviderConfig{}, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin"}, + }) + store := newMemoryStore() + service := newTestService(t, store, []Source{source}) + + statuses, err := service.Refresh( + testutil.Context(t), + RefreshOptions{ProviderID: "openclaw", Force: true, Now: testTime(0)}, + ) + if !errors.Is(err, ErrAllSourcesFailed) { + t.Fatalf("Refresh() error = %v, want ErrAllSourcesFailed", err) + } + status := requireStatus(t, statuses, "provider_live:openclaw") + if status.RefreshState != string(RefreshStateFailed) { + t.Fatalf("RefreshState = %q, want failed", status.RefreshState) + } + if !strings.Contains(status.LastError, "no configured side-effect-free") { + t.Fatalf("LastError = %q, want no configured discovery path", status.LastError) + } + }) + + t.Run("Should use configured Hermes command only when enabled", func(t *testing.T) { + t.Parallel() + + executor := &fakeDiscoveryExecutor{ + result: DiscoveryCommandResult{Stdout: `[{"id":"hermes-model"}]`}, + } + provider := aghconfig.ProviderConfig{} + provider.Models.Discovery.Command = "hermes models --json" + disabledSource := newLiveSourceForTest(t, "hermes", provider, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin"}, + CommandExecutor: executor, + }) + store := newMemoryStore() + service := newTestService(t, store, []Source{disabledSource}) + statuses, err := service.Refresh( + testutil.Context(t), + RefreshOptions{ProviderID: "hermes", Force: true, Now: testTime(0)}, + ) + if err != nil { + t.Fatalf("Refresh(disabled by default) error = %v", err) + } + if got := executor.callCount(); got != 0 { + t.Fatalf("executor calls = %d, want 0", got) + } + if status := requireStatus( + t, + statuses, + "provider_live:hermes", + ); status.RefreshState != string( + RefreshStateDisabled, + ) { + t.Fatalf("RefreshState = %q, want disabled", status.RefreshState) + } + + enabled := true + provider.Models.Discovery.Enabled = &enabled + enabledSource := newLiveSourceForTest(t, "hermes", provider, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin"}, + CommandExecutor: executor, + }) + rows, err := enabledSource.ListModels(testutil.Context(t), ListOptions{ProviderID: "hermes", Now: testTime(0)}) + if err != nil { + t.Fatalf("ListModels(enabled command) error = %v", err) + } + if got := executor.callCount(); got != 1 { + t.Fatalf("executor calls = %d, want 1", got) + } + if got, want := requireSingleRow(t, rows).ModelID, "hermes-model"; got != want { + t.Fatalf("ModelID = %q, want %q", got, want) + } + }) + + t.Run("Should use configured Pi endpoint only when enabled", func(t *testing.T) { + t.Parallel() + + server := liveJSONServer(t, func(_ *http.Request) string { + return `{"data":[{"id":"anthropic/claude-sonnet-4-6"}]}` + }) + enabled := true + provider := aghconfig.ProviderConfig{} + provider.Models.Discovery.Enabled = &enabled + provider.Models.Discovery.Endpoint = server.URL + rows, err := newLiveSourceForTest(t, "pi", provider, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin"}, + }).ListModels(testutil.Context(t), ListOptions{ProviderID: "pi", Now: testTime(0)}) + if err != nil { + t.Fatalf("ListModels() error = %v", err) + } + if got, want := requireSingleRow(t, rows).ModelID, "anthropic/claude-sonnet-4-6"; got != want { + t.Fatalf("ModelID = %q, want %q", got, want) + } + }) + + t.Run("Should record live discovery timeout without blocking indefinitely", func(t *testing.T) { + t.Parallel() + + server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) { + time.Sleep(150 * time.Millisecond) + w.WriteHeader(http.StatusOK) + })) + t.Cleanup(server.Close) + provider := aghconfig.ProviderConfig{} + provider.Models.Discovery.Endpoint = server.URL + provider.Models.Discovery.Timeout = "20ms" + source := newLiveSourceForTest(t, "vercel-ai-gateway", provider, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin"}, + }) + store := newMemoryStore() + service := newTestService(t, store, []Source{source}) + + started := time.Now() + statuses, err := service.Refresh( + testutil.Context(t), + RefreshOptions{ProviderID: "vercel-ai-gateway", Force: true, Now: testTime(0)}, + ) + elapsed := time.Since(started) + if !errors.Is(err, ErrAllSourcesFailed) { + t.Fatalf("Refresh() error = %v, want ErrAllSourcesFailed", err) + } + if elapsed >= 120*time.Millisecond { + t.Fatalf("elapsed = %s, want timeout before server sleep completes", elapsed) + } + status := requireStatus(t, statuses, "provider_live:vercel-ai-gateway") + if status.RefreshState != string(RefreshStateFailed) { + t.Fatalf("RefreshState = %q, want failed", status.RefreshState) + } + }) +} + +func TestLiveProviderRefreshCoalescing(t *testing.T) { + t.Parallel() + + t.Run("Should coalesce concurrent refreshes for the same provider", func(t *testing.T) { + t.Parallel() + + source := newBlockingProviderSource("provider_live:codex", "codex") + service := newTestService(t, newMemoryStore(), []Source{source}) + ctx := testutil.Context(t) + var wg sync.WaitGroup + results := make([][]SourceStatus, 2) + errs := make([]error, 2) + for index := range results { + wg.Add(1) + go func(i int) { + defer wg.Done() + statuses, err := service.Refresh(ctx, RefreshOptions{ + ProviderID: "codex", + Force: true, + Now: testTime(0), + }) + results[i] = statuses + errs[i] = err + }(index) + } + <-source.started + time.Sleep(50 * time.Millisecond) + source.release() + wg.Wait() + + for index, err := range errs { + if err != nil { + t.Fatalf("Refresh[%d]() error = %v", index, err) + } + } + if got := source.callCount(); got != 1 { + t.Fatalf("source calls = %d, want 1", got) + } + if len(results[0]) != 1 || len(results[1]) != 1 { + t.Fatalf("statuses = %#v / %#v, want one status each", results[0], results[1]) + } + if results[0][0].LastRefresh != results[1][0].LastRefresh { + t.Fatalf("LastRefresh differs: %s vs %s", results[0][0].LastRefresh, results[1][0].LastRefresh) + } + }) + + t.Run("Should serialize different source scopes without sharing statuses", func(t *testing.T) { + t.Parallel() + + firstSource := newBlockingProviderSource("provider_live:codex_a", "codex") + secondSource := newBlockingProviderSource("provider_live:codex_b", "codex") + service := newTestService(t, newMemoryStore(), []Source{firstSource, secondSource}) + ctx := testutil.Context(t) + + firstDone := make(chan []SourceStatus, 1) + firstErr := make(chan error, 1) + go func() { + statuses, err := service.Refresh(ctx, RefreshOptions{ + ProviderID: "codex", + SourceID: firstSource.ID(), + Force: true, + Now: testTime(0), + }) + firstDone <- statuses + firstErr <- err + }() + <-firstSource.started + + secondDone := make(chan []SourceStatus, 1) + secondErr := make(chan error, 1) + go func() { + statuses, err := service.Refresh(ctx, RefreshOptions{ + ProviderID: "codex", + SourceID: secondSource.ID(), + Force: true, + Now: testTime(1), + }) + secondDone <- statuses + secondErr <- err + }() + + select { + case <-secondSource.started: + t.Fatal("second source started before first provider refresh finished") + case <-time.After(30 * time.Millisecond): + } + + firstSource.release() + select { + case <-secondSource.started: + case <-time.After(300 * time.Millisecond): + t.Fatal("second source did not start after first provider refresh finished") + } + secondSource.release() + + firstStatuses := <-firstDone + if err := <-firstErr; err != nil { + t.Fatalf("first Refresh() error = %v", err) + } + secondStatuses := <-secondDone + if err := <-secondErr; err != nil { + t.Fatalf("second Refresh() error = %v", err) + } + if got, want := requireStatus(t, firstStatuses, firstSource.ID()).SourceID, firstSource.ID(); got != want { + t.Fatalf("first status source = %q, want %q", got, want) + } + if got, want := requireStatus(t, secondStatuses, secondSource.ID()).SourceID, secondSource.ID(); got != want { + t.Fatalf("second status source = %q, want %q", got, want) + } + if got := firstSource.callCount(); got != 1 { + t.Fatalf("first source calls = %d, want 1", got) + } + if got := secondSource.callCount(); got != 1 { + t.Fatalf("second source calls = %d, want 1", got) + } + }) +} + +func TestLiveProviderSourceRegistration(t *testing.T) { + t.Parallel() + + t.Run("Should register core live provider sources", func(t *testing.T) { + t.Parallel() + + sources, err := NewLiveProviderSources(LiveProviderSourcesConfig{ + Providers: map[string]aghconfig.ProviderConfig{ + "ollama": {Command: "ollama serve"}, + "openai": { + Command: "openai", + AuthMode: aghconfig.ProviderAuthModeBoundSecret, + CredentialSlots: []aghconfig.ProviderCredentialSlot{ + {Name: "api_key", TargetEnv: "OPENAI_API_KEY", SecretRef: "env:OPENAI_API_KEY", Required: true}, + }, + }, + }, + BaseEnv: []string{"PATH=/bin"}, + }) + if err != nil { + t.Fatalf("NewLiveProviderSources() error = %v", err) + } + sourceIDs := make([]string, 0, len(sources)) + for _, source := range sources { + sourceIDs = append(sourceIDs, source.ID()) + } + for _, want := range []string{ + "provider_live:codex", + "provider_live:claude", + "provider_live:gemini", + "provider_live:openrouter", + "provider_live:vercel-ai-gateway", + "provider_live:opencode", + "provider_live:openclaw", + "provider_live:hermes", + "provider_live:pi", + "provider_live:ollama", + "provider_live:openai", + } { + if !slices.Contains(sourceIDs, want) { + t.Fatalf("source ids = %#v, want %q registered", sourceIDs, want) + } + } + }) + + t.Run("Should derive default endpoint from versioned base URL", func(t *testing.T) { + t.Parallel() + + provider := boundSecretProvider("OPENAI_API_KEY", "env:OPENAI_API_KEY") + provider.BaseURL = "https://api.openai.test/v1" + source := newLiveSourceForTest(t, "openai", provider, LiveProviderSourcesConfig{ + BaseEnv: []string{"PATH=/bin"}, + SecretResolver: mapSecretResolver{"env:OPENAI_API_KEY": "sk-test"}, + }) + target, err := source.discoveryTarget() + if err != nil { + t.Fatalf("discoveryTarget() error = %v", err) + } + if got, want := target.endpoint, "https://api.openai.test/v1/models"; got != want { + t.Fatalf("endpoint = %q, want %q", got, want) + } + if got, want := source.ProviderIDs(), []string{"openai"}; !slices.Equal(got, want) { + t.Fatalf("ProviderIDs() = %#v, want %#v", got, want) + } + }) +} + +func TestLiveProviderParsingHelpers(t *testing.T) { + t.Parallel() + + t.Run("Should parse object map model payload", func(t *testing.T) { + t.Parallel() + + rows, err := parseLiveModelPayload( + "custom", + []byte( + `{"model-a":{"display_name":"Model A","supports_tools":true},`+ + `"model-b":{`+ + `"name":"Model B",`+ + `"reasoning_efforts":["minimal","unknown","high"],`+ + `"default_reasoning_effort":"high"`+ + `}}`, + ), + testTime(0), + ) + if err != nil { + t.Fatalf("parseLiveModelPayload() error = %v", err) + } + if got, want := rowModelIDs(rows), []string{"model-a", "model-b"}; !slices.Equal(got, want) { + t.Fatalf("row ids = %#v, want %#v", got, want) + } + if rows[0].SupportsTools == nil || !*rows[0].SupportsTools { + t.Fatalf("SupportsTools = %v, want true", rows[0].SupportsTools) + } + if !slices.Equal(rows[1].ReasoningEfforts, []ReasoningEffort{ReasoningEffortMinimal, ReasoningEffortHigh}) { + t.Fatalf("ReasoningEfforts = %#v, want minimal/high", rows[1].ReasoningEfforts) + } + if rows[1].DefaultReasoningEffort == nil || *rows[1].DefaultReasoningEffort != ReasoningEffortHigh { + t.Fatalf("DefaultReasoningEffort = %v, want high", rows[1].DefaultReasoningEffort) + } + }) + + t.Run("Should reject empty live payload", func(t *testing.T) { + t.Parallel() + + _, err := parseLiveModelPayload("custom", []byte(" "), testTime(0)) + if err == nil { + t.Fatal("parseLiveModelPayload(empty) error = nil, want error") + } + }) +} + +func TestLiveDiscoverySupportTypes(t *testing.T) { + t.Parallel() + + t.Run("Should resolve env secret refs", func(t *testing.T) { + t.Parallel() + + resolver := EnvSecretResolver{LookupEnv: func(key string) (string, bool) { + return map[string]string{"OPENAI_API_KEY": "sk-env"}[key], key == "OPENAI_API_KEY" + }} + value, err := resolver.ResolveRef(testutil.Context(t), "env:OPENAI_API_KEY") + if err != nil { + t.Fatalf("ResolveRef() error = %v", err) + } + if value != "sk-env" { + t.Fatalf("value = %q, want sk-env", value) + } + }) + + t.Run("Should reject unsupported secret refs", func(t *testing.T) { + t.Parallel() + + _, err := (EnvSecretResolver{}).ResolveRef(testutil.Context(t), "vault:providers/openai/api_key") + if !errors.Is(err, vault.ErrUnsupportedSecretRef) { + t.Fatalf("ResolveRef(vault) error = %v, want ErrUnsupportedSecretRef", err) + } + }) + + t.Run("Should run subprocess discovery command", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + result, err := ExecDiscoveryCommandExecutor{}.RunDiscoveryCommand(ctx, DiscoveryCommandRequest{ + ProviderID: "helper", + Command: os.Args[0], + Args: []string{"-test.run=TestLiveDiscoveryHelperProcess", "--", "ok"}, + Env: append(os.Environ(), "AGH_LIVE_DISCOVERY_HELPER=1"), + Timeout: time.Second, + }) + if err != nil { + t.Fatalf("RunDiscoveryCommand() error = %v", err) + } + if result.ExitCode != 0 { + t.Fatalf("ExitCode = %d, want 0", result.ExitCode) + } + if strings.TrimSpace(result.Stdout) != `[{"id":"helper-model"}]` { + t.Fatalf("Stdout = %q, want helper model JSON", result.Stdout) + } + }) +} + +func TestLiveDiscoveryHelperProcess(_ *testing.T) { + if os.Getenv("AGH_LIVE_DISCOVERY_HELPER") != "1" { + return + } + fmt.Fprint(os.Stdout, `[{"id":"helper-model"}]`) + os.Exit(0) +} + +func liveJSONServer(t *testing.T, body func(*http.Request) string) *httptest.Server { + t.Helper() + + server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { + w.Header().Set("Content-Type", "application/json") + _, err := fmt.Fprint(w, body(r)) + if err != nil { + t.Errorf("Fprint(response) error = %v", err) + } + })) + t.Cleanup(server.Close) + return server +} + +func boundSecretProvider(targetEnv string, secretRef string) aghconfig.ProviderConfig { + return aghconfig.ProviderConfig{ + AuthMode: aghconfig.ProviderAuthModeBoundSecret, + CredentialSlots: []aghconfig.ProviderCredentialSlot{ + {Name: "api_key", TargetEnv: targetEnv, SecretRef: secretRef, Required: true}, + }, + } +} + +func newLiveSourceForTest( + t *testing.T, + providerID string, + provider aghconfig.ProviderConfig, + cfg LiveProviderSourcesConfig, +) *LiveProviderSource { + t.Helper() + + source, err := NewLiveProviderSource(providerID, provider, cfg) + if err != nil { + t.Fatalf("NewLiveProviderSource() error = %v", err) + } + return source +} + +type mapSecretResolver map[string]string + +func (r mapSecretResolver) ResolveRef(ctx context.Context, ref string) (string, error) { + if ctx == nil { + return "", errors.New("context required") + } + if err := ctx.Err(); err != nil { + return "", err + } + normalized := vault.NormalizeRef(ref) + value, ok := r[normalized] + if !ok || strings.TrimSpace(value) == "" { + return "", fmt.Errorf("%w: %s", vault.ErrMissingSecret, normalized) + } + return value, nil +} + +type fakeDiscoveryExecutor struct { + mu sync.Mutex + result DiscoveryCommandResult + err error + requests []DiscoveryCommandRequest +} + +func (e *fakeDiscoveryExecutor) RunDiscoveryCommand( + _ context.Context, + req DiscoveryCommandRequest, +) (DiscoveryCommandResult, error) { + e.mu.Lock() + e.requests = append(e.requests, req) + e.mu.Unlock() + return e.result, e.err +} + +func (e *fakeDiscoveryExecutor) callCount() int { + e.mu.Lock() + defer e.mu.Unlock() + return len(e.requests) +} + +func (e *fakeDiscoveryExecutor) singleRequest(t *testing.T) DiscoveryCommandRequest { + t.Helper() + + e.mu.Lock() + defer e.mu.Unlock() + if len(e.requests) != 1 { + t.Fatalf("len(requests) = %d, want 1", len(e.requests)) + } + return e.requests[0] +} + +type blockingProviderSource struct { + id string + provider string + started chan struct{} + released chan struct{} + calls atomic.Int64 + once sync.Once +} + +func newBlockingProviderSource(id string, provider string) *blockingProviderSource { + return &blockingProviderSource{ + id: id, + provider: provider, + started: make(chan struct{}), + released: make(chan struct{}), + } +} + +func (s *blockingProviderSource) ID() string { + return s.id +} + +func (s *blockingProviderSource) Kind() SourceKind { + return SourceKindProviderLive +} + +func (s *blockingProviderSource) Priority() int { + return PriorityProviderLive +} + +func (s *blockingProviderSource) ProviderIDs() []string { + return []string{s.provider} +} + +func (s *blockingProviderSource) ListModels(_ context.Context, _ ListOptions) ([]ModelRow, error) { + s.calls.Add(1) + s.once.Do(func() { + close(s.started) + }) + <-s.released + available := true + return []ModelRow{ + { + ProviderID: s.provider, + ModelID: "gpt-5.4", + SourceID: s.id, + SourceKind: SourceKindProviderLive, + Priority: PriorityProviderLive, + Available: &available, + RefreshedAt: testTime(0), + }, + }, nil +} + +func (s *blockingProviderSource) release() { + close(s.released) +} + +func (s *blockingProviderSource) callCount() int64 { + return s.calls.Load() +} + +func envValue(env []string, key string) string { + for _, entry := range env { + currentKey, value, ok := strings.Cut(entry, "=") + if ok && currentKey == key { + return value + } + } + return "" +} diff --git a/internal/modelcatalog/merge.go b/internal/modelcatalog/merge.go new file mode 100644 index 000000000..9f1ad5b9c --- /dev/null +++ b/internal/modelcatalog/merge.go @@ -0,0 +1,141 @@ +package modelcatalog + +import ( + "sort" + "strings" + "time" +) + +// MergeRows computes deterministic model projections from source rows. +func MergeRows(rows []ModelRow) []Model { + if len(rows) == 0 { + return nil + } + grouped := make(map[string][]ModelRow) + for _, row := range rows { + if strings.TrimSpace(row.ProviderID) == "" || strings.TrimSpace(row.ModelID) == "" { + continue + } + key := row.ProviderID + "\x00" + row.ModelID + grouped[key] = append(grouped[key], row) + } + models := make([]Model, 0, len(grouped)) + for _, group := range grouped { + sortModelRows(group) + models = append(models, mergeModelGroup(group)) + } + sort.SliceStable(models, func(i, j int) bool { + if models[i].ProviderID != models[j].ProviderID { + return models[i].ProviderID < models[j].ProviderID + } + return models[i].ModelID < models[j].ModelID + }) + return models +} + +func mergeModelGroup(rows []ModelRow) Model { + first := rows[0] + model := Model{ + ProviderID: first.ProviderID, + ModelID: first.ModelID, + AvailabilityState: string(AvailabilityStateUnknown), + RefreshedAt: first.RefreshedAt, + Sources: make([]SourceRef, 0, len(rows)), + } + for _, row := range rows { + model.Sources = append(model.Sources, SourceRef{ + SourceID: row.SourceID, + SourceKind: row.SourceKind, + Priority: row.Priority, + RefreshedAt: row.RefreshedAt, + Stale: row.Stale, + LastError: RedactString(row.LastError), + }) + if model.DisplayName == "" { + model.DisplayName = row.DisplayName + } + if model.ContextWindow == nil { + model.ContextWindow = row.ContextWindow + } + if model.MaxInputTokens == nil { + model.MaxInputTokens = row.MaxInputTokens + } + if model.MaxOutputTokens == nil { + model.MaxOutputTokens = row.MaxOutputTokens + } + if model.SupportsTools == nil { + model.SupportsTools = row.SupportsTools + } + if model.SupportsReasoning == nil { + model.SupportsReasoning = row.SupportsReasoning + } + if len(model.ReasoningEfforts) == 0 && len(row.ReasoningEfforts) > 0 { + model.ReasoningEfforts = append([]ReasoningEffort(nil), row.ReasoningEfforts...) + } + if model.DefaultReasoningEffort == nil { + model.DefaultReasoningEffort = row.DefaultReasoningEffort + } + if model.CostInputPerMillion == nil { + model.CostInputPerMillion = row.CostInputPerMillion + } + if model.CostOutputPerMillion == nil { + model.CostOutputPerMillion = row.CostOutputPerMillion + } + if model.LastError == "" { + model.LastError = RedactString(row.LastError) + } + if row.Stale { + model.Stale = true + } + } + applyAvailability(&model, rows) + return model +} + +func applyAvailability(model *Model, rows []ModelRow) { + for _, row := range rows { + if row.Available == nil || !availabilityAuthority(row.SourceKind) { + continue + } + model.Available = row.Available + model.Stale = row.Stale + switch { + case *row.Available && row.Stale: + model.AvailabilityState = string(AvailabilityStateAvailableStale) + case *row.Available: + model.AvailabilityState = string(AvailabilityStateAvailableLive) + case row.Stale: + model.AvailabilityState = string(AvailabilityStateUnavailableStale) + default: + model.AvailabilityState = string(AvailabilityStateUnavailableLive) + } + return + } + model.AvailabilityState = string(AvailabilityStateUnknown) + model.Available = nil +} + +func availabilityAuthority(kind SourceKind) bool { + return kind == SourceKindProviderLive || kind == SourceKindExtension +} + +func sortModelRows(rows []ModelRow) { + sort.SliceStable(rows, func(i, j int) bool { + left := rows[i] + right := rows[j] + if left.Priority != right.Priority { + return left.Priority > right.Priority + } + if !left.RefreshedAt.Equal(right.RefreshedAt) { + return left.RefreshedAt.After(right.RefreshedAt) + } + return left.SourceID < right.SourceID + }) +} + +func defaultNow(now time.Time) time.Time { + if now.IsZero() { + return time.Now().UTC() + } + return now.UTC() +} diff --git a/internal/modelcatalog/modelsdev.go b/internal/modelcatalog/modelsdev.go new file mode 100644 index 000000000..cf5a7bb3e --- /dev/null +++ b/internal/modelcatalog/modelsdev.go @@ -0,0 +1,381 @@ +package modelcatalog + +import ( + "context" + "encoding/json" + "fmt" + "io" + "maps" + "net/http" + "sort" + "strings" + "sync" + "time" + + aghconfig "github.com/pedronauck/agh/internal/config" +) + +const maxModelsDevPayloadBytes = 16 << 20 + +var defaultModelsDevProviderMapping = map[string]string{ + "anthropic": "claude", + "claude": "claude", + "google": "gemini", + "gemini": "gemini", + "openai": "codex", + "codex": "codex", + "openrouter": "openrouter", + "moonshot": "moonshot", + "kimi": "moonshot", + "xai": "xai", + "mistral": "mistral", + "groq": "groq", + "minimax": "minimax", +} + +type modelsDevCache struct { + expiresAt time.Time + rows []ModelRow +} + +// ModelsDevSource fetches catalog enrichment from models.dev. +type ModelsDevSource struct { + endpoint string + enabled bool + ttl time.Duration + timeout time.Duration + client *http.Client + providerIDs map[string]struct{} + providerMap map[string]string + mu sync.Mutex + cache *modelsDevCache +} + +var _ Source = (*ModelsDevSource)(nil) + +// ModelsDevSourceOption customizes models.dev source construction. +type ModelsDevSourceOption func(*ModelsDevSource) + +// WithModelsDevHTTPClient injects the explicit-timeout HTTP client used for models.dev fetches. +func WithModelsDevHTTPClient(client *http.Client) ModelsDevSourceOption { + return func(source *ModelsDevSource) { + if source != nil && client != nil { + source.client = client + } + } +} + +// NewModelsDevSource creates a configured models.dev source. +func NewModelsDevSource( + providers map[string]aghconfig.ProviderConfig, + cfg aghconfig.ModelsDevSourceConfig, + options ...ModelsDevSourceOption, +) (*ModelsDevSource, error) { + ttl, err := time.ParseDuration(cfg.EffectiveTTL()) + if err != nil { + return nil, fmt.Errorf("model catalog: parse models.dev ttl: %w", err) + } + timeout, err := time.ParseDuration(cfg.EffectiveTimeout()) + if err != nil { + return nil, fmt.Errorf("model catalog: parse models.dev timeout: %w", err) + } + source := &ModelsDevSource{ + endpoint: strings.TrimSpace(cfg.EffectiveEndpoint()), + enabled: cfg.EffectiveEnabled(), + ttl: ttl, + timeout: timeout, + client: &http.Client{Timeout: timeout}, + providerIDs: knownProviderIDs(providers), + providerMap: cloneProviderMapping(defaultModelsDevProviderMapping), + } + for _, option := range options { + if option != nil { + option(source) + } + } + return source, nil +} + +func (s *ModelsDevSource) ID() string { + return SourceIDModelsDev +} + +func (s *ModelsDevSource) Kind() SourceKind { + return SourceKindModelsDev +} + +func (s *ModelsDevSource) Priority() int { + return PriorityModelsDev +} + +func (s *ModelsDevSource) TTL() time.Duration { + return s.ttl +} + +// Timeout returns the explicit HTTP timeout used by the source. +func (s *ModelsDevSource) Timeout() time.Duration { + return s.timeout +} + +func (s *ModelsDevSource) ProviderIDs() []string { + providers := make([]string, 0, len(s.providerIDs)) + for providerID := range s.providerIDs { + providers = append(providers, providerID) + } + sort.Strings(providers) + return providers +} + +func (s *ModelsDevSource) ListModels(ctx context.Context, opts ListOptions) ([]ModelRow, error) { + if ctx == nil { + return nil, fmt.Errorf("model catalog: models.dev context is required") + } + if !s.enabled { + return nil, ErrSourceDisabled + } + now := defaultNow(opts.Now) + if !opts.Refresh { + if rows, ok := s.cachedRows(now, opts.ProviderID, false, ""); ok { + return rows, nil + } + } + rows, err := s.fetchRows(ctx, now) + if err != nil { + if cached, ok := s.cachedRows(now, opts.ProviderID, true, sourceErrorText(err)); ok { + return cached, &StaleFallbackError{SourceID: s.ID(), Err: err} + } + return nil, err + } + s.mu.Lock() + s.cache = &modelsDevCache{ + expiresAt: now.Add(s.ttl), + rows: cloneModelRows(rows), + } + s.mu.Unlock() + return filterRowsByProvider(rows, opts.ProviderID), nil +} + +func (s *ModelsDevSource) cachedRows( + now time.Time, + providerID string, + stale bool, + lastError string, +) ([]ModelRow, bool) { + s.mu.Lock() + defer s.mu.Unlock() + if s.cache == nil { + return nil, false + } + if !stale && !s.cache.expiresAt.IsZero() && !s.cache.expiresAt.After(now) { + return nil, false + } + rows := cloneModelRows(s.cache.rows) + for index := range rows { + rows[index].Stale = stale + rows[index].LastError = RedactString(lastError) + } + return filterRowsByProvider(rows, providerID), true +} + +func (s *ModelsDevSource) fetchRows(ctx context.Context, now time.Time) (rows []ModelRow, err error) { + req, err := http.NewRequestWithContext(ctx, http.MethodGet, s.endpoint, http.NoBody) + if err != nil { + return nil, fmt.Errorf("model catalog: create models.dev request: %w", err) + } + resp, err := s.client.Do(req) + if err != nil { + return nil, fmt.Errorf("model catalog: fetch models.dev catalog: %w", err) + } + defer func() { + if closeErr := resp.Body.Close(); closeErr != nil && err == nil { + err = fmt.Errorf("model catalog: close models.dev response: %w", closeErr) + } + }() + if resp.StatusCode < http.StatusOK || resp.StatusCode >= http.StatusMultipleChoices { + return nil, fmt.Errorf("model catalog: models.dev returned HTTP %d", resp.StatusCode) + } + decoder := json.NewDecoder(io.LimitReader(resp.Body, maxModelsDevPayloadBytes)) + var payload modelsDevPayload + if err := decoder.Decode(&payload); err != nil { + return nil, fmt.Errorf("model catalog: decode models.dev catalog: %w", err) + } + return s.parsePayload(payload, now), nil +} + +func (s *ModelsDevSource) parsePayload(payload modelsDevPayload, now time.Time) []ModelRow { + rows := make([]ModelRow, 0) + for providerKey, provider := range payload { + rawProviderID := provider.ID + if strings.TrimSpace(rawProviderID) == "" { + rawProviderID = providerKey + } + providerID := s.resolveProviderID(rawProviderID) + if providerID == "" || len(provider.Models) == 0 { + continue + } + modelKeys := make([]string, 0, len(provider.Models)) + for modelKey := range provider.Models { + modelKeys = append(modelKeys, modelKey) + } + sort.Strings(modelKeys) + for _, modelKey := range modelKeys { + row, ok := modelsDevRow(providerID, modelKey, provider.Models[modelKey], now) + if ok { + rows = append(rows, row) + } + } + } + sort.SliceStable(rows, func(i, j int) bool { + if rows[i].ProviderID != rows[j].ProviderID { + return rows[i].ProviderID < rows[j].ProviderID + } + return rows[i].ModelID < rows[j].ModelID + }) + return rows +} + +func (s *ModelsDevSource) resolveProviderID(raw string) string { + normalized := strings.ToLower(strings.TrimSpace(raw)) + if normalized == "" { + return "" + } + if mapped := s.providerMap[normalized]; mapped != "" { + if _, ok := s.providerIDs[mapped]; ok { + return mapped + } + return "" + } + if _, ok := s.providerIDs[normalized]; ok { + return normalized + } + return "" +} + +type modelsDevPayload map[string]modelsDevProvider + +type modelsDevProvider struct { + ID string `json:"id"` + Models map[string]modelsDevRawModel `json:"models"` +} + +type modelsDevRawModel struct { + ID string `json:"id"` + Name string `json:"name"` + Reasoning *bool `json:"reasoning"` + SupportsReasoning *bool `json:"supportsReasoning"` + SupportsReasoningLegacy *bool `json:"supports_reasoning"` + ToolCall *bool `json:"tool_call"` + SupportsTools *bool `json:"supportsTools"` + SupportsToolsLegacy *bool `json:"supports_tools"` + Limit modelsDevLimit `json:"limit"` + ContextWindow *int64 `json:"contextWindow"` + MaxInputTokens *int64 `json:"maxInputTokens"` + MaxOutputTokens *int64 `json:"maxOutputTokens"` + Cost modelsDevCost `json:"cost"` + Pricing modelsDevCost `json:"pricing"` +} + +type modelsDevLimit struct { + Context *int64 `json:"context"` + Input *int64 `json:"input"` + Output *int64 `json:"output"` +} + +type modelsDevCost struct { + Input *float64 `json:"input"` + Output *float64 `json:"output"` +} + +func modelsDevRow(providerID string, modelKey string, raw modelsDevRawModel, now time.Time) (ModelRow, bool) { + modelID := strings.TrimSpace(raw.ID) + if modelID == "" { + modelID = strings.TrimSpace(modelKey) + } + if modelID == "" { + return ModelRow{}, false + } + row := ModelRow{ + ProviderID: providerID, + ModelID: modelID, + DisplayName: strings.TrimSpace(raw.Name), + SourceID: SourceIDModelsDev, + SourceKind: SourceKindModelsDev, + Priority: PriorityModelsDev, + RefreshedAt: now, + ContextWindow: firstInt64(raw.Limit.Context, raw.ContextWindow), + MaxInputTokens: firstInt64(raw.Limit.Input, raw.MaxInputTokens), + MaxOutputTokens: firstInt64(raw.Limit.Output, raw.MaxOutputTokens), + SupportsTools: firstBool(raw.ToolCall, raw.SupportsTools, raw.SupportsToolsLegacy), + SupportsReasoning: firstBool(raw.Reasoning, raw.SupportsReasoning, raw.SupportsReasoningLegacy), + CostInputPerMillion: firstFloat64(raw.Cost.Input, raw.Pricing.Input), + CostOutputPerMillion: firstFloat64(raw.Cost.Output, raw.Pricing.Output), + } + return row, true +} + +func firstBool(values ...*bool) *bool { + for _, value := range values { + if value != nil { + return value + } + } + return nil +} + +func firstInt64(values ...*int64) *int64 { + for _, value := range values { + if value != nil { + return value + } + } + return nil +} + +func firstFloat64(values ...*float64) *float64 { + for _, value := range values { + if value != nil { + return value + } + } + return nil +} + +func knownProviderIDs(providers map[string]aghconfig.ProviderConfig) map[string]struct{} { + known := make(map[string]struct{}) + for providerID := range aghconfig.BuiltinProviders() { + known[providerID] = struct{}{} + } + for providerID := range providers { + known[providerID] = struct{}{} + } + return known +} + +func cloneProviderMapping(src map[string]string) map[string]string { + cloned := make(map[string]string, len(src)) + maps.Copy(cloned, src) + return cloned +} + +func filterRowsByProvider(rows []ModelRow, providerID string) []ModelRow { + trimmed := strings.TrimSpace(providerID) + if trimmed == "" { + return rows + } + filtered := make([]ModelRow, 0, len(rows)) + for _, row := range rows { + if row.ProviderID == trimmed { + filtered = append(filtered, row) + } + } + return filtered +} + +func cloneModelRows(rows []ModelRow) []ModelRow { + cloned := make([]ModelRow, len(rows)) + for index, row := range rows { + cloned[index] = row + cloned[index].ReasoningEfforts = append([]ReasoningEffort(nil), row.ReasoningEfforts...) + } + return cloned +} diff --git a/internal/modelcatalog/modelsdev_test.go b/internal/modelcatalog/modelsdev_test.go new file mode 100644 index 000000000..1603ae09e --- /dev/null +++ b/internal/modelcatalog/modelsdev_test.go @@ -0,0 +1,344 @@ +package modelcatalog + +import ( + "errors" + "fmt" + "net/http" + "net/http/httptest" + "strings" + "sync/atomic" + "testing" + "time" + + aghconfig "github.com/pedronauck/agh/internal/config" + "github.com/pedronauck/agh/internal/testutil" +) + +func TestModelsDevSource(t *testing.T) { + t.Parallel() + + t.Run("Should parse current models dev fields", func(t *testing.T) { + t.Parallel() + + server := modelsDevServer(t, http.StatusOK, `{ + "openai": { + "id": "openai", + "models": { + "gpt-5.4": { + "id": "gpt-5.4", + "name": "GPT-5.4", + "reasoning": true, + "tool_call": true, + "limit": {"context": 256000, "input": 200000, "output": 32000}, + "cost": {"input": 1.25, "output": 10.5} + } + } + } + }`) + source := newModelsDevTestSource(t, server.URL, "1h", "1s", true) + + rows, err := source.ListModels( + testutil.Context(t), + ListOptions{ProviderID: "codex", Refresh: true, Now: testTime(0)}, + ) + if err != nil { + t.Fatalf("ListModels() error = %v", err) + } + row := requireSingleRow(t, rows) + assertModelsDevCurrentRow(t, row) + }) + + t.Run("Should parse legacy models dev aliases", func(t *testing.T) { + t.Parallel() + + server := modelsDevServer(t, http.StatusOK, `{ + "anthropic": { + "models": { + "claude-legacy": { + "name": "Claude Legacy", + "supportsReasoning": true, + "supports_tools": true, + "contextWindow": 200000, + "maxInputTokens": 150000, + "maxOutputTokens": 24000, + "pricing": {"input": 3, "output": 15} + } + } + } + }`) + source := newModelsDevTestSource(t, server.URL, "1h", "1s", true) + + rows, err := source.ListModels( + testutil.Context(t), + ListOptions{ProviderID: "claude", Refresh: true, Now: testTime(0)}, + ) + if err != nil { + t.Fatalf("ListModels() error = %v", err) + } + row := requireSingleRow(t, rows) + if row.ProviderID != "claude" || row.ModelID != "claude-legacy" { + t.Fatalf("row identity = %s/%s, want claude/claude-legacy", row.ProviderID, row.ModelID) + } + if row.SupportsReasoning == nil || !*row.SupportsReasoning { + t.Fatalf("SupportsReasoning = %v, want true", row.SupportsReasoning) + } + if row.SupportsTools == nil || !*row.SupportsTools { + t.Fatalf("SupportsTools = %v, want true", row.SupportsTools) + } + if row.ContextWindow == nil || *row.ContextWindow != 200000 { + t.Fatalf("ContextWindow = %v, want 200000", row.ContextWindow) + } + if row.MaxInputTokens == nil || *row.MaxInputTokens != 150000 { + t.Fatalf("MaxInputTokens = %v, want 150000", row.MaxInputTokens) + } + if row.MaxOutputTokens == nil || *row.MaxOutputTokens != 24000 { + t.Fatalf("MaxOutputTokens = %v, want 24000", row.MaxOutputTokens) + } + if row.CostInputPerMillion == nil || *row.CostInputPerMillion != 3 { + t.Fatalf("CostInputPerMillion = %v, want 3", row.CostInputPerMillion) + } + if row.CostOutputPerMillion == nil || *row.CostOutputPerMillion != 15 { + t.Fatalf("CostOutputPerMillion = %v, want 15", row.CostOutputPerMillion) + } + }) + + t.Run("Should record disabled status without outbound request", func(t *testing.T) { + t.Parallel() + + var requests atomic.Int64 + server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) { + requests.Add(1) + w.WriteHeader(http.StatusOK) + })) + t.Cleanup(server.Close) + source := newModelsDevTestSource(t, server.URL, "1h", "1s", false) + store := newMemoryStore() + service := newTestService(t, store, []Source{source}) + + statuses, err := service.Refresh( + testutil.Context(t), + RefreshOptions{ProviderID: "codex", Force: true, Now: testTime(0)}, + ) + if err != nil { + t.Fatalf("Refresh(disabled) error = %v", err) + } + status := requireStatus(t, statuses, SourceIDModelsDev) + if status.RefreshState != string(RefreshStateDisabled) { + t.Fatalf("RefreshState = %q, want disabled", status.RefreshState) + } + if got := requests.Load(); got != 0 { + t.Fatalf("requests = %d, want 0", got) + } + }) + + t.Run("Should apply overridden endpoint ttl and timeout", func(t *testing.T) { + t.Parallel() + + var requests atomic.Int64 + server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) { + requests.Add(1) + _, err := fmt.Fprint(w, `{"openai":{"models":{"gpt-5.4":{"name":"GPT-5.4"}}}}`) + if err != nil { + t.Errorf("Fprint(response) error = %v", err) + } + })) + t.Cleanup(server.Close) + source := newModelsDevTestSource(t, server.URL, "1h", "250ms", true) + if source.Timeout() != 250*time.Millisecond { + t.Fatalf("Timeout() = %s, want 250ms", source.Timeout()) + } + if source.TTL() != time.Hour { + t.Fatalf("TTL() = %s, want 1h", source.TTL()) + } + store := newMemoryStore() + service := newTestService(t, store, []Source{source}) + + if _, err := service.Refresh( + testutil.Context(t), + RefreshOptions{ProviderID: "codex", Now: testTime(0)}, + ); err != nil { + t.Fatalf("Refresh(first) error = %v", err) + } + if _, err := service.Refresh( + testutil.Context(t), + RefreshOptions{ProviderID: "codex", Now: testTime(1)}, + ); err != nil { + t.Fatalf("Refresh(second) error = %v", err) + } + if got := requests.Load(); got != 1 { + t.Fatalf("requests = %d, want 1 due TTL skip", got) + } + }) + + t.Run("Should apply explicit HTTP timeout", func(t *testing.T) { + t.Parallel() + + server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) { + time.Sleep(200 * time.Millisecond) + w.WriteHeader(http.StatusOK) + })) + t.Cleanup(server.Close) + source := newModelsDevTestSource(t, server.URL, "1h", "20ms", true) + + started := time.Now() + _, err := source.ListModels( + testutil.Context(t), + ListOptions{ProviderID: "codex", Refresh: true, Now: testTime(0)}, + ) + elapsed := time.Since(started) + if err == nil { + t.Fatal("ListModels(timeout) error = nil, want timeout error") + } + if elapsed >= 150*time.Millisecond { + t.Fatalf("elapsed = %s, want timeout before server sleep completes", elapsed) + } + }) + + t.Run("Should return stale cached rows when refresh fails after prior success", func(t *testing.T) { + t.Parallel() + + var requests atomic.Int64 + server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) { + count := requests.Add(1) + if count == 1 { + _, err := fmt.Fprint(w, `{"openai":{"models":{"gpt-5.4":{"name":"GPT-5.4"}}}}`) + if err != nil { + t.Errorf("Fprint(first response) error = %v", err) + } + return + } + http.Error(w, "raw upstream secret sk-should-not-leak", http.StatusInternalServerError) + })) + t.Cleanup(server.Close) + source := newModelsDevTestSource(t, server.URL, "1h", "1s", true) + store := newMemoryStore() + service := newTestService(t, store, []Source{source}) + + if _, err := service.ListModels( + testutil.Context(t), + ListOptions{ProviderID: "codex", Refresh: true, Now: testTime(0)}, + ); err != nil { + t.Fatalf("ListModels(first) error = %v", err) + } + models, err := service.ListModels( + testutil.Context(t), + ListOptions{ProviderID: "codex", Refresh: true, Now: testTime(1)}, + ) + if err != nil { + t.Fatalf("ListModels(second stale fallback) error = %v", err) + } + model := requireSingleModel(t, models) + if !model.Stale { + t.Fatal("Model.Stale = false, want true") + } + statuses, err := service.ListSourceStatus(testutil.Context(t), "codex") + if err != nil { + t.Fatalf("ListSourceStatus() error = %v", err) + } + status := requireStatus(t, statuses, SourceIDModelsDev) + if status.RefreshState != string(RefreshStateFailed) || !status.Stale { + t.Fatalf("status = %#v, want failed stale status", status) + } + if !status.LastSuccess.Equal(testTime(0)) { + t.Fatalf("LastSuccess = %s, want first refresh time %s", status.LastSuccess, testTime(0)) + } + if strings.Contains(status.LastError, "sk-should-not-leak") { + t.Fatalf("LastError = %q, want redacted/no raw upstream body", status.LastError) + } + }) + + t.Run("Should reject all source failure without cache", func(t *testing.T) { + t.Parallel() + + server := modelsDevServer(t, http.StatusInternalServerError, `secret sk-raw-value`) + source := newModelsDevTestSource(t, server.URL, "1h", "1s", true) + + _, err := source.ListModels( + testutil.Context(t), + ListOptions{ProviderID: "codex", Refresh: true, Now: testTime(0)}, + ) + if err == nil { + t.Fatal("ListModels(no cache) error = nil, want upstream error") + } + var fallback *StaleFallbackError + if errors.As(err, &fallback) { + t.Fatalf("ListModels(no cache) error = %v, want no stale fallback", err) + } + }) +} + +func modelsDevServer(t *testing.T, status int, body string) *httptest.Server { + t.Helper() + + server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) { + w.WriteHeader(status) + _, err := fmt.Fprint(w, body) + if err != nil { + t.Errorf("Fprint(response) error = %v", err) + } + })) + t.Cleanup(server.Close) + return server +} + +func newModelsDevTestSource( + t *testing.T, + endpoint string, + ttl string, + timeout string, + enabled bool, +) *ModelsDevSource { + t.Helper() + + source, err := NewModelsDevSource(nil, aghconfig.ModelsDevSourceConfig{ + Enabled: &enabled, + Endpoint: endpoint, + TTL: ttl, + Timeout: timeout, + }) + if err != nil { + t.Fatalf("NewModelsDevSource() error = %v", err) + } + return source +} + +func requireSingleRow(t *testing.T, rows []ModelRow) ModelRow { + t.Helper() + + if len(rows) != 1 { + t.Fatalf("len(rows) = %d, want 1: %#v", len(rows), rows) + } + return rows[0] +} + +func assertModelsDevCurrentRow(t *testing.T, row ModelRow) { + t.Helper() + + if row.ProviderID != "codex" || row.ModelID != "gpt-5.4" { + t.Fatalf("row identity = %s/%s, want codex/gpt-5.4", row.ProviderID, row.ModelID) + } + if row.DisplayName != "GPT-5.4" { + t.Fatalf("DisplayName = %q, want GPT-5.4", row.DisplayName) + } + if row.SupportsReasoning == nil || !*row.SupportsReasoning { + t.Fatalf("SupportsReasoning = %v, want true", row.SupportsReasoning) + } + if row.SupportsTools == nil || !*row.SupportsTools { + t.Fatalf("SupportsTools = %v, want true", row.SupportsTools) + } + if row.ContextWindow == nil || *row.ContextWindow != 256000 { + t.Fatalf("ContextWindow = %v, want 256000", row.ContextWindow) + } + if row.MaxInputTokens == nil || *row.MaxInputTokens != 200000 { + t.Fatalf("MaxInputTokens = %v, want 200000", row.MaxInputTokens) + } + if row.MaxOutputTokens == nil || *row.MaxOutputTokens != 32000 { + t.Fatalf("MaxOutputTokens = %v, want 32000", row.MaxOutputTokens) + } + if row.CostInputPerMillion == nil || *row.CostInputPerMillion != 1.25 { + t.Fatalf("CostInputPerMillion = %v, want 1.25", row.CostInputPerMillion) + } + if row.CostOutputPerMillion == nil || *row.CostOutputPerMillion != 10.5 { + t.Fatalf("CostOutputPerMillion = %v, want 10.5", row.CostOutputPerMillion) + } +} diff --git a/internal/modelcatalog/redact.go b/internal/modelcatalog/redact.go new file mode 100644 index 000000000..f9f03f9cc --- /dev/null +++ b/internal/modelcatalog/redact.go @@ -0,0 +1,33 @@ +package modelcatalog + +import ( + "regexp" + "strings" +) + +var secretPatterns = []*regexp.Regexp{ + regexp.MustCompile(`agh_claim_[A-Za-z0-9_-]+`), + regexp.MustCompile(`sk-[A-Za-z0-9_-]{8,}`), + regexp.MustCompile(`gh[pousr]_[A-Za-z0-9_]{8,}`), + regexp.MustCompile(`xox[baprs]-[A-Za-z0-9-]{8,}`), + regexp.MustCompile(`(?i)\bBearer\s+[A-Za-z0-9._~+/=-]{8,}`), + regexp.MustCompile( + `(?i)\b([A-Z0-9_-]*(?:api[_-]?key|auth[_-]?token|oauth[_-]?token|access[_-]?token|refresh[_-]?token|id[_-]?token|secret|password|credential|private[_-]?key)[A-Z0-9_-]*)=([^&\s]+)`, + ), +} + +// RedactString removes secret-shaped values from catalog errors. +func RedactString(value string) string { + redacted := value + for _, pattern := range secretPatterns { + redacted = pattern.ReplaceAllStringFunc(redacted, redactMatch) + } + return redacted +} + +func redactMatch(value string) string { + if key, _, ok := strings.Cut(value, "="); ok { + return key + "=[REDACTED]" + } + return "[REDACTED]" +} diff --git a/internal/modelcatalog/redact_test.go b/internal/modelcatalog/redact_test.go new file mode 100644 index 000000000..464e5b462 --- /dev/null +++ b/internal/modelcatalog/redact_test.go @@ -0,0 +1,51 @@ +package modelcatalog + +import ( + "strings" + "testing" +) + +func TestRedactString(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + input string + secret string + }{ + { + name: "Should redact OpenAI style API keys", + input: "models.dev failed with api_key=sk-super-secret-token-123", + secret: "sk-super-secret-token-123", + }, + { + name: "Should redact OAuth bearer tokens", + input: "provider returned Authorization: Bearer ya29.secret-oauth-token", + secret: "ya29.secret-oauth-token", + }, + { + name: "Should redact secret shaped environment values", + input: "discovery failed with OPENAI_API_KEY=env-secret-value CLIENT_SECRET=client-secret-value", + secret: "env-secret-value", + }, + { + name: "Should redact OAuth token environment values", + input: "extension failed with OAUTH_TOKEN=oauth-secret-value", + secret: "oauth-secret-value", + }, + } + + for _, tc := range tests { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + + redacted := RedactString(tc.input) + if strings.Contains(redacted, tc.secret) { + t.Fatalf("RedactString() = %q, want secret removed", redacted) + } + if !strings.Contains(redacted, "[REDACTED]") { + t.Fatalf("RedactString() = %q, want redaction marker", redacted) + } + }) + } +} diff --git a/internal/modelcatalog/service.go b/internal/modelcatalog/service.go new file mode 100644 index 000000000..8a7791b89 --- /dev/null +++ b/internal/modelcatalog/service.go @@ -0,0 +1,517 @@ +package modelcatalog + +import ( + "context" + "errors" + "fmt" + "sort" + "strings" + "sync" + "time" +) + +type sourceProviderLister interface { + ProviderIDs() []string +} + +type sourceTTLProvider interface { + TTL() time.Duration +} + +// CatalogService refreshes sources and projects stored model catalog rows. +type CatalogService struct { + store Store + sources []Source + sourceByID map[string]Source + lockMu sync.Mutex + refreshFlights map[string]*refreshFlight +} + +type refreshFlight struct { + scopeKey string + done chan struct{} + statuses []SourceStatus + err error +} + +var _ Service = (*CatalogService)(nil) + +// NewService creates a model catalog service from a store and source list. +func NewService(store Store, sources []Source) (*CatalogService, error) { + if store == nil { + return nil, fmt.Errorf("model catalog store is required") + } + normalizedSources := make([]Source, 0, len(sources)) + sourceByID := make(map[string]Source, len(sources)) + for _, source := range sources { + if source == nil { + return nil, fmt.Errorf("model catalog source is required") + } + if err := ValidateSourceIdentity(source.ID(), source.Kind()); err != nil { + return nil, err + } + if source.Priority() <= 0 { + return nil, fmt.Errorf("model catalog source %q priority must be positive", source.ID()) + } + if _, exists := sourceByID[source.ID()]; exists { + return nil, fmt.Errorf("model catalog source %q is registered more than once", source.ID()) + } + normalizedSources = append(normalizedSources, source) + sourceByID[source.ID()] = source + } + sort.SliceStable(normalizedSources, func(i, j int) bool { + if normalizedSources[i].Priority() != normalizedSources[j].Priority() { + return normalizedSources[i].Priority() > normalizedSources[j].Priority() + } + return normalizedSources[i].ID() < normalizedSources[j].ID() + }) + return &CatalogService{ + store: store, + sources: normalizedSources, + sourceByID: sourceByID, + refreshFlights: make(map[string]*refreshFlight), + }, nil +} + +// ListModels returns the merged catalog projection. +func (s *CatalogService) ListModels(ctx context.Context, opts ListOptions) ([]Model, error) { + if ctx == nil { + return nil, fmt.Errorf("model catalog list context is required") + } + now := defaultNow(opts.Now) + listOpts := opts + listOpts.Now = now + listOpts.IncludeAll = true + rows, err := s.store.ListRows(ctx, listOpts) + if err != nil { + return nil, fmt.Errorf("model catalog: list stored rows: %w", err) + } + + var refreshErr error + if opts.Refresh || (len(rows) == 0 && len(s.sources) > 0) { + _, refreshErr = s.Refresh(ctx, RefreshOptions{ + ProviderID: opts.ProviderID, + SourceID: opts.SourceID, + Force: opts.Refresh, + Now: now, + }) + rows, err = s.store.ListRows(ctx, listOpts) + if err != nil { + return nil, fmt.Errorf("model catalog: list stored rows after refresh: %w", err) + } + } + if len(rows) == 0 && refreshErr != nil { + return nil, refreshErr + } + return MergeRows(rows), nil +} + +// Refresh updates registered sources and returns their latest statuses. +func (s *CatalogService) Refresh(ctx context.Context, opts RefreshOptions) ([]SourceStatus, error) { + if ctx == nil { + return nil, fmt.Errorf("model catalog refresh context is required") + } + now := defaultNow(opts.Now) + sources, err := s.selectSources(opts.SourceID) + if err != nil { + return nil, err + } + providerKey := strings.TrimSpace(opts.ProviderID) + if providerKey == "" { + providerKey = "__all__" + } + scopeKey := refreshFlightScopeKey(providerKey, opts) + + return s.withRefreshFlight(providerKey, scopeKey, func() ([]SourceStatus, error) { + return s.refreshSources(ctx, sources, opts, now) + }) +} + +// ListSourceStatus returns provider-scoped source health rows. +func (s *CatalogService) ListSourceStatus(ctx context.Context, providerID string) ([]SourceStatus, error) { + if ctx == nil { + return nil, fmt.Errorf("model catalog status context is required") + } + statuses, err := s.store.ListSourceStatus(ctx, strings.TrimSpace(providerID)) + if err != nil { + return nil, fmt.Errorf("model catalog: list source status: %w", err) + } + for index := range statuses { + statuses[index].LastError = RedactString(statuses[index].LastError) + } + return statuses, nil +} + +func (s *CatalogService) refreshSources( + ctx context.Context, + sources []Source, + opts RefreshOptions, + now time.Time, +) ([]SourceStatus, error) { + statuses := make([]SourceStatus, 0, len(sources)) + successes := 0 + failures := 0 + staleFallbacks := 0 + for _, source := range sources { + sourceStatuses, outcome, err := s.refreshSource(ctx, source, opts, now) + statuses = append(statuses, sourceStatuses...) + if err != nil && !errors.Is(err, ErrSourceDisabled) { + failures++ + } + switch outcome { + case refreshOutcomeSuccess: + successes++ + case refreshOutcomeStale: + staleFallbacks++ + } + } + if successes == 0 && staleFallbacks == 0 && failures > 0 { + return statuses, fmt.Errorf("%w (%d failed)", ErrAllSourcesFailed, failures) + } + return statuses, nil +} + +type refreshOutcome int + +const ( + refreshOutcomeEmpty refreshOutcome = iota + refreshOutcomeSuccess + refreshOutcomeStale +) + +func (s *CatalogService) refreshSource( + ctx context.Context, + source Source, + opts RefreshOptions, + now time.Time, +) ([]SourceStatus, refreshOutcome, error) { + if !opts.Force && + strings.TrimSpace(opts.ProviderID) != "" && + sourceHasFreshStatus(ctx, s.store, source, opts.ProviderID, now) { + statuses, err := s.store.ListSourceStatus(ctx, opts.ProviderID) + if err != nil { + return nil, refreshOutcomeEmpty, fmt.Errorf("model catalog: load fresh source status: %w", err) + } + return filterStatusesBySource(statuses, source.ID()), refreshOutcomeSuccess, nil + } + + rows, err := source.ListModels(ctx, ListOptions{ + ProviderID: opts.ProviderID, + SourceID: source.ID(), + Refresh: true, + IncludeAll: true, + IncludeStale: true, + Now: now, + }) + if err != nil { + return s.recordSourceFailure(ctx, source, opts.ProviderID, rows, now, err) + } + statuses, err := s.persistSourceRows(ctx, source, opts.ProviderID, rows, now, false, "") + if err != nil { + return nil, refreshOutcomeEmpty, err + } + if len(rows) > 0 { + return statuses, refreshOutcomeSuccess, nil + } + return statuses, refreshOutcomeEmpty, nil +} + +func (s *CatalogService) recordSourceFailure( + ctx context.Context, + source Source, + providerID string, + rows []ModelRow, + now time.Time, + sourceErr error, +) ([]SourceStatus, refreshOutcome, error) { + if errors.Is(sourceErr, ErrSourceDisabled) { + statuses, err := s.persistDisabledSource(ctx, source, providerID, now) + return statuses, refreshOutcomeEmpty, err + } + redacted := sourceErrorText(sourceErr) + if len(rows) > 0 { + staleRows := markRowsStale(rows, redacted) + statuses, err := s.persistSourceRows(ctx, source, providerID, staleRows, now, true, redacted) + return statuses, refreshOutcomeStale, err + } + + providers := s.providersForSource(source, providerID) + statuses := make([]SourceStatus, 0, len(providers)) + staleCount := 0 + for _, provider := range providers { + previous, err := s.store.ListRows(ctx, ListOptions{ + ProviderID: provider, + SourceID: source.ID(), + IncludeAll: true, + IncludeStale: true, + Now: now, + }) + if err != nil { + return nil, refreshOutcomeEmpty, fmt.Errorf("model catalog: load stale rows for %q: %w", source.ID(), err) + } + staleRows := markRowsStale(previous, redacted) + status := sourceStatus(source, provider, now, len(staleRows), true, redacted, RefreshStateFailed) + s.preserveLastSuccess(ctx, provider, &status) + if err := s.store.ReplaceSourceRows(ctx, source.ID(), provider, staleRows, status); err != nil { + return nil, refreshOutcomeEmpty, fmt.Errorf("model catalog: persist failed source status: %w", err) + } + if len(staleRows) > 0 { + staleCount += len(staleRows) + } + statuses = append(statuses, status) + } + if staleCount > 0 { + return statuses, refreshOutcomeStale, sourceErr + } + return statuses, refreshOutcomeEmpty, sourceErr +} + +func (s *CatalogService) persistSourceRows( + ctx context.Context, + source Source, + providerID string, + rows []ModelRow, + now time.Time, + stale bool, + lastError string, +) ([]SourceStatus, error) { + grouped := groupRowsByProvider(source, rows) + providers := providerKeys(grouped) + if strings.TrimSpace(providerID) != "" && len(providers) == 0 { + providers = []string{strings.TrimSpace(providerID)} + } + if len(providers) == 0 { + providers = s.providersForSource(source, providerID) + } + statuses := make([]SourceStatus, 0, len(providers)) + state := RefreshStateSucceeded + if stale { + state = RefreshStateFailed + } + for _, provider := range providers { + providerRows := grouped[provider] + for index := range providerRows { + providerRows[index] = normalizeSourceRow(source, providerRows[index], now, stale, lastError) + } + status := sourceStatus(source, provider, now, len(providerRows), stale, lastError, state) + if stale { + s.preserveLastSuccess(ctx, provider, &status) + } + if err := s.store.ReplaceSourceRows(ctx, source.ID(), provider, providerRows, status); err != nil { + return nil, fmt.Errorf("model catalog: persist source rows for %q/%q: %w", source.ID(), provider, err) + } + statuses = append(statuses, status) + } + return statuses, nil +} + +func (s *CatalogService) persistDisabledSource( + ctx context.Context, + source Source, + providerID string, + now time.Time, +) ([]SourceStatus, error) { + providers := s.providersForSource(source, providerID) + statuses := make([]SourceStatus, 0, len(providers)) + for _, provider := range providers { + status := sourceStatus(source, provider, now, 0, false, "", RefreshStateDisabled) + if err := s.store.ReplaceSourceRows(ctx, source.ID(), provider, nil, status); err != nil { + return nil, fmt.Errorf("model catalog: persist disabled source status: %w", err) + } + statuses = append(statuses, status) + } + return statuses, nil +} + +func (s *CatalogService) providersForSource(source Source, providerID string) []string { + if trimmed := strings.TrimSpace(providerID); trimmed != "" { + return []string{trimmed} + } + if lister, ok := source.(sourceProviderLister); ok { + providers := lister.ProviderIDs() + sort.Strings(providers) + return providers + } + return nil +} + +func (s *CatalogService) selectSources(sourceID string) ([]Source, error) { + trimmed := strings.TrimSpace(sourceID) + if trimmed == "" { + return s.sources, nil + } + source, ok := s.sourceByID[trimmed] + if !ok { + return nil, fmt.Errorf("%w: %q", ErrSourceNotRegistered, trimmed) + } + return []Source{source}, nil +} + +func (s *CatalogService) withRefreshFlight( + providerID string, + scopeKey string, + fn func() ([]SourceStatus, error), +) ([]SourceStatus, error) { + for { + s.lockMu.Lock() + flight := s.refreshFlights[providerID] + if flight == nil { + flight = &refreshFlight{ + scopeKey: scopeKey, + done: make(chan struct{}), + } + s.refreshFlights[providerID] = flight + s.lockMu.Unlock() + + flight.statuses, flight.err = fn() + s.lockMu.Lock() + delete(s.refreshFlights, providerID) + s.lockMu.Unlock() + close(flight.done) + return cloneSourceStatuses(flight.statuses), flight.err + } + s.lockMu.Unlock() + <-flight.done + if flight.scopeKey == scopeKey { + return cloneSourceStatuses(flight.statuses), flight.err + } + } +} + +func refreshFlightScopeKey(providerKey string, opts RefreshOptions) string { + return fmt.Sprintf("%s\x00%s\x00%t", providerKey, strings.TrimSpace(opts.SourceID), opts.Force) +} + +func sourceHasFreshStatus(ctx context.Context, store Store, source Source, providerID string, now time.Time) bool { + if ttlProvider, ok := source.(sourceTTLProvider); !ok || ttlProvider.TTL() <= 0 { + return false + } + statuses, err := store.ListSourceStatus(ctx, providerID) + if err != nil { + return false + } + for _, status := range statuses { + if status.SourceID != source.ID() { + continue + } + return status.RefreshState == string(RefreshStateSucceeded) && + !status.NextRefresh.IsZero() && + status.NextRefresh.After(now) + } + return false +} + +func filterStatusesBySource(statuses []SourceStatus, sourceID string) []SourceStatus { + filtered := make([]SourceStatus, 0, len(statuses)) + for _, status := range statuses { + if status.SourceID == sourceID { + filtered = append(filtered, status) + } + } + return filtered +} + +func normalizeSourceRow(source Source, row ModelRow, now time.Time, stale bool, lastError string) ModelRow { + normalized := row + if strings.TrimSpace(normalized.SourceID) == "" { + normalized.SourceID = source.ID() + } + if normalized.SourceKind == "" { + normalized.SourceKind = source.Kind() + } + if normalized.Priority == 0 { + normalized.Priority = source.Priority() + } + if normalized.RefreshedAt.IsZero() { + normalized.RefreshedAt = now + } + normalized.Stale = stale || normalized.Stale + if lastError != "" { + normalized.LastError = RedactString(lastError) + } else { + normalized.LastError = RedactString(normalized.LastError) + } + return normalized +} + +func sourceStatus( + source Source, + providerID string, + now time.Time, + rowCount int, + stale bool, + lastError string, + state RefreshState, +) SourceStatus { + status := SourceStatus{ + SourceID: source.ID(), + SourceKind: source.Kind(), + ProviderID: providerID, + Priority: source.Priority(), + LastRefresh: now, + RefreshState: string(state), + RowCount: rowCount, + Stale: stale, + LastError: RedactString(lastError), + } + if state == RefreshStateSucceeded { + status.LastSuccess = now + } + if ttlProvider, ok := source.(sourceTTLProvider); ok && ttlProvider.TTL() > 0 { + status.NextRefresh = now.Add(ttlProvider.TTL()) + } + return status +} + +func (s *CatalogService) preserveLastSuccess(ctx context.Context, providerID string, status *SourceStatus) { + statuses, err := s.store.ListSourceStatus(ctx, providerID) + if err != nil { + return + } + for _, previous := range statuses { + if previous.SourceID == status.SourceID { + status.LastSuccess = previous.LastSuccess + return + } + } +} + +func groupRowsByProvider(source Source, rows []ModelRow) map[string][]ModelRow { + grouped := make(map[string][]ModelRow) + for _, row := range rows { + providerID := strings.TrimSpace(row.ProviderID) + if providerID == "" { + continue + } + normalized := row + normalized.SourceID = source.ID() + normalized.SourceKind = source.Kind() + normalized.Priority = source.Priority() + grouped[providerID] = append(grouped[providerID], normalized) + } + return grouped +} + +func providerKeys(grouped map[string][]ModelRow) []string { + providers := make([]string, 0, len(grouped)) + for providerID := range grouped { + providers = append(providers, providerID) + } + sort.Strings(providers) + return providers +} + +func markRowsStale(rows []ModelRow, lastError string) []ModelRow { + staleRows := make([]ModelRow, 0, len(rows)) + for _, row := range rows { + stale := row + stale.Stale = true + stale.LastError = RedactString(lastError) + staleRows = append(staleRows, stale) + } + return staleRows +} + +func cloneSourceStatuses(statuses []SourceStatus) []SourceStatus { + return append([]SourceStatus(nil), statuses...) +} diff --git a/internal/modelcatalog/service_integration_test.go b/internal/modelcatalog/service_integration_test.go new file mode 100644 index 000000000..db95cce43 --- /dev/null +++ b/internal/modelcatalog/service_integration_test.go @@ -0,0 +1,404 @@ +package modelcatalog_test + +import ( + "context" + "database/sql" + "fmt" + "net/http" + "net/http/httptest" + "path/filepath" + "slices" + "sync" + "testing" + "time" + + aghconfig "github.com/pedronauck/agh/internal/config" + "github.com/pedronauck/agh/internal/modelcatalog" + "github.com/pedronauck/agh/internal/store" + "github.com/pedronauck/agh/internal/store/globaldb" + "github.com/pedronauck/agh/internal/testutil" + _ "modernc.org/sqlite" +) + +func TestCatalogServiceGlobalDBIntegration(t *testing.T) { + t.Parallel() + + t.Run("Should refresh and list rows by provider with global DB store", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + store, _ := openCatalogGlobalDB(t) + source := modelcatalog.NewConfigSource(map[string]aghconfig.ProviderConfig{ + "codex": { + Models: aghconfig.ProviderModelsConfig{ + Default: "manual-model", + Curated: []aghconfig.ProviderModelConfig{ + {ID: "gpt-5.4", DisplayName: "GPT-5.4"}, + }, + }, + }, + "claude": { + Models: aghconfig.ProviderModelsConfig{ + Default: "claude-sonnet-4-6", + }, + }, + }) + service, err := modelcatalog.NewService(store, []modelcatalog.Source{source}) + if err != nil { + t.Fatalf("NewService() error = %v", err) + } + + if _, err := service.Refresh( + ctx, + modelcatalog.RefreshOptions{Force: true, Now: integrationTime(0)}, + ); err != nil { + t.Fatalf("Refresh() error = %v", err) + } + models, err := service.ListModels(ctx, modelcatalog.ListOptions{ProviderID: "codex", Now: integrationTime(1)}) + if err != nil { + t.Fatalf("ListModels(codex) error = %v", err) + } + if got, want := len(models), 2; got != want { + t.Fatalf("len(models) = %d, want %d: %#v", got, want, models) + } + for _, model := range models { + if model.ProviderID != "codex" { + t.Fatalf("model.ProviderID = %q, want codex", model.ProviderID) + } + } + }) + + t.Run("Should not persist raw models dev upstream payload", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + store, path := openCatalogGlobalDB(t) + server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) { + _, err := fmt.Fprint(w, `{ + "openai": { + "models": { + "gpt-5.4": { + "name": "GPT-5.4", + "unused_raw_marker": "raw_secret_marker_should_not_persist" + } + } + } + }`) + if err != nil { + t.Errorf("Fprint(response) error = %v", err) + } + })) + t.Cleanup(server.Close) + enabled := true + source, err := modelcatalog.NewModelsDevSource(nil, aghconfig.ModelsDevSourceConfig{ + Enabled: &enabled, + Endpoint: server.URL, + TTL: "1h", + Timeout: "1s", + }) + if err != nil { + t.Fatalf("NewModelsDevSource() error = %v", err) + } + service, err := modelcatalog.NewService(store, []modelcatalog.Source{source}) + if err != nil { + t.Fatalf("NewService() error = %v", err) + } + if _, err := service.Refresh( + ctx, + modelcatalog.RefreshOptions{ProviderID: "codex", Force: true, Now: integrationTime(0)}, + ); err != nil { + t.Fatalf("Refresh() error = %v", err) + } + + db, err := sql.Open("sqlite", path) + if err != nil { + t.Fatalf("sql.Open() error = %v", err) + } + t.Cleanup(func() { + if closeErr := db.Close(); closeErr != nil { + t.Errorf("db.Close() error = %v", closeErr) + } + }) + var matches int + if err := db.QueryRowContext( + ctx, + `SELECT + (SELECT COUNT(*) FROM model_catalog_rows + WHERE source_id LIKE ? OR provider_id LIKE ? OR model_id LIKE ? OR display_name LIKE ? OR last_error LIKE ?) + + + (SELECT COUNT(*) FROM model_catalog_sources + WHERE source_id LIKE ? OR provider_id LIKE ? OR last_error LIKE ?)`, + "%raw_secret_marker%", + "%raw_secret_marker%", + "%raw_secret_marker%", + "%raw_secret_marker%", + "%raw_secret_marker%", + "%raw_secret_marker%", + "%raw_secret_marker%", + "%raw_secret_marker%", + ).Scan(&matches); err != nil { + t.Fatalf("QueryRowContext(raw marker) error = %v", err) + } + if matches != 0 { + t.Fatalf("raw marker persisted in %d catalog fields, want 0", matches) + } + }) + + t.Run("Should coalesce same provider refreshes without SQLite busy failures", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + store, _ := openCatalogGlobalDB(t) + source := newIntegrationBlockingSource(map[string][]modelcatalog.ModelRow{ + "codex": { + integrationRow("codex", "gpt-5.4", integrationTime(20)), + }, + }) + service, err := modelcatalog.NewService(store, []modelcatalog.Source{source}) + if err != nil { + t.Fatalf("NewService() error = %v", err) + } + + results := make(chan error, 2) + for range 2 { + go func() { + _, refreshErr := service.Refresh(ctx, modelcatalog.RefreshOptions{ + ProviderID: "codex", + SourceID: source.ID(), + Force: true, + Now: integrationTime(20), + }) + results <- refreshErr + }() + } + source.waitForCalls(t, 1) + source.requireCallCountStable(t, 1, 25*time.Millisecond) + source.release() + + for range 2 { + if err := <-results; err != nil { + t.Fatalf("Refresh() error = %v", err) + } + } + models, err := service.ListModels(ctx, modelcatalog.ListOptions{ProviderID: "codex", Now: integrationTime(21)}) + if err != nil { + t.Fatalf("ListModels(codex) error = %v", err) + } + if got, want := integrationModelKeys(models), []string{"codex/gpt-5.4"}; !slices.Equal(got, want) { + t.Fatalf("model keys = %#v, want %#v", got, want) + } + }) + + t.Run("Should persist concurrent cross provider refreshes without SQLite busy failures", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + store, _ := openCatalogGlobalDB(t) + source := newIntegrationBlockingSource(map[string][]modelcatalog.ModelRow{ + "claude": { + integrationRow("claude", "claude-sonnet-4-6", integrationTime(30)), + }, + "codex": { + integrationRow("codex", "gpt-5.4", integrationTime(30)), + }, + }) + service, err := modelcatalog.NewService(store, []modelcatalog.Source{source}) + if err != nil { + t.Fatalf("NewService() error = %v", err) + } + + results := make(chan error, 2) + for _, providerID := range []string{"codex", "claude"} { + go func(providerID string) { + _, refreshErr := service.Refresh(ctx, modelcatalog.RefreshOptions{ + ProviderID: providerID, + SourceID: source.ID(), + Force: true, + Now: integrationTime(30), + }) + results <- refreshErr + }(providerID) + } + source.waitForCalls(t, 2) + source.release() + + for range 2 { + if err := <-results; err != nil { + t.Fatalf("Refresh() error = %v", err) + } + } + models, err := service.ListModels(ctx, modelcatalog.ListOptions{Now: integrationTime(31)}) + if err != nil { + t.Fatalf("ListModels() error = %v", err) + } + if got, want := integrationModelKeys( + models, + ), []string{ + "claude/claude-sonnet-4-6", + "codex/gpt-5.4", + }; !slices.Equal( + got, + want, + ) { + t.Fatalf("model keys = %#v, want %#v", got, want) + } + }) +} + +func openCatalogGlobalDB(t *testing.T) (*globaldb.GlobalDB, string) { + t.Helper() + + ctx := testutil.Context(t) + path := filepath.Join(t.TempDir(), store.GlobalDatabaseName) + store, err := globaldb.OpenGlobalDB(ctx, path) + if err != nil { + t.Fatalf("OpenGlobalDB() error = %v", err) + } + t.Cleanup(func() { + if closeErr := store.Close(testutil.Context(t)); closeErr != nil { + t.Errorf("GlobalDB.Close() error = %v", closeErr) + } + }) + return store, path +} + +func integrationTime(offset int) time.Time { + return time.Date(2026, 5, 7, 13, offset, 0, 0, time.UTC) +} + +type integrationBlockingSource struct { + mu sync.Mutex + rowsByProvider map[string][]modelcatalog.ModelRow + calls int + callsCh chan int + releaseCh chan struct{} + releaseOnce sync.Once +} + +func newIntegrationBlockingSource(rowsByProvider map[string][]modelcatalog.ModelRow) *integrationBlockingSource { + return &integrationBlockingSource{ + rowsByProvider: rowsByProvider, + callsCh: make(chan int, 16), + releaseCh: make(chan struct{}), + } +} + +func (s *integrationBlockingSource) ID() string { + return "provider_live:integration" +} + +func (s *integrationBlockingSource) Kind() modelcatalog.SourceKind { + return modelcatalog.SourceKindProviderLive +} + +func (s *integrationBlockingSource) Priority() int { + return modelcatalog.PriorityProviderLive +} + +func (s *integrationBlockingSource) ProviderIDs() []string { + s.mu.Lock() + defer s.mu.Unlock() + providers := make([]string, 0, len(s.rowsByProvider)) + for providerID := range s.rowsByProvider { + providers = append(providers, providerID) + } + slices.Sort(providers) + return providers +} + +func (s *integrationBlockingSource) ListModels( + ctx context.Context, + opts modelcatalog.ListOptions, +) ([]modelcatalog.ModelRow, error) { + s.mu.Lock() + s.calls++ + calls := s.calls + s.mu.Unlock() + select { + case s.callsCh <- calls: + default: + } + + select { + case <-s.releaseCh: + case <-ctx.Done(): + return nil, ctx.Err() + } + + s.mu.Lock() + rows := cloneIntegrationRows(s.rowsByProvider[opts.ProviderID]) + s.mu.Unlock() + return rows, nil +} + +func (s *integrationBlockingSource) waitForCalls(t *testing.T, want int) { + t.Helper() + + deadline := time.After(time.Second) + for { + if s.callCount() >= want { + return + } + select { + case <-s.callsCh: + case <-deadline: + t.Fatalf("source calls = %d, want at least %d", s.callCount(), want) + } + } +} + +func (s *integrationBlockingSource) requireCallCountStable( + t *testing.T, + want int, + duration time.Duration, +) { + t.Helper() + + timer := time.NewTimer(duration) + defer timer.Stop() + for { + select { + case <-s.callsCh: + if got := s.callCount(); got > want { + t.Fatalf("source calls = %d while first refresh was blocked, want at most %d", got, want) + } + case <-timer.C: + return + } + } +} + +func (s *integrationBlockingSource) release() { + s.releaseOnce.Do(func() { + close(s.releaseCh) + }) +} + +func (s *integrationBlockingSource) callCount() int { + s.mu.Lock() + defer s.mu.Unlock() + return s.calls +} + +func integrationRow(providerID string, modelID string, refreshedAt time.Time) modelcatalog.ModelRow { + return modelcatalog.ModelRow{ + SourceID: "provider_live:integration", + SourceKind: modelcatalog.SourceKindProviderLive, + Priority: modelcatalog.PriorityProviderLive, + ProviderID: providerID, + ModelID: modelID, + RefreshedAt: refreshedAt, + } +} + +func cloneIntegrationRows(rows []modelcatalog.ModelRow) []modelcatalog.ModelRow { + return append([]modelcatalog.ModelRow(nil), rows...) +} + +func integrationModelKeys(models []modelcatalog.Model) []string { + keys := make([]string, 0, len(models)) + for _, model := range models { + keys = append(keys, model.ProviderID+"/"+model.ModelID) + } + return keys +} diff --git a/internal/modelcatalog/service_test.go b/internal/modelcatalog/service_test.go new file mode 100644 index 000000000..10bfc997a --- /dev/null +++ b/internal/modelcatalog/service_test.go @@ -0,0 +1,827 @@ +package modelcatalog + +import ( + "context" + "errors" + "fmt" + "slices" + "strings" + "sync" + "testing" + "time" + + "github.com/pedronauck/agh/internal/testutil" +) + +func TestMergeRows(t *testing.T) { + t.Parallel() + + t.Run("Should let higher priority source win conflicting fields", func(t *testing.T) { + t.Parallel() + + contextWindowConfig := int64(100) + contextWindowCatalog := int64(200) + models := MergeRows([]ModelRow{ + testRow( + "models_dev", + SourceKindModelsDev, + PriorityModelsDev, + "codex", + "gpt-5.4", + testTime(0), + func(row *ModelRow) { + row.DisplayName = "Catalog GPT" + row.ContextWindow = &contextWindowCatalog + }, + ), + testRow("config", SourceKindConfig, PriorityConfig, "codex", "gpt-5.4", testTime(0), func(row *ModelRow) { + row.DisplayName = "Config GPT" + row.ContextWindow = &contextWindowConfig + }), + }) + + model := requireSingleModel(t, models) + if model.DisplayName != "Config GPT" { + t.Fatalf("DisplayName = %q, want Config GPT", model.DisplayName) + } + if model.ContextWindow == nil || *model.ContextWindow != contextWindowConfig { + t.Fatalf("ContextWindow = %v, want %d", model.ContextWindow, contextWindowConfig) + } + }) + + t.Run("Should let provider live priority win over extension priority", func(t *testing.T) { + t.Parallel() + + liveAvailable := true + extensionAvailable := false + models := MergeRows([]ModelRow{ + testRow( + "extension:alpha", + SourceKindExtension, + PriorityExtension, + "codex", + "gpt-5.4", + testTime(0), + func(row *ModelRow) { + row.DisplayName = "Extension GPT" + row.Available = &extensionAvailable + }, + ), + testRow( + "provider_live:codex", + SourceKindProviderLive, + PriorityProviderLive, + "codex", + "gpt-5.4", + testTime(0), + func(row *ModelRow) { + row.DisplayName = "Live GPT" + row.Available = &liveAvailable + }, + ), + }) + + model := requireSingleModel(t, models) + if model.DisplayName != "Live GPT" { + t.Fatalf("DisplayName = %q, want Live GPT", model.DisplayName) + } + if model.Available == nil || !*model.Available { + t.Fatalf("Available = %v, want true", model.Available) + } + if model.AvailabilityState != string(AvailabilityStateAvailableLive) { + t.Fatalf("AvailabilityState = %q, want available_live", model.AvailabilityState) + } + }) + + t.Run("Should resolve equal priority and freshness by ascending source id", func(t *testing.T) { + t.Parallel() + + models := MergeRows([]ModelRow{ + testRow( + "extension:b", + SourceKindExtension, + PriorityExtension, + "codex", + "gpt-5.4", + testTime(0), + func(row *ModelRow) { + row.DisplayName = "B Source" + }, + ), + testRow( + "extension:a", + SourceKindExtension, + PriorityExtension, + "codex", + "gpt-5.4", + testTime(0), + func(row *ModelRow) { + row.DisplayName = "A Source" + }, + ), + }) + + model := requireSingleModel(t, models) + if model.DisplayName != "A Source" { + t.Fatalf("DisplayName = %q, want A Source", model.DisplayName) + } + if got, want := model.Sources[0].SourceID, "extension:a"; got != want { + t.Fatalf("Sources[0].SourceID = %q, want %q", got, want) + } + }) + + t.Run("Should let lower priority source fill missing metadata", func(t *testing.T) { + t.Parallel() + + contextWindow := int64(256000) + costInput := 1.25 + models := MergeRows([]ModelRow{ + testRow("config", SourceKindConfig, PriorityConfig, "codex", "gpt-5.4", testTime(0), nil), + testRow( + "models_dev", + SourceKindModelsDev, + PriorityModelsDev, + "codex", + "gpt-5.4", + testTime(0), + func(row *ModelRow) { + row.DisplayName = "Catalog GPT" + row.ContextWindow = &contextWindow + row.CostInputPerMillion = &costInput + }, + ), + }) + + model := requireSingleModel(t, models) + if model.DisplayName != "Catalog GPT" { + t.Fatalf("DisplayName = %q, want Catalog GPT", model.DisplayName) + } + if model.ContextWindow == nil || *model.ContextWindow != contextWindow { + t.Fatalf("ContextWindow = %v, want %d", model.ContextWindow, contextWindow) + } + if model.CostInputPerMillion == nil || *model.CostInputPerMillion != costInput { + t.Fatalf("CostInputPerMillion = %v, want %f", model.CostInputPerMillion, costInput) + } + }) + + t.Run("Should project merged availability states", func(t *testing.T) { + t.Parallel() + + for _, tc := range []struct { + name string + available bool + stale bool + state AvailabilityState + }{ + {name: "Should project stale available live truth", available: true, stale: true, state: AvailabilityStateAvailableStale}, + {name: "Should project fresh available live truth", available: true, stale: false, state: AvailabilityStateAvailableLive}, + {name: "Should project stale unavailable live truth", available: false, stale: true, state: AvailabilityStateUnavailableStale}, + {name: "Should project fresh unavailable live truth", available: false, stale: false, state: AvailabilityStateUnavailableLive}, + } { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + + models := MergeRows([]ModelRow{ + testRow( + "provider_live:codex", + SourceKindProviderLive, + PriorityProviderLive, + "codex", + "gpt-5.4", + testTime(0), + func(row *ModelRow) { + row.Available = &tc.available + row.Stale = tc.stale + }, + ), + }) + model := requireSingleModel(t, models) + if model.Available == nil || *model.Available != tc.available { + t.Fatalf("Available = %v, want %t", model.Available, tc.available) + } + if model.AvailabilityState != string(tc.state) { + t.Fatalf("AvailabilityState = %q, want %q", model.AvailabilityState, tc.state) + } + }) + } + }) + + t.Run("Should keep catalog only models at unknown availability", func(t *testing.T) { + t.Parallel() + + models := MergeRows([]ModelRow{ + testRow("models_dev", SourceKindModelsDev, PriorityModelsDev, "codex", "gpt-5.4", testTime(0), nil), + }) + model := requireSingleModel(t, models) + if model.Available != nil { + t.Fatalf("Available = %v, want nil", model.Available) + } + if model.AvailabilityState != string(AvailabilityStateUnknown) { + t.Fatalf("AvailabilityState = %q, want unknown", model.AvailabilityState) + } + }) + + t.Run("Should sort merged projection and source refs deterministically", func(t *testing.T) { + t.Parallel() + + models := MergeRows([]ModelRow{ + testRow("extension:b", SourceKindExtension, PriorityExtension, "claude", "claude-4", testTime(1), nil), + testRow( + "provider_live:codex", + SourceKindProviderLive, + PriorityProviderLive, + "codex", + "gpt-5.4", + testTime(0), + nil, + ), + testRow("extension:a", SourceKindExtension, PriorityExtension, "codex", "gpt-5.4", testTime(2), nil), + }) + if got, want := modelKeys(models), []string{"claude/claude-4", "codex/gpt-5.4"}; !slices.Equal(got, want) { + t.Fatalf("model keys = %#v, want %#v", got, want) + } + if got, want := sourceIDs( + models[1].Sources, + ), []string{ + "provider_live:codex", + "extension:a", + }; !slices.Equal( + got, + want, + ) { + t.Fatalf("source ids = %#v, want %#v", got, want) + } + }) +} + +func TestCatalogServiceRefresh(t *testing.T) { + t.Parallel() + + t.Run("Should return partial success and record failed source status", func(t *testing.T) { + t.Parallel() + + store := newMemoryStore() + service := newTestService(t, store, []Source{ + &fakeSource{ + id: "config", + kind: SourceKindConfig, + priority: PriorityConfig, + providers: []string{"codex"}, + rows: []ModelRow{ + testRow("config", SourceKindConfig, PriorityConfig, "codex", "gpt-5.4", testTime(0), nil), + }, + }, + &fakeSource{ + id: "models_dev", + kind: SourceKindModelsDev, + priority: PriorityModelsDev, + providers: []string{"codex"}, + err: fmt.Errorf("upstream failed with api_key=super-secret"), + }, + }) + + models, err := service.ListModels( + testutil.Context(t), + ListOptions{ProviderID: "codex", Refresh: true, Now: testTime(10)}, + ) + if err != nil { + t.Fatalf("ListModels(refresh) error = %v", err) + } + if got, want := modelKeys(models), []string{"codex/gpt-5.4"}; !slices.Equal(got, want) { + t.Fatalf("model keys = %#v, want %#v", got, want) + } + statuses, err := service.ListSourceStatus(testutil.Context(t), "codex") + if err != nil { + t.Fatalf("ListSourceStatus() error = %v", err) + } + failed := requireStatus(t, statuses, "models_dev") + if failed.RefreshState != string(RefreshStateFailed) { + t.Fatalf("RefreshState = %q, want failed", failed.RefreshState) + } + if strings.Contains(failed.LastError, "super-secret") || !strings.Contains(failed.LastError, "[REDACTED]") { + t.Fatalf("LastError = %q, want redacted secret", failed.LastError) + } + }) + + t.Run("Should fail all source failure when no stale rows exist", func(t *testing.T) { + t.Parallel() + + store := newMemoryStore() + service := newTestService(t, store, []Source{ + &fakeSource{ + id: "models_dev", + kind: SourceKindModelsDev, + priority: PriorityModelsDev, + providers: []string{"codex"}, + err: errors.New("models.dev down"), + }, + }) + + _, err := service.ListModels( + testutil.Context(t), + ListOptions{ProviderID: "codex", Refresh: true, Now: testTime(0)}, + ) + if !errors.Is(err, ErrAllSourcesFailed) { + t.Fatalf("ListModels() error = %v, want ErrAllSourcesFailed", err) + } + }) + + t.Run("Should return stale rows when refresh fails after prior success", func(t *testing.T) { + t.Parallel() + + available := true + source := &fakeSource{ + id: "provider_live:codex", + kind: SourceKindProviderLive, + priority: PriorityProviderLive, + providers: []string{"codex"}, + rows: []ModelRow{ + testRow( + "provider_live:codex", + SourceKindProviderLive, + PriorityProviderLive, + "codex", + "gpt-5.4", + testTime(0), + func(row *ModelRow) { + row.Available = &available + }, + ), + }, + } + store := newMemoryStore() + service := newTestService(t, store, []Source{source}) + if _, err := service.ListModels( + testutil.Context(t), + ListOptions{ProviderID: "codex", Refresh: true, Now: testTime(1)}, + ); err != nil { + t.Fatalf("ListModels(first refresh) error = %v", err) + } + + source.rows = nil + source.err = errors.New("live source unavailable sk-secret-token") + models, err := service.ListModels( + testutil.Context(t), + ListOptions{ProviderID: "codex", Refresh: true, Now: testTime(2)}, + ) + if err != nil { + t.Fatalf("ListModels(stale refresh) error = %v", err) + } + model := requireSingleModel(t, models) + if model.AvailabilityState != string(AvailabilityStateAvailableStale) { + t.Fatalf("AvailabilityState = %q, want available_stale", model.AvailabilityState) + } + if !model.Stale { + t.Fatal("Model.Stale = false, want true") + } + if strings.Contains(model.LastError, "sk-secret-token") || !strings.Contains(model.LastError, "[REDACTED]") { + t.Fatalf("LastError = %q, want redacted stale error", model.LastError) + } + statuses, err := service.ListSourceStatus(testutil.Context(t), "codex") + if err != nil { + t.Fatalf("ListSourceStatus() error = %v", err) + } + status := requireStatus(t, statuses, "provider_live:codex") + if !status.LastSuccess.Equal(testTime(1)) { + t.Fatalf("LastSuccess = %s, want first refresh time %s", status.LastSuccess, testTime(1)) + } + }) + + t.Run("Should reject invalid extension source id before persistence", func(t *testing.T) { + t.Parallel() + + store := newMemoryStore() + _, err := NewService(store, []Source{ + &fakeSource{id: "extension:BadSlug", kind: SourceKindExtension, priority: PriorityExtension}, + }) + if err == nil { + t.Fatal("NewService(invalid extension source) error = nil, want validation error") + } + if store.replaceCount != 0 { + t.Fatalf("replaceCount = %d, want 0", store.replaceCount) + } + }) +} + +func TestCatalogServiceRefreshConcurrency(t *testing.T) { + t.Parallel() + + t.Run("Should coalesce concurrent refreshes for the same provider scope", func(t *testing.T) { + t.Parallel() + + source := newBlockingRefreshSource(map[string][]ModelRow{ + "codex": { + testRow( + "provider_live:codex", + SourceKindProviderLive, + PriorityProviderLive, + "codex", + "gpt-5.4", + testTime(30), + nil, + ), + }, + }) + store := newMemoryStore() + service := newTestService(t, store, []Source{source}) + ctx := testutil.Context(t) + + results := make(chan refreshTestResult, 2) + for range 2 { + go func() { + statuses, err := service.Refresh(ctx, RefreshOptions{ + ProviderID: "codex", + SourceID: source.ID(), + Force: true, + Now: testTime(30), + }) + results <- refreshTestResult{statuses: statuses, err: err} + }() + } + source.waitForCalls(t, 1) + source.requireCallCountStable(t, 1, 25*time.Millisecond) + source.release() + + for range 2 { + result := <-results + if result.err != nil { + t.Fatalf("Refresh() error = %v", result.err) + } + if got, want := len(result.statuses), 1; got != want { + t.Fatalf("len(statuses) = %d, want %d: %#v", got, want, result.statuses) + } + } + }) + + t.Run("Should let concurrent refreshes across providers replace rows deterministically", func(t *testing.T) { + t.Parallel() + + source := newBlockingRefreshSource(map[string][]ModelRow{ + "claude": { + testRow( + "provider_live:shared", + SourceKindProviderLive, + PriorityProviderLive, + "claude", + "claude-sonnet-4-6", + testTime(31), + nil, + ), + }, + "codex": { + testRow( + "provider_live:shared", + SourceKindProviderLive, + PriorityProviderLive, + "codex", + "gpt-5.4", + testTime(31), + nil, + ), + }, + }) + store := newMemoryStore() + service := newTestService(t, store, []Source{source}) + ctx := testutil.Context(t) + + results := make(chan refreshTestResult, 2) + for _, providerID := range []string{"codex", "claude"} { + go func(providerID string) { + statuses, err := service.Refresh(ctx, RefreshOptions{ + ProviderID: providerID, + SourceID: source.ID(), + Force: true, + Now: testTime(31), + }) + results <- refreshTestResult{statuses: statuses, err: err} + }(providerID) + } + source.waitForCalls(t, 2) + source.release() + + for range 2 { + result := <-results + if result.err != nil { + t.Fatalf("Refresh() error = %v", result.err) + } + if got, want := len(result.statuses), 1; got != want { + t.Fatalf("len(statuses) = %d, want %d: %#v", got, want, result.statuses) + } + } + models, err := service.ListModels( + ctx, + ListOptions{IncludeStale: true, Now: testTime(32)}, + ) + if err != nil { + t.Fatalf("ListModels() error = %v", err) + } + if got, want := modelKeys(models), []string{"claude/claude-sonnet-4-6", "codex/gpt-5.4"}; !slices.Equal( + got, + want, + ) { + t.Fatalf("model keys = %#v, want %#v", got, want) + } + if got, want := source.callCount(), 2; got != want { + t.Fatalf("source calls = %d, want %d cross-provider calls", got, want) + } + }) +} + +type fakeSource struct { + id string + kind SourceKind + priority int + providers []string + rows []ModelRow + err error + ttl time.Duration + calls int +} + +func (s *fakeSource) ID() string { + return s.id +} + +func (s *fakeSource) Kind() SourceKind { + return s.kind +} + +func (s *fakeSource) Priority() int { + return s.priority +} + +func (s *fakeSource) ProviderIDs() []string { + return append([]string(nil), s.providers...) +} + +func (s *fakeSource) TTL() time.Duration { + return s.ttl +} + +func (s *fakeSource) ListModels(_ context.Context, opts ListOptions) ([]ModelRow, error) { + s.calls++ + rows := make([]ModelRow, 0, len(s.rows)) + for _, row := range s.rows { + if opts.ProviderID == "" || row.ProviderID == opts.ProviderID { + rows = append(rows, row) + } + } + return rows, s.err +} + +type refreshTestResult struct { + statuses []SourceStatus + err error +} + +type blockingRefreshSource struct { + mu sync.Mutex + rowsByProvider map[string][]ModelRow + calls int + callsCh chan int + releaseCh chan struct{} + releaseOnce sync.Once +} + +func newBlockingRefreshSource(rowsByProvider map[string][]ModelRow) *blockingRefreshSource { + return &blockingRefreshSource{ + rowsByProvider: rowsByProvider, + callsCh: make(chan int, 16), + releaseCh: make(chan struct{}), + } +} + +func (s *blockingRefreshSource) ID() string { + return "provider_live:shared" +} + +func (s *blockingRefreshSource) Kind() SourceKind { + return SourceKindProviderLive +} + +func (s *blockingRefreshSource) Priority() int { + return PriorityProviderLive +} + +func (s *blockingRefreshSource) ProviderIDs() []string { + s.mu.Lock() + defer s.mu.Unlock() + providers := make([]string, 0, len(s.rowsByProvider)) + for providerID := range s.rowsByProvider { + providers = append(providers, providerID) + } + slices.Sort(providers) + return providers +} + +func (s *blockingRefreshSource) TTL() time.Duration { + return 0 +} + +func (s *blockingRefreshSource) ListModels(ctx context.Context, opts ListOptions) ([]ModelRow, error) { + s.mu.Lock() + s.calls++ + calls := s.calls + s.mu.Unlock() + select { + case s.callsCh <- calls: + default: + } + + select { + case <-s.releaseCh: + case <-ctx.Done(): + return nil, ctx.Err() + } + + s.mu.Lock() + rows := cloneModelRows(s.rowsByProvider[opts.ProviderID]) + s.mu.Unlock() + return rows, nil +} + +func (s *blockingRefreshSource) waitForCalls(t *testing.T, want int) { + t.Helper() + + deadline := time.After(time.Second) + for { + if s.callCount() >= want { + return + } + select { + case <-s.callsCh: + case <-deadline: + t.Fatalf("source calls = %d, want at least %d", s.callCount(), want) + } + } +} + +func (s *blockingRefreshSource) requireCallCountStable(t *testing.T, want int, duration time.Duration) { + t.Helper() + + timer := time.NewTimer(duration) + defer timer.Stop() + for { + select { + case <-s.callsCh: + if got := s.callCount(); got > want { + t.Fatalf("source calls = %d while first refresh was blocked, want at most %d", got, want) + } + case <-timer.C: + return + } + } +} + +func (s *blockingRefreshSource) release() { + s.releaseOnce.Do(func() { + close(s.releaseCh) + }) +} + +func (s *blockingRefreshSource) callCount() int { + s.mu.Lock() + defer s.mu.Unlock() + return s.calls +} + +type memoryStore struct { + mu sync.Mutex + rows map[string][]ModelRow + statuses map[string]SourceStatus + replaceCount int +} + +func newMemoryStore() *memoryStore { + return &memoryStore{ + rows: make(map[string][]ModelRow), + statuses: make(map[string]SourceStatus), + } +} + +func (s *memoryStore) ReplaceSourceRows( + _ context.Context, + sourceID string, + providerID string, + rows []ModelRow, + status SourceStatus, +) error { + s.mu.Lock() + defer s.mu.Unlock() + s.replaceCount++ + key := sourceProviderKey(sourceID, providerID) + s.rows[key] = cloneModelRows(rows) + s.statuses[key] = status + return nil +} + +func (s *memoryStore) ListRows(_ context.Context, opts ListOptions) ([]ModelRow, error) { + s.mu.Lock() + defer s.mu.Unlock() + rows := make([]ModelRow, 0) + for _, group := range s.rows { + for _, row := range group { + if opts.ProviderID != "" && row.ProviderID != opts.ProviderID { + continue + } + if opts.SourceID != "" && row.SourceID != opts.SourceID { + continue + } + if row.Stale && !opts.IncludeAll && !opts.IncludeStale { + continue + } + rows = append(rows, row) + } + } + return rows, nil +} + +func (s *memoryStore) ListSourceStatus(_ context.Context, providerID string) ([]SourceStatus, error) { + s.mu.Lock() + defer s.mu.Unlock() + statuses := make([]SourceStatus, 0, len(s.statuses)) + for _, status := range s.statuses { + if providerID == "" || status.ProviderID == providerID { + statuses = append(statuses, status) + } + } + return statuses, nil +} + +func sourceProviderKey(sourceID string, providerID string) string { + return sourceID + "\x00" + providerID +} + +func newTestService(t *testing.T, store Store, sources []Source) *CatalogService { + t.Helper() + + service, err := NewService(store, sources) + if err != nil { + t.Fatalf("NewService() error = %v", err) + } + return service +} + +func testRow( + sourceID string, + kind SourceKind, + priority int, + providerID string, + modelID string, + refreshedAt time.Time, + mutate func(*ModelRow), +) ModelRow { + row := ModelRow{ + SourceID: sourceID, + SourceKind: kind, + Priority: priority, + ProviderID: providerID, + ModelID: modelID, + RefreshedAt: refreshedAt, + } + if mutate != nil { + mutate(&row) + } + return row +} + +func testTime(offset int) time.Time { + return time.Date(2026, 5, 7, 12, offset, 0, 0, time.UTC) +} + +func requireSingleModel(t *testing.T, models []Model) Model { + t.Helper() + + if len(models) != 1 { + t.Fatalf("len(models) = %d, want 1: %#v", len(models), models) + } + return models[0] +} + +func requireStatus(t *testing.T, statuses []SourceStatus, sourceID string) SourceStatus { + t.Helper() + + for _, status := range statuses { + if status.SourceID == sourceID { + return status + } + } + t.Fatalf("statuses = %#v, want source %q", statuses, sourceID) + return SourceStatus{} +} + +func modelKeys(models []Model) []string { + keys := make([]string, 0, len(models)) + for _, model := range models { + keys = append(keys, model.ProviderID+"/"+model.ModelID) + } + return keys +} + +func sourceIDs(sources []SourceRef) []string { + ids := make([]string, 0, len(sources)) + for _, source := range sources { + ids = append(ids, source.SourceID) + } + return ids +} diff --git a/internal/modelcatalog/source_id.go b/internal/modelcatalog/source_id.go new file mode 100644 index 000000000..1f5be74c7 --- /dev/null +++ b/internal/modelcatalog/source_id.go @@ -0,0 +1,116 @@ +package modelcatalog + +import ( + "fmt" + "regexp" + "strings" +) + +var sourceSlugPattern = regexp.MustCompile(`^[a-z0-9][a-z0-9_-]*$`) + +// ValidateSourceID checks a stable catalog source identity. +func ValidateSourceID(sourceID string) error { + trimmed := strings.TrimSpace(sourceID) + if trimmed == "" { + return fmt.Errorf("model catalog source id is required") + } + if staticSourceKind(trimmed) != "" { + return nil + } + kind, slug, ok := strings.Cut(trimmed, ":") + if !ok || kind == "" || slug == "" { + return fmt.Errorf("model catalog source id %q must be static or :", sourceID) + } + switch SourceKind(kind) { + case SourceKindProviderLive, SourceKindExtension, SourceKindACPSession: + default: + return fmt.Errorf("model catalog source id %q uses unsupported dynamic kind %q", sourceID, kind) + } + if !sourceSlugPattern.MatchString(slug) { + return fmt.Errorf("model catalog source id %q slug must match ^[a-z0-9][a-z0-9_-]*$", sourceID) + } + return nil +} + +// ValidateSourceIdentity checks that a source id and kind describe the same source family. +func ValidateSourceIdentity(sourceID string, kind SourceKind) error { + trimmedID := strings.TrimSpace(sourceID) + if err := ValidateSourceID(trimmedID); err != nil { + return err + } + trimmedKind := SourceKind(strings.TrimSpace(string(kind))) + if trimmedKind == "" { + return fmt.Errorf("model catalog source kind is required") + } + if staticKind := staticSourceKind(trimmedID); staticKind != "" { + if staticKind != trimmedKind { + return fmt.Errorf("model catalog source id %q requires kind %q, got %q", trimmedID, staticKind, trimmedKind) + } + return nil + } + prefix, _, _ := strings.Cut(trimmedID, ":") + if SourceKind(prefix) != trimmedKind { + return fmt.Errorf("model catalog source id %q requires kind %q, got %q", trimmedID, prefix, trimmedKind) + } + return nil +} + +// SourceKindExtensionID returns the stable source id for an extension model source. +func SourceKindExtensionID(extensionName string) (string, error) { + slug, err := NormalizeExtensionSourceSlug(extensionName) + if err != nil { + return "", err + } + return string(SourceKindExtension) + ":" + slug, nil +} + +// NormalizeExtensionSourceSlug converts an extension name into the dynamic source-id slug. +func NormalizeExtensionSourceSlug(extensionName string) (string, error) { + trimmed := strings.TrimSpace(extensionName) + if trimmed == "" { + return "", fmt.Errorf("model catalog extension source name is required") + } + var builder strings.Builder + lastSeparator := false + for _, r := range trimmed { + switch { + case r >= 'A' && r <= 'Z': + builder.WriteRune(r + ('a' - 'A')) + lastSeparator = false + case r >= 'a' && r <= 'z': + builder.WriteRune(r) + lastSeparator = false + case r >= '0' && r <= '9': + builder.WriteRune(r) + lastSeparator = false + case r == '-' || r == '_': + builder.WriteRune(r) + lastSeparator = true + case r == ' ' || r == '\t' || r == '\n' || r == '\r': + if !lastSeparator { + builder.WriteRune('-') + lastSeparator = true + } + default: + return "", fmt.Errorf("model catalog extension source slug cannot include %q", string(r)) + } + } + slug := builder.String() + if !sourceSlugPattern.MatchString(slug) { + return "", fmt.Errorf("model catalog extension source slug %q must match ^[a-z0-9][a-z0-9_-]*$", slug) + } + return slug, nil +} + +func staticSourceKind(sourceID string) SourceKind { + switch sourceID { + case SourceIDBuiltin: + return SourceKindBuiltin + case SourceIDConfig: + return SourceKindConfig + case SourceIDModelsDev: + return SourceKindModelsDev + default: + return "" + } +} diff --git a/internal/modelcatalog/sources.go b/internal/modelcatalog/sources.go new file mode 100644 index 000000000..40a0b121e --- /dev/null +++ b/internal/modelcatalog/sources.go @@ -0,0 +1,167 @@ +package modelcatalog + +import ( + "context" + "maps" + "sort" + "strings" + "time" + + aghconfig "github.com/pedronauck/agh/internal/config" +) + +type providerConfigSource struct { + id string + kind SourceKind + priority int + providers map[string]aghconfig.ProviderConfig +} + +var _ Source = (*providerConfigSource)(nil) + +// NewBuiltinSource creates the offline bootstrap source from AGH built-ins. +func NewBuiltinSource() Source { + return newProviderConfigSource( + SourceIDBuiltin, + SourceKindBuiltin, + PriorityBuiltin, + aghconfig.BuiltinProviders(), + ) +} + +// NewConfigSource creates the operator config model source. +func NewConfigSource(providers map[string]aghconfig.ProviderConfig) Source { + return newProviderConfigSource(SourceIDConfig, SourceKindConfig, PriorityConfig, providers) +} + +func newProviderConfigSource( + id string, + kind SourceKind, + priority int, + providers map[string]aghconfig.ProviderConfig, +) Source { + return &providerConfigSource{ + id: id, + kind: kind, + priority: priority, + providers: cloneConfigProviders(providers), + } +} + +func (s *providerConfigSource) ID() string { + return s.id +} + +func (s *providerConfigSource) Kind() SourceKind { + return s.kind +} + +func (s *providerConfigSource) Priority() int { + return s.priority +} + +func (s *providerConfigSource) ProviderIDs() []string { + providers := make([]string, 0, len(s.providers)) + for providerID := range s.providers { + providers = append(providers, providerID) + } + sort.Strings(providers) + return providers +} + +func (s *providerConfigSource) ListModels( + _ context.Context, + opts ListOptions, +) ([]ModelRow, error) { + now := defaultNow(opts.Now) + providers := s.ProviderIDs() + rows := make([]ModelRow, 0) + for _, providerID := range providers { + if opts.ProviderID != "" && opts.ProviderID != providerID { + continue + } + provider := s.providers[providerID] + rows = append(rows, providerModelRows(providerID, provider.Models, s.id, s.kind, s.priority, now)...) + } + return rows, nil +} + +func providerModelRows( + providerID string, + models aghconfig.ProviderModelsConfig, + sourceID string, + kind SourceKind, + priority int, + now time.Time, +) []ModelRow { + byID := make(map[string]ModelRow) + order := make([]string, 0, len(models.Curated)+1) + addModel := func(modelID string) ModelRow { + trimmed := strings.TrimSpace(modelID) + row, ok := byID[trimmed] + if ok { + return row + } + row = ModelRow{ + ProviderID: providerID, + ModelID: trimmed, + SourceID: sourceID, + SourceKind: kind, + Priority: priority, + RefreshedAt: now, + } + byID[trimmed] = row + order = append(order, trimmed) + return row + } + if defaultModel := strings.TrimSpace(models.Default); defaultModel != "" { + addModel(defaultModel) + } + for _, curated := range models.Curated { + modelID := strings.TrimSpace(curated.ID) + if modelID == "" { + continue + } + row := addModel(modelID) + enrichRowFromProviderModel(&row, curated) + byID[modelID] = row + } + rows := make([]ModelRow, 0, len(order)) + for _, modelID := range order { + rows = append(rows, byID[modelID]) + } + return rows +} + +func enrichRowFromProviderModel(row *ModelRow, model aghconfig.ProviderModelConfig) { + row.DisplayName = strings.TrimSpace(model.DisplayName) + row.ContextWindow = model.ContextWindow + row.MaxInputTokens = model.MaxInputTokens + row.MaxOutputTokens = model.MaxOutputTokens + row.SupportsTools = model.SupportsTools + row.SupportsReasoning = model.SupportsReasoning + row.CostInputPerMillion = model.CostInputPerMillion + row.CostOutputPerMillion = model.CostOutputPerMillion + if len(model.ReasoningEfforts) > 0 { + row.ReasoningEfforts = make([]ReasoningEffort, 0, len(model.ReasoningEfforts)) + for _, effort := range model.ReasoningEfforts { + trimmed := strings.TrimSpace(effort) + if trimmed != "" { + row.ReasoningEfforts = append(row.ReasoningEfforts, ReasoningEffort(trimmed)) + } + } + } + if effort := strings.TrimSpace(model.DefaultReasoningEffort); effort != "" { + defaultEffort := ReasoningEffort(effort) + row.DefaultReasoningEffort = &defaultEffort + } +} + +func cloneConfigProviders(src map[string]aghconfig.ProviderConfig) map[string]aghconfig.ProviderConfig { + if src == nil { + return map[string]aghconfig.ProviderConfig{} + } + cloned := make(map[string]aghconfig.ProviderConfig, len(src)) + maps.Copy(cloned, src) + return cloned +} diff --git a/internal/modelcatalog/sources_test.go b/internal/modelcatalog/sources_test.go new file mode 100644 index 000000000..74ed84350 --- /dev/null +++ b/internal/modelcatalog/sources_test.go @@ -0,0 +1,111 @@ +package modelcatalog + +import ( + "slices" + "testing" + + aghconfig "github.com/pedronauck/agh/internal/config" + "github.com/pedronauck/agh/internal/testutil" +) + +func TestProviderConfigSources(t *testing.T) { + t.Parallel() + + t.Run("Should expose manual default outside curated list", func(t *testing.T) { + t.Parallel() + + source := NewConfigSource(map[string]aghconfig.ProviderConfig{ + "codex": { + Models: aghconfig.ProviderModelsConfig{ + Default: "manual-model", + Curated: []aghconfig.ProviderModelConfig{ + {ID: "curated-model", DisplayName: "Curated Model"}, + }, + }, + }, + }) + rows, err := source.ListModels(testutil.Context(t), ListOptions{ProviderID: "codex", Now: testTime(0)}) + if err != nil { + t.Fatalf("ListModels() error = %v", err) + } + if got, want := rowModelIDs(rows), []string{"manual-model", "curated-model"}; !slices.Equal(got, want) { + t.Fatalf("row ids = %#v, want %#v", got, want) + } + if rows[0].DisplayName != "" { + t.Fatalf("default DisplayName = %q, want empty metadata for manual default", rows[0].DisplayName) + } + }) + + t.Run("Should convert curated config metadata into rows", func(t *testing.T) { + t.Parallel() + + supportsTools := true + contextWindow := int64(128000) + defaultEffort := ReasoningEffortHigh + source := NewConfigSource(map[string]aghconfig.ProviderConfig{ + "codex": { + Models: aghconfig.ProviderModelsConfig{ + Default: "gpt-5.4", + Curated: []aghconfig.ProviderModelConfig{ + { + ID: "gpt-5.4", + DisplayName: "GPT-5.4", + ContextWindow: &contextWindow, + SupportsTools: &supportsTools, + ReasoningEfforts: []string{"low", "high"}, + DefaultReasoningEffort: string(defaultEffort), + }, + }, + }, + }, + }) + rows, err := source.ListModels(testutil.Context(t), ListOptions{ProviderID: "codex", Now: testTime(0)}) + if err != nil { + t.Fatalf("ListModels() error = %v", err) + } + if len(rows) != 1 { + t.Fatalf("len(rows) = %d, want 1: %#v", len(rows), rows) + } + row := rows[0] + if row.DisplayName != "GPT-5.4" { + t.Fatalf("DisplayName = %q, want GPT-5.4", row.DisplayName) + } + if row.ContextWindow == nil || *row.ContextWindow != contextWindow { + t.Fatalf("ContextWindow = %v, want %d", row.ContextWindow, contextWindow) + } + if row.SupportsTools == nil || !*row.SupportsTools { + t.Fatalf("SupportsTools = %v, want true", row.SupportsTools) + } + if !slices.Equal(row.ReasoningEfforts, []ReasoningEffort{ReasoningEffortLow, ReasoningEffortHigh}) { + t.Fatalf("ReasoningEfforts = %#v, want low/high", row.ReasoningEfforts) + } + if row.DefaultReasoningEffort == nil || *row.DefaultReasoningEffort != defaultEffort { + t.Fatalf("DefaultReasoningEffort = %v, want high", row.DefaultReasoningEffort) + } + }) + + t.Run("Should expose builtin provider model defaults", func(t *testing.T) { + t.Parallel() + + source := NewBuiltinSource() + rows, err := source.ListModels(testutil.Context(t), ListOptions{ProviderID: "codex", Now: testTime(0)}) + if err != nil { + t.Fatalf("ListModels() error = %v", err) + } + if len(rows) == 0 { + t.Fatal("len(rows) = 0, want builtin codex models") + } + if rows[0].SourceID != SourceIDBuiltin || rows[0].SourceKind != SourceKindBuiltin || + rows[0].Priority != PriorityBuiltin { + t.Fatalf("row source = %#v, want builtin source metadata", rows[0]) + } + }) +} + +func rowModelIDs(rows []ModelRow) []string { + ids := make([]string, 0, len(rows)) + for _, row := range rows { + ids = append(ids, row.ModelID) + } + return ids +} diff --git a/internal/modelcatalog/types.go b/internal/modelcatalog/types.go new file mode 100644 index 000000000..3ea2b0792 --- /dev/null +++ b/internal/modelcatalog/types.go @@ -0,0 +1,212 @@ +package modelcatalog + +import ( + "context" + "time" +) + +// SourceKind identifies the provenance family for a catalog source row. +type SourceKind string + +const ( + // SourceKindBuiltin identifies AGH's offline bootstrap catalog. + SourceKindBuiltin SourceKind = "builtin" + // SourceKindConfig identifies operator-authored provider model config. + SourceKindConfig SourceKind = "config" + // SourceKindModelsDev identifies enrichment from models.dev. + SourceKindModelsDev SourceKind = "models_dev" + // SourceKindProviderLive identifies live provider discovery. + SourceKindProviderLive SourceKind = "provider_live" + // SourceKindExtension identifies extension-provided model source rows. + SourceKindExtension SourceKind = "extension" + // SourceKindACPSession identifies session-scoped ACP observations. + SourceKindACPSession SourceKind = "acp_session" +) + +const ( + // SourceIDBuiltin is AGH's offline bootstrap catalog source. + SourceIDBuiltin = "builtin" + // SourceIDConfig is the operator-authored provider config source. + SourceIDConfig = "config" + // SourceIDModelsDev is the models.dev catalog source. + SourceIDModelsDev = "models_dev" +) + +const ( + // PriorityConfig lets explicit operator config win source conflicts. + PriorityConfig = 120 + // PriorityProviderLive ranks live provider data above extension rows. + PriorityProviderLive = 110 + // PriorityExtension ranks extension-provided rows above catalog enrichment. + PriorityExtension = 100 + // PriorityModelsDev ranks models.dev as catalog enrichment. + PriorityModelsDev = 50 + // PriorityBuiltin ranks offline bootstrap rows last. + PriorityBuiltin = 10 +) + +// ReasoningEffort identifies one normalized model reasoning level. +type ReasoningEffort string + +const ( + // ReasoningEffortMinimal is the smallest supported reasoning level. + ReasoningEffortMinimal ReasoningEffort = "minimal" + // ReasoningEffortLow is the low reasoning level. + ReasoningEffortLow ReasoningEffort = "low" + // ReasoningEffortMedium is the medium reasoning level. + ReasoningEffortMedium ReasoningEffort = "medium" + // ReasoningEffortHigh is the high reasoning level. + ReasoningEffortHigh ReasoningEffort = "high" + // ReasoningEffortXHigh is the extra-high reasoning level. + ReasoningEffortXHigh ReasoningEffort = "xhigh" +) + +// RefreshState identifies one source refresh lifecycle state. +type RefreshState string + +const ( + // RefreshStateIdle indicates a source has no active refresh state. + RefreshStateIdle RefreshState = "idle" + // RefreshStateRefreshing indicates a source refresh is currently running. + RefreshStateRefreshing RefreshState = "refreshing" + // RefreshStateSucceeded indicates the last source refresh succeeded. + RefreshStateSucceeded RefreshState = "succeeded" + // RefreshStateFailed indicates the last source refresh failed. + RefreshStateFailed RefreshState = "failed" + // RefreshStateDisabled indicates a configured source is disabled. + RefreshStateDisabled RefreshState = "disabled" +) + +// AvailabilityState identifies how reliable the merged availability signal is. +type AvailabilityState string + +const ( + // AvailabilityStateAvailableLive means a fresh live source reports availability. + AvailabilityStateAvailableLive AvailabilityState = "available_live" + // AvailabilityStateAvailableStale means a stale live source reports availability. + AvailabilityStateAvailableStale AvailabilityState = "available_stale" + // AvailabilityStateUnavailableLive means a fresh live source reports unavailability. + AvailabilityStateUnavailableLive AvailabilityState = "unavailable_live" + // AvailabilityStateUnavailableStale means a stale live source reports unavailability. + AvailabilityStateUnavailableStale AvailabilityState = "unavailable_stale" + // AvailabilityStateUnknown means no live or extension source reported availability. + AvailabilityStateUnknown AvailabilityState = "unknown" +) + +// ListOptions filters persisted catalog source rows. +type ListOptions struct { + ProviderID string + SourceID string + Refresh bool + IncludeAll bool + IncludeStale bool + Now time.Time +} + +// RefreshOptions controls a model catalog refresh request. +type RefreshOptions struct { + ProviderID string + SourceID string + Force bool + RequestID string + Now time.Time +} + +// ModelRow is one provider/model record contributed by one catalog source. +type ModelRow struct { + ProviderID string + ModelID string + DisplayName string + SourceID string + SourceKind SourceKind + Priority int + Available *bool + Stale bool + RefreshedAt time.Time + ExpiresAt time.Time + ContextWindow *int64 + MaxInputTokens *int64 + MaxOutputTokens *int64 + SupportsTools *bool + SupportsReasoning *bool + ReasoningEfforts []ReasoningEffort + DefaultReasoningEffort *ReasoningEffort + CostInputPerMillion *float64 + CostOutputPerMillion *float64 + LastError string +} + +// SourceRef identifies one source participating in a merged catalog projection. +type SourceRef struct { + SourceID string + SourceKind SourceKind + Priority int + RefreshedAt time.Time + Stale bool + LastError string +} + +// Model is the deterministic merged projection for one provider/model key. +type Model struct { + ProviderID string + ModelID string + DisplayName string + Sources []SourceRef + Available *bool + AvailabilityState string + Stale bool + RefreshedAt time.Time + ContextWindow *int64 + MaxInputTokens *int64 + MaxOutputTokens *int64 + SupportsTools *bool + SupportsReasoning *bool + ReasoningEfforts []ReasoningEffort + DefaultReasoningEffort *ReasoningEffort + CostInputPerMillion *float64 + CostOutputPerMillion *float64 + LastError string +} + +// SourceStatus reports provider-scoped source health and row counts. +type SourceStatus struct { + SourceID string + SourceKind SourceKind + ProviderID string + Priority int + LastRefresh time.Time + NextRefresh time.Time + LastSuccess time.Time + LastError string + RefreshState string + RowCount int + Stale bool +} + +// Source produces model rows for one catalog source. +type Source interface { + ID() string + Kind() SourceKind + Priority() int + ListModels(ctx context.Context, opts ListOptions) ([]ModelRow, error) +} + +// Store persists source rows and provider-scoped source status. +type Store interface { + ReplaceSourceRows( + ctx context.Context, + sourceID string, + providerID string, + rows []ModelRow, + status SourceStatus, + ) error + ListRows(ctx context.Context, opts ListOptions) ([]ModelRow, error) + ListSourceStatus(ctx context.Context, providerID string) ([]SourceStatus, error) +} + +// Service exposes merged model catalog projections. +type Service interface { + ListModels(ctx context.Context, opts ListOptions) ([]Model, error) + Refresh(ctx context.Context, opts RefreshOptions) ([]SourceStatus, error) + ListSourceStatus(ctx context.Context, providerID string) ([]SourceStatus, error) +} diff --git a/internal/sandbox/daytona/provider_test.go b/internal/sandbox/daytona/provider_test.go index e4472861a..70886a851 100644 --- a/internal/sandbox/daytona/provider_test.go +++ b/internal/sandbox/daytona/provider_test.go @@ -576,7 +576,7 @@ func TestDaytonaToolHostPermissionDecisionModes(t *testing.T) { t.Fatalf("newDaytonaToolHost() error = %v", err) } decision, interactive := host.PermissionDecision(acpsdk.RequestPermissionRequest{ - ToolCall: acpsdk.RequestPermissionToolCall{ + ToolCall: acpsdk.ToolCallUpdate{ Kind: tc.kind, Locations: []acpsdk.ToolCallLocation{{Path: "file.txt"}}, }, @@ -800,7 +800,7 @@ func TestDaytonaToolHostConstructorAuthorizationAndPaths(t *testing.T) { t.Fatalf("Authorize(create terminal) error = %v", err) } decision, interactive := allowAll.PermissionDecision(acpsdk.RequestPermissionRequest{ - ToolCall: acpsdk.RequestPermissionToolCall{ + ToolCall: acpsdk.ToolCallUpdate{ Locations: []acpsdk.ToolCallLocation{{Path: "/outside/file.txt"}}, }, }) diff --git a/internal/session/interfaces.go b/internal/session/interfaces.go index 773c0f630..1e9f74499 100644 --- a/internal/session/interfaces.go +++ b/internal/session/interfaces.go @@ -126,6 +126,7 @@ type AgentProcess struct { waitFn func() error stderrFn func() string healthStateFn func() subprocess.HealthState + capsSnapshotFn func() acp.Caps approvePermissionFn func(context.Context, acp.ApproveRequest) error requestPermissionFn func(context.Context, acp.RequestPermissionRequest) (acp.RequestPermissionResponse, error) configureRuntimeFn func(func() TurnSource) @@ -152,6 +153,7 @@ type AgentProcessOptions struct { HealthState func() subprocess.HealthState ApprovePermission func(context.Context, acp.ApproveRequest) error RequestPermission func(context.Context, acp.RequestPermissionRequest) (acp.RequestPermissionResponse, error) + CapsSnapshot func() acp.Caps ConfigureRuntime func(func() TurnSource) ToolHost sandbox.ToolHost } @@ -191,6 +193,7 @@ func NewAgentProcess(opts AgentProcessOptions) *AgentProcess { waitFn: waitFn, stderrFn: stderrFn, healthStateFn: opts.HealthState, + capsSnapshotFn: opts.CapsSnapshot, approvePermissionFn: opts.ApprovePermission, requestPermissionFn: opts.RequestPermission, configureRuntimeFn: opts.ConfigureRuntime, @@ -229,6 +232,17 @@ func (p *AgentProcess) HealthState() (subprocess.HealthState, bool) { return p.healthStateFn(), true } +// CapsSnapshot returns the latest ACP capability/config snapshot for this process. +func (p *AgentProcess) CapsSnapshot() acp.Caps { + if p == nil { + return acp.Caps{} + } + if p.capsSnapshotFn != nil { + return acp.CloneCaps(p.capsSnapshotFn()) + } + return acp.CloneCaps(p.Caps) +} + // ToolHost returns the sandbox-owned tool host when the process exposes one. func (p *AgentProcess) ToolHost() sandbox.ToolHost { if p == nil { @@ -281,18 +295,19 @@ func wrapACPProcess(proc *acp.AgentProcess) *AgentProcess { } return &AgentProcess{ - PID: proc.PID, - AgentName: proc.AgentName, - Command: proc.Command, - Args: append([]string(nil), proc.Args...), - Cwd: proc.Cwd, - SessionID: proc.SessionID, - Caps: proc.Caps, - StartedAt: proc.StartedAt, - done: proc.Done(), - waitFn: proc.Wait, - stderrFn: proc.Stderr, - healthStateFn: proc.HealthState, + PID: proc.PID, + AgentName: proc.AgentName, + Command: proc.Command, + Args: append([]string(nil), proc.Args...), + Cwd: proc.Cwd, + SessionID: proc.SessionID, + Caps: proc.CapsSnapshot(), + StartedAt: proc.StartedAt, + done: proc.Done(), + waitFn: proc.Wait, + stderrFn: proc.Stderr, + healthStateFn: proc.HealthState, + capsSnapshotFn: proc.CapsSnapshot, approvePermissionFn: func(ctx context.Context, req acp.ApproveRequest) error { if err := ctx.Err(); err != nil { return err diff --git a/internal/session/manager.go b/internal/session/manager.go index 407f274e3..5101f0568 100644 --- a/internal/session/manager.go +++ b/internal/session/manager.go @@ -35,6 +35,8 @@ var ( ErrPendingPermissionNotFound = errors.New("session: pending permission not found") // ErrPendingPermissionConflict reports that the approval request matched multiple pending permissions. ErrPendingPermissionConflict = errors.New("session: pending permission lookup is ambiguous") + // ErrInvalidRuntimeOverride reports that a session runtime override is invalid. + ErrInvalidRuntimeOverride = errors.New("session: invalid runtime override") ) // CreateOpts defines the inputs required to create a new session. @@ -42,6 +44,7 @@ type CreateOpts struct { AgentName string Provider string Model string + ReasoningEffort string SandboxRef string DisableSandbox bool Name string diff --git a/internal/session/manager_start.go b/internal/session/manager_start.go index 2d14b4568..64a718364 100644 --- a/internal/session/manager_start.go +++ b/internal/session/manager_start.go @@ -28,6 +28,7 @@ type sessionStartSpec struct { agentName string provider string model string + reasoningEffort string sandboxDisabled bool workspace workspacepkg.ResolvedWorkspace channel string @@ -103,6 +104,7 @@ func (m *Manager) prepareCreateStart(ctx context.Context, opts CreateOpts) (sess agentName: strings.TrimSpace(agentName), provider: strings.TrimSpace(opts.Provider), model: strings.TrimSpace(opts.Model), + reasoningEffort: strings.TrimSpace(opts.ReasoningEffort), sandboxDisabled: sandboxDisabled, workspace: resolvedWorkspace, channel: strings.TrimSpace(opts.Channel), @@ -135,6 +137,8 @@ func (m *Manager) prepareResumeStart(ctx context.Context, meta store.SessionMeta sessionName: meta.Name, agentName: meta.AgentName, provider: strings.TrimSpace(meta.Provider), + model: strings.TrimSpace(meta.Model), + reasoningEffort: strings.TrimSpace(meta.ReasoningEffort), workspace: resolvedWorkspace, channel: strings.TrimSpace(meta.Channel), sessionType: normalizeSessionType(Type(meta.SessionType)), @@ -364,6 +368,9 @@ func (m *Manager) prepareSessionStartRuntime( if err != nil { return sessionStartRuntime{}, fmt.Errorf("session: resolve session agent %q: %w", spec.agentName, err) } + if err := spec.validateRuntimeOverrides(resolved); err != nil { + return sessionStartRuntime{}, err + } startMCPServers, err := m.sessionMCPServers(ctx, spec, resolved) if err != nil { @@ -377,6 +384,25 @@ func (m *Manager) prepareSessionStartRuntime( }, nil } +func (s *sessionStartSpec) validateRuntimeOverrides(_ aghconfig.ResolvedAgent) error { + providerOverride := strings.TrimSpace(s.provider) + modelOverride := strings.TrimSpace(s.model) + reasoningEffort := strings.TrimSpace(s.reasoningEffort) + if modelOverride != "" && providerOverride == "" { + return fmt.Errorf("%w: provider is required when model is set", ErrInvalidRuntimeOverride) + } + if reasoningEffort == "" { + return nil + } + if providerOverride == "" { + return fmt.Errorf("%w: provider is required when reasoning_effort is set", ErrInvalidRuntimeOverride) + } + if err := ValidateReasoningEffort(reasoningEffort); err != nil { + return err + } + return nil +} + func (m *Manager) sessionMCPServers( ctx context.Context, spec *sessionStartSpec, @@ -440,6 +466,7 @@ func (s *sessionStartSpec) newStartingSession( AgentName: resolved.Name, Provider: strings.TrimSpace(resolved.Provider), Model: strings.TrimSpace(resolved.Model), + ReasoningEffort: strings.TrimSpace(s.reasoningEffort), WorkspaceID: s.workspace.ID, Workspace: s.workspace.RootDir, Channel: s.channel, @@ -556,6 +583,7 @@ func (m *Manager) sessionStartOpts( Permissions: m.startPermissions(session.Type, resolved.Permissions), SystemPrompt: resolved.Prompt, PreferredModel: preferredACPModel(resolved), + ReasoningEffort: strings.TrimSpace(session.ReasoningEffort), ResumeSessionID: s.acpSessionID, ToolGateway: newProviderNativeToolGateway(m, session), } @@ -602,6 +630,12 @@ func sessionStartEnvForProvider( env = setSessionStartEnvValue(env, "AGH_AGENT_NAME", strings.TrimSpace(session.AgentName)) env = unsetSessionStartEnvKeys(env, "AGH_SESSION_CHANNEL", "AGH_PEER_ID") + if effort := strings.TrimSpace(session.ReasoningEffort); effort != "" { + env = setSessionStartEnvValue(env, "AGH_REASONING_EFFORT", effort) + } else { + env = unsetSessionStartEnvKeys(env, "AGH_REASONING_EFFORT") + } + channel := strings.TrimSpace(session.Channel) if channel == "" { return env diff --git a/internal/session/manager_test.go b/internal/session/manager_test.go index 2c556754b..52042d11a 100644 --- a/internal/session/manager_test.go +++ b/internal/session/manager_test.go @@ -102,6 +102,7 @@ func TestCreateAppliesRuntimeModelOverride(t *testing.T) { h := newHarness(t) session, err := h.manager.Create(testutil.Context(t), CreateOpts{ AgentName: "coder", + Provider: "codex", Model: "task-profile-model", Name: "profiled-worker", Workspace: h.workspaceID, @@ -122,6 +123,82 @@ func TestCreateAppliesRuntimeModelOverride(t *testing.T) { t.Fatalf("meta.Model = %q, want task-profile-model", meta.Model) } }) + + t.Run("Should reject model override without provider override", func(t *testing.T) { + t.Parallel() + + h := newHarness(t) + _, err := h.manager.Create(testutil.Context(t), CreateOpts{ + AgentName: "coder", + Model: "task-profile-model", + Workspace: h.workspaceID, + }) + if !errors.Is(err, ErrInvalidRuntimeOverride) { + t.Fatalf("Create() error = %v, want ErrInvalidRuntimeOverride", err) + } + }) + + t.Run("Should persist supported reasoning effort override", func(t *testing.T) { + t.Parallel() + + h := newHarness(t) + session, err := h.manager.Create(testutil.Context(t), CreateOpts{ + AgentName: "coder", + Provider: "codex", + ReasoningEffort: "high", + Name: "reasoned-worker", + Workspace: h.workspaceID, + }) + if err != nil { + t.Fatalf("Create() error = %v", err) + } + t.Cleanup(func() { + if err := h.manager.Stop(testutil.Context(t), session.ID); err != nil { + t.Fatalf("Stop() error = %v", err) + } + }) + + if got := session.Info().ReasoningEffort; got != "high" { + t.Fatalf("session.Info().ReasoningEffort = %q, want high", got) + } + if meta := readMeta(t, session.MetaPath()); meta.ReasoningEffort != "high" { + t.Fatalf("meta.ReasoningEffort = %q, want high", meta.ReasoningEffort) + } + }) + + t.Run("Should persist reasoning effort without provider-level support flag", func(t *testing.T) { + t.Parallel() + + h := newHarness(t) + h.driver.startHook = func(opts acp.StartOpts, _ int) (*fakeProcess, error) { + if got := opts.ReasoningEffort; got != "high" { + t.Fatalf("StartOpts.ReasoningEffort = %q, want high", got) + } + return newFakeProcess(opts.AgentName, opts.Command, opts.Cwd, "acp-reasoning"), nil + } + session, err := h.manager.Create(testutil.Context(t), CreateOpts{ + AgentName: "coder", + Provider: "claude", + ReasoningEffort: "high", + Name: "reasoned-claude-worker", + Workspace: h.workspaceID, + }) + if err != nil { + t.Fatalf("Create() error = %v", err) + } + t.Cleanup(func() { + if err := h.manager.Stop(testutil.Context(t), session.ID); err != nil { + t.Fatalf("Stop() error = %v", err) + } + }) + + if got := session.Info().ReasoningEffort; got != "high" { + t.Fatalf("session.Info().ReasoningEffort = %q, want high", got) + } + if meta := readMeta(t, session.MetaPath()); meta.ReasoningEffort != "high" { + t.Fatalf("meta.ReasoningEffort = %q, want high", meta.ReasoningEffort) + } + }) } func TestCreateNotifiesSessionCreationBeforeImmediateExit(t *testing.T) { diff --git a/internal/session/query.go b/internal/session/query.go index 61c324258..c84b17b65 100644 --- a/internal/session/query.go +++ b/internal/session/query.go @@ -309,6 +309,7 @@ func sessionInfoFromMeta(meta store.SessionMeta) *Info { AgentName: meta.AgentName, Provider: meta.Provider, Model: strings.TrimSpace(meta.Model), + ReasoningEffort: strings.TrimSpace(meta.ReasoningEffort), WorkspaceID: meta.WorkspaceID, Channel: meta.Channel, Type: normalizeSessionType(Type(meta.SessionType)), diff --git a/internal/session/query_test.go b/internal/session/query_test.go index 69d381a31..26c022bdc 100644 --- a/internal/session/query_test.go +++ b/internal/session/query_test.go @@ -762,18 +762,19 @@ func TestReadMetaAndQueryHelpers(t *testing.T) { createdAt := time.Date(2026, 4, 3, 12, 0, 0, 0, time.UTC) updatedAt := createdAt.Add(time.Minute) info := sessionInfoFromMeta(store.SessionMeta{ - ID: "sess-1", - Name: "stored", - AgentName: "coder", - Provider: "codex", - Model: " gpt-4o ", - WorkspaceID: "ws-1", - State: string(StateStopped), - StopReason: &stopReason, - StopDetail: "deadline exceeded", - ACPSessionID: &acpID, - CreatedAt: createdAt, - UpdatedAt: updatedAt, + ID: "sess-1", + Name: "stored", + AgentName: "coder", + Provider: "codex", + Model: " gpt-4o ", + ReasoningEffort: " high ", + WorkspaceID: "ws-1", + State: string(StateStopped), + StopReason: &stopReason, + StopDetail: "deadline exceeded", + ACPSessionID: &acpID, + CreatedAt: createdAt, + UpdatedAt: updatedAt, }) if got := info.ACPSessionID; got != "acp-123" { t.Fatalf("sessionInfoFromMeta().ACPSessionID = %q, want %q", got, "acp-123") @@ -784,6 +785,9 @@ func TestReadMetaAndQueryHelpers(t *testing.T) { if got := info.Model; got != "gpt-4o" { t.Fatalf("sessionInfoFromMeta().Model = %q, want %q", got, "gpt-4o") } + if got := info.ReasoningEffort; got != "high" { + t.Fatalf("sessionInfoFromMeta().ReasoningEffort = %q, want %q", got, "high") + } if got := info.State; got != StateStopped { t.Fatalf("sessionInfoFromMeta().State = %q, want %q", got, StateStopped) } diff --git a/internal/session/runtime_overrides.go b/internal/session/runtime_overrides.go new file mode 100644 index 000000000..a2972e5cf --- /dev/null +++ b/internal/session/runtime_overrides.go @@ -0,0 +1,27 @@ +package session + +import ( + "fmt" + "slices" + "strings" +) + +// SupportedReasoningEfforts is the canonical ordered enum accepted by session creation. +var SupportedReasoningEfforts = []string{"minimal", "low", "medium", "high", "xhigh"} + +// IsSupportedReasoningEffort reports whether value is an accepted reasoning effort. +func IsSupportedReasoningEffort(value string) bool { + return slices.Contains(SupportedReasoningEfforts, strings.TrimSpace(value)) +} + +// ValidateReasoningEffort validates one reasoning effort override. +func ValidateReasoningEffort(value string) error { + trimmed := strings.TrimSpace(value) + if trimmed == "" || IsSupportedReasoningEffort(trimmed) { + return nil + } + return fmt.Errorf( + "%w: reasoning_effort must be one of minimal, low, medium, high, xhigh", + ErrInvalidRuntimeOverride, + ) +} diff --git a/internal/session/session.go b/internal/session/session.go index fa66a2213..8e4dd8439 100644 --- a/internal/session/session.go +++ b/internal/session/session.go @@ -52,6 +52,7 @@ type Info struct { AgentName string Provider string Model string + ReasoningEffort string WorkspaceID string Workspace string Channel string @@ -81,6 +82,7 @@ type Session struct { AgentName string Provider string Model string + ReasoningEffort string WorkspaceID string Workspace string Channel string @@ -125,12 +127,18 @@ func (s *Session) Info() *Info { s.mu.RLock() defer s.mu.RUnlock() + acpCaps := cloneCaps(s.ACPCaps) + if s.process != nil { + acpCaps = cloneCaps(s.process.CapsSnapshot()) + } + return &Info{ ID: s.ID, Name: s.Name, AgentName: s.AgentName, Provider: s.Provider, Model: s.Model, + ReasoningEffort: s.ReasoningEffort, WorkspaceID: s.WorkspaceID, Workspace: s.Workspace, Channel: s.Channel, @@ -141,7 +149,7 @@ func (s *Session) Info() *Info { StopDetail: s.stopDetail, Failure: store.CloneSessionFailure(s.failure), ACPSessionID: s.ACPSessionID, - ACPCaps: cloneCaps(s.ACPCaps), + ACPCaps: acpCaps, Liveness: store.CloneSessionLivenessMeta(s.Liveness), Sandbox: cloneSessionSandboxMeta(s.Sandbox), SoulSnapshotID: s.SoulSnapshotID, @@ -390,7 +398,7 @@ func (s *Session) updateFromProcess(proc *AgentProcess, now time.Time) { s.process = proc if proc != nil { s.ACPSessionID = strings.TrimSpace(proc.SessionID) - s.ACPCaps = cloneCaps(proc.Caps) + s.ACPCaps = cloneCaps(proc.CapsSnapshot()) if s.Liveness == nil { s.Liveness = &store.SessionLivenessMeta{} } @@ -833,6 +841,7 @@ func (s *Session) Meta() store.SessionMeta { AgentName: s.AgentName, Provider: s.Provider, Model: s.Model, + ReasoningEffort: s.ReasoningEffort, WorkspaceID: s.WorkspaceID, Channel: s.Channel, SessionType: string(normalizeSessionType(s.Type)), @@ -885,11 +894,7 @@ func canTransition(current State, next State) bool { } func cloneCaps(caps acp.Caps) acp.Caps { - return acp.Caps{ - SupportsLoadSession: caps.SupportsLoadSession, - SupportedModes: append([]string(nil), caps.SupportedModes...), - SupportedModels: append([]string(nil), caps.SupportedModels...), - } + return acp.CloneCaps(caps) } func stringPointer(value string) *string { diff --git a/internal/settings/collections.go b/internal/settings/collections.go index 2a8c83a37..d6b4bcb6a 100644 --- a/internal/settings/collections.go +++ b/internal/settings/collections.go @@ -241,7 +241,7 @@ func (s *service) buildProviderItems(ctx context.Context, cfg *aghconfig.Config) } } - items = append(items, cloneProviderItem(item)) + items = append(items, cloneProviderItem(&item)) } return items, nil } @@ -250,7 +250,7 @@ func providerSettingsFromConfig(name string, provider aghconfig.ProviderConfig) return ProviderSettings{ Command: provider.Command, DisplayName: provider.DisplayName, - DefaultModel: provider.DefaultModel, + Models: cloneProviderModelsConfig(provider.Models), Harness: provider.EffectiveHarness(), RuntimeProvider: provider.RuntimeProviderName(name), Transport: strings.TrimSpace(provider.Transport), @@ -498,7 +498,11 @@ func (s *service) putProvider( } if _, err := aghconfig.EditConfigOverlay(s.homePaths, "", target, func(editor *aghconfig.OverlayEditor) error { - return editor.SetTable([]string{"providers", name}, values) + path := []string{"providers", name} + if err := editor.Delete(path); err != nil { + return err + } + return editor.SetTable(path, values) }); err != nil { return MutationResult{}, fmt.Errorf("settings: write provider %q: %w", name, err) } @@ -1098,8 +1102,8 @@ func providerSettingsMap(settings ProviderSettings) map[string]any { if strings.TrimSpace(settings.DisplayName) != "" { values["display_name"] = strings.TrimSpace(settings.DisplayName) } - if strings.TrimSpace(settings.DefaultModel) != "" { - values["default_model"] = strings.TrimSpace(settings.DefaultModel) + if models := providerModelsSettingsMap(settings.Models); len(models) > 0 { + values["models"] = models } if settings.Harness != "" { values["harness"] = string(settings.Harness) @@ -1134,6 +1138,83 @@ func providerSettingsMap(settings ProviderSettings) map[string]any { return values } +func providerModelsSettingsMap(models aghconfig.ProviderModelsConfig) map[string]any { + values := make(map[string]any) + if strings.TrimSpace(models.Default) != "" { + values["default"] = strings.TrimSpace(models.Default) + } + if len(models.Curated) > 0 { + values["curated"] = providerModelConfigMaps(models.Curated) + } + if discovery := providerModelsDiscoveryMap(models.Discovery); len(discovery) > 0 { + values["discovery"] = discovery + } + return values +} + +func providerModelConfigMaps(models []aghconfig.ProviderModelConfig) []map[string]any { + values := make([]map[string]any, 0, len(models)) + for _, model := range models { + entry := make(map[string]any) + if strings.TrimSpace(model.ID) != "" { + entry["id"] = strings.TrimSpace(model.ID) + } + if strings.TrimSpace(model.DisplayName) != "" { + entry["display_name"] = strings.TrimSpace(model.DisplayName) + } + if model.ContextWindow != nil { + entry["context_window"] = *model.ContextWindow + } + if model.MaxInputTokens != nil { + entry["max_input_tokens"] = *model.MaxInputTokens + } + if model.MaxOutputTokens != nil { + entry["max_output_tokens"] = *model.MaxOutputTokens + } + if model.SupportsTools != nil { + entry["supports_tools"] = *model.SupportsTools + } + if model.SupportsReasoning != nil { + entry["supports_reasoning"] = *model.SupportsReasoning + } + if len(model.ReasoningEfforts) > 0 { + entry["reasoning_efforts"] = cloneStringSlicePreserveNil(model.ReasoningEfforts) + } + if strings.TrimSpace(model.DefaultReasoningEffort) != "" { + entry["default_reasoning_effort"] = strings.TrimSpace(model.DefaultReasoningEffort) + } + if model.CostInputPerMillion != nil { + entry["cost_input_per_million"] = *model.CostInputPerMillion + } + if model.CostOutputPerMillion != nil { + entry["cost_output_per_million"] = *model.CostOutputPerMillion + } + values = append(values, entry) + } + return values +} + +func providerModelsDiscoveryMap(discovery aghconfig.ProviderModelsDiscoveryConfig) map[string]any { + values := make(map[string]any) + if discovery.Enabled != nil { + values["enabled"] = *discovery.Enabled + } + if strings.TrimSpace(discovery.Command) != "" { + values["command"] = strings.TrimSpace(discovery.Command) + } + if strings.TrimSpace(discovery.Endpoint) != "" { + values["endpoint"] = strings.TrimSpace(discovery.Endpoint) + } + if strings.TrimSpace(discovery.Timeout) != "" { + values["timeout"] = strings.TrimSpace(discovery.Timeout) + } + return values +} + +func boolPtr(value bool) *bool { + return &value +} + func providerCredentialSlotMaps(slots []aghconfig.ProviderCredentialSlot) []map[string]any { values := make([]map[string]any, 0, len(slots)) for _, slot := range slots { diff --git a/internal/settings/models.go b/internal/settings/models.go index c270482b7..49690a5cf 100644 --- a/internal/settings/models.go +++ b/internal/settings/models.go @@ -464,7 +464,8 @@ type SourceMetadata struct { type ProviderSettings struct { Command string DisplayName string - DefaultModel string + Models aghconfig.ProviderModelsConfig + ModelsSet bool Harness aghconfig.ProviderHarness RuntimeProvider string Transport string @@ -633,20 +634,100 @@ func cloneSourceMetadata(value SourceMetadata) SourceMetadata { } func cloneProviderSettings(value ProviderSettings) ProviderSettings { + value.Models = cloneProviderModelsConfig(value.Models) value.CredentialSlots = append([]aghconfig.ProviderCredentialSlot(nil), value.CredentialSlots...) return value } -func cloneProviderItem(value ProviderItem) ProviderItem { - value.Settings = cloneProviderSettings(value.Settings) - value.Credentials = append([]ProviderCredentialStatus(nil), value.Credentials...) - value.SourceMetadata = cloneSourceMetadata(value.SourceMetadata) +func cloneProviderModelsConfig(value aghconfig.ProviderModelsConfig) aghconfig.ProviderModelsConfig { + return aghconfig.ProviderModelsConfig{ + Default: value.Default, + Curated: cloneProviderModelConfigs(value.Curated), + Discovery: cloneProviderModelsDiscoveryConfig(value.Discovery), + } +} + +func cloneProviderModelsDiscoveryConfig( + value aghconfig.ProviderModelsDiscoveryConfig, +) aghconfig.ProviderModelsDiscoveryConfig { + return aghconfig.ProviderModelsDiscoveryConfig{ + Enabled: cloneBoolPtr(value.Enabled), + Command: value.Command, + Endpoint: value.Endpoint, + Timeout: value.Timeout, + } +} + +func cloneProviderModelConfigs(values []aghconfig.ProviderModelConfig) []aghconfig.ProviderModelConfig { + if values == nil { + return nil + } + cloned := make([]aghconfig.ProviderModelConfig, len(values)) + for idx, value := range values { + cloned[idx] = aghconfig.ProviderModelConfig{ + ID: value.ID, + DisplayName: value.DisplayName, + ContextWindow: cloneInt64Ptr(value.ContextWindow), + MaxInputTokens: cloneInt64Ptr(value.MaxInputTokens), + MaxOutputTokens: cloneInt64Ptr(value.MaxOutputTokens), + SupportsTools: cloneBoolPtr(value.SupportsTools), + SupportsReasoning: cloneBoolPtr(value.SupportsReasoning), + ReasoningEfforts: cloneStringSlicePreserveNil(value.ReasoningEfforts), + DefaultReasoningEffort: value.DefaultReasoningEffort, + CostInputPerMillion: cloneFloat64Ptr(value.CostInputPerMillion), + CostOutputPerMillion: cloneFloat64Ptr(value.CostOutputPerMillion), + } + } + return cloned +} + +func cloneInt64Ptr(value *int64) *int64 { + if value == nil { + return nil + } + cloned := *value + return &cloned +} + +func cloneFloat64Ptr(value *float64) *float64 { + if value == nil { + return nil + } + cloned := *value + return &cloned +} + +func cloneStringSlicePreserveNil(value []string) []string { + if value == nil { + return nil + } + cloned := make([]string, len(value)) + copy(cloned, value) + return cloned +} + +func cloneBoolPtr(value *bool) *bool { + if value == nil { + return nil + } + cloned := *value + return &cloned +} + +func cloneProviderItem(value *ProviderItem) ProviderItem { + if value == nil { + return ProviderItem{} + } + cloned := *value + cloned.Settings = cloneProviderSettings(value.Settings) + cloned.Credentials = append([]ProviderCredentialStatus(nil), value.Credentials...) + cloned.SourceMetadata = cloneSourceMetadata(value.SourceMetadata) if value.Fallback != nil { fallback := *value.Fallback fallback.Settings = cloneProviderSettings(fallback.Settings) - value.Fallback = &fallback + cloned.Fallback = &fallback } - return value + return cloned } func cloneMCPServerItem(value MCPServerItem) MCPServerItem { diff --git a/internal/settings/service_test.go b/internal/settings/service_test.go index 4c4816f3d..a06643ed9 100644 --- a/internal/settings/service_test.go +++ b/internal/settings/service_test.go @@ -666,11 +666,19 @@ func TestListCollectionBuildsProvidersSandboxesAndHooks(t *testing.T) { writeFile(t, homePaths.ConfigFile, baseSettingsConfig()+` [providers.codex] -default_model = "gpt-5" +[providers.codex.models] +default = "gpt-5" +[[providers.codex.models.curated]] +id = "gpt-5" +display_name = "GPT-5" +[[providers.codex.models.curated]] +id = "gpt-5-mini" +display_name = "GPT-5 Mini" [providers.custom] command = "custom-acp --stdio" - default_model = "custom-model" + [providers.custom.models] + default = "custom-model" [[providers.custom.credential_slots]] name = "api_key" target_env = "CUSTOM_API_KEY" @@ -715,9 +723,18 @@ command = "/bin/ship" t.Fatalf("ListCollection(providers) error = %v", err) } codex := mustFindProviderItem(t, providers.Providers, "codex") - if got, want := codex.Settings.DefaultModel, "gpt-5"; got != want { + if got, want := codex.Settings.Models.Default, "gpt-5"; got != want { t.Fatalf("codex default model = %q, want %q", got, want) } + if got, want := len(codex.Settings.Models.Curated), 2; got != want { + t.Fatalf("codex curated model count = %d, want %d", got, want) + } + if got, want := codex.Settings.Models.Curated[0].ID, "gpt-5"; got != want { + t.Fatalf("codex curated[0].ID = %q, want %q", got, want) + } + if got, want := codex.Settings.Models.Curated[1].ID, "gpt-5-mini"; got != want { + t.Fatalf("codex curated[1].ID = %q, want %q", got, want) + } if !codex.Default { t.Fatal("codex default = false, want true") } @@ -782,8 +799,21 @@ func TestCollectionMutationsProviderSandboxAndHook(t *testing.T) { CollectionRequest: CollectionRequest{Collection: CollectionProviders}, Name: "custom", Provider: &ProviderSettings{ - Command: "custom-acp --stdio", - DefaultModel: "custom-model", + Command: "custom-acp --stdio", + Models: aghconfig.ProviderModelsConfig{ + Default: "custom-model", + Curated: []aghconfig.ProviderModelConfig{ + { + ID: "custom-model", + DisplayName: "Custom Model", + SupportsReasoning: boolPtr(true), + ReasoningEfforts: []string{"low", "high"}, + DefaultReasoningEffort: "high", + SupportsTools: boolPtr(true), + }, + {ID: "custom-fast", DisplayName: "Custom Fast"}, + }, + }, CredentialSlots: []aghconfig.ProviderCredentialSlot{ { Name: "api_key", @@ -806,9 +836,32 @@ func TestCollectionMutationsProviderSandboxAndHook(t *testing.T) { } configPayload := readFile(t, homePaths.ConfigFile) if !strings.Contains(configPayload, "[providers.custom]") || - !strings.Contains(configPayload, `default_model = "custom-model"`) { + !strings.Contains(configPayload, "[providers.custom.models]") || + !strings.Contains(configPayload, `default = "custom-model"`) || + !strings.Contains(configPayload, `[[providers.custom.models.curated]]`) || + !strings.Contains(configPayload, `id = "custom-model"`) || + !strings.Contains(configPayload, `reasoning_efforts = ["low", "high"]`) { t.Fatalf("config payload missing provider overlay:\n%s", configPayload) } + clearModelsResult, err := service.PutCollectionItem(ctx, CollectionItemPutRequest{ + CollectionRequest: CollectionRequest{Collection: CollectionProviders}, + Name: "custom", + Provider: &ProviderSettings{ + Command: "custom-acp --stdio", + ModelsSet: true, + }, + }) + if err != nil { + t.Fatalf("PutCollectionItem(clear provider models) error = %v", err) + } + if got, want := clearModelsResult.WriteTarget, WriteTargetGlobalConfig; got != want { + t.Fatalf("clear provider models write target = %q, want %q", got, want) + } + configPayload = readFile(t, homePaths.ConfigFile) + if strings.Contains(configPayload, `default = "custom-model"`) || + strings.Contains(configPayload, `[[providers.custom.models.curated]]`) { + t.Fatalf("config payload still contains provider model overlay after clear:\n%s", configPayload) + } if _, err := service.DeleteCollectionItem(ctx, CollectionItemDeleteRequest{ CollectionRequest: CollectionRequest{Collection: CollectionProviders}, Name: "custom", @@ -2014,9 +2067,10 @@ command = "/bin/echo" func mustFindProviderItem(t *testing.T, items []ProviderItem, name string) ProviderItem { t.Helper() - for _, item := range items { + for idx := range items { + item := &items[idx] if item.Name == name { - return item + return *item } } t.Fatalf("Provider item %q not found in %#v", name, items) diff --git a/internal/situation/service.go b/internal/situation/service.go index 4f7be71ec..3efb37d60 100644 --- a/internal/situation/service.go +++ b/internal/situation/service.go @@ -387,7 +387,7 @@ func (s *Service) resolveAgent( model := strings.TrimSpace(agent.Model) if provider != "" && model == "" { if providerConfig, err := workspaceSnapshot.Config.ResolveProvider(provider); err == nil { - model = strings.TrimSpace(providerConfig.DefaultModel) + model = strings.TrimSpace(providerConfig.Models.Default) } } return aghconfig.ResolvedAgent{ diff --git a/internal/store/globaldb/global_db.go b/internal/store/globaldb/global_db.go index 6ccf6478c..69d4a7de5 100644 --- a/internal/store/globaldb/global_db.go +++ b/internal/store/globaldb/global_db.go @@ -684,6 +684,7 @@ var globalSchemaStatements = appendSchemaStatements( }, bridgeTaskSubscriptionSchemaStatements(), resources.SchemaStatements(), + modelCatalogSchemaStatements(), ) func appendSchemaStatements(groups ...[]string) []string { @@ -841,40 +842,46 @@ var globalSchemaMigrations = []store.Migration{ }, { Version: 17, - Name: "rebuild_network_conversation_containers", - Up: migrateNetworkConversationContainers, - Checksum: "2026-05-05-rebuild-network-conversation-containers", - }, - { - Version: 18, Name: "add_task_orchestration_profile_schema", Up: migrateTaskOrchestrationProfileSchema, Checksum: "2026-05-05-add-task-orchestration-profile-schema", }, { - Version: 19, + Version: 18, Name: "add_task_review_gate_schema", Up: migrateTaskReviewGateSchema, Checksum: "2026-05-05-add-task-review-gate-schema", }, { - Version: 20, + Version: 19, Name: "add_notification_cursors", Up: migrateNotificationCursors, Checksum: "2026-05-05-add-notification-cursors", }, { - Version: 21, + Version: 20, Name: "add_bridge_task_subscriptions", Up: migrateBridgeTaskSubscriptions, Checksum: "2026-05-05-add-bridge-task-subscriptions", }, + { + Version: 21, + Name: "rebuild_network_conversation_containers", + Up: migrateNetworkConversationContainers, + Checksum: "2026-05-05-rebuild-network-conversation-containers", + }, { Version: 22, Name: "memv2_memory_events", Up: migrateMemoryV2Events, Checksum: "2026-05-05-memv2-memory-events", }, + { + Version: 23, + Name: "add_model_catalog_persistence", + Up: migrateModelCatalogPersistence, + Checksum: "2026-05-07-add-model-catalog-persistence", + }, } func migrateNetworkConversationContainers(ctx context.Context, tx *sql.Tx) error { diff --git a/internal/store/globaldb/global_db_heartbeat_test.go b/internal/store/globaldb/global_db_heartbeat_test.go index c858688c9..0c13a6c2b 100644 --- a/internal/store/globaldb/global_db_heartbeat_test.go +++ b/internal/store/globaldb/global_db_heartbeat_test.go @@ -134,36 +134,7 @@ func TestGlobalDBHeartbeatMigration(t *testing.T) { if got, want := len(records), len(globalSchemaMigrations); got != want { t.Fatalf("len(records) = %d, want %d", got, want) } - if records[11].Version != 12 || records[11].Name != "add_agent_soul_snapshots" { - t.Fatalf("records[11] = %#v, want Soul v12 before Heartbeat", records[11]) - } - if records[12].Version != 13 || records[12].Name != "add_agent_heartbeat_storage" { - t.Fatalf("records[12] = %#v, want Heartbeat storage v13", records[12]) - } - if records[13].Version != 14 || records[13].Name != "add_event_summary_lineage" { - t.Fatalf("records[13] = %#v, want event summary lineage v14", records[13]) - } - if records[14].Version != 15 || records[14].Name != "rebuild_event_summaries_for_global_payloads" { - t.Fatalf("records[14] = %#v, want event summary global payloads v15", records[14]) - } - if records[15].Version != 16 || records[15].Name != "rename_actor_ref_columns_to_actor_id" { - t.Fatalf("records[15] = %#v, want actor_id rename v16", records[15]) - } - if records[16].Version != 17 || records[16].Name != "rebuild_network_conversation_containers" { - t.Fatalf("records[16] = %#v, want network conversation containers v17", records[16]) - } - if records[17].Version != 18 || records[17].Name != "add_task_orchestration_profile_schema" { - t.Fatalf("records[17] = %#v, want task orchestration profile schema v18", records[17]) - } - if records[18].Version != 19 || records[18].Name != "add_task_review_gate_schema" { - t.Fatalf("records[18] = %#v, want task review gate schema v19", records[18]) - } - if records[19].Version != 20 || records[19].Name != "add_notification_cursors" { - t.Fatalf("records[19] = %#v, want notification cursors v20", records[19]) - } - if records[20].Version != 21 || records[20].Name != "add_bridge_task_subscriptions" { - t.Fatalf("records[20] = %#v, want bridge task subscriptions v21", records[20]) - } + assertAppliedGlobalMigrationOrder(t, records) }) t.Run("Should return wrapped errors and not mark failed Heartbeat migrations successful", func(t *testing.T) { diff --git a/internal/store/globaldb/global_db_model_catalog.go b/internal/store/globaldb/global_db_model_catalog.go new file mode 100644 index 000000000..272f90a5a --- /dev/null +++ b/internal/store/globaldb/global_db_model_catalog.go @@ -0,0 +1,781 @@ +package globaldb + +import ( + "context" + "database/sql" + "fmt" + "strings" + "time" + + "github.com/pedronauck/agh/internal/modelcatalog" + "github.com/pedronauck/agh/internal/store" +) + +var _ modelcatalog.Store = (*GlobalDB)(nil) + +type modelCatalogSQLExecutor interface { + ExecContext(ctx context.Context, query string, args ...any) (sql.Result, error) + QueryContext(ctx context.Context, query string, args ...any) (*sql.Rows, error) + QueryRowContext(ctx context.Context, query string, args ...any) *sql.Row +} + +type modelCatalogRowKey struct { + sourceID string + providerID string + modelID string +} + +// ReplaceSourceRows atomically replaces all model rows and status for one provider-scoped source. +func (g *GlobalDB) ReplaceSourceRows( + ctx context.Context, + sourceID string, + providerID string, + rows []modelcatalog.ModelRow, + status modelcatalog.SourceStatus, +) error { + if err := g.checkReady(ctx, "replace model catalog source rows"); err != nil { + return err + } + normalizedRows, normalizedStatus, err := normalizeModelCatalogReplacement(sourceID, providerID, rows, status) + if err != nil { + return err + } + + return g.withModelCatalogImmediateTransaction( + ctx, + "model catalog source replacement", + func(exec modelCatalogSQLExecutor) error { + if err := upsertModelCatalogSourceStatus(ctx, exec, normalizedStatus); err != nil { + return err + } + if _, err := exec.ExecContext( + ctx, + `DELETE FROM model_catalog_reasoning_efforts WHERE source_id = ? AND provider_id = ?`, + normalizedStatus.SourceID, + normalizedStatus.ProviderID, + ); err != nil { + return fmt.Errorf("store: delete model catalog reasoning efforts: %w", err) + } + if _, err := exec.ExecContext( + ctx, + `DELETE FROM model_catalog_rows WHERE source_id = ? AND provider_id = ?`, + normalizedStatus.SourceID, + normalizedStatus.ProviderID, + ); err != nil { + return fmt.Errorf("store: delete model catalog source rows: %w", err) + } + for _, row := range normalizedRows { + if err := insertModelCatalogRow(ctx, exec, row); err != nil { + return err + } + if err := insertModelCatalogReasoningEfforts(ctx, exec, row); err != nil { + return err + } + } + return nil + }, + ) +} + +// ListRows returns deterministic catalog source rows matching the query. +func (g *GlobalDB) ListRows( + ctx context.Context, + opts modelcatalog.ListOptions, +) (catalogRows []modelcatalog.ModelRow, err error) { + if err := g.checkReady(ctx, "list model catalog rows"); err != nil { + return nil, err + } + sqlQuery := `SELECT + source_id, + provider_id, + model_id, + source_kind, + priority, + available, + stale, + refreshed_at, + expires_at, + display_name, + context_window, + max_input_tokens, + max_output_tokens, + supports_tools, + supports_reasoning, + default_reasoning_effort, + cost_input_per_million, + cost_output_per_million, + last_error + FROM model_catalog_rows` + where, args := modelCatalogRowFilterClauses(opts, "") + sqlQuery = store.AppendWhere(sqlQuery, where) + sqlQuery += ` ORDER BY provider_id ASC, model_id ASC, priority DESC, refreshed_at DESC, source_id ASC` + + rows, err := g.db.QueryContext(ctx, sqlQuery, args...) + if err != nil { + return nil, fmt.Errorf("store: query model catalog rows: %w", err) + } + defer func() { + if closeErr := joinRowsCloseError(rows, nil, "model catalog row query"); closeErr != nil && err == nil { + err = closeErr + } + }() + + catalogRows = make([]modelcatalog.ModelRow, 0) + for rows.Next() { + row, scanErr := scanModelCatalogRow(rows) + if scanErr != nil { + return nil, scanErr + } + catalogRows = append(catalogRows, row) + } + if err := rows.Err(); err != nil { + return nil, fmt.Errorf("store: iterate model catalog rows: %w", err) + } + + efforts, err := listModelCatalogReasoningEfforts(ctx, g.db, opts) + if err != nil { + return nil, err + } + for index := range catalogRows { + key := modelCatalogKey(catalogRows[index].SourceID, catalogRows[index].ProviderID, catalogRows[index].ModelID) + catalogRows[index].ReasoningEfforts = efforts[key] + } + return catalogRows, nil +} + +// ListSourceStatus returns provider-scoped source status rows. +func (g *GlobalDB) ListSourceStatus( + ctx context.Context, + providerID string, +) (statuses []modelcatalog.SourceStatus, err error) { + if err := g.checkReady(ctx, "list model catalog source status"); err != nil { + return nil, err + } + sqlQuery := `SELECT + source_id, + provider_id, + source_kind, + priority, + refresh_state, + last_refresh_at, + next_refresh_at, + last_success_at, + last_error, + row_count, + stale + FROM model_catalog_sources` + where, args := store.BuildClauses(store.StringClause("provider_id", providerID)) + sqlQuery = store.AppendWhere(sqlQuery, where) + sqlQuery += ` ORDER BY provider_id ASC, source_id ASC` + + rows, err := g.db.QueryContext(ctx, sqlQuery, args...) + if err != nil { + return nil, fmt.Errorf("store: query model catalog source status: %w", err) + } + defer func() { + if closeErr := joinRowsCloseError( + rows, + nil, + "model catalog source status query", + ); closeErr != nil && + err == nil { + err = closeErr + } + }() + + statuses = make([]modelcatalog.SourceStatus, 0) + for rows.Next() { + status, scanErr := scanModelCatalogSourceStatus(rows) + if scanErr != nil { + return nil, scanErr + } + statuses = append(statuses, status) + } + if err := rows.Err(); err != nil { + return nil, fmt.Errorf("store: iterate model catalog source status: %w", err) + } + return statuses, nil +} + +func normalizeModelCatalogReplacement( + sourceID string, + providerID string, + rows []modelcatalog.ModelRow, + status modelcatalog.SourceStatus, +) ([]modelcatalog.ModelRow, modelcatalog.SourceStatus, error) { + trimmedSourceID, err := requireModelCatalogValue(sourceID, "source id") + if err != nil { + return nil, modelcatalog.SourceStatus{}, err + } + trimmedProviderID, err := requireModelCatalogValue(providerID, "provider id") + if err != nil { + return nil, modelcatalog.SourceStatus{}, err + } + normalizedStatus, err := normalizeModelCatalogStatus(trimmedSourceID, trimmedProviderID, status) + if err != nil { + return nil, modelcatalog.SourceStatus{}, err + } + + normalizedRows := make([]modelcatalog.ModelRow, 0, len(rows)) + for index, row := range rows { + normalizedRow, err := normalizeModelCatalogRow(trimmedSourceID, trimmedProviderID, normalizedStatus, row) + if err != nil { + return nil, modelcatalog.SourceStatus{}, fmt.Errorf("store: normalize model catalog row %d: %w", index, err) + } + if normalizedStatus.Priority == 0 { + normalizedStatus.Priority = normalizedRow.Priority + } + if normalizedRow.Priority != normalizedStatus.Priority { + return nil, modelcatalog.SourceStatus{}, fmt.Errorf( + "store: model catalog row %q priority %d does not match source priority %d", + normalizedRow.ModelID, + normalizedRow.Priority, + normalizedStatus.Priority, + ) + } + normalizedRows = append(normalizedRows, normalizedRow) + } + normalizedStatus.RowCount = len(normalizedRows) + return normalizedRows, normalizedStatus, nil +} + +func normalizeModelCatalogStatus( + sourceID string, + providerID string, + status modelcatalog.SourceStatus, +) (modelcatalog.SourceStatus, error) { + normalized := status + if normalized.SourceID = strings.TrimSpace(normalized.SourceID); normalized.SourceID == "" { + normalized.SourceID = sourceID + } + if normalized.ProviderID = strings.TrimSpace(normalized.ProviderID); normalized.ProviderID == "" { + normalized.ProviderID = providerID + } + if normalized.SourceID != sourceID { + return modelcatalog.SourceStatus{}, fmt.Errorf( + "store: model catalog status source id %q does not match %q", + normalized.SourceID, + sourceID, + ) + } + if normalized.ProviderID != providerID { + return modelcatalog.SourceStatus{}, fmt.Errorf( + "store: model catalog status provider id %q does not match %q", + normalized.ProviderID, + providerID, + ) + } + normalized.SourceKind = modelcatalog.SourceKind(strings.TrimSpace(string(normalized.SourceKind))) + if normalized.SourceKind == "" { + return modelcatalog.SourceStatus{}, fmt.Errorf("store: model catalog status source kind is required") + } + if err := modelcatalog.ValidateSourceIdentity(normalized.SourceID, normalized.SourceKind); err != nil { + return modelcatalog.SourceStatus{}, fmt.Errorf("store: validate model catalog source identity: %w", err) + } + normalized.RefreshState = strings.TrimSpace(normalized.RefreshState) + if normalized.RefreshState == "" { + normalized.RefreshState = string(modelcatalog.RefreshStateIdle) + } + normalized.LastError = strings.TrimSpace(normalized.LastError) + if normalized.RowCount < 0 { + return modelcatalog.SourceStatus{}, fmt.Errorf( + "store: model catalog status row count %d is invalid", + normalized.RowCount, + ) + } + return normalized, nil +} + +func normalizeModelCatalogRow( + sourceID string, + providerID string, + status modelcatalog.SourceStatus, + row modelcatalog.ModelRow, +) (modelcatalog.ModelRow, error) { + normalized := row + if normalized.SourceID = strings.TrimSpace(normalized.SourceID); normalized.SourceID == "" { + normalized.SourceID = sourceID + } + if normalized.ProviderID = strings.TrimSpace(normalized.ProviderID); normalized.ProviderID == "" { + normalized.ProviderID = providerID + } + if normalized.SourceID != sourceID { + return modelcatalog.ModelRow{}, fmt.Errorf( + "source id %q does not match %q", + normalized.SourceID, + sourceID, + ) + } + if normalized.ProviderID != providerID { + return modelcatalog.ModelRow{}, fmt.Errorf( + "provider id %q does not match %q", + normalized.ProviderID, + providerID, + ) + } + modelID, err := requireModelCatalogValue(normalized.ModelID, "model id") + if err != nil { + return modelcatalog.ModelRow{}, err + } + normalized.ModelID = modelID + normalized.SourceKind = modelcatalog.SourceKind(strings.TrimSpace(string(normalized.SourceKind))) + if normalized.SourceKind == "" { + normalized.SourceKind = status.SourceKind + } + if normalized.SourceKind != status.SourceKind { + return modelcatalog.ModelRow{}, fmt.Errorf( + "source kind %q does not match status source kind %q", + normalized.SourceKind, + status.SourceKind, + ) + } + normalized.DisplayName = strings.TrimSpace(normalized.DisplayName) + normalized.LastError = strings.TrimSpace(normalized.LastError) + if normalized.DefaultReasoningEffort != nil { + effort := modelcatalog.ReasoningEffort(strings.TrimSpace(string(*normalized.DefaultReasoningEffort))) + if effort == "" { + normalized.DefaultReasoningEffort = nil + } else { + normalized.DefaultReasoningEffort = &effort + } + } + for index, effort := range normalized.ReasoningEfforts { + trimmed := modelcatalog.ReasoningEffort(strings.TrimSpace(string(effort))) + if trimmed == "" { + return modelcatalog.ModelRow{}, fmt.Errorf("reasoning effort %d is required", index) + } + normalized.ReasoningEfforts[index] = trimmed + } + return normalized, nil +} + +func requireModelCatalogValue(value string, field string) (string, error) { + trimmed := strings.TrimSpace(value) + if trimmed == "" { + return "", fmt.Errorf("store: model catalog %s is required", field) + } + return trimmed, nil +} + +func upsertModelCatalogSourceStatus( + ctx context.Context, + exec modelCatalogSQLExecutor, + status modelcatalog.SourceStatus, +) error { + if _, err := exec.ExecContext( + ctx, + `INSERT INTO model_catalog_sources ( + source_id, + provider_id, + source_kind, + priority, + refresh_state, + last_refresh_at, + next_refresh_at, + last_success_at, + last_error, + row_count, + stale + ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) + ON CONFLICT(source_id, provider_id) DO UPDATE SET + source_kind = excluded.source_kind, + priority = excluded.priority, + refresh_state = excluded.refresh_state, + last_refresh_at = excluded.last_refresh_at, + next_refresh_at = excluded.next_refresh_at, + last_success_at = excluded.last_success_at, + last_error = excluded.last_error, + row_count = excluded.row_count, + stale = excluded.stale`, + status.SourceID, + status.ProviderID, + string(status.SourceKind), + status.Priority, + status.RefreshState, + store.FormatNullableTimestamp(status.LastRefresh), + store.FormatNullableTimestamp(status.NextRefresh), + store.FormatNullableTimestamp(status.LastSuccess), + status.LastError, + status.RowCount, + boolToSQLiteInt(status.Stale), + ); err != nil { + return fmt.Errorf("store: upsert model catalog source status: %w", err) + } + return nil +} + +func insertModelCatalogRow(ctx context.Context, exec modelCatalogSQLExecutor, row modelcatalog.ModelRow) error { + if _, err := exec.ExecContext( + ctx, + `INSERT INTO model_catalog_rows ( + source_id, + provider_id, + model_id, + source_kind, + priority, + available, + stale, + refreshed_at, + expires_at, + display_name, + context_window, + max_input_tokens, + max_output_tokens, + supports_tools, + supports_reasoning, + default_reasoning_effort, + cost_input_per_million, + cost_output_per_million, + last_error + ) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`, + row.SourceID, + row.ProviderID, + row.ModelID, + string(row.SourceKind), + row.Priority, + nullableBoolToSQLiteInt(row.Available), + boolToSQLiteInt(row.Stale), + store.FormatNullableTimestamp(row.RefreshedAt), + store.FormatNullableTimestamp(row.ExpiresAt), + row.DisplayName, + store.NullableInt64(row.ContextWindow), + store.NullableInt64(row.MaxInputTokens), + store.NullableInt64(row.MaxOutputTokens), + nullableBoolToSQLiteInt(row.SupportsTools), + nullableBoolToSQLiteInt(row.SupportsReasoning), + nullableReasoningEffort(row.DefaultReasoningEffort), + store.NullableFloat64(row.CostInputPerMillion), + store.NullableFloat64(row.CostOutputPerMillion), + row.LastError, + ); err != nil { + return fmt.Errorf( + "store: insert model catalog row %q/%q/%q: %w", + row.SourceID, + row.ProviderID, + row.ModelID, + err, + ) + } + return nil +} + +func insertModelCatalogReasoningEfforts( + ctx context.Context, + exec modelCatalogSQLExecutor, + row modelcatalog.ModelRow, +) error { + for rank, effort := range row.ReasoningEfforts { + if _, err := exec.ExecContext( + ctx, + `INSERT INTO model_catalog_reasoning_efforts ( + source_id, + provider_id, + model_id, + effort, + rank + ) VALUES (?, ?, ?, ?, ?)`, + row.SourceID, + row.ProviderID, + row.ModelID, + string(effort), + rank, + ); err != nil { + return fmt.Errorf("store: insert model catalog reasoning effort %q: %w", effort, err) + } + } + return nil +} + +func listModelCatalogReasoningEfforts( + ctx context.Context, + exec modelCatalogSQLExecutor, + opts modelcatalog.ListOptions, +) (efforts map[modelCatalogRowKey][]modelcatalog.ReasoningEffort, err error) { + sqlQuery := `SELECT + e.source_id, + e.provider_id, + e.model_id, + e.effort + FROM model_catalog_reasoning_efforts e + JOIN model_catalog_rows r + ON r.source_id = e.source_id + AND r.provider_id = e.provider_id + AND r.model_id = e.model_id` + where, args := modelCatalogRowFilterClauses(opts, "r") + sqlQuery = store.AppendWhere(sqlQuery, where) + sqlQuery += ` ORDER BY e.source_id ASC, e.provider_id ASC, e.model_id ASC, e.rank ASC, e.effort ASC` + + rows, err := exec.QueryContext(ctx, sqlQuery, args...) + if err != nil { + return nil, fmt.Errorf("store: query model catalog reasoning efforts: %w", err) + } + defer func() { + if closeErr := joinRowsCloseError( + rows, + nil, + "model catalog reasoning effort query", + ); closeErr != nil && + err == nil { + err = closeErr + } + }() + + efforts = make(map[modelCatalogRowKey][]modelcatalog.ReasoningEffort) + for rows.Next() { + var ( + sourceID string + providerID string + modelID string + effort string + ) + if err := rows.Scan(&sourceID, &providerID, &modelID, &effort); err != nil { + return nil, fmt.Errorf("store: scan model catalog reasoning effort: %w", err) + } + key := modelCatalogKey(sourceID, providerID, modelID) + efforts[key] = append(efforts[key], modelcatalog.ReasoningEffort(effort)) + } + if err := rows.Err(); err != nil { + return nil, fmt.Errorf("store: iterate model catalog reasoning efforts: %w", err) + } + return efforts, nil +} + +func scanModelCatalogRow(scanner interface{ Scan(dest ...any) error }) (modelcatalog.ModelRow, error) { + var ( + row modelcatalog.ModelRow + sourceKind string + availableRaw sql.NullInt64 + staleRaw int + refreshedAtRaw string + expiresAtRaw string + contextWindowRaw sql.NullInt64 + maxInputTokensRaw sql.NullInt64 + maxOutputTokensRaw sql.NullInt64 + supportsToolsRaw sql.NullInt64 + supportsReasoningRaw sql.NullInt64 + defaultReasoningEffortRaw sql.NullString + costInputPerMillionRaw sql.NullFloat64 + costOutputPerMillionRaw sql.NullFloat64 + ) + if err := scanner.Scan( + &row.SourceID, + &row.ProviderID, + &row.ModelID, + &sourceKind, + &row.Priority, + &availableRaw, + &staleRaw, + &refreshedAtRaw, + &expiresAtRaw, + &row.DisplayName, + &contextWindowRaw, + &maxInputTokensRaw, + &maxOutputTokensRaw, + &supportsToolsRaw, + &supportsReasoningRaw, + &defaultReasoningEffortRaw, + &costInputPerMillionRaw, + &costOutputPerMillionRaw, + &row.LastError, + ); err != nil { + return modelcatalog.ModelRow{}, fmt.Errorf("store: scan model catalog row: %w", err) + } + row.SourceKind = modelcatalog.SourceKind(sourceKind) + available, err := nullableSQLiteIntToBool(availableRaw, "available") + if err != nil { + return modelcatalog.ModelRow{}, err + } + row.Available = available + row.Stale = staleRaw != 0 + if row.RefreshedAt, err = parseOptionalModelCatalogTimestamp(refreshedAtRaw, "refreshed_at"); err != nil { + return modelcatalog.ModelRow{}, err + } + if row.ExpiresAt, err = parseOptionalModelCatalogTimestamp(expiresAtRaw, "expires_at"); err != nil { + return modelcatalog.ModelRow{}, err + } + row.ContextWindow = store.NullInt64(contextWindowRaw) + row.MaxInputTokens = store.NullInt64(maxInputTokensRaw) + row.MaxOutputTokens = store.NullInt64(maxOutputTokensRaw) + if row.SupportsTools, err = nullableSQLiteIntToBool(supportsToolsRaw, "supports_tools"); err != nil { + return modelcatalog.ModelRow{}, err + } + if row.SupportsReasoning, err = nullableSQLiteIntToBool(supportsReasoningRaw, "supports_reasoning"); err != nil { + return modelcatalog.ModelRow{}, err + } + row.DefaultReasoningEffort = nullReasoningEffort(defaultReasoningEffortRaw) + row.CostInputPerMillion = store.NullFloat64(costInputPerMillionRaw) + row.CostOutputPerMillion = store.NullFloat64(costOutputPerMillionRaw) + return row, nil +} + +func scanModelCatalogSourceStatus(scanner interface{ Scan(dest ...any) error }) (modelcatalog.SourceStatus, error) { + var ( + status modelcatalog.SourceStatus + sourceKind string + lastRefreshRaw string + nextRefreshRaw string + lastSuccessRaw string + staleRaw int + ) + if err := scanner.Scan( + &status.SourceID, + &status.ProviderID, + &sourceKind, + &status.Priority, + &status.RefreshState, + &lastRefreshRaw, + &nextRefreshRaw, + &lastSuccessRaw, + &status.LastError, + &status.RowCount, + &staleRaw, + ); err != nil { + return modelcatalog.SourceStatus{}, fmt.Errorf("store: scan model catalog source status: %w", err) + } + var err error + status.SourceKind = modelcatalog.SourceKind(sourceKind) + if status.LastRefresh, err = parseOptionalModelCatalogTimestamp(lastRefreshRaw, "last_refresh_at"); err != nil { + return modelcatalog.SourceStatus{}, err + } + if status.NextRefresh, err = parseOptionalModelCatalogTimestamp(nextRefreshRaw, "next_refresh_at"); err != nil { + return modelcatalog.SourceStatus{}, err + } + if status.LastSuccess, err = parseOptionalModelCatalogTimestamp(lastSuccessRaw, "last_success_at"); err != nil { + return modelcatalog.SourceStatus{}, err + } + status.Stale = staleRaw != 0 + return status, nil +} + +func modelCatalogRowFilterClauses(opts modelcatalog.ListOptions, alias string) ([]string, []any) { + column := func(name string) string { + if strings.TrimSpace(alias) == "" { + return name + } + return strings.TrimSpace(alias) + "." + name + } + + where := make([]string, 0, 3) + args := make([]any, 0, 2) + if providerID := strings.TrimSpace(opts.ProviderID); providerID != "" { + where = append(where, column("provider_id")+" = ?") + args = append(args, providerID) + } + if sourceID := strings.TrimSpace(opts.SourceID); sourceID != "" { + where = append(where, column("source_id")+" = ?") + args = append(args, sourceID) + } + if !opts.IncludeStale && !opts.IncludeAll { + where = append(where, column("stale")+" = 0") + } + return where, args +} + +func (g *GlobalDB) withModelCatalogImmediateTransaction( + ctx context.Context, + action string, + run func(exec modelCatalogSQLExecutor) error, +) (err error) { + conn, err := g.db.Conn(ctx) + if err != nil { + return fmt.Errorf("store: open connection for %s: %w", action, err) + } + defer func() { + if closeErr := conn.Close(); closeErr != nil && err == nil { + err = fmt.Errorf("store: close %s transaction connection: %w", action, closeErr) + } + }() + + rollbackCtx := context.WithoutCancel(ctx) + if _, err := conn.ExecContext(ctx, "BEGIN IMMEDIATE"); err != nil { + return fmt.Errorf("store: begin immediate %s transaction: %w", action, err) + } + + finished := false + defer func() { + if !finished { + joinCleanupError(&err, rollbackImmediate(rollbackCtx, conn, action)) + } + }() + + if err := run(conn); err != nil { + return err + } + if _, err = conn.ExecContext(ctx, "COMMIT"); err != nil { + return fmt.Errorf("store: commit %s transaction: %w", action, err) + } + finished = true + return nil +} + +func modelCatalogKey(sourceID string, providerID string, modelID string) modelCatalogRowKey { + return modelCatalogRowKey{ + sourceID: sourceID, + providerID: providerID, + modelID: modelID, + } +} + +func boolToSQLiteInt(value bool) int { + if value { + return 1 + } + return 0 +} + +func nullableBoolToSQLiteInt(value *bool) any { + if value == nil { + return nil + } + return boolToSQLiteInt(*value) +} + +func nullableSQLiteIntToBool(value sql.NullInt64, field string) (*bool, error) { + if !value.Valid { + return nil, nil + } + switch value.Int64 { + case 0: + converted := false + return &converted, nil + case 1: + converted := true + return &converted, nil + default: + return nil, fmt.Errorf("store: model catalog %s boolean value %d is invalid", field, value.Int64) + } +} + +func nullableReasoningEffort(value *modelcatalog.ReasoningEffort) any { + if value == nil { + return nil + } + trimmed := strings.TrimSpace(string(*value)) + if trimmed == "" { + return nil + } + return trimmed +} + +func nullReasoningEffort(value sql.NullString) *modelcatalog.ReasoningEffort { + if !value.Valid { + return nil + } + trimmed := strings.TrimSpace(value.String) + if trimmed == "" { + return nil + } + effort := modelcatalog.ReasoningEffort(trimmed) + return &effort +} + +func parseOptionalModelCatalogTimestamp(value string, field string) (time.Time, error) { + parsed, err := store.ParseNullableTimestamp(value) + if err != nil { + return time.Time{}, fmt.Errorf("store: parse model catalog %s: %w", field, err) + } + if parsed == nil { + return time.Time{}, nil + } + return *parsed, nil +} diff --git a/internal/store/globaldb/global_db_model_catalog_test.go b/internal/store/globaldb/global_db_model_catalog_test.go new file mode 100644 index 000000000..fdb0c2c62 --- /dev/null +++ b/internal/store/globaldb/global_db_model_catalog_test.go @@ -0,0 +1,938 @@ +package globaldb + +import ( + "context" + "database/sql" + "path/filepath" + "slices" + "strings" + "testing" + "time" + + "github.com/pedronauck/agh/internal/modelcatalog" + "github.com/pedronauck/agh/internal/store" + "github.com/pedronauck/agh/internal/testutil" +) + +const modelCatalogMigrationVersion = 23 + +func TestGlobalDBModelCatalogSchemaMigration(t *testing.T) { + t.Parallel() + + t.Run("Should create model catalog schema on fresh DB", func(t *testing.T) { + t.Parallel() + + globalDB := openTestGlobalDB(t) + + assertModelCatalogSchema(t, globalDB.db) + assertAppliedMigrationVersion(t, globalDB.db, modelCatalogMigrationVersion) + }) + + t.Run("Should upgrade previous global schema by appending model catalog migration", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + path := filepath.Join(t.TempDir(), GlobalDatabaseName) + previousDB := openPreviousModelCatalogSchemaDB(t, path) + beforeRecords, err := store.AppliedMigrations(ctx, previousDB) + if err != nil { + t.Fatalf("AppliedMigrations(previous) error = %v", err) + } + if got, want := len(beforeRecords), modelCatalogMigrationVersion-1; got != want { + t.Fatalf("len(beforeRecords) = %d, want %d", got, want) + } + exists, err := tableExists(ctx, previousDB, "model_catalog_sources") + if err != nil { + t.Fatalf("tableExists(model_catalog_sources) error = %v", err) + } + if exists { + t.Fatal("model_catalog_sources exists before v23 migration, want absent") + } + if err := previousDB.Close(); err != nil { + t.Fatalf("previousDB.Close() error = %v", err) + } + + globalDB, err := OpenGlobalDB(ctx, path) + if err != nil { + t.Fatalf("OpenGlobalDB(upgrade) error = %v", err) + } + t.Cleanup(func() { + if closeErr := globalDB.Close(testutil.Context(t)); closeErr != nil { + t.Errorf("Close(upgrade) error = %v", closeErr) + } + }) + + assertModelCatalogSchema(t, globalDB.db) + records, err := store.AppliedMigrations(ctx, globalDB.db) + if err != nil { + t.Fatalf("AppliedMigrations(upgrade) error = %v", err) + } + if got, want := len(records), len(globalSchemaMigrations); got != want { + t.Fatalf("len(records) = %d, want %d", got, want) + } + if got := records[len(records)-1]; got.Version != modelCatalogMigrationVersion || + got.Name != "add_model_catalog_persistence" { + t.Fatalf("tail migration = %#v, want model catalog v23", got) + } + for index, before := range beforeRecords { + if !records[index].AppliedAt.Equal(before.AppliedAt) { + t.Fatalf( + "migration %d applied_at = %s, want unchanged %s", + before.Version, + records[index].AppliedAt, + before.AppliedAt, + ) + } + } + }) + + t.Run("Should keep model catalog migration record stable after reopen", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + path := filepath.Join(t.TempDir(), GlobalDatabaseName) + first, err := OpenGlobalDB(ctx, path) + if err != nil { + t.Fatalf("OpenGlobalDB(first) error = %v", err) + } + firstRecords, err := store.AppliedMigrations(ctx, first.db) + if err != nil { + t.Fatalf("AppliedMigrations(first) error = %v", err) + } + if err := first.Close(ctx); err != nil { + t.Fatalf("Close(first) error = %v", err) + } + + second, err := OpenGlobalDB(ctx, path) + if err != nil { + t.Fatalf("OpenGlobalDB(second) error = %v", err) + } + t.Cleanup(func() { + if closeErr := second.Close(testutil.Context(t)); closeErr != nil { + t.Errorf("Close(second) error = %v", closeErr) + } + }) + secondRecords, err := store.AppliedMigrations(ctx, second.db) + if err != nil { + t.Fatalf("AppliedMigrations(second) error = %v", err) + } + if got, want := len(secondRecords), len(firstRecords); got != want { + t.Fatalf("len(secondRecords) = %d, want %d", got, want) + } + for index, firstRecord := range firstRecords { + if !secondRecords[index].AppliedAt.Equal(firstRecord.AppliedAt) { + t.Fatalf( + "migration %d applied_at = %s, want unchanged %s", + firstRecord.Version, + secondRecords[index].AppliedAt, + firstRecord.AppliedAt, + ) + } + } + }) + + t.Run("Should keep model catalog migration identity at global registry tail", func(t *testing.T) { + t.Parallel() + + if len(globalSchemaMigrations) < modelCatalogMigrationVersion { + t.Fatalf( + "len(globalSchemaMigrations) = %d, want at least %d", + len(globalSchemaMigrations), + modelCatalogMigrationVersion, + ) + } + tail := globalSchemaMigrations[len(globalSchemaMigrations)-1] + if tail.Version != modelCatalogMigrationVersion || + tail.Name != "add_model_catalog_persistence" || + tail.Checksum != "2026-05-07-add-model-catalog-persistence" { + t.Fatalf( + "tail migration = version %d name %q checksum %q, want model catalog v23", + tail.Version, + tail.Name, + tail.Checksum, + ) + } + if previous := globalSchemaMigrations[len(globalSchemaMigrations)-2]; previous.Version != modelCatalogMigrationVersion-1 { + t.Fatalf("previous migration version = %d, want %d", previous.Version, modelCatalogMigrationVersion-1) + } + }) +} + +func TestGlobalDBModelCatalogStore(t *testing.T) { + t.Parallel() + + t.Run("Should replace source rows and reasoning efforts atomically", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + globalDB := openTestGlobalDB(t) + first := modelCatalogRow("config", "codex", "gpt-5.4", modelcatalog.SourceKindConfig, 120) + first.ReasoningEfforts = []modelcatalog.ReasoningEffort{ + modelcatalog.ReasoningEffortLow, + modelcatalog.ReasoningEffortHigh, + } + replaceModelCatalogRows( + t, + globalDB, + "config", + "codex", + modelcatalog.SourceKindConfig, + 120, + []modelcatalog.ModelRow{first}, + ) + + second := modelCatalogRow("config", "codex", "gpt-5.5", modelcatalog.SourceKindConfig, 120) + second.ReasoningEfforts = []modelcatalog.ReasoningEffort{modelcatalog.ReasoningEffortMinimal} + replaceModelCatalogRows( + t, + globalDB, + "config", + "codex", + modelcatalog.SourceKindConfig, + 120, + []modelcatalog.ModelRow{second}, + ) + + rows, err := globalDB.ListRows( + ctx, + modelcatalog.ListOptions{ProviderID: "codex", SourceID: "config", IncludeStale: true}, + ) + if err != nil { + t.Fatalf("ListRows() error = %v", err) + } + if got, want := len(rows), 1; got != want { + t.Fatalf("len(rows) = %d, want %d: %#v", got, want, rows) + } + if rows[0].ModelID != "gpt-5.5" || !slices.Equal(rows[0].ReasoningEfforts, second.ReasoningEfforts) { + t.Fatalf("rows[0] = %#v, want replacement row with minimal effort", rows[0]) + } + + var oldEffortCount int + if err := globalDB.db.QueryRowContext( + ctx, + `SELECT COUNT(*) FROM model_catalog_reasoning_efforts WHERE source_id = ? AND provider_id = ? AND model_id = ?`, + "config", + "codex", + "gpt-5.4", + ).Scan(&oldEffortCount); err != nil { + t.Fatalf("QueryRowContext(old efforts) error = %v", err) + } + if oldEffortCount != 0 { + t.Fatalf("old effort count = %d, want 0", oldEffortCount) + } + statuses, err := globalDB.ListSourceStatus(ctx, "codex") + if err != nil { + t.Fatalf("ListSourceStatus() error = %v", err) + } + if len(statuses) != 1 || statuses[0].RowCount != 1 { + t.Fatalf("statuses = %#v, want one status with row_count 1", statuses) + } + }) + + t.Run("Should roll back source replacement when reasoning effort insert fails", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + globalDB := openTestGlobalDB(t) + original := modelCatalogRow("config", "codex", "gpt-5.4", modelcatalog.SourceKindConfig, 120) + original.ReasoningEfforts = []modelcatalog.ReasoningEffort{modelcatalog.ReasoningEffortLow} + replaceModelCatalogRows( + t, + globalDB, + "config", + "codex", + modelcatalog.SourceKindConfig, + 120, + []modelcatalog.ModelRow{original}, + ) + + invalid := modelCatalogRow("config", "codex", "gpt-5.5", modelcatalog.SourceKindConfig, 120) + invalid.ReasoningEfforts = []modelcatalog.ReasoningEffort{ + modelcatalog.ReasoningEffortHigh, + modelcatalog.ReasoningEffortHigh, + } + err := globalDB.ReplaceSourceRows( + ctx, + "config", + "codex", + []modelcatalog.ModelRow{invalid}, + modelCatalogStatus("config", "codex", modelcatalog.SourceKindConfig, 120), + ) + if err == nil { + t.Fatal("ReplaceSourceRows(duplicate efforts) error = nil, want constraint error") + } + + rows, err := globalDB.ListRows( + ctx, + modelcatalog.ListOptions{ProviderID: "codex", SourceID: "config", IncludeStale: true}, + ) + if err != nil { + t.Fatalf("ListRows(after failed replace) error = %v", err) + } + if len(rows) != 1 || rows[0].ModelID != "gpt-5.4" || + !slices.Equal(rows[0].ReasoningEfforts, original.ReasoningEfforts) { + t.Fatalf("rows after failed replace = %#v, want original row preserved", rows) + } + statuses, err := globalDB.ListSourceStatus(ctx, "codex") + if err != nil { + t.Fatalf("ListSourceStatus(after failed replace) error = %v", err) + } + if len(statuses) != 1 || statuses[0].RowCount != 1 { + t.Fatalf("statuses after failed replace = %#v, want original row_count 1", statuses) + } + }) + + t.Run("Should filter rows by provider source and stale flag", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + globalDB := openTestGlobalDB(t) + fresh := modelCatalogRow("config", "codex", "codex-fresh", modelcatalog.SourceKindConfig, 120) + stale := modelCatalogRow("config", "codex", "codex-stale", modelcatalog.SourceKindConfig, 120) + stale.Stale = true + replaceModelCatalogRows( + t, + globalDB, + "config", + "codex", + modelcatalog.SourceKindConfig, + 120, + []modelcatalog.ModelRow{fresh, stale}, + ) + dev := modelCatalogRow("models_dev", "codex", "codex-dev", modelcatalog.SourceKindModelsDev, 50) + replaceModelCatalogRows( + t, + globalDB, + "models_dev", + "codex", + modelcatalog.SourceKindModelsDev, + 50, + []modelcatalog.ModelRow{dev}, + ) + claude := modelCatalogRow("config", "claude", "claude-fresh", modelcatalog.SourceKindConfig, 120) + replaceModelCatalogRows( + t, + globalDB, + "config", + "claude", + modelcatalog.SourceKindConfig, + 120, + []modelcatalog.ModelRow{claude}, + ) + + rows, err := globalDB.ListRows(ctx, modelcatalog.ListOptions{ProviderID: "codex", SourceID: "config"}) + if err != nil { + t.Fatalf("ListRows(fresh config) error = %v", err) + } + assertModelCatalogModelIDs(t, rows, []string{"codex-fresh"}) + + rows, err = globalDB.ListRows( + ctx, + modelcatalog.ListOptions{ProviderID: "codex", SourceID: "config", IncludeStale: true}, + ) + if err != nil { + t.Fatalf("ListRows(include stale) error = %v", err) + } + assertModelCatalogModelIDs(t, rows, []string{"codex-fresh", "codex-stale"}) + + rows, err = globalDB.ListRows( + ctx, + modelcatalog.ListOptions{ProviderID: "codex", SourceID: "models_dev", IncludeAll: true}, + ) + if err != nil { + t.Fatalf("ListRows(models_dev) error = %v", err) + } + assertModelCatalogModelIDs(t, rows, []string{"codex-dev"}) + + rows, err = globalDB.ListRows(ctx, modelcatalog.ListOptions{ProviderID: "claude", IncludeStale: true}) + if err != nil { + t.Fatalf("ListRows(claude) error = %v", err) + } + assertModelCatalogModelIDs(t, rows, []string{"claude-fresh"}) + }) + + t.Run("Should store models dev status per provider without empty provider sentinel", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + globalDB := openTestGlobalDB(t) + replaceModelCatalogRows(t, globalDB, "models_dev", "codex", modelcatalog.SourceKindModelsDev, 50, nil) + replaceModelCatalogRows(t, globalDB, "models_dev", "claude", modelcatalog.SourceKindModelsDev, 50, nil) + + statuses, err := globalDB.ListSourceStatus(ctx, "") + if err != nil { + t.Fatalf("ListSourceStatus(all) error = %v", err) + } + if got, want := len(statuses), 2; got != want { + t.Fatalf("len(statuses) = %d, want %d: %#v", got, want, statuses) + } + for _, status := range statuses { + if status.SourceID != "models_dev" { + t.Fatalf("status.SourceID = %q, want models_dev", status.SourceID) + } + if status.ProviderID == "" { + t.Fatalf("status has empty provider sentinel: %#v", status) + } + } + codexStatuses, err := globalDB.ListSourceStatus(ctx, "codex") + if err != nil { + t.Fatalf("ListSourceStatus(codex) error = %v", err) + } + if len(codexStatuses) != 1 || codexStatuses[0].ProviderID != "codex" { + t.Fatalf("codex statuses = %#v, want one codex status", codexStatuses) + } + }) + + t.Run("Should reject empty provider source status sentinel writes", func(t *testing.T) { + t.Parallel() + + globalDB := openTestGlobalDB(t) + err := globalDB.ReplaceSourceRows( + testutil.Context(t), + "models_dev", + "", + nil, + modelcatalog.SourceStatus{ + SourceID: "models_dev", + SourceKind: modelcatalog.SourceKindModelsDev, + ProviderID: "", + Priority: 50, + }, + ) + if err == nil || !strings.Contains(err.Error(), "provider id is required") { + t.Fatalf("ReplaceSourceRows(empty provider) error = %v, want provider id validation", err) + } + }) + + t.Run("Should reject mismatched source status and row identities", func(t *testing.T) { + t.Parallel() + + for _, tc := range []struct { + name string + rows []modelcatalog.ModelRow + status modelcatalog.SourceStatus + want string + }{ + { + name: "Should reject mismatched status source", + status: modelCatalogStatus("other", "codex", modelcatalog.SourceKindConfig, 120), + want: "status source id", + }, + { + name: "Should reject missing status source kind", + status: modelcatalog.SourceStatus{SourceID: "config", ProviderID: "codex", Priority: 120}, + want: "source kind is required", + }, + { + name: "Should reject mismatched row provider", + rows: []modelcatalog.ModelRow{ + modelCatalogRow("config", "claude", "gpt-5.4", modelcatalog.SourceKindConfig, 120), + }, + status: modelCatalogStatus("config", "codex", modelcatalog.SourceKindConfig, 120), + want: "provider id", + }, + { + name: "Should reject mismatched row source kind", + rows: []modelcatalog.ModelRow{ + modelCatalogRow("config", "codex", "gpt-5.4", modelcatalog.SourceKindModelsDev, 120), + }, + status: modelCatalogStatus("config", "codex", modelcatalog.SourceKindConfig, 120), + want: "source kind", + }, + { + name: "Should reject blank reasoning effort", + rows: []modelcatalog.ModelRow{ + func() modelcatalog.ModelRow { + row := modelCatalogRow("config", "codex", "gpt-5.4", modelcatalog.SourceKindConfig, 120) + row.ReasoningEfforts = []modelcatalog.ReasoningEffort{""} + return row + }(), + }, + status: modelCatalogStatus("config", "codex", modelcatalog.SourceKindConfig, 120), + want: "reasoning effort", + }, + } { + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + + globalDB := openTestGlobalDB(t) + err := globalDB.ReplaceSourceRows(testutil.Context(t), "config", "codex", tc.rows, tc.status) + if err == nil || !strings.Contains(err.Error(), tc.want) { + t.Fatalf("ReplaceSourceRows() error = %v, want containing %q", err, tc.want) + } + }) + } + }) + + t.Run("Should reject nil contexts for catalog store methods", func(t *testing.T) { + t.Parallel() + + globalDB := openTestGlobalDB(t) + nilCtx := nilModelCatalogTestContext() + if err := globalDB.ReplaceSourceRows( + nilCtx, + "config", + "codex", + nil, + modelCatalogStatus("config", "codex", modelcatalog.SourceKindConfig, 120), + ); err == nil { + t.Fatal("ReplaceSourceRows(nil context) error = nil, want validation error") + } + if _, err := globalDB.ListRows(nilCtx, modelcatalog.ListOptions{}); err == nil { + t.Fatal("ListRows(nil context) error = nil, want validation error") + } + if _, err := globalDB.ListSourceStatus(nilCtx, "codex"); err == nil { + t.Fatal("ListSourceStatus(nil context) error = nil, want validation error") + } + }) + + t.Run("Should preserve null default reasoning effort", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + globalDB := openTestGlobalDB(t) + row := modelCatalogRow("config", "codex", "gpt-5.4", modelcatalog.SourceKindConfig, 120) + row.ReasoningEfforts = []modelcatalog.ReasoningEffort{modelcatalog.ReasoningEffortLow} + replaceModelCatalogRows( + t, + globalDB, + "config", + "codex", + modelcatalog.SourceKindConfig, + 120, + []modelcatalog.ModelRow{row}, + ) + + var raw sql.NullString + if err := globalDB.db.QueryRowContext( + ctx, + `SELECT default_reasoning_effort FROM model_catalog_rows WHERE source_id = ? AND provider_id = ? AND model_id = ?`, + "config", + "codex", + "gpt-5.4", + ).Scan(&raw); err != nil { + t.Fatalf("QueryRowContext(default_reasoning_effort) error = %v", err) + } + if raw.Valid { + t.Fatalf("default_reasoning_effort raw = %#v, want NULL", raw) + } + rows, err := globalDB.ListRows( + ctx, + modelcatalog.ListOptions{ProviderID: "codex", SourceID: "config", IncludeStale: true}, + ) + if err != nil { + t.Fatalf("ListRows() error = %v", err) + } + if len(rows) != 1 || rows[0].DefaultReasoningEffort != nil { + t.Fatalf("rows = %#v, want nil DefaultReasoningEffort", rows) + } + }) + + t.Run("Should round trip nullable booleans and explicit default reasoning effort", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + globalDB := openTestGlobalDB(t) + available := false + supportsReasoning := false + defaultEffort := modelcatalog.ReasoningEffortHigh + row := modelCatalogRow("", "", "manual-model", "", 120) + row.Available = &available + row.SupportsTools = nil + row.SupportsReasoning = &supportsReasoning + row.DefaultReasoningEffort = &defaultEffort + row.ReasoningEfforts = []modelcatalog.ReasoningEffort{ + modelcatalog.ReasoningEffortLow, + modelcatalog.ReasoningEffortHigh, + } + status := modelCatalogStatus("config", "codex", modelcatalog.SourceKindConfig, 120) + status.RefreshState = "" + if err := globalDB.ReplaceSourceRows(ctx, "config", "codex", []modelcatalog.ModelRow{row}, status); err != nil { + t.Fatalf("ReplaceSourceRows() error = %v", err) + } + + statuses, err := globalDB.ListSourceStatus(ctx, "codex") + if err != nil { + t.Fatalf("ListSourceStatus() error = %v", err) + } + if len(statuses) != 1 || statuses[0].RefreshState != string(modelcatalog.RefreshStateIdle) { + t.Fatalf("statuses = %#v, want default idle refresh state", statuses) + } + rows, err := globalDB.ListRows( + ctx, + modelcatalog.ListOptions{ProviderID: "codex", SourceID: "config", IncludeStale: true}, + ) + if err != nil { + t.Fatalf("ListRows() error = %v", err) + } + if len(rows) != 1 { + t.Fatalf("rows = %#v, want one row", rows) + } + got := rows[0] + if got.SourceID != "config" || got.ProviderID != "codex" || got.SourceKind != modelcatalog.SourceKindConfig { + t.Fatalf("row identity = %#v, want normalized source/provider/kind", got) + } + if got.Available == nil || *got.Available || got.SupportsTools != nil || + got.SupportsReasoning == nil || *got.SupportsReasoning || + got.DefaultReasoningEffort == nil || *got.DefaultReasoningEffort != modelcatalog.ReasoningEffortHigh { + t.Fatalf("row nullable values = %#v, want false/nil/false/high", got) + } + }) + + t.Run("Should update source status row count stale flag and last error", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + globalDB := openTestGlobalDB(t) + row := modelCatalogRow("provider_live:codex", "codex", "gpt-5.4", modelcatalog.SourceKindProviderLive, 110) + replaceModelCatalogRows( + t, + globalDB, + "provider_live:codex", + "codex", + modelcatalog.SourceKindProviderLive, + 110, + []modelcatalog.ModelRow{row}, + ) + + failed := modelCatalogStatus("provider_live:codex", "codex", modelcatalog.SourceKindProviderLive, 110) + failed.RefreshState = string(modelcatalog.RefreshStateFailed) + failed.LastError = "redacted refresh failed" + failed.Stale = true + if err := globalDB.ReplaceSourceRows(ctx, "provider_live:codex", "codex", nil, failed); err != nil { + t.Fatalf("ReplaceSourceRows(failed) error = %v", err) + } + + statuses, err := globalDB.ListSourceStatus(ctx, "codex") + if err != nil { + t.Fatalf("ListSourceStatus() error = %v", err) + } + if len(statuses) != 1 { + t.Fatalf("statuses = %#v, want one status", statuses) + } + status := statuses[0] + if status.RowCount != 0 || !status.Stale || status.LastError != "redacted refresh failed" || + status.RefreshState != string(modelcatalog.RefreshStateFailed) { + t.Fatalf("status = %#v, want failed stale status with row_count 0", status) + } + rows, err := globalDB.ListRows( + ctx, + modelcatalog.ListOptions{ProviderID: "codex", SourceID: "provider_live:codex", IncludeStale: true}, + ) + if err != nil { + t.Fatalf("ListRows(after failed replace) error = %v", err) + } + if len(rows) != 0 { + t.Fatalf("rows after failed replace = %#v, want empty", rows) + } + }) + + t.Run("Should order equal freshness rows by source identity", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + globalDB := openTestGlobalDB(t) + refreshedAt := time.Date(2026, 5, 7, 12, 0, 0, 0, time.UTC) + zulu := modelCatalogRow("extension:z_source", "codex", "gpt-5.4", modelcatalog.SourceKindExtension, 100) + zulu.RefreshedAt = refreshedAt + alpha := modelCatalogRow("extension:a_source", "codex", "gpt-5.4", modelcatalog.SourceKindExtension, 100) + alpha.RefreshedAt = refreshedAt + replaceModelCatalogRows( + t, + globalDB, + "extension:z_source", + "codex", + modelcatalog.SourceKindExtension, + 100, + []modelcatalog.ModelRow{zulu}, + ) + replaceModelCatalogRows( + t, + globalDB, + "extension:a_source", + "codex", + modelcatalog.SourceKindExtension, + 100, + []modelcatalog.ModelRow{alpha}, + ) + + rows, err := globalDB.ListRows(ctx, modelcatalog.ListOptions{ProviderID: "codex", IncludeStale: true}) + if err != nil { + t.Fatalf("ListRows() error = %v", err) + } + if got, want := len(rows), 2; got != want { + t.Fatalf("len(rows) = %d, want %d: %#v", got, want, rows) + } + gotSources := []string{rows[0].SourceID, rows[1].SourceID} + if !slices.Equal(gotSources, []string{"extension:a_source", "extension:z_source"}) { + t.Fatalf("source order = %#v, want extension:a_source before extension:z_source", gotSources) + } + }) + + t.Run("Should surface corrupt persisted catalog timestamps and booleans", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + globalDB := openTestGlobalDB(t) + row := modelCatalogRow("config", "codex", "gpt-5.4", modelcatalog.SourceKindConfig, 120) + replaceModelCatalogRows( + t, + globalDB, + "config", + "codex", + modelcatalog.SourceKindConfig, + 120, + []modelcatalog.ModelRow{row}, + ) + + if _, err := globalDB.db.ExecContext( + ctx, + `UPDATE model_catalog_rows SET refreshed_at = ? WHERE source_id = ? AND provider_id = ? AND model_id = ?`, + "bad-timestamp", + "config", + "codex", + "gpt-5.4", + ); err != nil { + t.Fatalf("ExecContext(corrupt row timestamp) error = %v", err) + } + if _, err := globalDB.ListRows( + ctx, + modelcatalog.ListOptions{ProviderID: "codex", SourceID: "config", IncludeStale: true}, + ); err == nil || + !strings.Contains(err.Error(), "refreshed_at") { + t.Fatalf("ListRows(corrupt timestamp) error = %v, want refreshed_at parse error", err) + } + + if _, err := globalDB.db.ExecContext( + ctx, + `UPDATE model_catalog_rows SET refreshed_at = ?, available = ? WHERE source_id = ? AND provider_id = ? AND model_id = ?`, + store.FormatTimestamp(row.RefreshedAt), + 2, + "config", + "codex", + "gpt-5.4", + ); err == nil { + t.Fatal("ExecContext(corrupt boolean with checks enabled) error = nil, want constraint error") + } + if _, err := globalDB.db.ExecContext(ctx, `PRAGMA ignore_check_constraints = ON`); err != nil { + t.Fatalf("enable ignore_check_constraints error = %v", err) + } + if _, err := globalDB.db.ExecContext( + ctx, + `UPDATE model_catalog_rows SET refreshed_at = ?, available = ? WHERE source_id = ? AND provider_id = ? AND model_id = ?`, + store.FormatTimestamp(row.RefreshedAt), + 2, + "config", + "codex", + "gpt-5.4", + ); err != nil { + t.Fatalf("ExecContext(corrupt boolean) error = %v", err) + } + if _, err := globalDB.db.ExecContext(ctx, `PRAGMA ignore_check_constraints = OFF`); err != nil { + t.Fatalf("disable ignore_check_constraints error = %v", err) + } + if _, err := globalDB.ListRows( + ctx, + modelcatalog.ListOptions{ProviderID: "codex", SourceID: "config", IncludeStale: true}, + ); err == nil || + !strings.Contains(err.Error(), "available boolean") { + t.Fatalf("ListRows(corrupt boolean) error = %v, want available boolean error", err) + } + + if _, err := globalDB.db.ExecContext( + ctx, + `UPDATE model_catalog_sources SET last_refresh_at = ? WHERE source_id = ? AND provider_id = ?`, + "bad-status-time", + "config", + "codex", + ); err != nil { + t.Fatalf("ExecContext(corrupt status timestamp) error = %v", err) + } + if _, err := globalDB.ListSourceStatus(ctx, "codex"); err == nil || + !strings.Contains(err.Error(), "last_refresh_at") { + t.Fatalf("ListSourceStatus(corrupt timestamp) error = %v, want last_refresh_at parse error", err) + } + }) +} + +func assertModelCatalogSchema(t *testing.T, db *sql.DB) { + t.Helper() + + assertTablesPresent(t, db, "model_catalog_sources", "model_catalog_rows", "model_catalog_reasoning_efforts") + assertTableColumns(t, db, "model_catalog_sources", []string{ + "source_id", + "provider_id", + "source_kind", + "priority", + "refresh_state", + "last_refresh_at", + "next_refresh_at", + "last_success_at", + "last_error", + "row_count", + "stale", + }) + assertTableColumns(t, db, "model_catalog_rows", []string{ + "source_id", + "provider_id", + "model_id", + "source_kind", + "priority", + "available", + "stale", + "refreshed_at", + "expires_at", + "display_name", + "context_window", + "max_input_tokens", + "max_output_tokens", + "supports_tools", + "supports_reasoning", + "default_reasoning_effort", + "cost_input_per_million", + "cost_output_per_million", + "last_error", + }) + assertTableColumns(t, db, "model_catalog_reasoning_efforts", []string{ + "source_id", + "provider_id", + "model_id", + "effort", + "rank", + }) + assertIndexesPresent( + t, + db, + "model_catalog_rows", + "idx_model_catalog_rows_provider_model", + "idx_model_catalog_rows_source_provider", + ) + assertIndexesPresent(t, db, "model_catalog_sources", "idx_model_catalog_sources_provider") +} + +func openPreviousModelCatalogSchemaDB(t *testing.T, dbPath string) *sql.DB { + t.Helper() + + ctx := testutil.Context(t) + db, err := store.OpenSQLiteDatabase(ctx, dbPath, nil) + if err != nil { + t.Fatalf("OpenSQLiteDatabase(previous) error = %v", err) + } + if err := store.RunMigrations(ctx, db, previousModelCatalogMigrations()); err != nil { + t.Fatalf("RunMigrations(previous) error = %v", err) + } + return db +} + +func previousModelCatalogMigrations() []store.Migration { + migrations := append([]store.Migration(nil), globalSchemaMigrations[:modelCatalogMigrationVersion-1]...) + migrations[0].Statements = schemaStatementsWithoutModelCatalog() + return migrations +} + +func schemaStatementsWithoutModelCatalog() []string { + blocked := make(map[string]struct{}, len(modelCatalogSchemaStatements())) + for _, statement := range modelCatalogSchemaStatements() { + blocked[strings.TrimSpace(statement)] = struct{}{} + } + filtered := make([]string, 0, len(globalSchemaStatements)-len(blocked)) + for _, statement := range globalSchemaStatements { + if _, ok := blocked[strings.TrimSpace(statement)]; ok { + continue + } + filtered = append(filtered, statement) + } + return filtered +} + +func replaceModelCatalogRows( + t *testing.T, + globalDB *GlobalDB, + sourceID string, + providerID string, + sourceKind modelcatalog.SourceKind, + priority int, + rows []modelcatalog.ModelRow, +) { + t.Helper() + + if err := globalDB.ReplaceSourceRows( + testutil.Context(t), + sourceID, + providerID, + rows, + modelCatalogStatus(sourceID, providerID, sourceKind, priority), + ); err != nil { + t.Fatalf("ReplaceSourceRows(%s/%s) error = %v", sourceID, providerID, err) + } +} + +func modelCatalogRow( + sourceID string, + providerID string, + modelID string, + sourceKind modelcatalog.SourceKind, + priority int, +) modelcatalog.ModelRow { + available := true + supportsTools := true + supportsReasoning := true + contextWindow := int64(256000) + maxOutputTokens := int64(32000) + costInput := 1.25 + costOutput := 10.5 + return modelcatalog.ModelRow{ + SourceID: sourceID, + ProviderID: providerID, + ModelID: modelID, + DisplayName: strings.ToUpper(modelID), + SourceKind: sourceKind, + Priority: priority, + Available: &available, + RefreshedAt: time.Date(2026, 5, 7, 11, 0, 0, 0, time.UTC), + ExpiresAt: time.Date(2026, 5, 8, 11, 0, 0, 0, time.UTC), + ContextWindow: &contextWindow, + MaxOutputTokens: &maxOutputTokens, + SupportsTools: &supportsTools, + SupportsReasoning: &supportsReasoning, + CostInputPerMillion: &costInput, + CostOutputPerMillion: &costOutput, + } +} + +func modelCatalogStatus( + sourceID string, + providerID string, + sourceKind modelcatalog.SourceKind, + priority int, +) modelcatalog.SourceStatus { + now := time.Date(2026, 5, 7, 11, 0, 0, 0, time.UTC) + return modelcatalog.SourceStatus{ + SourceID: sourceID, + ProviderID: providerID, + SourceKind: sourceKind, + Priority: priority, + LastRefresh: now, + NextRefresh: now.Add(24 * time.Hour), + LastSuccess: now, + RefreshState: string(modelcatalog.RefreshStateSucceeded), + } +} + +func assertModelCatalogModelIDs(t *testing.T, rows []modelcatalog.ModelRow, want []string) { + t.Helper() + + got := make([]string, 0, len(rows)) + for _, row := range rows { + got = append(got, row.ModelID) + } + slices.Sort(got) + slices.Sort(want) + if !slices.Equal(got, want) { + t.Fatalf("model ids = %#v, want %#v", got, want) + } +} + +func nilModelCatalogTestContext() context.Context { + return nil +} diff --git a/internal/store/globaldb/global_db_network_conversations_test.go b/internal/store/globaldb/global_db_network_conversations_test.go index 2ff3dd2d2..864c96eab 100644 --- a/internal/store/globaldb/global_db_network_conversations_test.go +++ b/internal/store/globaldb/global_db_network_conversations_test.go @@ -11,7 +11,7 @@ import ( "github.com/pedronauck/agh/internal/testutil" ) -const networkConversationMigrationVersion = 17 +const networkConversationMigrationVersion = 21 func TestOpenGlobalDBCreatesNetworkConversationSchema(t *testing.T) { t.Parallel() @@ -153,6 +153,84 @@ func TestNetworkConversationMigrationRebuildsLegacyTimeline(t *testing.T) { func TestNetworkConversationMigrationReopenAfterRestart(t *testing.T) { t.Parallel() + t.Run( + "Should upgrade observed task and bridge migration history by appending network migration", + func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + path := filepath.Join(t.TempDir(), GlobalDatabaseName) + seedLegacyNetworkConversationDatabase(t, path) + + beforeDB, err := store.OpenSQLiteDatabase(ctx, path, nil) + if err != nil { + t.Fatalf("OpenSQLiteDatabase(before) error = %v", err) + } + beforeRecords, err := store.AppliedMigrations(ctx, beforeDB) + if err != nil { + t.Fatalf("AppliedMigrations(before) error = %v", err) + } + if err := beforeDB.Close(); err != nil { + t.Fatalf("beforeDB.Close() error = %v", err) + } + assertAppliedGlobalMigrationPrefix(t, beforeRecords, networkConversationMigrationVersion-1) + + first, err := OpenGlobalDB(ctx, path) + if err != nil { + t.Fatalf("OpenGlobalDB(first) error = %v", err) + } + firstRecords, err := store.AppliedMigrations(ctx, first.db) + if err != nil { + t.Fatalf("AppliedMigrations(first) error = %v", err) + } + assertAppliedGlobalMigrationOrder(t, firstRecords) + for index, before := range beforeRecords { + if !firstRecords[index].AppliedAt.Equal(before.AppliedAt) { + t.Fatalf( + "migration %d applied_at = %s, want unchanged %s", + before.Version, + firstRecords[index].AppliedAt, + before.AppliedAt, + ) + } + } + assertTaskOrchestrationProfileSchema(t, first.db) + assertReviewGateSchema(t, first.db) + assertNotificationCursorSchema(t, first.db) + assertBridgeTaskSubscriptionSchema(t, first.db) + assertTableLacksColumns(t, first.db, "network_timeline_log", "interaction_id") + assertTablesPresent(t, first.db, "network_threads", "network_direct_rooms", "network_work", "memory_events") + if err := first.Close(ctx); err != nil { + t.Fatalf("Close(first) error = %v", err) + } + + second, err := OpenGlobalDB(ctx, path) + if err != nil { + t.Fatalf("OpenGlobalDB(second) error = %v", err) + } + t.Cleanup(func() { + if closeErr := second.Close(ctx); closeErr != nil { + t.Fatalf("Close(second) error = %v", closeErr) + } + }) + secondRecords, err := store.AppliedMigrations(ctx, second.db) + if err != nil { + t.Fatalf("AppliedMigrations(second) error = %v", err) + } + assertAppliedGlobalMigrationOrder(t, secondRecords) + for index, firstRecord := range firstRecords { + if !secondRecords[index].AppliedAt.Equal(firstRecord.AppliedAt) { + t.Fatalf( + "second migration %d applied_at = %s, want unchanged %s", + firstRecord.Version, + secondRecords[index].AppliedAt, + firstRecord.AppliedAt, + ) + } + } + }, + ) + t.Run("Should record migration version and keep schema stable after reopen", func(t *testing.T) { t.Parallel() @@ -190,7 +268,7 @@ func TestNetworkConversationMigrationReopenAfterRestart(t *testing.T) { t.Fatalf("len(secondRecords) = %d, want %d", got, want) } if !secondRecords[len(secondRecords)-1].AppliedAt.Equal(firstRecords[len(firstRecords)-1].AppliedAt) { - t.Fatalf("migration v17 applied_at changed after reopen") + t.Fatalf("migration v%d applied_at changed after reopen", networkConversationMigrationVersion) } assertTableLacksColumns(t, second.db, "network_timeline_log", "interaction_id") assertTablesPresent(t, second.db, "network_threads", "network_direct_rooms", "network_work") diff --git a/internal/store/globaldb/global_db_soul_test.go b/internal/store/globaldb/global_db_soul_test.go index b5fb84249..2384c5662 100644 --- a/internal/store/globaldb/global_db_soul_test.go +++ b/internal/store/globaldb/global_db_soul_test.go @@ -72,30 +72,7 @@ func TestGlobalDBSoulMigration(t *testing.T) { if heartbeatRecord.Version != 13 || heartbeatRecord.Name != "add_agent_heartbeat_storage" { t.Fatalf("records[12] = %#v, want add_agent_heartbeat_storage v13", heartbeatRecord) } - if records[13].Version != 14 || records[13].Name != "add_event_summary_lineage" { - t.Fatalf("records[13] = %#v, want add_event_summary_lineage v14", records[13]) - } - if records[14].Version != 15 || records[14].Name != "rebuild_event_summaries_for_global_payloads" { - t.Fatalf("records[14] = %#v, want rebuild_event_summaries_for_global_payloads v15", records[14]) - } - if records[15].Version != 16 || records[15].Name != "rename_actor_ref_columns_to_actor_id" { - t.Fatalf("records[15] = %#v, want rename_actor_ref_columns_to_actor_id v16", records[15]) - } - if records[16].Version != 17 || records[16].Name != "rebuild_network_conversation_containers" { - t.Fatalf("records[16] = %#v, want rebuild_network_conversation_containers v17", records[16]) - } - if records[17].Version != 18 || records[17].Name != "add_task_orchestration_profile_schema" { - t.Fatalf("records[17] = %#v, want add_task_orchestration_profile_schema v18", records[17]) - } - if records[18].Version != 19 || records[18].Name != "add_task_review_gate_schema" { - t.Fatalf("records[18] = %#v, want add_task_review_gate_schema v19", records[18]) - } - if records[19].Version != 20 || records[19].Name != "add_notification_cursors" { - t.Fatalf("records[19] = %#v, want add_notification_cursors v20", records[19]) - } - if records[20].Version != 21 || records[20].Name != "add_bridge_task_subscriptions" { - t.Fatalf("records[20] = %#v, want add_bridge_task_subscriptions v21", records[20]) - } + assertAppliedGlobalMigrationOrder(t, records) for _, table := range []string{"soul_snapshots", "soul_revisions"} { exists, err := tableExists(ctx, globalDB.db, table) if err != nil { diff --git a/internal/store/globaldb/global_db_test.go b/internal/store/globaldb/global_db_test.go index 8330eaed4..3b854797d 100644 --- a/internal/store/globaldb/global_db_test.go +++ b/internal/store/globaldb/global_db_test.go @@ -176,75 +176,7 @@ func TestOpenGlobalDBRecordsSchemaMigrationAndRepeatedBootIsIdempotent(t *testin if got, want := len(firstRecords), len(globalSchemaMigrations); got != want { t.Fatalf("len(firstRecords) = %d, want %d", got, want) } - if firstRecords[0].Version != 1 || firstRecords[0].Name != "create_global_schema" { - t.Fatalf("firstRecords[0] = %#v, want create_global_schema v1", firstRecords[0]) - } - if firstRecords[1].Version != 2 || firstRecords[1].Name != "add_session_failure_diagnostics" { - t.Fatalf("firstRecords[1] = %#v, want add_session_failure_diagnostics v2", firstRecords[1]) - } - if firstRecords[2].Version != 3 || firstRecords[2].Name != "add_automation_scheduler_state" { - t.Fatalf("firstRecords[2] = %#v, want add_automation_scheduler_state v3", firstRecords[2]) - } - if firstRecords[3].Version != 4 || firstRecords[3].Name != "add_mcp_auth_tokens" { - t.Fatalf("firstRecords[3] = %#v, want add_mcp_auth_tokens v4", firstRecords[3]) - } - if firstRecords[4].Version != 5 || firstRecords[4].Name != "add_tool_process_records" { - t.Fatalf("firstRecords[4] = %#v, want add_tool_process_records v5", firstRecords[4]) - } - if firstRecords[5].Version != 6 || firstRecords[5].Name != "add_memory_operation_scope" { - t.Fatalf("firstRecords[5] = %#v, want add_memory_operation_scope v6", firstRecords[5]) - } - if firstRecords[6].Version != 7 || firstRecords[6].Name != "add_task_run_claim_lease_schema" { - t.Fatalf("firstRecords[6] = %#v, want add_task_run_claim_lease_schema v7", firstRecords[6]) - } - if firstRecords[7].Version != 8 || firstRecords[7].Name != "add_session_lineage_metadata" { - t.Fatalf("firstRecords[7] = %#v, want add_session_lineage_metadata v8", firstRecords[7]) - } - if firstRecords[8].Version != 9 || firstRecords[8].Name != "rename_environment_columns_to_sandbox" { - t.Fatalf("firstRecords[8] = %#v, want rename_environment_columns_to_sandbox v9", firstRecords[8]) - } - if firstRecords[9].Version != 10 || firstRecords[9].Name != "add_vault_secrets" { - t.Fatalf("firstRecords[9] = %#v, want add_vault_secrets v10", firstRecords[9]) - } - if firstRecords[10].Version != 11 || firstRecords[10].Name != "unify_secret_refs" { - t.Fatalf("firstRecords[10] = %#v, want unify_secret_refs v11", firstRecords[10]) - } - if firstRecords[11].Version != 12 || firstRecords[11].Name != "add_agent_soul_snapshots" { - t.Fatalf("firstRecords[11] = %#v, want add_agent_soul_snapshots v12", firstRecords[11]) - } - if firstRecords[12].Version != 13 || firstRecords[12].Name != "add_agent_heartbeat_storage" { - t.Fatalf("firstRecords[12] = %#v, want add_agent_heartbeat_storage v13", firstRecords[12]) - } - if firstRecords[13].Version != 14 || firstRecords[13].Name != "add_event_summary_lineage" { - t.Fatalf("firstRecords[13] = %#v, want add_event_summary_lineage v14", firstRecords[13]) - } - if firstRecords[14].Version != 15 || firstRecords[14].Name != "rebuild_event_summaries_for_global_payloads" { - t.Fatalf( - "firstRecords[14] = %#v, want rebuild_event_summaries_for_global_payloads v15", - firstRecords[14], - ) - } - if firstRecords[15].Version != 16 || firstRecords[15].Name != "rename_actor_ref_columns_to_actor_id" { - t.Fatalf("firstRecords[15] = %#v, want rename_actor_ref_columns_to_actor_id v16", firstRecords[15]) - } - if firstRecords[16].Version != 17 || firstRecords[16].Name != "rebuild_network_conversation_containers" { - t.Fatalf("firstRecords[16] = %#v, want rebuild_network_conversation_containers v17", firstRecords[16]) - } - if firstRecords[17].Version != 18 || firstRecords[17].Name != "add_task_orchestration_profile_schema" { - t.Fatalf("firstRecords[17] = %#v, want add_task_orchestration_profile_schema v18", firstRecords[17]) - } - if firstRecords[18].Version != 19 || firstRecords[18].Name != "add_task_review_gate_schema" { - t.Fatalf("firstRecords[18] = %#v, want add_task_review_gate_schema v19", firstRecords[18]) - } - if firstRecords[19].Version != 20 || firstRecords[19].Name != "add_notification_cursors" { - t.Fatalf("firstRecords[19] = %#v, want add_notification_cursors v20", firstRecords[19]) - } - if firstRecords[20].Version != 21 || firstRecords[20].Name != "add_bridge_task_subscriptions" { - t.Fatalf("firstRecords[20] = %#v, want add_bridge_task_subscriptions v21", firstRecords[20]) - } - if firstRecords[21].Version != 22 || firstRecords[21].Name != "memv2_memory_events" { - t.Fatalf("firstRecords[21] = %#v, want memv2_memory_events v22", firstRecords[21]) - } + assertAppliedGlobalMigrationOrder(t, firstRecords) if err := first.Close(ctx); err != nil { t.Fatalf("Close(first) error = %v", err) } @@ -310,6 +242,266 @@ func TestOpenGlobalDBFailsOnSchemaMigrationIntegrityMismatch(t *testing.T) { } } +func TestGlobalSchemaMigrationsAreAppendOnlyContract(t *testing.T) { + t.Parallel() + + t.Run("Should keep shipped migration prefix identities stable", func(t *testing.T) { + t.Parallel() + + assertGlobalSchemaMigrationDefinitions(t, globalSchemaMigrations) + }) + + t.Run("Should apply shipped migration prefix on fresh DB", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + path := filepath.Join(t.TempDir(), GlobalDatabaseName) + globalDB, err := OpenGlobalDB(ctx, path) + if err != nil { + t.Fatalf("OpenGlobalDB() error = %v", err) + } + t.Cleanup(func() { + if closeErr := globalDB.Close(testutil.Context(t)); closeErr != nil { + t.Errorf("Close() error = %v", closeErr) + } + }) + + records, err := store.AppliedMigrations(ctx, globalDB.db) + if err != nil { + t.Fatalf("AppliedMigrations() error = %v", err) + } + if got, want := len(records), len(globalSchemaMigrations); got != want { + t.Fatalf("len(records) = %d, want %d", got, want) + } + assertAppliedGlobalMigrationOrder(t, records) + }) + + t.Run("Should preserve shipped migration prefix across reopen", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + path := filepath.Join(t.TempDir(), GlobalDatabaseName) + first, err := OpenGlobalDB(ctx, path) + if err != nil { + t.Fatalf("OpenGlobalDB(first) error = %v", err) + } + firstRecords, err := store.AppliedMigrations(ctx, first.db) + if err != nil { + t.Fatalf("AppliedMigrations(first) error = %v", err) + } + assertAppliedGlobalMigrationOrder(t, firstRecords) + if err := first.Close(ctx); err != nil { + t.Fatalf("Close(first) error = %v", err) + } + + second, err := OpenGlobalDB(ctx, path) + if err != nil { + t.Fatalf("OpenGlobalDB(second) error = %v", err) + } + t.Cleanup(func() { + if closeErr := second.Close(testutil.Context(t)); closeErr != nil { + t.Errorf("Close(second) error = %v", closeErr) + } + }) + secondRecords, err := store.AppliedMigrations(ctx, second.db) + if err != nil { + t.Fatalf("AppliedMigrations(second) error = %v", err) + } + if got, want := len(secondRecords), len(firstRecords); got != want { + t.Fatalf("len(secondRecords) = %d, want %d", got, want) + } + assertAppliedGlobalMigrationOrder(t, secondRecords) + for index := range expectedGlobalMigrationPrefix() { + if !secondRecords[index].AppliedAt.Equal(firstRecords[index].AppliedAt) { + t.Fatalf( + "migration %d applied_at = %s, want unchanged %s", + firstRecords[index].Version, + secondRecords[index].AppliedAt, + firstRecords[index].AppliedAt, + ) + } + } + }) + + t.Run("Should upgrade recorded shipped prefix by appending later migrations", func(t *testing.T) { + t.Parallel() + + ctx := testutil.Context(t) + path := filepath.Join(t.TempDir(), GlobalDatabaseName) + db, err := store.OpenSQLiteDatabase(ctx, path, nil) + if err != nil { + t.Fatalf("OpenSQLiteDatabase(prefix) error = %v", err) + } + prefix := expectedGlobalMigrationPrefix() + if err := store.RunMigrations(ctx, db, globalSchemaMigrations[:len(prefix)]); err != nil { + t.Fatalf("RunMigrations(prefix) error = %v", err) + } + prefixRecords, err := store.AppliedMigrations(ctx, db) + if err != nil { + t.Fatalf("AppliedMigrations(prefix) error = %v", err) + } + assertAppliedGlobalMigrationPrefix(t, prefixRecords, len(prefix)) + if err := db.Close(); err != nil { + t.Fatalf("prefix db.Close() error = %v", err) + } + + globalDB, err := OpenGlobalDB(ctx, path) + if err != nil { + t.Fatalf("OpenGlobalDB(upgrade) error = %v", err) + } + t.Cleanup(func() { + if closeErr := globalDB.Close(testutil.Context(t)); closeErr != nil { + t.Errorf("Close(upgrade) error = %v", closeErr) + } + }) + records, err := store.AppliedMigrations(ctx, globalDB.db) + if err != nil { + t.Fatalf("AppliedMigrations(upgrade) error = %v", err) + } + if got, want := len(records), len(globalSchemaMigrations); got != want { + t.Fatalf("len(records) = %d, want %d", got, want) + } + assertAppliedGlobalMigrationOrder(t, records) + for index, prefixRecord := range prefixRecords { + if !records[index].AppliedAt.Equal(prefixRecord.AppliedAt) { + t.Fatalf( + "migration %d applied_at = %s, want unchanged %s", + prefixRecord.Version, + records[index].AppliedAt, + prefixRecord.AppliedAt, + ) + } + } + }) +} + +type expectedGlobalMigrationIdentity struct { + version int + name string + checksum string +} + +func expectedGlobalMigrationPrefix() []expectedGlobalMigrationIdentity { + return []expectedGlobalMigrationIdentity{ + { + version: 1, + name: "create_global_schema", + checksum: "70e2c16c9d32e692891ab71d075ca823782626e7c9f6ffbbc88c5d662704e089", + }, + {version: 2, name: "add_session_failure_diagnostics", checksum: "2026-04-24-add-session-failure-diagnostics"}, + {version: 3, name: "add_automation_scheduler_state", checksum: "2026-04-24-add-automation-scheduler-state"}, + {version: 4, name: "add_mcp_auth_tokens", checksum: "2026-04-25-add-mcp-auth-tokens"}, + {version: 5, name: "add_tool_process_records", checksum: "2026-04-24-add-tool-process-records"}, + {version: 6, name: "add_memory_operation_scope", checksum: "2026-04-25-add-memory-operation-scope"}, + {version: 7, name: "add_task_run_claim_lease_schema", checksum: "2026-04-26-add-task-run-claim-lease-schema"}, + {version: 8, name: "add_session_lineage_metadata", checksum: "2026-04-26-add-session-lineage-metadata"}, + { + version: 9, + name: "rename_environment_columns_to_sandbox", + checksum: "2026-04-28-rename-environment-columns-to-sandbox", + }, + {version: 10, name: "add_vault_secrets", checksum: "2026-05-01-add-vault-secrets"}, + {version: 11, name: "unify_secret_refs", checksum: "2026-05-01-unify-secret-refs"}, + {version: 12, name: "add_agent_soul_snapshots", checksum: "2026-05-02-add-agent-soul-snapshots"}, + {version: 13, name: "add_agent_heartbeat_storage", checksum: "2026-05-02-add-agent-heartbeat-storage"}, + {version: 14, name: "add_event_summary_lineage", checksum: "2026-05-04-add-event-summary-lineage"}, + { + version: 15, + name: "rebuild_event_summaries_for_global_payloads", + checksum: "2026-05-04-rebuild-event-summaries-for-global-payloads", + }, + { + version: 16, + name: "rename_actor_ref_columns_to_actor_id", + checksum: "2026-05-04-rename-actor-ref-columns-to-actor-id", + }, + { + version: 17, + name: "add_task_orchestration_profile_schema", + checksum: "2026-05-05-add-task-orchestration-profile-schema", + }, + {version: 18, name: "add_task_review_gate_schema", checksum: "2026-05-05-add-task-review-gate-schema"}, + {version: 19, name: "add_notification_cursors", checksum: "2026-05-05-add-notification-cursors"}, + {version: 20, name: "add_bridge_task_subscriptions", checksum: "2026-05-05-add-bridge-task-subscriptions"}, + } +} + +func assertGlobalSchemaMigrationDefinitions(t *testing.T, migrations []store.Migration) { + t.Helper() + + want := expectedGlobalMigrationPrefix() + if got := len(migrations); got < len(want) { + t.Fatalf("globalSchemaMigrations length = %d, want at least shipped prefix length %d", got, len(want)) + } + for index, expected := range want { + got := migrations[index] + if got.Version != expected.version || got.Name != expected.name || got.Checksum != expected.checksum { + t.Fatalf( + "globalSchemaMigrations[%d] = version %d name %q checksum %q, want version %d name %q checksum %q", + index, + got.Version, + got.Name, + got.Checksum, + expected.version, + expected.name, + expected.checksum, + ) + } + } +} + +func assertAppliedGlobalMigrationOrder(t *testing.T, records []store.MigrationRecord) { + t.Helper() + + want := expectedGlobalMigrationPrefix() + if got := len(records); got < len(want) { + t.Fatalf("schema_migrations length = %d, want at least shipped prefix length %d", got, len(want)) + } + for index, expected := range want { + got := records[index] + if got.Version != expected.version || got.Name != expected.name || got.Checksum != expected.checksum { + t.Fatalf( + "schema_migrations[%d] = version %d name %q checksum %q, want version %d name %q checksum %q", + index, + got.Version, + got.Name, + got.Checksum, + expected.version, + expected.name, + expected.checksum, + ) + } + } +} + +func assertAppliedGlobalMigrationPrefix(t *testing.T, records []store.MigrationRecord, length int) { + t.Helper() + + want := expectedGlobalMigrationPrefix() + if length < 0 || length > len(want) { + t.Fatalf("migration prefix length = %d, want 0..%d", length, len(want)) + } + if got := len(records); got < length { + t.Fatalf("schema_migrations length = %d, want at least prefix length %d", got, length) + } + for index := range records[:length] { + expected := want[index] + got := records[index] + if got.Version != expected.version || got.Name != expected.name || got.Checksum != expected.checksum { + t.Fatalf( + "schema_migrations[%d] = version %d name %q checksum %q, want version %d name %q checksum %q", + index, + got.Version, + got.Name, + got.Checksum, + expected.version, + expected.name, + expected.checksum, + ) + } + } +} + func TestOpenGlobalDBCreatesExtensionsTableWithExpectedColumns(t *testing.T) { t.Parallel() diff --git a/internal/store/globaldb/migrate_model_catalog.go b/internal/store/globaldb/migrate_model_catalog.go new file mode 100644 index 000000000..6796270d9 --- /dev/null +++ b/internal/store/globaldb/migrate_model_catalog.go @@ -0,0 +1,16 @@ +package globaldb + +import ( + "context" + "database/sql" + "fmt" +) + +func migrateModelCatalogPersistence(ctx context.Context, tx *sql.Tx) error { + for _, statement := range modelCatalogSchemaStatements() { + if _, err := tx.ExecContext(ctx, statement); err != nil { + return fmt.Errorf("store: apply model catalog schema: %w", err) + } + } + return nil +} diff --git a/internal/store/globaldb/schema_model_catalog.go b/internal/store/globaldb/schema_model_catalog.go new file mode 100644 index 000000000..2ff1ab442 --- /dev/null +++ b/internal/store/globaldb/schema_model_catalog.go @@ -0,0 +1,59 @@ +package globaldb + +func modelCatalogSchemaStatements() []string { + return []string{ + `CREATE TABLE IF NOT EXISTS model_catalog_sources ( + source_id TEXT NOT NULL CHECK (trim(source_id) <> ''), + provider_id TEXT NOT NULL CHECK (trim(provider_id) <> ''), + source_kind TEXT NOT NULL CHECK (trim(source_kind) <> ''), + priority INTEGER NOT NULL, + refresh_state TEXT NOT NULL CHECK (trim(refresh_state) <> ''), + last_refresh_at TEXT NOT NULL DEFAULT '', + next_refresh_at TEXT NOT NULL DEFAULT '', + last_success_at TEXT NOT NULL DEFAULT '', + last_error TEXT NOT NULL DEFAULT '', + row_count INTEGER NOT NULL DEFAULT 0 CHECK (row_count >= 0), + stale INTEGER NOT NULL DEFAULT 0 CHECK (stale IN (0, 1)), + PRIMARY KEY (source_id, provider_id) + );`, + `CREATE TABLE IF NOT EXISTS model_catalog_rows ( + source_id TEXT NOT NULL CHECK (trim(source_id) <> ''), + provider_id TEXT NOT NULL CHECK (trim(provider_id) <> ''), + model_id TEXT NOT NULL CHECK (trim(model_id) <> ''), + source_kind TEXT NOT NULL CHECK (trim(source_kind) <> ''), + priority INTEGER NOT NULL, + available INTEGER CHECK (available IN (0, 1) OR available IS NULL), + stale INTEGER NOT NULL DEFAULT 0 CHECK (stale IN (0, 1)), + refreshed_at TEXT NOT NULL DEFAULT '', + expires_at TEXT NOT NULL DEFAULT '', + display_name TEXT NOT NULL DEFAULT '', + context_window INTEGER, + max_input_tokens INTEGER, + max_output_tokens INTEGER, + supports_tools INTEGER CHECK (supports_tools IN (0, 1) OR supports_tools IS NULL), + supports_reasoning INTEGER CHECK (supports_reasoning IN (0, 1) OR supports_reasoning IS NULL), + default_reasoning_effort TEXT, + cost_input_per_million REAL, + cost_output_per_million REAL, + last_error TEXT NOT NULL DEFAULT '', + PRIMARY KEY (source_id, provider_id, model_id) + );`, + `CREATE TABLE IF NOT EXISTS model_catalog_reasoning_efforts ( + source_id TEXT NOT NULL, + provider_id TEXT NOT NULL, + model_id TEXT NOT NULL, + effort TEXT NOT NULL CHECK (trim(effort) <> ''), + rank INTEGER NOT NULL CHECK (rank >= 0), + PRIMARY KEY (source_id, provider_id, model_id, effort), + FOREIGN KEY (source_id, provider_id, model_id) + REFERENCES model_catalog_rows(source_id, provider_id, model_id) + ON DELETE CASCADE + );`, + `CREATE INDEX IF NOT EXISTS idx_model_catalog_rows_provider_model + ON model_catalog_rows(provider_id, model_id, priority DESC, refreshed_at DESC, source_id ASC);`, + `CREATE INDEX IF NOT EXISTS idx_model_catalog_rows_source_provider + ON model_catalog_rows(source_id, provider_id);`, + `CREATE INDEX IF NOT EXISTS idx_model_catalog_sources_provider + ON model_catalog_sources(provider_id, refresh_state, stale);`, + } +} diff --git a/internal/store/types.go b/internal/store/types.go index 7cec62a49..fbd5b7be4 100644 --- a/internal/store/types.go +++ b/internal/store/types.go @@ -1372,6 +1372,7 @@ type SessionMeta struct { AgentName string `json:"agent_name"` Provider string `json:"provider,omitempty"` Model string `json:"model,omitempty"` + ReasoningEffort string `json:"reasoning_effort,omitempty"` WorkspaceID string `json:"workspace_id,omitempty"` Channel string `json:"channel,omitempty"` SessionType string `json:"session_type,omitempty"` diff --git a/internal/testutil/acpmock/cmd/acpmock-driver/main.go b/internal/testutil/acpmock/cmd/acpmock-driver/main.go index ae0db255b..ee8db99fe 100644 --- a/internal/testutil/acpmock/cmd/acpmock-driver/main.go +++ b/internal/testutil/acpmock/cmd/acpmock-driver/main.go @@ -18,9 +18,8 @@ import ( ) var ( - _ acpsdk.Agent = (*mockAgent)(nil) - _ acpsdk.AgentLoader = (*mockAgent)(nil) - _ acpsdk.AgentExperimental = (*mockAgent)(nil) + _ acpsdk.Agent = (*mockAgent)(nil) + _ acpsdk.AgentLoader = (*mockAgent)(nil) ) type cliArgs struct { @@ -36,6 +35,7 @@ type sessionState struct { type mockAgent struct { conn *acpsdk.AgentSideConnection agent acpmock.AgentFixture + configOptions []acpsdk.SessionConfigOption diagnosticsPath string lifecycleCtx context.Context cancelLifecycle context.CancelFunc @@ -74,6 +74,7 @@ func main() { agent := &mockAgent{ agent: agentFixture, + configOptions: sessionConfigOptionsFromFixture(agentFixture.ConfigOptions), diagnosticsPath: strings.TrimSpace(args.DiagnosticsPath), lifecycleCtx: lifecycleCtx, cancelLifecycle: cancelLifecycle, @@ -129,6 +130,27 @@ func (a *mockAgent) Cancel(context.Context, acpsdk.CancelNotification) error { return nil } +func (a *mockAgent) CloseSession( + context.Context, + acpsdk.CloseSessionRequest, +) (acpsdk.CloseSessionResponse, error) { + return acpsdk.CloseSessionResponse{}, nil +} + +func (a *mockAgent) ListSessions( + context.Context, + acpsdk.ListSessionsRequest, +) (acpsdk.ListSessionsResponse, error) { + return acpsdk.ListSessionsResponse{}, nil +} + +func (a *mockAgent) ResumeSession( + context.Context, + acpsdk.ResumeSessionRequest, +) (acpsdk.ResumeSessionResponse, error) { + return acpsdk.ResumeSessionResponse{}, nil +} + func (a *mockAgent) NewSession(_ context.Context, params acpsdk.NewSessionRequest) (acpsdk.NewSessionResponse, error) { a.mu.Lock() a.nextSession++ @@ -138,7 +160,10 @@ func (a *mockAgent) NewSession(_ context.Context, params acpsdk.NewSessionReques if err := a.writeSessionDiagnostics("session_new", sessionID, params.McpServers); err != nil { return acpsdk.NewSessionResponse{}, err } - return acpsdk.NewSessionResponse{SessionId: acpsdk.SessionId(sessionID)}, nil + return acpsdk.NewSessionResponse{ + SessionId: acpsdk.SessionId(sessionID), + ConfigOptions: a.cloneConfigOptions(), + }, nil } func (a *mockAgent) LoadSession( @@ -154,7 +179,7 @@ func (a *mockAgent) LoadSession( if err := a.writeSessionDiagnostics("session_load", sessionID, params.McpServers); err != nil { return acpsdk.LoadSessionResponse{}, err } - return acpsdk.LoadSessionResponse{}, nil + return acpsdk.LoadSessionResponse{ConfigOptions: a.cloneConfigOptions()}, nil } func (a *mockAgent) writeSessionDiagnostics( @@ -180,11 +205,127 @@ func (a *mockAgent) SetSessionMode( return acpsdk.SetSessionModeResponse{}, nil } -func (a *mockAgent) SetSessionModel( +func (a *mockAgent) SetSessionConfigOption( + _ context.Context, + request acpsdk.SetSessionConfigOptionRequest, +) (acpsdk.SetSessionConfigOptionResponse, error) { + if request.ValueId == nil { + return acpsdk.SetSessionConfigOptionResponse{}, errors.New( + "acpmock-driver: only value-id session config options are supported", + ) + } + if err := a.setConfigOptionValue( + string(request.ValueId.ConfigId), + string(request.ValueId.Value), + ); err != nil { + return acpsdk.SetSessionConfigOptionResponse{}, err + } + return acpsdk.SetSessionConfigOptionResponse{ConfigOptions: a.cloneConfigOptions()}, nil +} + +func (a *mockAgent) UnstableSetSessionModel( context.Context, - acpsdk.SetSessionModelRequest, -) (acpsdk.SetSessionModelResponse, error) { - return acpsdk.SetSessionModelResponse{}, nil + acpsdk.UnstableSetSessionModelRequest, +) (acpsdk.UnstableSetSessionModelResponse, error) { + return acpsdk.UnstableSetSessionModelResponse{}, nil +} + +func sessionConfigOptionsFromFixture( + options []acpmock.SessionConfigOptionFixture, +) []acpsdk.SessionConfigOption { + if len(options) == 0 { + return nil + } + result := make([]acpsdk.SessionConfigOption, 0, len(options)) + for _, option := range options { + values := make(acpsdk.SessionConfigSelectOptionsUngrouped, 0, len(option.Values)) + for _, value := range option.Values { + label := strings.TrimSpace(value.Label) + if label == "" { + label = strings.TrimSpace(value.Value) + } + values = append(values, acpsdk.SessionConfigSelectOption{ + Name: label, + Value: acpsdk.SessionConfigValueId(strings.TrimSpace(value.Value)), + }) + } + result = append(result, acpsdk.SessionConfigOption{ + Select: &acpsdk.SessionConfigOptionSelect{ + Id: acpsdk.SessionConfigId(strings.TrimSpace(option.ID)), + Name: strings.TrimSpace(option.Name), + CurrentValue: acpsdk.SessionConfigValueId(strings.TrimSpace(option.Current)), + Options: acpsdk.SessionConfigSelectOptions{ + Ungrouped: &values, + }, + Type: "select", + }, + }) + } + return result +} + +func (a *mockAgent) cloneConfigOptions() []acpsdk.SessionConfigOption { + a.mu.Lock() + defer a.mu.Unlock() + return cloneSessionConfigOptions(a.configOptions) +} + +func (a *mockAgent) setConfigOptionValue(configID string, value string) error { + trimmedConfigID := strings.TrimSpace(configID) + trimmedValue := strings.TrimSpace(value) + if trimmedConfigID == "" { + return errors.New("acpmock-driver: session config option id is required") + } + if trimmedValue == "" { + return errors.New("acpmock-driver: session config option value is required") + } + + a.mu.Lock() + defer a.mu.Unlock() + for idx := range a.configOptions { + option := a.configOptions[idx].Select + if option == nil || string(option.Id) != trimmedConfigID { + continue + } + if option.Options.Ungrouped == nil { + return fmt.Errorf("acpmock-driver: config option %q has no selectable values", trimmedConfigID) + } + for _, candidate := range *option.Options.Ungrouped { + if string(candidate.Value) == trimmedValue { + option.CurrentValue = acpsdk.SessionConfigValueId(trimmedValue) + return nil + } + } + return fmt.Errorf( + "acpmock-driver: config option %q value %q is not available", + trimmedConfigID, + trimmedValue, + ) + } + return fmt.Errorf("acpmock-driver: config option %q is not available", trimmedConfigID) +} + +func cloneSessionConfigOptions(options []acpsdk.SessionConfigOption) []acpsdk.SessionConfigOption { + if len(options) == 0 { + return nil + } + cloned := make([]acpsdk.SessionConfigOption, 0, len(options)) + for _, option := range options { + if option.Select != nil { + selectCopy := *option.Select + if option.Select.Options.Ungrouped != nil { + values := append(acpsdk.SessionConfigSelectOptionsUngrouped(nil), (*option.Select.Options.Ungrouped)...) + selectCopy.Options.Ungrouped = &values + } + cloned = append(cloned, acpsdk.SessionConfigOption{Select: &selectCopy}) + continue + } + if option.Boolean != nil { + booleanCopy := *option.Boolean + cloned = append(cloned, acpsdk.SessionConfigOption{Boolean: &booleanCopy}) + } + } + return cloned } func (a *mockAgent) Prompt(ctx context.Context, params acpsdk.PromptRequest) (acpsdk.PromptResponse, error) { @@ -388,7 +529,7 @@ func (a *mockAgent) requestPermission( response, err := a.conn.RequestPermission(ctx, acpsdk.RequestPermissionRequest{ SessionId: sessionID, Options: defaultPermissionOptions(), - ToolCall: acpsdk.RequestPermissionToolCall{ + ToolCall: acpsdk.ToolCallUpdate{ ToolCallId: acpsdk.ToolCallId(strings.TrimSpace(step.ToolCallID)), Title: acpsdk.Ptr(title), Kind: acpsdk.Ptr(toolKindValue), diff --git a/internal/testutil/acpmock/cmd/acpmock-driver/main_test.go b/internal/testutil/acpmock/cmd/acpmock-driver/main_test.go index 544440d7a..39732fd71 100644 --- a/internal/testutil/acpmock/cmd/acpmock-driver/main_test.go +++ b/internal/testutil/acpmock/cmd/acpmock-driver/main_test.go @@ -1,6 +1,7 @@ package main import ( + "context" "strings" "testing" @@ -12,89 +13,154 @@ import ( func TestExtractPromptTextPreservesAugmentedPromptDiagnostics(t *testing.T) { t.Parallel() - prompt := "Session instructions\n\n" + - "User request:\n\n" + - "{}\n\n" + - "Relevant durable memory for this turn:\n" + - "- Auth [workspace]\n\n" + - "User message:\n" + - "hello alpha" - blocks := []acpsdk.ContentBlock{ - acpsdk.TextBlock("ignored"), - acpsdk.TextBlock(prompt), - } - - if got, want := extractPromptText(blocks), prompt; got != want { - t.Fatalf("extractPromptText() = %q, want %q", got, want) - } + t.Run("Should preserve augmented prompt diagnostics", func(t *testing.T) { + t.Parallel() + + prompt := "Session instructions\n\n" + + "User request:\n\n" + + "{}\n\n" + + "Relevant durable memory for this turn:\n" + + "- Auth [workspace]\n\n" + + "User message:\n" + + "hello alpha" + blocks := []acpsdk.ContentBlock{ + acpsdk.TextBlock("ignored"), + acpsdk.TextBlock(prompt), + } + + if got, want := extractPromptText(blocks), prompt; got != want { + t.Fatalf("extractPromptText() = %q, want %q", got, want) + } + }) } func TestExtractPromptTextPreservesAugmentedPromptWithoutNestedMessageMarker(t *testing.T) { t.Parallel() - prompt := "Session instructions\n\n" + - "User request:\n\n" + - "{}\n\n" + - "hello alpha" - blocks := []acpsdk.ContentBlock{ - acpsdk.TextBlock(prompt), - } - - if got, want := extractPromptText(blocks), prompt; got != want { - t.Fatalf("extractPromptText() = %q, want %q", got, want) - } + t.Run("Should preserve augmented prompt without nested message marker", func(t *testing.T) { + t.Parallel() + + prompt := "Session instructions\n\n" + + "User request:\n\n" + + "{}\n\n" + + "hello alpha" + blocks := []acpsdk.ContentBlock{ + acpsdk.TextBlock(prompt), + } + + if got, want := extractPromptText(blocks), prompt; got != want { + t.Fatalf("extractPromptText() = %q, want %q", got, want) + } + }) } func TestMockAgentSelectTurnDoesNotCountUnmatchedPrompts(t *testing.T) { t.Parallel() - agent := &mockAgent{ - agent: acpmock.AgentFixture{ - Name: "alpha", - Turns: []acpmock.TurnFixture{ - { - Name: "first", - Match: acpmock.TurnMatch{ - TurnSource: acp.PromptTurnSourceUser, - UserText: "first prompt", - Occurrence: 1, + t.Run("Should not count unmatched prompts", func(t *testing.T) { + t.Parallel() + + agent := &mockAgent{ + agent: acpmock.AgentFixture{ + Name: "alpha", + Turns: []acpmock.TurnFixture{ + { + Name: "first", + Match: acpmock.TurnMatch{ + TurnSource: acp.PromptTurnSourceUser, + UserText: "first prompt", + Occurrence: 1, + }, + }, + { + Name: "second", + Match: acpmock.TurnMatch{ + TurnSource: acp.PromptTurnSourceUser, + UserText: "second prompt", + Occurrence: 2, + }, }, }, + }, + sessions: map[string]*sessionState{}, + } + meta := acp.PromptMeta{TurnSource: acp.PromptTurnSourceUser} + + first, occurrence, err := agent.selectTurn("acp-session-1", "first prompt", meta) + if err != nil { + t.Fatalf("selectTurn(first) error = %v", err) + } + if first.Name != "first" || occurrence != 1 { + t.Fatalf("selectTurn(first) = (%q, %d), want (first, 1)", first.Name, occurrence) + } + + _, occurrence, err = agent.selectTurn("acp-session-1", "extractor internal prompt", meta) + if err == nil || !strings.Contains(err.Error(), "no turn matched") { + t.Fatalf("selectTurn(unmatched) error = %v, want no-match error", err) + } + if occurrence != 2 { + t.Fatalf("selectTurn(unmatched) occurrence = %d, want next occurrence 2", occurrence) + } + + second, occurrence, err := agent.selectTurn("acp-session-1", "second prompt", meta) + if err != nil { + t.Fatalf("selectTurn(second) error = %v", err) + } + if second.Name != "second" || occurrence != 2 { + t.Fatalf("selectTurn(second) = (%q, %d), want (second, 2)", second.Name, occurrence) + } + }) +} + +func TestMockAgentSessionConfigOptions(t *testing.T) { + t.Parallel() + + t.Run("Should update current select values", func(t *testing.T) { + t.Parallel() + + agent := &mockAgent{ + configOptions: sessionConfigOptionsFromFixture([]acpmock.SessionConfigOptionFixture{ { - Name: "second", - Match: acpmock.TurnMatch{ - TurnSource: acp.PromptTurnSourceUser, - UserText: "second prompt", - Occurrence: 2, + ID: "model", + Name: "Model", + Current: "qa-browser-model", + Values: []acpmock.SessionConfigOptionValueFixture{ + {Value: "qa-browser-model", Label: "QA Browser Model"}, + {Value: "qa-browser-model-alt", Label: "QA Browser Model Alt"}, }, }, + }), + } + + response, err := agent.SetSessionConfigOption( + context.Background(), + acpsdk.SetSessionConfigOptionRequest{ + ValueId: &acpsdk.SetSessionConfigOptionValueId{ + ConfigId: acpsdk.SessionConfigId("model"), + Value: acpsdk.SessionConfigValueId("qa-browser-model-alt"), + }, + }, + ) + if err != nil { + t.Fatalf("SetSessionConfigOption() error = %v", err) + } + if got, want := response.ConfigOptions[0].Select.CurrentValue, acpsdk.SessionConfigValueId( + "qa-browser-model-alt", + ); got != want { + t.Fatalf("CurrentValue = %q, want %q", got, want) + } + + _, err = agent.SetSessionConfigOption( + context.Background(), + acpsdk.SetSessionConfigOptionRequest{ + ValueId: &acpsdk.SetSessionConfigOptionValueId{ + ConfigId: acpsdk.SessionConfigId("model"), + Value: acpsdk.SessionConfigValueId("missing-model"), + }, }, - }, - sessions: map[string]*sessionState{}, - } - meta := acp.PromptMeta{TurnSource: acp.PromptTurnSourceUser} - - first, occurrence, err := agent.selectTurn("acp-session-1", "first prompt", meta) - if err != nil { - t.Fatalf("selectTurn(first) error = %v", err) - } - if first.Name != "first" || occurrence != 1 { - t.Fatalf("selectTurn(first) = (%q, %d), want (first, 1)", first.Name, occurrence) - } - - _, occurrence, err = agent.selectTurn("acp-session-1", "extractor internal prompt", meta) - if err == nil || !strings.Contains(err.Error(), "no turn matched") { - t.Fatalf("selectTurn(unmatched) error = %v, want no-match error", err) - } - if occurrence != 2 { - t.Fatalf("selectTurn(unmatched) occurrence = %d, want next occurrence 2", occurrence) - } - - second, occurrence, err := agent.selectTurn("acp-session-1", "second prompt", meta) - if err != nil { - t.Fatalf("selectTurn(second) error = %v", err) - } - if second.Name != "second" || occurrence != 2 { - t.Fatalf("selectTurn(second) = (%q, %d), want (second, 2)", second.Name, occurrence) - } + ) + if err == nil || !strings.Contains(err.Error(), "is not available") { + t.Fatalf("SetSessionConfigOption(missing) error = %v, want unavailable value", err) + } + }) } diff --git a/internal/testutil/acpmock/fixture.go b/internal/testutil/acpmock/fixture.go index 9f420ad20..ed0c008b3 100644 --- a/internal/testutil/acpmock/fixture.go +++ b/internal/testutil/acpmock/fixture.go @@ -49,12 +49,27 @@ type Fixture struct { // AgentFixture describes one named ACP mock agent inside a fixture file. type AgentFixture struct { - Name string `json:"name"` - Provider string `json:"provider"` - Model string `json:"model,omitempty"` - Permissions string `json:"permissions,omitempty"` - Prompt string `json:"prompt,omitempty"` - Turns []TurnFixture `json:"turns"` + Name string `json:"name"` + Provider string `json:"provider"` + Model string `json:"model,omitempty"` + Permissions string `json:"permissions,omitempty"` + Prompt string `json:"prompt,omitempty"` + ConfigOptions []SessionConfigOptionFixture `json:"config_options,omitempty"` + Turns []TurnFixture `json:"turns"` +} + +// SessionConfigOptionFixture describes one deterministic ACP session config select option. +type SessionConfigOptionFixture struct { + ID string `json:"id"` + Name string `json:"name"` + Current string `json:"current"` + Values []SessionConfigOptionValueFixture `json:"values"` +} + +// SessionConfigOptionValueFixture describes one selectable ACP config option value. +type SessionConfigOptionValueFixture struct { + Value string `json:"value"` + Label string `json:"label,omitempty"` } // TurnFixture describes one deterministic prompt turn for an agent. @@ -252,6 +267,11 @@ func (a AgentFixture) Validate(path string) error { if len(a.Turns) == 0 { return fmt.Errorf("acpmock: %s.turns must contain at least one turn", path) } + for idx, option := range a.ConfigOptions { + if err := option.Validate(fmt.Sprintf("%s.config_options[%d]", path, idx)); err != nil { + return err + } + } for idx, turn := range a.Turns { if err := turn.Validate(fmt.Sprintf("%s.turns[%d]", path, idx)); err != nil { return err @@ -260,6 +280,44 @@ func (a AgentFixture) Validate(path string) error { return nil } +// Validate ensures one session config option is deterministic and selectable. +func (o SessionConfigOptionFixture) Validate(path string) error { + id := strings.TrimSpace(o.ID) + if id == "" { + return fmt.Errorf("acpmock: %s.id is required", path) + } + name := strings.TrimSpace(o.Name) + if name == "" { + return fmt.Errorf("acpmock: %s.name is required", path) + } + current := strings.TrimSpace(o.Current) + if current == "" { + return fmt.Errorf("acpmock: %s.current is required", path) + } + if len(o.Values) == 0 { + return fmt.Errorf("acpmock: %s.values must contain at least one value", path) + } + seen := make(map[string]struct{}, len(o.Values)) + currentFound := false + for idx, value := range o.Values { + trimmed := strings.TrimSpace(value.Value) + if trimmed == "" { + return fmt.Errorf("acpmock: %s.values[%d].value is required", path, idx) + } + if _, exists := seen[trimmed]; exists { + return fmt.Errorf("acpmock: %s.values[%d].value duplicates %q", path, idx, trimmed) + } + seen[trimmed] = struct{}{} + if trimmed == current { + currentFound = true + } + } + if !currentFound { + return fmt.Errorf("acpmock: %s.current %q must be listed in values", path, current) + } + return nil +} + // Validate ensures the turn fixture is usable. func (t TurnFixture) Validate(path string) error { if err := t.Match.Validate(path + ".match"); err != nil { diff --git a/internal/testutil/acpmock/testdata/browser_session_lifecycle_fixture.json b/internal/testutil/acpmock/testdata/browser_session_lifecycle_fixture.json index bc0efa505..fc93743ea 100644 --- a/internal/testutil/acpmock/testdata/browser_session_lifecycle_fixture.json +++ b/internal/testutil/acpmock/testdata/browser_session_lifecycle_fixture.json @@ -6,6 +6,42 @@ "provider": "claude", "permissions": "approve-reads", "prompt": "You are the browser lifecycle test agent.", + "config_options": [ + { + "id": "model", + "name": "Model", + "current": "qa-browser-model", + "values": [ + { + "value": "qa-browser-model", + "label": "QA Browser Model" + }, + { + "value": "qa-browser-model-alt", + "label": "QA Browser Model Alt" + } + ] + }, + { + "id": "reasoning_effort", + "name": "Reasoning effort", + "current": "medium", + "values": [ + { + "value": "low", + "label": "Low" + }, + { + "value": "medium", + "label": "Medium" + }, + { + "value": "high", + "label": "High" + } + ] + } + ], "turns": [ { "name": "browser-session-lifecycle", diff --git a/internal/testutil/e2e/config_seed_test.go b/internal/testutil/e2e/config_seed_test.go index dd4f1aeea..9cc4d23de 100644 --- a/internal/testutil/e2e/config_seed_test.go +++ b/internal/testutil/e2e/config_seed_test.go @@ -19,8 +19,10 @@ func TestSeedConfigPreservesLiveProviderAndAgentValidation(t *testing.T) { DefaultAgent: "coder", Providers: map[string]aghconfig.ProviderConfig{ "fake": { - Command: "fake-agent --stdio", - DefaultModel: "fake-model", + Command: "fake-agent --stdio", + Models: aghconfig.ProviderModelsConfig{ + Default: "fake-model", + }, CredentialSlots: []aghconfig.ProviderCredentialSlot{ { Name: "api_key", diff --git a/internal/testutil/e2e/runtime_harness_integration_test.go b/internal/testutil/e2e/runtime_harness_integration_test.go index da26b1bd9..44629393a 100644 --- a/internal/testutil/e2e/runtime_harness_integration_test.go +++ b/internal/testutil/e2e/runtime_harness_integration_test.go @@ -311,6 +311,20 @@ func (a *e2eACPAgent) Cancel(context.Context, acpsdk.CancelNotification) error { return nil } +func (a *e2eACPAgent) CloseSession( + context.Context, + acpsdk.CloseSessionRequest, +) (acpsdk.CloseSessionResponse, error) { + return acpsdk.CloseSessionResponse{}, nil +} + +func (a *e2eACPAgent) ListSessions( + context.Context, + acpsdk.ListSessionsRequest, +) (acpsdk.ListSessionsResponse, error) { + return acpsdk.ListSessionsResponse{Sessions: []acpsdk.SessionInfo{}}, nil +} + func (a *e2eACPAgent) NewSession( context.Context, acpsdk.NewSessionRequest, @@ -318,6 +332,20 @@ func (a *e2eACPAgent) NewSession( return acpsdk.NewSessionResponse{SessionId: "e2e-helper-session"}, nil } +func (a *e2eACPAgent) ResumeSession( + context.Context, + acpsdk.ResumeSessionRequest, +) (acpsdk.ResumeSessionResponse, error) { + return acpsdk.ResumeSessionResponse{}, nil +} + +func (a *e2eACPAgent) SetSessionConfigOption( + context.Context, + acpsdk.SetSessionConfigOptionRequest, +) (acpsdk.SetSessionConfigOptionResponse, error) { + return acpsdk.SetSessionConfigOptionResponse{ConfigOptions: []acpsdk.SessionConfigOption{}}, nil +} + func (a *e2eACPAgent) LoadSession( context.Context, acpsdk.LoadSessionRequest, diff --git a/internal/workspace/clone.go b/internal/workspace/clone.go index fad7120cb..68ecd6df2 100644 --- a/internal/workspace/clone.go +++ b/internal/workspace/clone.go @@ -65,6 +65,7 @@ func cloneConfig(src *aghconfig.Config) aghconfig.Config { Permissions: src.Permissions, MCPServers: cloneMCPServers(src.MCPServers), Providers: cloneProviders(src.Providers), + ModelCatalog: cloneModelCatalogConfig(src.ModelCatalog), Sandboxes: cloneSandboxProfiles(src.Sandboxes), Observability: src.Observability, Log: src.Log, @@ -157,17 +158,102 @@ func cloneProvider(src aghconfig.ProviderConfig) aghconfig.ProviderConfig { return aghconfig.ProviderConfig{ Command: src.Command, DisplayName: src.DisplayName, - DefaultModel: src.DefaultModel, + Models: cloneProviderModelsConfig(src.Models), Harness: src.Harness, RuntimeProvider: src.RuntimeProvider, Transport: src.Transport, BaseURL: src.BaseURL, + AuthMode: src.AuthMode, + EnvPolicy: src.EnvPolicy, + HomePolicy: src.HomePolicy, + AuthStatusCmd: src.AuthStatusCmd, + AuthLoginCmd: src.AuthLoginCmd, + SessionMCP: cloneBoolPtr(src.SessionMCP), Aliases: append([]string(nil), src.Aliases...), CredentialSlots: append([]aghconfig.ProviderCredentialSlot(nil), src.CredentialSlots...), MCPServers: cloneMCPServers(src.MCPServers), } } +func cloneBoolPtr(src *bool) *bool { + if src == nil { + return nil + } + value := *src + return &value +} + +func cloneInt64Ptr(src *int64) *int64 { + if src == nil { + return nil + } + value := *src + return &value +} + +func cloneFloat64Ptr(src *float64) *float64 { + if src == nil { + return nil + } + value := *src + return &value +} + +func cloneProviderModelsConfig(src aghconfig.ProviderModelsConfig) aghconfig.ProviderModelsConfig { + return aghconfig.ProviderModelsConfig{ + Default: src.Default, + Curated: cloneProviderModelConfigs(src.Curated), + Discovery: cloneProviderModelsDiscoveryConfig(src.Discovery), + } +} + +func cloneProviderModelsDiscoveryConfig( + src aghconfig.ProviderModelsDiscoveryConfig, +) aghconfig.ProviderModelsDiscoveryConfig { + return aghconfig.ProviderModelsDiscoveryConfig{ + Enabled: cloneBoolPtr(src.Enabled), + Command: src.Command, + Endpoint: src.Endpoint, + Timeout: src.Timeout, + } +} + +func cloneProviderModelConfigs(src []aghconfig.ProviderModelConfig) []aghconfig.ProviderModelConfig { + if src == nil { + return nil + } + cloned := make([]aghconfig.ProviderModelConfig, len(src)) + for idx, model := range src { + cloned[idx] = aghconfig.ProviderModelConfig{ + ID: model.ID, + DisplayName: model.DisplayName, + ContextWindow: cloneInt64Ptr(model.ContextWindow), + MaxInputTokens: cloneInt64Ptr(model.MaxInputTokens), + MaxOutputTokens: cloneInt64Ptr(model.MaxOutputTokens), + SupportsTools: cloneBoolPtr(model.SupportsTools), + SupportsReasoning: cloneBoolPtr(model.SupportsReasoning), + ReasoningEfforts: append([]string(nil), model.ReasoningEfforts...), + DefaultReasoningEffort: model.DefaultReasoningEffort, + CostInputPerMillion: cloneFloat64Ptr(model.CostInputPerMillion), + CostOutputPerMillion: cloneFloat64Ptr(model.CostOutputPerMillion), + } + } + return cloned +} + +func cloneModelCatalogConfig(src aghconfig.ModelCatalogConfig) aghconfig.ModelCatalogConfig { + return aghconfig.ModelCatalogConfig{ + Sources: aghconfig.ModelCatalogSourcesConfig{ + ModelsDev: aghconfig.ModelsDevSourceConfig{ + Enabled: cloneBoolPtr(src.Sources.ModelsDev.Enabled), + Endpoint: src.Sources.ModelsDev.Endpoint, + TTL: src.Sources.ModelsDev.TTL, + Timeout: src.Sources.ModelsDev.Timeout, + }, + }, + } +} + func cloneAgentDefs(src []aghconfig.AgentDef) []aghconfig.AgentDef { if len(src) == 0 { return nil diff --git a/internal/workspace/resolver_test.go b/internal/workspace/resolver_test.go index b9ca3730f..d03e4d788 100644 --- a/internal/workspace/resolver_test.go +++ b/internal/workspace/resolver_test.go @@ -1274,8 +1274,10 @@ func TestCloneConfigProducesDeepCopy(t *testing.T) { }, Providers: map[string]aghconfig.ProviderConfig{ "claude": { - Command: "claude", - DefaultModel: "sonnet", + Command: "claude", + Models: aghconfig.ProviderModelsConfig{ + Default: "sonnet", + }, CredentialSlots: []aghconfig.ProviderCredentialSlot{ { Name: "api_key", diff --git a/magefile.go b/magefile.go index 931b07de4..d6855dc4b 100644 --- a/magefile.go +++ b/magefile.go @@ -269,6 +269,12 @@ func Boundaries() error { {"internal/api/udsapi", "internal/daemon"}, {"internal/api/udsapi", "internal/api/httpapi"}, {"internal/api/udsapi", "internal/cli"}, + {"internal/modelcatalog", "internal/daemon"}, + {"internal/modelcatalog", "internal/api/contract"}, + {"internal/modelcatalog", "internal/api/core"}, + {"internal/modelcatalog", "internal/api/httpapi"}, + {"internal/modelcatalog", "internal/api/udsapi"}, + {"internal/modelcatalog", "internal/cli"}, {"internal/memory/contract", "internal/memory/controller"}, {"internal/memory/contract", "internal/memory/recall"}, {"internal/memory/contract", "internal/memory/extractor"}, diff --git a/openapi/agh.json b/openapi/agh.json index 1f8fab750..4084084fe 100644 --- a/openapi/agh.json +++ b/openapi/agh.json @@ -4497,6 +4497,50 @@ "acp_caps": { "nullable": true, "properties": { + "config_options": { + "items": { + "properties": { + "current": { + "type": "string" + }, + "description": { + "type": "string" + }, + "id": { + "type": "string" + }, + "kind": { + "type": "string" + }, + "label": { + "type": "string" + }, + "values": { + "items": { + "properties": { + "description": { + "type": "string" + }, + "label": { + "type": "string" + }, + "value": { + "type": "string" + } + }, + "required": [ + "value" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": ["id", "kind"], + "type": "object" + }, + "type": "array" + }, "supported_models": { "items": { "type": "string" @@ -4804,12 +4848,18 @@ ], "type": "object" }, + "model": { + "type": "string" + }, "name": { "type": "string" }, "provider": { "type": "string" }, + "reasoning_effort": { + "type": "string" + }, "sandbox": { "nullable": true, "properties": { @@ -35081,6 +35131,53 @@ "acp_caps": { "nullable": true, "properties": { + "config_options": { + "items": { + "properties": { + "current": { + "type": "string" + }, + "description": { + "type": "string" + }, + "id": { + "type": "string" + }, + "kind": { + "type": "string" + }, + "label": { + "type": "string" + }, + "values": { + "items": { + "properties": { + "description": { + "type": "string" + }, + "label": { + "type": "string" + }, + "value": { + "type": "string" + } + }, + "required": [ + "value" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": [ + "id", + "kind" + ], + "type": "object" + }, + "type": "array" + }, "supported_models": { "items": { "type": "string" @@ -35390,12 +35487,18 @@ ], "type": "object" }, + "model": { + "type": "string" + }, "name": { "type": "string" }, "provider": { "type": "string" }, + "reasoning_effort": { + "type": "string" + }, "sandbox": { "nullable": true, "properties": { @@ -35757,6 +35860,53 @@ "acp_caps": { "nullable": true, "properties": { + "config_options": { + "items": { + "properties": { + "current": { + "type": "string" + }, + "description": { + "type": "string" + }, + "id": { + "type": "string" + }, + "kind": { + "type": "string" + }, + "label": { + "type": "string" + }, + "values": { + "items": { + "properties": { + "description": { + "type": "string" + }, + "label": { + "type": "string" + }, + "value": { + "type": "string" + } + }, + "required": [ + "value" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": [ + "id", + "kind" + ], + "type": "object" + }, + "type": "array" + }, "supported_models": { "items": { "type": "string" @@ -36066,12 +36216,18 @@ ], "type": "object" }, + "model": { + "type": "string" + }, "name": { "type": "string" }, "provider": { "type": "string" }, + "reasoning_effort": { + "type": "string" + }, "sandbox": { "nullable": true, "properties": { @@ -40769,75 +40925,17 @@ "x-agh-transports": ["http", "uds"] } }, - "/api/resources": { + "/api/openai/v1/models": { "get": { - "operationId": "listResources", + "operationId": "listOpenAIModels", "parameters": [ { - "description": "Filter by resource kind", - "in": "query", - "name": "kind", - "schema": { - "type": "string" - } - }, - { - "description": "Filter by resource scope kind", - "in": "query", - "name": "scope_kind", - "schema": { - "enum": ["global", "workspace"], - "type": "string" - } - }, - { - "description": "Filter by workspace scope id", - "in": "query", - "name": "scope_id", - "schema": { - "type": "string" - } - }, - { - "description": "Filter by stamped owner kind", - "in": "query", - "name": "owner_kind", - "schema": { - "type": "string" - } - }, - { - "description": "Filter by stamped owner id", - "in": "query", - "name": "owner_id", - "schema": { - "type": "string" - } - }, - { - "description": "Filter by stamped source kind", - "in": "query", - "name": "source_kind", - "schema": { - "type": "string" - } - }, - { - "description": "Filter by stamped source id", + "description": "Filter by AGH provider id", "in": "query", - "name": "source_id", + "name": "provider_id", "schema": { "type": "string" } - }, - { - "description": "Maximum number of records to return", - "in": "query", - "name": "limit", - "schema": { - "format": "int32", - "type": "integer" - } } ], "responses": { @@ -40846,96 +40944,161 @@ "application/json": { "schema": { "properties": { - "records": { + "data": { "items": { "properties": { - "created_at": { - "format": "date-time", - "type": "string" - }, - "id": { - "type": "string" - }, - "kind": { - "type": "string" - }, - "owner": { + "agh": { "properties": { - "id": { + "availability_state": { "type": "string" }, - "kind": { + "available": { + "nullable": true, + "type": "boolean" + }, + "context_window": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "cost": { + "nullable": true, + "properties": { + "input_per_million": { + "format": "double", + "nullable": true, + "type": "number" + }, + "output_per_million": { + "format": "double", + "nullable": true, + "type": "number" + } + }, + "type": "object" + }, + "default_reasoning_effort": { + "nullable": true, "type": "string" - } - }, - "required": ["id", "kind"], - "type": "object" - }, - "scope": { - "properties": { - "id": { + }, + "display_name": { "type": "string" }, - "kind": { - "enum": ["global", "workspace"], + "last_error": { "type": "string" - } - }, - "required": ["kind"], - "type": "object" - }, - "source": { - "properties": { - "id": { + }, + "max_input_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "max_output_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "model_id": { "type": "string" }, - "kind": { + "provider_id": { + "type": "string" + }, + "reasoning_efforts": { + "items": { + "type": "string" + }, + "type": "array" + }, + "refreshed_at": { "type": "string" + }, + "sources": { + "items": { + "type": "string" + }, + "type": "array" + }, + "stale": { + "type": "boolean" + }, + "supports_reasoning": { + "nullable": true, + "type": "boolean" + }, + "supports_tools": { + "nullable": true, + "type": "boolean" } }, - "required": ["id", "kind"], + "required": [ + "availability_state", + "available", + "model_id", + "provider_id", + "sources", + "stale" + ], "type": "object" }, - "spec": {}, - "updated_at": { - "format": "date-time", - "type": "string" - }, - "version": { + "created": { "format": "int64", "type": "integer" + }, + "id": { + "type": "string" + }, + "object": { + "type": "string" + }, + "owned_by": { + "type": "string" } }, "required": [ - "created_at", + "agh", + "created", "id", - "kind", - "owner", - "scope", - "source", - "spec", - "updated_at", - "version" + "object", + "owned_by" ], "type": "object" }, "type": "array" + }, + "object": { + "type": "string" } }, - "required": ["records"], + "required": ["data", "object"], "type": "object" } } }, "description": "OK" }, - "403": { + "400": { "content": { "application/json": { "schema": { "properties": { "error": { - "type": "string" + "properties": { + "code": { + "type": "string" + }, + "message": { + "type": "string" + }, + "param": { + "nullable": true, + "type": "string" + }, + "type": { + "type": "string" + } + }, + "required": ["code", "message", "param", "type"], + "type": "object" } }, "required": ["error"], @@ -40943,15 +41106,31 @@ } } }, - "description": "Forbidden" + "description": "Invalid model catalog filter" }, - "422": { + "401": { "content": { "application/json": { "schema": { "properties": { "error": { - "type": "string" + "properties": { + "code": { + "type": "string" + }, + "message": { + "type": "string" + }, + "param": { + "nullable": true, + "type": "string" + }, + "type": { + "type": "string" + } + }, + "required": ["code", "message", "param", "type"], + "type": "object" } }, "required": ["error"], @@ -40959,7 +41138,39 @@ } } }, - "description": "Invalid resource filter" + "description": "Unauthorized" + }, + "403": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "properties": { + "code": { + "type": "string" + }, + "message": { + "type": "string" + }, + "param": { + "nullable": true, + "type": "string" + }, + "type": { + "type": "string" + } + }, + "required": ["code", "message", "param", "type"], + "type": "object" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Forbidden" }, "500": { "content": { @@ -40967,7 +41178,23 @@ "schema": { "properties": { "error": { - "type": "string" + "properties": { + "code": { + "type": "string" + }, + "message": { + "type": "string" + }, + "param": { + "nullable": true, + "type": "string" + }, + "type": { + "type": "string" + } + }, + "required": ["code", "message", "param", "type"], + "type": "object" } }, "required": ["error"], @@ -40977,84 +41204,81 @@ }, "description": "Internal server error" }, + "503": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "properties": { + "code": { + "type": "string" + }, + "message": { + "type": "string" + }, + "param": { + "nullable": true, + "type": "string" + }, + "type": { + "type": "string" + } + }, + "required": ["code", "message", "param", "type"], + "type": "object" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Model catalog unavailable" + }, "default": { "description": "" } }, - "summary": "List desired-state resources on the local operator control plane", - "tags": ["resources"], - "x-agh-transports": ["http", "uds"] + "summary": "List provider models using the OpenAI-compatible model shape", + "tags": ["openai"], + "x-agh-transports": ["http"] } }, - "/api/resources/{kind}": { + "/api/providers/models": { "get": { - "operationId": "listResourcesByKind", + "operationId": "listProviderModels", "parameters": [ { - "description": "Resource kind", - "in": "path", - "name": "kind", - "required": true, - "schema": { - "type": "string" - } - }, - { - "description": "Filter by resource scope kind", - "in": "query", - "name": "scope_kind", - "schema": { - "enum": ["global", "workspace"], - "type": "string" - } - }, - { - "description": "Filter by workspace scope id", - "in": "query", - "name": "scope_id", - "schema": { - "type": "string" - } - }, - { - "description": "Filter by stamped owner kind", - "in": "query", - "name": "owner_kind", - "schema": { - "type": "string" - } - }, - { - "description": "Filter by stamped owner id", + "description": "Filter by AGH provider id", "in": "query", - "name": "owner_id", + "name": "provider_id", "schema": { "type": "string" } }, { - "description": "Filter by stamped source kind", + "description": "Filter by catalog source id", "in": "query", - "name": "source_kind", + "name": "source_id", "schema": { "type": "string" } }, { - "description": "Filter by stamped source id", + "description": "Refresh sources before listing models", "in": "query", - "name": "source_id", + "name": "refresh", "schema": { - "type": "string" + "type": "boolean" } }, { - "description": "Maximum number of records to return", + "description": "Include stale source rows in the merged projection", "in": "query", - "name": "limit", + "name": "include_stale", "schema": { - "format": "int32", - "type": "integer" + "type": "boolean" } } ], @@ -41064,49 +41288,1524 @@ "application/json": { "schema": { "properties": { - "records": { + "models": { "items": { "properties": { - "created_at": { - "format": "date-time", + "availability_state": { "type": "string" }, - "id": { - "type": "string" + "available": { + "nullable": true, + "type": "boolean" }, - "kind": { - "type": "string" + "context_window": { + "format": "int64", + "nullable": true, + "type": "integer" }, - "owner": { + "cost": { + "nullable": true, "properties": { - "id": { - "type": "string" + "input_per_million": { + "format": "double", + "nullable": true, + "type": "number" }, - "kind": { - "type": "string" + "output_per_million": { + "format": "double", + "nullable": true, + "type": "number" } }, - "required": ["id", "kind"], "type": "object" }, - "scope": { - "properties": { - "id": { - "type": "string" - }, - "kind": { - "enum": ["global", "workspace"], - "type": "string" - } - }, - "required": ["kind"], - "type": "object" + "default_reasoning_effort": { + "nullable": true, + "type": "string" }, - "source": { - "properties": { - "id": { - "type": "string" - }, + "display_name": { + "type": "string" + }, + "last_error": { + "type": "string" + }, + "max_input_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "max_output_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "model_id": { + "type": "string" + }, + "provider_id": { + "type": "string" + }, + "reasoning_efforts": { + "items": { + "type": "string" + }, + "type": "array" + }, + "refreshed_at": { + "type": "string" + }, + "sources": { + "items": { + "properties": { + "last_error": { + "type": "string" + }, + "priority": { + "type": "integer" + }, + "refreshed_at": { + "type": "string" + }, + "source_id": { + "type": "string" + }, + "source_kind": { + "type": "string" + }, + "stale": { + "type": "boolean" + } + }, + "required": [ + "priority", + "source_id", + "source_kind", + "stale" + ], + "type": "object" + }, + "type": "array" + }, + "stale": { + "type": "boolean" + }, + "supports_reasoning": { + "nullable": true, + "type": "boolean" + }, + "supports_tools": { + "nullable": true, + "type": "boolean" + } + }, + "required": [ + "availability_state", + "available", + "model_id", + "provider_id", + "sources", + "stale" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": ["models"], + "type": "object" + } + } + }, + "description": "OK" + }, + "400": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Invalid model catalog filter" + }, + "403": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Forbidden" + }, + "500": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Internal server error" + }, + "503": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Model catalog unavailable" + }, + "default": { + "description": "" + } + }, + "summary": "List provider model catalog entries across providers", + "tags": ["providers"], + "x-agh-transports": ["http", "uds"] + } + }, + "/api/providers/models/refresh": { + "post": { + "operationId": "refreshProviderModels", + "requestBody": { + "content": { + "application/json": { + "schema": { + "properties": { + "force": { + "type": "boolean" + }, + "request_id": { + "type": "string" + }, + "source_id": { + "type": "string" + } + }, + "type": "object" + } + } + }, + "description": "JSON request body" + }, + "responses": { + "200": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + }, + "sources": { + "items": { + "properties": { + "last_error": { + "type": "string" + }, + "last_refresh": { + "type": "string" + }, + "last_success": { + "type": "string" + }, + "next_refresh": { + "type": "string" + }, + "priority": { + "type": "integer" + }, + "provider_id": { + "type": "string" + }, + "refresh_state": { + "type": "string" + }, + "row_count": { + "type": "integer" + }, + "source_id": { + "type": "string" + }, + "source_kind": { + "type": "string" + }, + "stale": { + "type": "boolean" + } + }, + "required": [ + "priority", + "provider_id", + "refresh_state", + "row_count", + "source_id", + "source_kind", + "stale" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": ["sources"], + "type": "object" + } + } + }, + "description": "OK" + }, + "400": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Invalid model catalog refresh request" + }, + "403": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Forbidden" + }, + "500": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Internal server error" + }, + "503": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + }, + "sources": { + "items": { + "properties": { + "last_error": { + "type": "string" + }, + "last_refresh": { + "type": "string" + }, + "last_success": { + "type": "string" + }, + "next_refresh": { + "type": "string" + }, + "priority": { + "type": "integer" + }, + "provider_id": { + "type": "string" + }, + "refresh_state": { + "type": "string" + }, + "row_count": { + "type": "integer" + }, + "source_id": { + "type": "string" + }, + "source_kind": { + "type": "string" + }, + "stale": { + "type": "boolean" + } + }, + "required": [ + "priority", + "provider_id", + "refresh_state", + "row_count", + "source_id", + "source_kind", + "stale" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": ["sources"], + "type": "object" + } + } + }, + "description": "Model catalog refresh unavailable" + }, + "default": { + "description": "" + } + }, + "summary": "Refresh provider model catalog sources across providers", + "tags": ["providers"], + "x-agh-transports": ["http", "uds"] + } + }, + "/api/providers/models/status": { + "get": { + "operationId": "getProviderModelStatus", + "responses": { + "200": { + "content": { + "application/json": { + "schema": { + "properties": { + "sources": { + "items": { + "properties": { + "last_error": { + "type": "string" + }, + "last_refresh": { + "type": "string" + }, + "last_success": { + "type": "string" + }, + "next_refresh": { + "type": "string" + }, + "priority": { + "type": "integer" + }, + "provider_id": { + "type": "string" + }, + "refresh_state": { + "type": "string" + }, + "row_count": { + "type": "integer" + }, + "source_id": { + "type": "string" + }, + "source_kind": { + "type": "string" + }, + "stale": { + "type": "boolean" + } + }, + "required": [ + "priority", + "provider_id", + "refresh_state", + "row_count", + "source_id", + "source_kind", + "stale" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": ["sources"], + "type": "object" + } + } + }, + "description": "OK" + }, + "400": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Invalid model catalog filter" + }, + "403": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Forbidden" + }, + "500": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Internal server error" + }, + "503": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Model catalog unavailable" + }, + "default": { + "description": "" + } + }, + "summary": "List provider model catalog source status across providers", + "tags": ["providers"], + "x-agh-transports": ["http", "uds"] + } + }, + "/api/providers/{provider_id}/models": { + "get": { + "operationId": "listProviderModelsByProvider", + "parameters": [ + { + "description": "AGH provider id", + "in": "path", + "name": "provider_id", + "required": true, + "schema": { + "type": "string" + } + }, + { + "description": "Filter by catalog source id", + "in": "query", + "name": "source_id", + "schema": { + "type": "string" + } + }, + { + "description": "Refresh sources before listing models", + "in": "query", + "name": "refresh", + "schema": { + "type": "boolean" + } + }, + { + "description": "Include stale source rows in the merged projection", + "in": "query", + "name": "include_stale", + "schema": { + "type": "boolean" + } + } + ], + "responses": { + "200": { + "content": { + "application/json": { + "schema": { + "properties": { + "models": { + "items": { + "properties": { + "availability_state": { + "type": "string" + }, + "available": { + "nullable": true, + "type": "boolean" + }, + "context_window": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "cost": { + "nullable": true, + "properties": { + "input_per_million": { + "format": "double", + "nullable": true, + "type": "number" + }, + "output_per_million": { + "format": "double", + "nullable": true, + "type": "number" + } + }, + "type": "object" + }, + "default_reasoning_effort": { + "nullable": true, + "type": "string" + }, + "display_name": { + "type": "string" + }, + "last_error": { + "type": "string" + }, + "max_input_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "max_output_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "model_id": { + "type": "string" + }, + "provider_id": { + "type": "string" + }, + "reasoning_efforts": { + "items": { + "type": "string" + }, + "type": "array" + }, + "refreshed_at": { + "type": "string" + }, + "sources": { + "items": { + "properties": { + "last_error": { + "type": "string" + }, + "priority": { + "type": "integer" + }, + "refreshed_at": { + "type": "string" + }, + "source_id": { + "type": "string" + }, + "source_kind": { + "type": "string" + }, + "stale": { + "type": "boolean" + } + }, + "required": [ + "priority", + "source_id", + "source_kind", + "stale" + ], + "type": "object" + }, + "type": "array" + }, + "stale": { + "type": "boolean" + }, + "supports_reasoning": { + "nullable": true, + "type": "boolean" + }, + "supports_tools": { + "nullable": true, + "type": "boolean" + } + }, + "required": [ + "availability_state", + "available", + "model_id", + "provider_id", + "sources", + "stale" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": ["models"], + "type": "object" + } + } + }, + "description": "OK" + }, + "400": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Invalid model catalog filter" + }, + "403": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Forbidden" + }, + "500": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Internal server error" + }, + "503": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Model catalog unavailable" + }, + "default": { + "description": "" + } + }, + "summary": "List provider model catalog entries for one provider", + "tags": ["providers"], + "x-agh-transports": ["http", "uds"] + } + }, + "/api/providers/{provider_id}/models/refresh": { + "post": { + "operationId": "refreshProviderModelsByProvider", + "parameters": [ + { + "description": "AGH provider id", + "in": "path", + "name": "provider_id", + "required": true, + "schema": { + "type": "string" + } + } + ], + "requestBody": { + "content": { + "application/json": { + "schema": { + "properties": { + "force": { + "type": "boolean" + }, + "request_id": { + "type": "string" + }, + "source_id": { + "type": "string" + } + }, + "type": "object" + } + } + }, + "description": "JSON request body" + }, + "responses": { + "200": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + }, + "sources": { + "items": { + "properties": { + "last_error": { + "type": "string" + }, + "last_refresh": { + "type": "string" + }, + "last_success": { + "type": "string" + }, + "next_refresh": { + "type": "string" + }, + "priority": { + "type": "integer" + }, + "provider_id": { + "type": "string" + }, + "refresh_state": { + "type": "string" + }, + "row_count": { + "type": "integer" + }, + "source_id": { + "type": "string" + }, + "source_kind": { + "type": "string" + }, + "stale": { + "type": "boolean" + } + }, + "required": [ + "priority", + "provider_id", + "refresh_state", + "row_count", + "source_id", + "source_kind", + "stale" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": ["sources"], + "type": "object" + } + } + }, + "description": "OK" + }, + "400": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Invalid model catalog refresh request" + }, + "403": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Forbidden" + }, + "500": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Internal server error" + }, + "503": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + }, + "sources": { + "items": { + "properties": { + "last_error": { + "type": "string" + }, + "last_refresh": { + "type": "string" + }, + "last_success": { + "type": "string" + }, + "next_refresh": { + "type": "string" + }, + "priority": { + "type": "integer" + }, + "provider_id": { + "type": "string" + }, + "refresh_state": { + "type": "string" + }, + "row_count": { + "type": "integer" + }, + "source_id": { + "type": "string" + }, + "source_kind": { + "type": "string" + }, + "stale": { + "type": "boolean" + } + }, + "required": [ + "priority", + "provider_id", + "refresh_state", + "row_count", + "source_id", + "source_kind", + "stale" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": ["sources"], + "type": "object" + } + } + }, + "description": "Model catalog refresh unavailable" + }, + "default": { + "description": "" + } + }, + "summary": "Refresh provider model catalog sources for one provider", + "tags": ["providers"], + "x-agh-transports": ["http", "uds"] + } + }, + "/api/providers/{provider_id}/models/status": { + "get": { + "operationId": "getProviderModelStatusByProvider", + "parameters": [ + { + "description": "AGH provider id", + "in": "path", + "name": "provider_id", + "required": true, + "schema": { + "type": "string" + } + } + ], + "responses": { + "200": { + "content": { + "application/json": { + "schema": { + "properties": { + "sources": { + "items": { + "properties": { + "last_error": { + "type": "string" + }, + "last_refresh": { + "type": "string" + }, + "last_success": { + "type": "string" + }, + "next_refresh": { + "type": "string" + }, + "priority": { + "type": "integer" + }, + "provider_id": { + "type": "string" + }, + "refresh_state": { + "type": "string" + }, + "row_count": { + "type": "integer" + }, + "source_id": { + "type": "string" + }, + "source_kind": { + "type": "string" + }, + "stale": { + "type": "boolean" + } + }, + "required": [ + "priority", + "provider_id", + "refresh_state", + "row_count", + "source_id", + "source_kind", + "stale" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": ["sources"], + "type": "object" + } + } + }, + "description": "OK" + }, + "400": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Invalid model catalog filter" + }, + "403": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Forbidden" + }, + "500": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Internal server error" + }, + "503": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Model catalog unavailable" + }, + "default": { + "description": "" + } + }, + "summary": "List provider model catalog source status for one provider", + "tags": ["providers"], + "x-agh-transports": ["http", "uds"] + } + }, + "/api/resources": { + "get": { + "operationId": "listResources", + "parameters": [ + { + "description": "Filter by resource kind", + "in": "query", + "name": "kind", + "schema": { + "type": "string" + } + }, + { + "description": "Filter by resource scope kind", + "in": "query", + "name": "scope_kind", + "schema": { + "enum": ["global", "workspace"], + "type": "string" + } + }, + { + "description": "Filter by workspace scope id", + "in": "query", + "name": "scope_id", + "schema": { + "type": "string" + } + }, + { + "description": "Filter by stamped owner kind", + "in": "query", + "name": "owner_kind", + "schema": { + "type": "string" + } + }, + { + "description": "Filter by stamped owner id", + "in": "query", + "name": "owner_id", + "schema": { + "type": "string" + } + }, + { + "description": "Filter by stamped source kind", + "in": "query", + "name": "source_kind", + "schema": { + "type": "string" + } + }, + { + "description": "Filter by stamped source id", + "in": "query", + "name": "source_id", + "schema": { + "type": "string" + } + }, + { + "description": "Maximum number of records to return", + "in": "query", + "name": "limit", + "schema": { + "format": "int32", + "type": "integer" + } + } + ], + "responses": { + "200": { + "content": { + "application/json": { + "schema": { + "properties": { + "records": { + "items": { + "properties": { + "created_at": { + "format": "date-time", + "type": "string" + }, + "id": { + "type": "string" + }, + "kind": { + "type": "string" + }, + "owner": { + "properties": { + "id": { + "type": "string" + }, + "kind": { + "type": "string" + } + }, + "required": ["id", "kind"], + "type": "object" + }, + "scope": { + "properties": { + "id": { + "type": "string" + }, + "kind": { + "enum": ["global", "workspace"], + "type": "string" + } + }, + "required": ["kind"], + "type": "object" + }, + "source": { + "properties": { + "id": { + "type": "string" + }, + "kind": { + "type": "string" + } + }, + "required": ["id", "kind"], + "type": "object" + }, + "spec": {}, + "updated_at": { + "format": "date-time", + "type": "string" + }, + "version": { + "format": "int64", + "type": "integer" + } + }, + "required": [ + "created_at", + "id", + "kind", + "owner", + "scope", + "source", + "spec", + "updated_at", + "version" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": ["records"], + "type": "object" + } + } + }, + "description": "OK" + }, + "403": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Forbidden" + }, + "422": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Invalid resource filter" + }, + "500": { + "content": { + "application/json": { + "schema": { + "properties": { + "error": { + "type": "string" + } + }, + "required": ["error"], + "type": "object" + } + } + }, + "description": "Internal server error" + }, + "default": { + "description": "" + } + }, + "summary": "List desired-state resources on the local operator control plane", + "tags": ["resources"], + "x-agh-transports": ["http", "uds"] + } + }, + "/api/resources/{kind}": { + "get": { + "operationId": "listResourcesByKind", + "parameters": [ + { + "description": "Resource kind", + "in": "path", + "name": "kind", + "required": true, + "schema": { + "type": "string" + } + }, + { + "description": "Filter by resource scope kind", + "in": "query", + "name": "scope_kind", + "schema": { + "enum": ["global", "workspace"], + "type": "string" + } + }, + { + "description": "Filter by workspace scope id", + "in": "query", + "name": "scope_id", + "schema": { + "type": "string" + } + }, + { + "description": "Filter by stamped owner kind", + "in": "query", + "name": "owner_kind", + "schema": { + "type": "string" + } + }, + { + "description": "Filter by stamped owner id", + "in": "query", + "name": "owner_id", + "schema": { + "type": "string" + } + }, + { + "description": "Filter by stamped source kind", + "in": "query", + "name": "source_kind", + "schema": { + "type": "string" + } + }, + { + "description": "Filter by stamped source id", + "in": "query", + "name": "source_id", + "schema": { + "type": "string" + } + }, + { + "description": "Maximum number of records to return", + "in": "query", + "name": "limit", + "schema": { + "format": "int32", + "type": "integer" + } + } + ], + "responses": { + "200": { + "content": { + "application/json": { + "schema": { + "properties": { + "records": { + "items": { + "properties": { + "created_at": { + "format": "date-time", + "type": "string" + }, + "id": { + "type": "string" + }, + "kind": { + "type": "string" + }, + "owner": { + "properties": { + "id": { + "type": "string" + }, + "kind": { + "type": "string" + } + }, + "required": ["id", "kind"], + "type": "object" + }, + "scope": { + "properties": { + "id": { + "type": "string" + }, + "kind": { + "enum": ["global", "workspace"], + "type": "string" + } + }, + "required": ["kind"], + "type": "object" + }, + "source": { + "properties": { + "id": { + "type": "string" + }, "kind": { "type": "string" } @@ -41896,6 +43595,50 @@ "acp_caps": { "nullable": true, "properties": { + "config_options": { + "items": { + "properties": { + "current": { + "type": "string" + }, + "description": { + "type": "string" + }, + "id": { + "type": "string" + }, + "kind": { + "type": "string" + }, + "label": { + "type": "string" + }, + "values": { + "items": { + "properties": { + "description": { + "type": "string" + }, + "label": { + "type": "string" + }, + "value": { + "type": "string" + } + }, + "required": [ + "value" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": ["id", "kind"], + "type": "object" + }, + "type": "array" + }, "supported_models": { "items": { "type": "string" @@ -42203,12 +43946,18 @@ ], "type": "object" }, + "model": { + "type": "string" + }, "name": { "type": "string" }, "provider": { "type": "string" }, + "reasoning_effort": { + "type": "string" + }, "sandbox": { "nullable": true, "properties": { @@ -42348,12 +44097,18 @@ "channel": { "type": "string" }, + "model": { + "type": "string" + }, "name": { "type": "string" }, "provider": { "type": "string" }, + "reasoning_effort": { + "type": "string" + }, "workspace": { "type": "string" }, @@ -42379,6 +44134,48 @@ "acp_caps": { "nullable": true, "properties": { + "config_options": { + "items": { + "properties": { + "current": { + "type": "string" + }, + "description": { + "type": "string" + }, + "id": { + "type": "string" + }, + "kind": { + "type": "string" + }, + "label": { + "type": "string" + }, + "values": { + "items": { + "properties": { + "description": { + "type": "string" + }, + "label": { + "type": "string" + }, + "value": { + "type": "string" + } + }, + "required": ["value"], + "type": "object" + }, + "type": "array" + } + }, + "required": ["id", "kind"], + "type": "object" + }, + "type": "array" + }, "supported_models": { "items": { "type": "string" @@ -42686,12 +44483,18 @@ ], "type": "object" }, + "model": { + "type": "string" + }, "name": { "type": "string" }, "provider": { "type": "string" }, + "reasoning_effort": { + "type": "string" + }, "sandbox": { "nullable": true, "properties": { @@ -42939,6 +44742,48 @@ "acp_caps": { "nullable": true, "properties": { + "config_options": { + "items": { + "properties": { + "current": { + "type": "string" + }, + "description": { + "type": "string" + }, + "id": { + "type": "string" + }, + "kind": { + "type": "string" + }, + "label": { + "type": "string" + }, + "values": { + "items": { + "properties": { + "description": { + "type": "string" + }, + "label": { + "type": "string" + }, + "value": { + "type": "string" + } + }, + "required": ["value"], + "type": "object" + }, + "type": "array" + } + }, + "required": ["id", "kind"], + "type": "object" + }, + "type": "array" + }, "supported_models": { "items": { "type": "string" @@ -43246,12 +45091,18 @@ ], "type": "object" }, + "model": { + "type": "string" + }, "name": { "type": "string" }, "provider": { "type": "string" }, + "reasoning_effort": { + "type": "string" + }, "sandbox": { "nullable": true, "properties": { @@ -44230,6 +46081,48 @@ "acp_caps": { "nullable": true, "properties": { + "config_options": { + "items": { + "properties": { + "current": { + "type": "string" + }, + "description": { + "type": "string" + }, + "id": { + "type": "string" + }, + "kind": { + "type": "string" + }, + "label": { + "type": "string" + }, + "values": { + "items": { + "properties": { + "description": { + "type": "string" + }, + "label": { + "type": "string" + }, + "value": { + "type": "string" + } + }, + "required": ["value"], + "type": "object" + }, + "type": "array" + } + }, + "required": ["id", "kind"], + "type": "object" + }, + "type": "array" + }, "supported_models": { "items": { "type": "string" @@ -44537,12 +46430,18 @@ ], "type": "object" }, + "model": { + "type": "string" + }, "name": { "type": "string" }, "provider": { "type": "string" }, + "reasoning_effort": { + "type": "string" + }, "sandbox": { "nullable": true, "properties": { @@ -53091,9 +54990,6 @@ }, "type": "array" }, - "default_model": { - "type": "string" - }, "display_name": { "type": "string" }, @@ -53106,6 +55002,93 @@ "home_policy": { "type": "string" }, + "models": { + "nullable": true, + "properties": { + "curated": { + "items": { + "properties": { + "context_window": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "cost_input_per_million": { + "format": "double", + "nullable": true, + "type": "number" + }, + "cost_output_per_million": { + "format": "double", + "nullable": true, + "type": "number" + }, + "default_reasoning_effort": { + "type": "string" + }, + "display_name": { + "type": "string" + }, + "id": { + "type": "string" + }, + "max_input_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "max_output_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "reasoning_efforts": { + "items": { + "type": "string" + }, + "type": "array" + }, + "supports_reasoning": { + "nullable": true, + "type": "boolean" + }, + "supports_tools": { + "nullable": true, + "type": "boolean" + } + }, + "required": [ + "id" + ], + "type": "object" + }, + "type": "array" + }, + "default": { + "type": "string" + }, + "discovery": { + "nullable": true, + "properties": { + "command": { + "type": "string" + }, + "enabled": { + "nullable": true, + "type": "boolean" + }, + "endpoint": { + "type": "string" + }, + "timeout": { + "type": "string" + } + }, + "type": "object" + } + }, + "type": "object" + }, "runtime_provider": { "type": "string" }, @@ -53200,9 +55183,6 @@ }, "type": "array" }, - "default_model": { - "type": "string" - }, "display_name": { "type": "string" }, @@ -53215,6 +55195,91 @@ "home_policy": { "type": "string" }, + "models": { + "nullable": true, + "properties": { + "curated": { + "items": { + "properties": { + "context_window": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "cost_input_per_million": { + "format": "double", + "nullable": true, + "type": "number" + }, + "cost_output_per_million": { + "format": "double", + "nullable": true, + "type": "number" + }, + "default_reasoning_effort": { + "type": "string" + }, + "display_name": { + "type": "string" + }, + "id": { + "type": "string" + }, + "max_input_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "max_output_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "reasoning_efforts": { + "items": { + "type": "string" + }, + "type": "array" + }, + "supports_reasoning": { + "nullable": true, + "type": "boolean" + }, + "supports_tools": { + "nullable": true, + "type": "boolean" + } + }, + "required": ["id"], + "type": "object" + }, + "type": "array" + }, + "default": { + "type": "string" + }, + "discovery": { + "nullable": true, + "properties": { + "command": { + "type": "string" + }, + "enabled": { + "nullable": true, + "type": "boolean" + }, + "endpoint": { + "type": "string" + }, + "timeout": { + "type": "string" + } + }, + "type": "object" + } + }, + "type": "object" + }, "runtime_provider": { "type": "string" }, @@ -53650,9 +55715,6 @@ }, "type": "array" }, - "default_model": { - "type": "string" - }, "display_name": { "type": "string" }, @@ -53665,6 +55727,91 @@ "home_policy": { "type": "string" }, + "models": { + "nullable": true, + "properties": { + "curated": { + "items": { + "properties": { + "context_window": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "cost_input_per_million": { + "format": "double", + "nullable": true, + "type": "number" + }, + "cost_output_per_million": { + "format": "double", + "nullable": true, + "type": "number" + }, + "default_reasoning_effort": { + "type": "string" + }, + "display_name": { + "type": "string" + }, + "id": { + "type": "string" + }, + "max_input_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "max_output_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "reasoning_efforts": { + "items": { + "type": "string" + }, + "type": "array" + }, + "supports_reasoning": { + "nullable": true, + "type": "boolean" + }, + "supports_tools": { + "nullable": true, + "type": "boolean" + } + }, + "required": ["id"], + "type": "object" + }, + "type": "array" + }, + "default": { + "type": "string" + }, + "discovery": { + "nullable": true, + "properties": { + "command": { + "type": "string" + }, + "enabled": { + "nullable": true, + "type": "boolean" + }, + "endpoint": { + "type": "string" + }, + "timeout": { + "type": "string" + } + }, + "type": "object" + } + }, + "type": "object" + }, "runtime_provider": { "type": "string" }, @@ -53759,9 +55906,6 @@ }, "type": "array" }, - "default_model": { - "type": "string" - }, "display_name": { "type": "string" }, @@ -53774,6 +55918,91 @@ "home_policy": { "type": "string" }, + "models": { + "nullable": true, + "properties": { + "curated": { + "items": { + "properties": { + "context_window": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "cost_input_per_million": { + "format": "double", + "nullable": true, + "type": "number" + }, + "cost_output_per_million": { + "format": "double", + "nullable": true, + "type": "number" + }, + "default_reasoning_effort": { + "type": "string" + }, + "display_name": { + "type": "string" + }, + "id": { + "type": "string" + }, + "max_input_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "max_output_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "reasoning_efforts": { + "items": { + "type": "string" + }, + "type": "array" + }, + "supports_reasoning": { + "nullable": true, + "type": "boolean" + }, + "supports_tools": { + "nullable": true, + "type": "boolean" + } + }, + "required": ["id"], + "type": "object" + }, + "type": "array" + }, + "default": { + "type": "string" + }, + "discovery": { + "nullable": true, + "properties": { + "command": { + "type": "string" + }, + "enabled": { + "nullable": true, + "type": "boolean" + }, + "endpoint": { + "type": "string" + }, + "timeout": { + "type": "string" + } + }, + "type": "object" + } + }, + "type": "object" + }, "runtime_provider": { "type": "string" }, @@ -54016,9 +56245,6 @@ }, "type": "array" }, - "default_model": { - "type": "string" - }, "display_name": { "type": "string" }, @@ -54031,6 +56257,91 @@ "home_policy": { "type": "string" }, + "models": { + "nullable": true, + "properties": { + "curated": { + "items": { + "properties": { + "context_window": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "cost_input_per_million": { + "format": "double", + "nullable": true, + "type": "number" + }, + "cost_output_per_million": { + "format": "double", + "nullable": true, + "type": "number" + }, + "default_reasoning_effort": { + "type": "string" + }, + "display_name": { + "type": "string" + }, + "id": { + "type": "string" + }, + "max_input_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "max_output_tokens": { + "format": "int64", + "nullable": true, + "type": "integer" + }, + "reasoning_efforts": { + "items": { + "type": "string" + }, + "type": "array" + }, + "supports_reasoning": { + "nullable": true, + "type": "boolean" + }, + "supports_tools": { + "nullable": true, + "type": "boolean" + } + }, + "required": ["id"], + "type": "object" + }, + "type": "array" + }, + "default": { + "type": "string" + }, + "discovery": { + "nullable": true, + "properties": { + "command": { + "type": "string" + }, + "enabled": { + "nullable": true, + "type": "boolean" + }, + "endpoint": { + "type": "string" + }, + "timeout": { + "type": "string" + } + }, + "type": "object" + } + }, + "type": "object" + }, "runtime_provider": { "type": "string" }, @@ -79294,9 +81605,6 @@ "auth_mode": { "type": "string" }, - "default_model": { - "type": "string" - }, "display_name": { "type": "string" }, @@ -79327,6 +81635,50 @@ "acp_caps": { "nullable": true, "properties": { + "config_options": { + "items": { + "properties": { + "current": { + "type": "string" + }, + "description": { + "type": "string" + }, + "id": { + "type": "string" + }, + "kind": { + "type": "string" + }, + "label": { + "type": "string" + }, + "values": { + "items": { + "properties": { + "description": { + "type": "string" + }, + "label": { + "type": "string" + }, + "value": { + "type": "string" + } + }, + "required": [ + "value" + ], + "type": "object" + }, + "type": "array" + } + }, + "required": ["id", "kind"], + "type": "object" + }, + "type": "array" + }, "supported_models": { "items": { "type": "string" @@ -79634,12 +81986,18 @@ ], "type": "object" }, + "model": { + "type": "string" + }, "name": { "type": "string" }, "provider": { "type": "string" }, + "reasoning_effort": { + "type": "string" + }, "sandbox": { "nullable": true, "properties": { @@ -80020,6 +82378,12 @@ { "name": "observe" }, + { + "name": "openai" + }, + { + "name": "providers" + }, { "name": "resources" }, diff --git a/packages/site/content/runtime/api-reference/meta.json b/packages/site/content/runtime/api-reference/meta.json index cbf6ba5f3..1215bf668 100644 --- a/packages/site/content/runtime/api-reference/meta.json +++ b/packages/site/content/runtime/api-reference/meta.json @@ -24,9 +24,11 @@ "---Operations---", "daemon", "settings", + "providers", "extensions", "vault", "agent", - "tasks" + "tasks", + "openai" ] } diff --git a/packages/site/content/runtime/cli-reference/memory/extractor/replay.mdx b/packages/site/content/runtime/cli-reference/memory/extractor/replay.mdx index 2e97cf5d4..0a0ba32c1 100644 --- a/packages/site/content/runtime/cli-reference/memory/extractor/replay.mdx +++ b/packages/site/content/runtime/cli-reference/memory/extractor/replay.mdx @@ -14,7 +14,6 @@ agh memory extractor replay --session [flags] ### Options ``` - --from-dlq Replay from dead-letter queue records -h, --help help for replay --session string Session whose extractor work should be replayed ``` diff --git a/packages/site/content/runtime/cli-reference/memory/reset.mdx b/packages/site/content/runtime/cli-reference/memory/reset.mdx index 9cb9ed3f0..6ea6cd486 100644 --- a/packages/site/content/runtime/cli-reference/memory/reset.mdx +++ b/packages/site/content/runtime/cli-reference/memory/reset.mdx @@ -19,7 +19,6 @@ agh memory reset [flags] --dry-run Show reset work without applying it -h, --help help for reset --include-daily Include daily memory artifacts - --include-system Include _system memory state --scope string Memory scope: global, workspace, or agent --workspace string Workspace ID or path for workspace-bound memory ``` diff --git a/packages/site/content/runtime/cli-reference/provider/index.mdx b/packages/site/content/runtime/cli-reference/provider/index.mdx index 4a1e8b05c..fe3632993 100644 --- a/packages/site/content/runtime/cli-reference/provider/index.mdx +++ b/packages/site/content/runtime/cli-reference/provider/index.mdx @@ -31,6 +31,7 @@ Every AGH command supports `-o, --output`: ## Subcommands -| Command | Description | -| --------------------------------------------------------- | ----------------------------------------------------------- | -| [agh provider auth](/runtime/cli-reference/provider/auth) | Inspect native CLI and bound-secret provider authentication | +| Command | Description | +| ------------------------------------------------------------- | ----------------------------------------------------------- | +| [agh provider auth](/runtime/cli-reference/provider/auth) | Inspect native CLI and bound-secret provider authentication | +| [agh provider models](/runtime/cli-reference/provider/models) | Inspect and refresh the provider model catalog | diff --git a/packages/site/content/runtime/cli-reference/provider/meta.json b/packages/site/content/runtime/cli-reference/provider/meta.json index 5edafedb0..9d1e47bb7 100644 --- a/packages/site/content/runtime/cli-reference/provider/meta.json +++ b/packages/site/content/runtime/cli-reference/provider/meta.json @@ -1,4 +1,4 @@ { "title": "Provider", - "pages": ["index", "auth"] + "pages": ["index", "auth", "models"] } diff --git a/packages/site/content/runtime/cli-reference/provider/models/index.mdx b/packages/site/content/runtime/cli-reference/provider/models/index.mdx new file mode 100644 index 000000000..bb1a45486 --- /dev/null +++ b/packages/site/content/runtime/cli-reference/provider/models/index.mdx @@ -0,0 +1,38 @@ +--- +title: "agh provider models" +description: "Inspect and refresh the provider model catalog" +--- + +## agh provider models + +Inspect and refresh the provider model catalog + +### Options + +``` + -h, --help help for models +``` + +### Options inherited from parent commands + +``` + --json Emit JSON output + -o, --output string Output format: human, json, jsonl, or toon (default "human") +``` + +## Output Formats + +Every AGH command supports `-o, --output`: + +- `human` for interactive terminal use +- `json` for scripts and other machine-readable consumers +- `jsonl` for wait or streaming commands that emit one JSON record per line +- `toon` for compact agent-readable summaries + +## Subcommands + +| Command | Description | +| ----------------------------------------------------------------------------- | ----------------------------------------- | +| [agh provider models list](/runtime/cli-reference/provider/models/list) | List provider model catalog entries | +| [agh provider models refresh](/runtime/cli-reference/provider/models/refresh) | Refresh provider model catalog sources | +| [agh provider models status](/runtime/cli-reference/provider/models/status) | Show provider model catalog source status | diff --git a/packages/site/content/runtime/cli-reference/provider/models/list.mdx b/packages/site/content/runtime/cli-reference/provider/models/list.mdx new file mode 100644 index 000000000..c780fd34a --- /dev/null +++ b/packages/site/content/runtime/cli-reference/provider/models/list.mdx @@ -0,0 +1,43 @@ +--- +title: "agh provider models list" +description: "List provider model catalog entries" +--- + +## agh provider models list + +List provider model catalog entries + +``` +agh provider models list [provider] [flags] +``` + +### Options + +``` + -h, --help help for list + --include-stale Include stale source rows + --refresh Refresh sources before listing models + --source string Filter by catalog source id +``` + +### Options inherited from parent commands + +``` + --json Emit JSON output + -o, --output string Output format: human, json, jsonl, or toon (default "human") +``` + +## Output Formats + +Every AGH command supports `-o, --output`: + +- `human` for interactive terminal use +- `json` for scripts and other machine-readable consumers +- `jsonl` for wait or streaming commands that emit one JSON record per line +- `toon` for compact agent-readable summaries + +Example: + +```bash +agh provider models list [provider] -o json +``` diff --git a/packages/site/content/runtime/cli-reference/provider/models/meta.json b/packages/site/content/runtime/cli-reference/provider/models/meta.json new file mode 100644 index 000000000..fbf31434a --- /dev/null +++ b/packages/site/content/runtime/cli-reference/provider/models/meta.json @@ -0,0 +1,4 @@ +{ + "title": "Models", + "pages": ["index", "list", "refresh", "status"] +} diff --git a/packages/site/content/runtime/cli-reference/provider/models/refresh.mdx b/packages/site/content/runtime/cli-reference/provider/models/refresh.mdx new file mode 100644 index 000000000..062495d34 --- /dev/null +++ b/packages/site/content/runtime/cli-reference/provider/models/refresh.mdx @@ -0,0 +1,43 @@ +--- +title: "agh provider models refresh" +description: "Refresh provider model catalog sources" +--- + +## agh provider models refresh + +Refresh provider model catalog sources + +``` +agh provider models refresh [provider] [flags] +``` + +### Options + +``` + --force Force refresh even when cached status is fresh + -h, --help help for refresh + --request-id string Refresh request id for daemon logs + --source string Refresh only one catalog source id +``` + +### Options inherited from parent commands + +``` + --json Emit JSON output + -o, --output string Output format: human, json, jsonl, or toon (default "human") +``` + +## Output Formats + +Every AGH command supports `-o, --output`: + +- `human` for interactive terminal use +- `json` for scripts and other machine-readable consumers +- `jsonl` for wait or streaming commands that emit one JSON record per line +- `toon` for compact agent-readable summaries + +Example: + +```bash +agh provider models refresh [provider] -o json +``` diff --git a/packages/site/content/runtime/cli-reference/provider/models/status.mdx b/packages/site/content/runtime/cli-reference/provider/models/status.mdx new file mode 100644 index 000000000..7449def9e --- /dev/null +++ b/packages/site/content/runtime/cli-reference/provider/models/status.mdx @@ -0,0 +1,40 @@ +--- +title: "agh provider models status" +description: "Show provider model catalog source status" +--- + +## agh provider models status + +Show provider model catalog source status + +``` +agh provider models status [provider] [flags] +``` + +### Options + +``` + -h, --help help for status +``` + +### Options inherited from parent commands + +``` + --json Emit JSON output + -o, --output string Output format: human, json, jsonl, or toon (default "human") +``` + +## Output Formats + +Every AGH command supports `-o, --output`: + +- `human` for interactive terminal use +- `json` for scripts and other machine-readable consumers +- `jsonl` for wait or streaming commands that emit one JSON record per line +- `toon` for compact agent-readable summaries + +Example: + +```bash +agh provider models status [provider] -o json +``` diff --git a/packages/site/content/runtime/cli-reference/session/new.mdx b/packages/site/content/runtime/cli-reference/session/new.mdx index 741e23fdf..29eb5aff9 100644 --- a/packages/site/content/runtime/cli-reference/session/new.mdx +++ b/packages/site/content/runtime/cli-reference/session/new.mdx @@ -20,6 +20,9 @@ agh session new [flags] # Start a named session for a specific registered workspace and agent agh session new --workspace checkout-api --agent reviewer --name review-api + # Override provider, model, and reasoning effort for this session only + agh session new --provider codex --model gpt-5.4 --reasoning-effort high + # Auto-register an absolute workspace path before creating the session agh session new --cwd "$PWD" --agent reviewer ``` @@ -27,13 +30,15 @@ agh session new [flags] ### Options ``` - --agent string Agent definition name (defaults to config default) - --channel string Optional network channel opt-in for the session - --cwd string Absolute workspace directory to auto-register - -h, --help help for new - --name string Optional session label - --provider string Optional provider override for this session - --workspace string Registered workspace name or ID + --agent string Agent definition name (defaults to config default) + --channel string Optional network channel opt-in for the session + --cwd string Absolute workspace directory to auto-register + -h, --help help for new + --model string Optional model override for this session + --name string Optional session label + --provider string Optional provider override for this session + --reasoning-effort string Optional reasoning effort hint (minimal|low|medium|high|xhigh) for providers that support it + --workspace string Registered workspace name or ID ``` ### Options inherited from parent commands diff --git a/packages/site/content/runtime/core/agents/definitions.mdx b/packages/site/content/runtime/core/agents/definitions.mdx index edc2e08ff..086a1a0a1 100644 --- a/packages/site/content/runtime/core/agents/definitions.mdx +++ b/packages/site/content/runtime/core/agents/definitions.mdx @@ -73,7 +73,7 @@ Session creation first resolves the agent name, then resolves the provider and r | Agent name | explicit `--agent` or API `agent_name` -> `defaults.agent` | `agent name is required`; run `agh install` or set `defaults.agent` | | Provider | `agent.provider` -> `defaults.provider` | `agent provider is required`; run `agh install` or set `agent.provider`/`defaults.provider` | | Command | `agent.command` -> provider `command` | `provider "" command is required` | -| Model | `agent.model` -> provider `default_model` | Empty is allowed when the provider has no default. | +| Model | `agent.model` -> provider `models.default` | Empty is allowed when the provider has no default. | | Tools | `agent.tools` | Must be exact canonical ToolIDs or namespace-prefix wildcards. | | Toolsets | `agent.toolsets` | Must be canonical ToolsetIDs. | | Deny tools | `agent.deny_tools` | Same grammar as `tools`; denies narrow later policy evaluation. | @@ -100,7 +100,7 @@ These are the frontmatter fields accepted by the current `internal/config` parse | `name` | string | yes | none | Must be non-empty. `LoadAgentDef(name)` also requires the parsed name to match the requested directory/name. No lowercase or hyphen pattern is enforced today. | | `provider` | string | no | `defaults.provider` | Provider ID such as `claude`, `codex`, or a custom configured provider. Required at resolution time unless `defaults.provider` is set. | | `command` | string | no | provider `command` | Overrides the provider launch command for this agent. Parsed with shell-style quoting, but launched without a shell. | -| `model` | string | no | provider `default_model` | Stored as the resolved model metadata. The current ACP `session/new` and `session/load` payloads do not send this field. | +| `model` | string | no | provider `models.default` | Stored as the resolved model metadata. The current ACP `session/new` and `session/load` payloads do not send this field. | | `tools` | string array | no | empty | Additional exact canonical ToolIDs such as `agh__skill_view`, or namespace-prefix wildcards such as `agh__skill_*` and `mcp__github__*`. Default discovery is runtime-applied. | | `toolsets` | string array | no | empty | Additional canonical ToolsetIDs such as `agh__tasks` or `linear__read`. Runtime discovery adds `agh__bootstrap` and `agh__catalog` unless denied. | | `deny_tools` | string array | no | empty | Same grammar as `tools`. Denies can overlap allows and are interpreted as a narrowing layer by registry policy. | diff --git a/packages/site/content/runtime/core/agents/meta.json b/packages/site/content/runtime/core/agents/meta.json index 1179409c0..44b923458 100644 --- a/packages/site/content/runtime/core/agents/meta.json +++ b/packages/site/content/runtime/core/agents/meta.json @@ -1,5 +1,13 @@ { "title": "Agents", "icon": "FileText", - "pages": ["definitions", "capabilities", "soul", "heartbeat", "providers", "spawning"] + "pages": [ + "definitions", + "capabilities", + "soul", + "heartbeat", + "providers", + "model-catalog", + "spawning" + ] } diff --git a/packages/site/content/runtime/core/agents/model-catalog.mdx b/packages/site/content/runtime/core/agents/model-catalog.mdx new file mode 100644 index 000000000..da07a397a --- /dev/null +++ b/packages/site/content/runtime/core/agents/model-catalog.mdx @@ -0,0 +1,295 @@ +--- +title: Provider Model Catalog +description: Daemon-owned model catalog — sources, refresh lifecycle, native HTTP/UDS endpoints, OpenAI-compatible projection, and extension model.source contract. +--- + +The model catalog is the daemon-owned authority for **pre-session** provider model selection. The +new-session dialog, CLI, HTTP, UDS, Host API, web settings, and the OpenAI-compatible model list +all read the same projected rows. Active session controls keep flowing through ACP `configOptions` +once a session is running. + +The catalog never blocks session creation. Every refresh is detached from the request lifetime, +runs under an explicit deadline, falls back to stale rows when refresh fails, and exposes source +health through structured status. + +## Why a separate catalog + +ACP `models.availableModels` is observed only after `session/new` or `session/load`, which is too +late for the new-session dialog and for agents picking a model through CLI/HTTP/UDS. The catalog +splits two concepts that used to live together: + +- Pre-session catalog: which provider models AGH knows about before creating a session. +- Active session config: which controls the running ACP session exposes right now. + +The catalog is daemon-owned, persisted, refreshable, extensible, and agent-manageable. Active +session `configOptions` continue to govern the live session — catalog rows never override them. + +## Source priorities and merge + +The merge key is `(provider_id, model_id)`. Source rows are preserved separately and merged on read. + +| Source kind | Priority | Origin | +| --------------- | -------------- | ------------------------------------------------------------------ | +| `config` | 120 | `[providers..models]` operator config. | +| `provider_live` | 110 | Live discovery sources for the provider account/runtime. | +| `extension` | 100 | Extension model sources (capability `model.source`). | +| `models_dev` | 50 | Cross-provider enrichment from `models.dev` (with stale fallback). | +| `builtin` | 10 | Built-in defaults shipped with the daemon. | +| `acp_session` | session-scoped | Observed during an ACP session; never rewrites global authority. | + +Higher-priority non-empty fields win; lower-priority sources fill missing fields. Ties resolve by +fresher `refreshed_at`, then ascending `source_id`. `models.dev` and `builtin` rows can enrich +metadata but never prove account-level availability. + +## Merged availability + +The merged projection exposes both nullable `available` and string `availability_state` so stale +live truth is visible instead of collapsed: + +| `availability_state` | Meaning | API `available` | API `stale` | +| -------------------- | ------------------------------------------------------------------ | --------------- | ----------- | +| `available_live` | Live or extension row confirmed availability with fresh data. | `true` | `false` | +| `available_stale` | Live or extension row confirmed availability but the row is stale. | `true` | `true` | +| `unavailable_live` | Live or extension row denied availability with fresh data. | `false` | `false` | +| `unavailable_stale` | Live or extension row denied availability but the row is stale. | `false` | `true` | +| `unknown` | Only catalog/builtin/config metadata is known. | `null` | depends | + +Manual model entry remains valid even when no source advertises the model. `models.curated` is +metadata, never an allowlist. + +## Native HTTP and UDS endpoints + +| Method | Path | Transports | Description | +| ------ | --------------------------------------------- | ---------- | -------------------------------------------------------- | +| GET | `/api/providers/models` | HTTP, UDS | List merged provider model catalog entries. | +| GET | `/api/providers/{provider_id}/models` | HTTP, UDS | List merged catalog entries for one provider. | +| POST | `/api/providers/models/refresh` | HTTP, UDS | Refresh sources across providers; returns source status. | +| POST | `/api/providers/{provider_id}/models/refresh` | HTTP, UDS | Refresh sources for one provider. | +| GET | `/api/providers/models/status` | HTTP, UDS | Source status across providers. | +| GET | `/api/providers/{provider_id}/models/status` | HTTP, UDS | Source status for one provider. | + +List and status endpoints accept these query parameters: + +- `provider_id`: filter by AGH provider id (only on the cross-provider list). +- `source_id`: filter by catalog source id (`config`, `models_dev`, `provider_live:`, `extension:`). +- `refresh=true`: refresh sources before listing. +- `include_stale=true`: include stale source rows in the merged projection. + +Refresh requests accept an optional JSON body: + +```json +{ + "source_id": "provider_live:codex", + "force": true, + "request_id": "rfsh-2026-05-07-abc" +} +``` + +`request_id` (or a daemon-generated value) is the `refresh_request_id` correlation key surfaced in +logs and source status events. Refresh work runs under daemon-owned lifetime: the request's cancel +does not cancel refresh work, and the daemon joins outstanding refresh workers during shutdown. + +The native list response shape is: + +```json +{ + "models": [ + { + "provider_id": "codex", + "model_id": "gpt-5.4", + "display_name": "GPT-5.4", + "sources": [ + { + "source_id": "config", + "source_kind": "config", + "priority": 120, + "stale": false, + "refreshed_at": "2026-05-07T18:32:11Z" + }, + { + "source_id": "models_dev", + "source_kind": "models_dev", + "priority": 50, + "stale": false, + "refreshed_at": "2026-05-07T03:00:00Z" + } + ], + "available": true, + "availability_state": "available_live", + "stale": false, + "refreshed_at": "2026-05-07T18:32:11Z", + "context_window": 256000, + "max_output_tokens": 32000, + "supports_tools": true, + "supports_reasoning": true, + "reasoning_efforts": ["minimal", "low", "medium", "high", "xhigh"], + "default_reasoning_effort": "medium" + } + ] +} +``` + +Source status payloads carry `source_id`, `provider_id`, `source_kind`, `refresh_state` +(`idle | refreshing | succeeded | failed`), `last_refresh`, `next_refresh`, `last_success`, +`row_count`, `stale`, and a redacted `last_error`. Raw secrets, command lines, OAuth material, or +provider response bodies never appear in `last_error`. + +The HTTP and UDS transports return canonical, byte-equal JSON for the same projection so cross- +transport regression tests can compare daemon output directly. + +## OpenAI-compatible projection + +A list-only OpenAI-compatible endpoint is registered on HTTP only. UDS does not expose this route. + +```http +GET /api/openai/v1/models +GET /api/openai/v1/models?provider_id=codex +``` + +Authentication uses the same bearer-auth and middleware contract as the rest of `/api/*`. CORS and +rate-limit behavior follow HTTP defaults. Errors are wrapped in the OpenAI envelope shape +(`{"error": {...}}`) but reuse AGH's normal status-code semantics. Refresh work is **not** +available through this endpoint; clients use the native catalog endpoints, the CLI, the Host API, +or the web for refreshes. + +```json +{ + "object": "list", + "data": [ + { + "id": "gpt-5.4", + "object": "model", + "created": 0, + "owned_by": "codex", + "agh": { + "provider_id": "codex", + "model_id": "gpt-5.4", + "display_name": "GPT-5.4", + "sources": ["config", "models_dev"], + "available": true, + "availability_state": "available_live", + "stale": false, + "supports_tools": true, + "supports_reasoning": true, + "reasoning_efforts": ["minimal", "low", "medium", "high", "xhigh"], + "context_window": 256000, + "max_output_tokens": 32000 + } + } + ] +} +``` + +The `agh` extension key carries AGH-specific metadata. Generated OpenAPI/SDK contracts treat it as +a typed object (`OpenAIModelAGHPayload`), not a free-form blob. + +## Provider models CLI + +The CLI surface lives under the singular `provider` namespace because the catalog is provider- +scoped and already neighbors `agh provider auth`. A top-level `agh models …` alias is intentionally +out of scope for the MVP to avoid forking the command contract before the first one is stable. + +```bash +agh provider models list [provider] -o json +agh provider models list [provider] --source models_dev --refresh --include-stale +agh provider models refresh [provider] -o json +agh provider models refresh [provider] --source provider_live:codex --force --request-id rfsh-abc +agh provider models status [provider] -o json +``` + +`refresh` returns the same source-status payloads as the native HTTP/UDS refresh endpoints — not a +single success line — so CI scripts and agents can act on partial-source failures without parsing +stderr. JSON output is canonical: identical between `agh provider models …` and the daemon HTTP +response for the same projection. + +See [`agh provider models`](/runtime/cli-reference/provider/models) for full flag and output +documentation generated from the cobra source. + +## Extension model.source contract + +Extensions can provide model rows by declaring the manifest provide capability `model.source`. The +daemon validates rows, persists them, and applies the normal merge policy. Extensions cannot own +global catalog state; they only contribute source rows. + +```toml +[capabilities] +provides = ["model.source"] + +[actions] +requires = ["models/list", "models/refresh", "models/status"] + +[security] +capabilities = ["model.read", "model.write"] + +[subprocess] +command = "node" +args = ["dist/index.js"] +``` + +Extensions that provide `model.source` must implement the AGH-to-extension service method +`models/list`. The daemon calls it with a deadline-bound context; the extension returns rows scoped +to provider IDs the extension declares. + +| Method | Direction | Purpose | +| ---------------- | --------------- | -------------------------------------------------------------------------- | +| `models/list` | AGH → extension | Extension returns provider model rows; daemon validates and persists them. | +| `models/list` | Host API | Extension reads the daemon-owned merged projection. | +| `models/refresh` | Host API | Extension triggers a daemon-owned source refresh. | +| `models/status` | Host API | Extension reads daemon-owned source status. | + +Capability grants follow the same area-based scheme as other Host API methods: + +| Method | Area | Notes | +| ---------------- | ------------- | ----------------------------------------------------------- | +| `models/list` | `model.read` | Returns the daemon-owned merged projection, not raw rows. | +| `models/status` | `model.read` | Returns daemon-owned source status. | +| `models/refresh` | `model.write` | Triggers daemon-owned refresh; rate-limited and serialized. | + +Marketplace extensions are limited to read-oriented grants by policy, so a marketplace extension +can declare `model.read` and read the projection but must request `model.write` explicitly to +trigger refresh, and refresh grants stay subject to the marketplace policy review. + +Extension source rows are always validated through `internal/modelcatalog`. Invalid rows produce a +recorded source status (with redacted error) instead of corrupting the merged projection. + +## Refresh lifetime and serialization + +- Catalog list calls return cached rows immediately when present. +- Refresh detaches from request cancellation via `context.WithoutCancel(ctx)` and re-attaches an + explicit deadline through `context.WithDeadline`. +- Refresh work is **serialized** per `provider_id` before any subprocess or provider-home work. +- Concurrent refresh requests for the same provider **coalesce** behind the in-flight refresh and + return identical source statuses when it finishes. +- Refreshes for different providers can run concurrently. +- The daemon joins outstanding refresh workers during shutdown. + +Discovery never creates an ACP session. Live provider sources fail closed by recording source +status; session creation is never blocked on a successful network refresh. Stale rows remain +available as a fallback while the catalog labels them stale. + +## Observability + +Catalog operations emit structured events with the following correlation keys: + +- `refresh_request_id` +- `provider_id` +- `source_id` +- `source_kind` +- `model_id` for row-scoped events +- `extension_name` for extension sources +- `session_id` only for ACP session config observations + +Tracked events include refresh started/succeeded/failed, source row count changes, stale fallback +usage, all-source failure, extension source denied/unavailable, and ACP config option captured/ +updated transitions. + +## Related pages + +- [Providers](/runtime/core/agents/providers) covers `[providers..models]` and the + per-provider `models.discovery` shape. +- [config.toml](/runtime/core/configuration/config-toml#modelcatalogsourcesmodelsdev) documents + `[model_catalog.sources.models_dev]` defaults. +- [`agh provider models`](/runtime/cli-reference/provider/models) is the CLI generated from the + cobra source. +- [Develop Extensions](/runtime/core/extensions/develop#model-source-extensions) covers the + manifest provide capability `model.source` and Host API model methods. diff --git a/packages/site/content/runtime/core/agents/providers.mdx b/packages/site/content/runtime/core/agents/providers.mdx index ad26fa0ba..b40133e77 100644 --- a/packages/site/content/runtime/core/agents/providers.mdx +++ b/packages/site/content/runtime/core/agents/providers.mdx @@ -29,7 +29,7 @@ provider = "claude" The built-in registry lives in `internal/config/provider.go`. -| Provider ID | Harness | Runtime provider | Command | Default model | Auth mode | Credential target | +| Provider ID | Harness | Runtime provider | Command | `models.default` | Auth mode | Credential target | | ------------------- | -------- | ------------------- | ---------------------------------------------------------------- | --------------------------- | -------------- | -------------------- | | `claude` | `acp` | `claude` | `npx -y @agentclientprotocol/claude-agent-acp@latest` | `claude-sonnet-4-6` | `native_cli` | provider login | | `codex` | `acp` | `codex` | `npx -y @zed-industries/codex-acp@latest` | `gpt-5.4` | `native_cli` | provider login | @@ -94,7 +94,7 @@ Provider overrides and custom providers are configured in `config.toml`. | --------------------- | ------ | ------------------------ | ------------------------------------------------------------------------------------------------------------- | | `command` | string | yes for custom providers | Launch command for the ACP subprocess. Overrides a built-in command when set. | | `display_name` | string | no | Operator-facing label shown in settings and provider pickers. | -| `default_model` | string | no | Used when an `AGENT.md` omits `model`. Native `pi` receives it through ACP model selection. | +| `models` | table | no | Nested model config block (`models.default`, `models.curated`, `models.discovery`). See "Provider models". | | `harness` | string | no | `acp` for direct ACP launch or `pi_acp` for providers launched through the Pi ACP adapter. Defaults to `acp`. | | `runtime_provider` | string | no | Downstream provider id used by harnesses such as Pi. Defaults to the AGH provider id. | | `transport` | string | no | Optional Pi model-provider API family hint for custom providers. | @@ -110,6 +110,51 @@ Provider overrides and custom providers are configured in `config.toml`. AGH overlays provider config on top of a built-in provider when the name matches. Unknown provider names are accepted only when they have a `[providers.]` entry. +The flat keys `default_model`, `supported_models`, and `supports_reasoning_effort` are no longer +accepted. Config that still sets them is rejected at load time with a deterministic hard-cut error +that names the exact path. Move every value into the nested `[providers..models]` block below. + +## Provider models + +Each provider declares pre-session model defaults and curated metadata under `[providers..models]`. +Pre-session model selection is served by the daemon-owned model catalog. The catalog merges builtin +defaults, the operator config, the optional `models.dev` enrichment source, live provider discovery +sources, and extension model sources, then projects them through HTTP, UDS, CLI, the OpenAI-compatible +projection, the Host API, and the web. Active ACP `configOptions` continue to govern model and +reasoning controls inside a running session. + +| Field | Type | Required | Runtime behavior | +| ------------------------------------------- | ------- | --------------------------- | --------------------------------------------------------------------------------------------------------------------------------- | +| `models.default` | string | no | Default model when an `AGENT.md` omits `model`. Free-form: it does not need to appear in `models.curated`. | +| `models.curated` | array | no | Curated entries shown in pickers and projected as `config` rows in the catalog. Not an allowlist. | +| `models.curated[].id` | string | yes per entry | Provider model identifier sent to the runtime. Must be unique inside the provider. | +| `models.curated[].display_name` | string | no | Optional human label. | +| `models.curated[].context_window` | integer | no | Context window in tokens. | +| `models.curated[].max_input_tokens` | integer | no | Maximum input tokens. | +| `models.curated[].max_output_tokens` | integer | no | Maximum output tokens. | +| `models.curated[].supports_tools` | bool | no | Whether the model supports tool calls. | +| `models.curated[].supports_reasoning` | bool | no | Whether the model supports reasoning effort. | +| `models.curated[].reasoning_efforts` | array | no | Allowed reasoning levels (`minimal`, `low`, `medium`, `high`, `xhigh`). Blank entries are rejected. | +| `models.curated[].default_reasoning_effort` | string | no | Per-model default reasoning level. Must appear in `reasoning_efforts` when both are set. | +| `models.curated[].cost_input_per_million` | number | no | Display-only input cost per million tokens. | +| `models.curated[].cost_output_per_million` | number | no | Display-only output cost per million tokens. | +| `models.discovery.enabled` | bool | no | Enables the side-effect-free discovery adapter for this provider. Defaults to `false` for providers without a built-in safe path. | +| `models.discovery.command` | string | required for some providers | Side-effect-free discovery command (mutually exclusive with `endpoint` unless the adapter documents both). | +| `models.discovery.endpoint` | string | required for some providers | Side-effect-free discovery endpoint URL. | +| `models.discovery.timeout` | string | no | Per-discovery timeout duration (defaults to the model catalog timeout). | + +Discovery uses the resolved provider auth, env, and home policy and never creates an ACP session. +When a discovery path is unavailable or fails, the model catalog records a source status and falls +back to stale or lower-priority rows; session creation never depends on a successful discovery. + +Session creation can override `provider`, `model`, and `reasoning_effort` for one launch. `model` is +free-form: `models.curated` is operator metadata, not an allowlist, so a manual model ID outside the +curated list is accepted. Reasoning effort is validated against `minimal`, `low`, `medium`, `high`, +and `xhigh`, and AGH only forwards it when the resolved curated entry advertises it through +`supports_reasoning` and `reasoning_efforts`. After the session starts, AGH switches to active ACP +`configOptions` (or legacy `session/set_model` when the agent does not advertise config options) and +the catalog metadata stops governing the live session. + Native provider auth state belongs to the provider. Run the provider's own login command, such as `claude auth login`, `codex login`, `opencode auth login`, or Pi's `/login`, outside AGH or through a configured `auth_login_command`. The built-in `pi` provider exposes @@ -124,11 +169,21 @@ agents that omit `model`. ```toml [providers.claude] -default_model = "claude-sonnet-4-6" auth_mode = "native_cli" auth_status_command = "claude auth status" auth_login_command = "claude auth login" +[providers.claude.models] +default = "claude-sonnet-4-6" + +[[providers.claude.models.curated]] +id = "claude-sonnet-4-6" +display_name = "Claude Sonnet 4.6" + +[[providers.claude.models.curated]] +id = "claude-haiku-4-5" +display_name = "Claude Haiku 4.5" + [[providers.claude.mcp_servers]] name = "github" command = "npx" @@ -153,7 +208,13 @@ harness = "pi_acp" auth_mode = "bound_secret" runtime_provider = "openrouter" command = "npx -y pi-acp@latest" -default_model = "openai/gpt-5.4" + +[providers.openrouter.models] +default = "openai/gpt-5.4" + +[[providers.openrouter.models.curated]] +id = "openai/gpt-5.4" +display_name = "OpenAI GPT-5.4" [[providers.openrouter.credential_slots]] name = "api_key" @@ -188,11 +249,17 @@ built-ins. ```toml [providers.local-agent] command = "local-agent --acp --stdio" -default_model = "local-default" auth_mode = "native_cli" auth_status_command = "local-agent auth status" auth_login_command = "local-agent auth login" +[providers.local-agent.models] +default = "local-default" + +[[providers.local-agent.models.curated]] +id = "local-default" +display_name = "Local Default" + [[providers.local-agent.mcp_servers]] name = "filesystem-index" command = "local-index-mcp" @@ -254,7 +321,7 @@ Set environment variables before starting the daemon instead. ## Models and authentication -`model` and `default_model` are resolved and exposed as runtime metadata. Direct `acp` providers +`model` and `models.default` are resolved and exposed as runtime metadata. Direct `acp` providers receive the normal ACP startup flow. Native `pi` sessions receive the resolved `runtime_provider/model` through ACP model selection. Wrapped Pi-backed API-key providers receive session-local Pi `settings.json` and `models.json` so Pi can run with the AGH-selected provider, @@ -327,4 +394,6 @@ agent's role, permissions, MCP servers, and startup prompt in `AGENT.md`. - [Agent Definitions](/runtime/core/agents/definitions) explains the `AGENT.md` fields that reference providers. - [Spawning](/runtime/core/agents/spawning) shows exactly how the resolved provider command becomes a running ACP process. +- [Provider Model Catalog](/runtime/core/agents/model-catalog) covers the daemon-owned catalog, sources, refresh lifecycle, OpenAI-compatible projection, and extension `model.source`. - [CLI agent reference](/runtime/cli-reference/agent) lists the current `agh agent` inspection commands. +- [`agh provider models` CLI](/runtime/cli-reference/provider/models) inspects and refreshes the catalog without a UI. diff --git a/packages/site/content/runtime/core/configuration/agent-md.mdx b/packages/site/content/runtime/core/configuration/agent-md.mdx index bbb0345d6..497a9d6dd 100644 --- a/packages/site/content/runtime/core/configuration/agent-md.mdx +++ b/packages/site/content/runtime/core/configuration/agent-md.mdx @@ -27,7 +27,7 @@ The parser is strict. Unknown frontmatter fields fail loading. | `name` | string | required | Non-empty. Must match the requested agent name when loaded by name. | Agent identity and discovery key. | | `provider` | string | `[defaults].provider` | Built-in provider key or custom provider key. | Provider used to resolve command, model, auth mode, and runtime metadata. | | `command` | string | Provider `command` | Non-empty when overriding. | Agent-specific ACP launch command. | -| `model` | string | Provider `default_model` | Any string. | Agent-specific model metadata. | +| `model` | string | Provider `models.default` | Any string. | Agent-specific model metadata. | | `tools` | string array | empty | Exact canonical ToolIDs or namespace-prefix wildcards. | Additional agent tool allowlist grammar. | | `toolsets` | string array | empty | Canonical ToolsetIDs. | Additional named tool bundles allowed for the agent. | | `deny_tools` | string array | empty | Exact canonical ToolIDs or namespace-prefix wildcards. | Tool denies that always narrow the agent grants. | @@ -80,7 +80,7 @@ name: reviewer # Optional if [defaults].provider is set in config.toml. provider: claude -# Optional. Defaults to the provider default_model. +# Optional. Defaults to the provider models.default. model: claude-sonnet-4-6 # Optional. Add only extra ToolIDs beyond default discovery. @@ -163,7 +163,7 @@ Put blocking findings first and cite the relevant file or symbol. | Attribute | Value | | ------------ | ------------------------------------------------------------ | | Type | string | -| Default | Selected provider `default_model` | +| Default | Selected provider `models.default` | | Required | no | | Valid values | Any string. Empty is allowed if the provider has no default. | | Description | Agent-specific model metadata. | @@ -453,7 +453,7 @@ Sidecar behavior: | Agent name | explicit CLI/API agent name, then `[defaults].agent` | Fails if empty. | | Provider | `AGENT.md` `provider`, then `[defaults].provider` | Fails if still empty. | | Command | `AGENT.md` `command`, then provider `command` | Fails if empty after provider resolution. | -| Model | `AGENT.md` `model`, then provider `default_model` | Empty is allowed. | +| Model | `AGENT.md` `model`, then provider `models.default` | Empty is allowed. | | Tools | `AGENT.md` `tools` | Must be exact canonical ToolIDs or approved namespace-prefix wildcards. | | Toolsets | `AGENT.md` `toolsets` | Must be canonical ToolsetIDs. | | Deny tools | `AGENT.md` `deny_tools` | Same grammar as `tools`; denies narrow later policy evaluation. | diff --git a/packages/site/content/runtime/core/configuration/config-toml.mdx b/packages/site/content/runtime/core/configuration/config-toml.mdx index c3884e6e7..60ac2e156 100644 --- a/packages/site/content/runtime/core/configuration/config-toml.mdx +++ b/packages/site/content/runtime/core/configuration/config-toml.mdx @@ -24,51 +24,52 @@ Use only `[sandboxes.]` for session execution boundaries. ## Quick Reference -| Section | Purpose | Default | -| ------------------------------ | ------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- | -| `[daemon]` | Unix domain socket path for CLI and UDS API traffic. | `socket = "$AGH_HOME/daemon.sock"` | -| `[http]` | HTTP and SSE bind address. | `host = "localhost"`, `port = 2123` | -| `[defaults]` | Default agent, provider, and sandbox resolution. | `agent = "general"`, `provider = ""`, `sandbox = ""` | -| `[limits]` | Daemon-level session and agent caps. | `max_sessions = 10`, `max_concurrent_agents = 20` | -| `[session.limits]` | Session-scoped wall-clock timeout. | `timeout = "0s"` | -| `[session.supervision]` | Runtime activity heartbeat, progress, warning, and inactivity timeout controls. | heartbeat 30 seconds, progress 10 minutes, warning 15 minutes, timeout 30 minutes | -| `[agents.soul]` | Optional `SOUL.md` parsing, body limits, and compact projection budget. | enabled, 32 KiB body, 2 KiB compact projection | -| `[agents.heartbeat]` | Optional `HEARTBEAT.md` policy bounds, wake cadence/limits, and health timing. | enabled, 32 KiB body, 5 min/30 min intervals, 25 wakes per cycle, 168 h retention | -| `[permissions]` | Default permission mode. | `mode = "approve-all"` | -| `[tools]` | Tool registry lifecycle, hosted MCP enablement, and result budget defaults. | enabled, hosted MCP enabled, 256 KiB result default | -| `[tools.hosted_mcp]` | Hosted MCP session bind nonce lifecycle. | 30 seconds | -| `[tools.policy]` | External tool source defaults, approval timeout, and trusted sources. | external tools disabled, 120 second approval timeout, no trusted sources | -| `[[mcp_servers]]` | Top-level MCP servers passed to agents. | empty list | -| `[providers.]` | Built-in provider override or custom provider definition. | empty map plus built-ins | -| `[sandboxes.]` | Local or provider-backed execution sandbox profiles. | local backend when no profile is selected | -| `[observability]` | Event summary retention and global byte cap. | enabled, 7 days, 1 GiB | -| `[observability.transcripts]` | Transcript segment sizing and per-session cap. | enabled, 1 MiB segments, 256 MiB per session | -| `[log]` | Structured log level. | `level = "info"` | -| `[memory]` | Persistent memory runtime and global memory directory. | enabled, `$AGH_HOME/memory` | -| `[memory.controller]` | Hybrid write controller mode, latency, and fallback op. | hybrid, 300 ms, noop | -| `[memory.controller.llm]` | Controller LLM tiebreaker. | enabled, `anthropic/claude-haiku-4`, 250 ms, top_k 5 | -| `[memory.controller.policy]` | Content/rate caps and allowed write origins. | 4096 chars, 60 writes/min, all canonical origins | -| `[memory.recall]` | Deterministic recall: top-K, weights, freshness, signal queue. | top-K 5, raw 50, weighted fusion | -| `[memory.decisions]` | Decision WAL retention and per-row body cap. | 90 days, audit summary on, 64 KiB body cap | -| `[memory.extractor]` | Post-message extractor and bounded queue. | enabled, post_message mode, capacity 1, coalesce 16 | -| `[memory.dream]` | Dreaming runtime, gates, and scoring. | enabled, agent `dreaming-curator`, 24 h, 3 sessions, 30 min ticker | -| `[memory.session]` | Forensic session ledger materialization, archive, and unbound partition. | jsonl, `$AGH_HOME/sessions`, 24 h grace, 30-day cold archive, `_unbound` partition | -| `[memory.daily]` | Daily-log retention and rotation. | 1 MiB, 5000 lines, 7-day window, 30-day cold archive, sweep at 03:00 | -| `[memory.file]` | Curated memory file body limits. | 200 lines, 25 KiB | -| `[memory.provider]` | Active memory provider selection and circuit breaker. | bundled local, 2 s timeout, 5 failures, 30 s cooldown | -| `[memory.workspace]` | Workspace identity file location and auto-creation. | `/.agh/workspace.toml`, auto-create on first touch | -| `[skills]` | Skill discovery, polling, disable list, and marketplace trust gates. | enabled, poll every 3 seconds | -| `[skills.marketplace]` | Skill registry override. | unset | -| `[extensions.marketplace]` | Extension registry override. | unset | -| `[automation]` | Automation scheduler defaults. | enabled, UTC, 5 concurrent jobs | -| `[[automation.jobs]]` | Scheduled automation jobs. | empty list | -| `[[automation.triggers]]` | Event-driven automation triggers. | empty list | -| `[autonomy.coordinator]` | Coordinator session bootstrap for workspace-scoped task runs. | disabled, agent `coordinator`, TTL 2 hours, 5 children, 1 active per workspace | -| `[task.orchestration]` | Bounds for run summaries, context bundles, scheduler health, and max-runtime. | 4 KiB summaries, 8 KiB context, prior 5/recent 50 events, spawn fail limit 5 | -| `[task.orchestration.profile]` | Defaults and gates for task execution profiles. | inherit coordinator/worker/sandbox; provider override + sandbox `none` allowed | -| `[task.orchestration.review]` | Defaults and bounds for the post-terminal review gate. | policy `none`, max rounds 3, max attempts 2, timeout 20m, failure `block_task` | -| `[[hooks.declarations]]` | Config-defined runtime hooks. | empty list | -| `[network]` | Experimental AGH network runtime. | enabled, channel `default` | +| Section | Purpose | Default | +| ------------------------------------ | ------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- | +| `[daemon]` | Unix domain socket path for CLI and UDS API traffic. | `socket = "$AGH_HOME/daemon.sock"` | +| `[http]` | HTTP and SSE bind address. | `host = "localhost"`, `port = 2123` | +| `[defaults]` | Default agent, provider, and sandbox resolution. | `agent = "general"`, `provider = ""`, `sandbox = ""` | +| `[limits]` | Daemon-level session and agent caps. | `max_sessions = 10`, `max_concurrent_agents = 20` | +| `[session.limits]` | Session-scoped wall-clock timeout. | `timeout = "0s"` | +| `[session.supervision]` | Runtime activity heartbeat, progress, warning, and inactivity timeout controls. | heartbeat 30 seconds, progress 10 minutes, warning 15 minutes, timeout 30 minutes | +| `[agents.soul]` | Optional `SOUL.md` parsing, body limits, and compact projection budget. | enabled, 32 KiB body, 2 KiB compact projection | +| `[agents.heartbeat]` | Optional `HEARTBEAT.md` policy bounds, wake cadence/limits, and health timing. | enabled, 32 KiB body, 5 min/30 min intervals, 25 wakes per cycle, 168 h retention | +| `[permissions]` | Default permission mode. | `mode = "approve-all"` | +| `[tools]` | Tool registry lifecycle, hosted MCP enablement, and result budget defaults. | enabled, hosted MCP enabled, 256 KiB result default | +| `[tools.hosted_mcp]` | Hosted MCP session bind nonce lifecycle. | 30 seconds | +| `[tools.policy]` | External tool source defaults, approval timeout, and trusted sources. | external tools disabled, 120 second approval timeout, no trusted sources | +| `[[mcp_servers]]` | Top-level MCP servers passed to agents. | empty list | +| `[providers.]` | Built-in provider override or custom provider definition. | empty map plus built-ins | +| `[model_catalog.sources.models_dev]` | `models.dev` enrichment source (cross-provider). | enabled, `https://models.dev/api.json`, 24 h TTL, 10 s timeout | +| `[sandboxes.]` | Local or provider-backed execution sandbox profiles. | local backend when no profile is selected | +| `[observability]` | Event summary retention and global byte cap. | enabled, 7 days, 1 GiB | +| `[observability.transcripts]` | Transcript segment sizing and per-session cap. | enabled, 1 MiB segments, 256 MiB per session | +| `[log]` | Structured log level. | `level = "info"` | +| `[memory]` | Persistent memory runtime and global memory directory. | enabled, `$AGH_HOME/memory` | +| `[memory.controller]` | Hybrid write controller mode, latency, and fallback op. | hybrid, 300 ms, noop | +| `[memory.controller.llm]` | Controller LLM tiebreaker. | enabled, `anthropic/claude-haiku-4`, 250 ms, top_k 5 | +| `[memory.controller.policy]` | Content/rate caps and allowed write origins. | 4096 chars, 60 writes/min, all canonical origins | +| `[memory.recall]` | Deterministic recall: top-K, weights, freshness, signal queue. | top-K 5, raw 50, weighted fusion | +| `[memory.decisions]` | Decision WAL retention and per-row body cap. | 90 days, audit summary on, 64 KiB body cap | +| `[memory.extractor]` | Post-message extractor and bounded queue. | enabled, post_message mode, capacity 1, coalesce 16 | +| `[memory.dream]` | Dreaming runtime, gates, and scoring. | enabled, agent `dreaming-curator`, 24 h, 3 sessions, 30 min ticker | +| `[memory.session]` | Forensic session ledger materialization, archive, and unbound partition. | jsonl, `$AGH_HOME/sessions`, 24 h grace, 30-day cold archive, `_unbound` partition | +| `[memory.daily]` | Daily-log retention and rotation. | 1 MiB, 5000 lines, 7-day window, 30-day cold archive, sweep at 03:00 | +| `[memory.file]` | Curated memory file body limits. | 200 lines, 25 KiB | +| `[memory.provider]` | Active memory provider selection and circuit breaker. | bundled local, 2 s timeout, 5 failures, 30 s cooldown | +| `[memory.workspace]` | Workspace identity file location and auto-creation. | `/.agh/workspace.toml`, auto-create on first touch | +| `[skills]` | Skill discovery, polling, disable list, and marketplace trust gates. | enabled, poll every 3 seconds | +| `[skills.marketplace]` | Skill registry override. | unset | +| `[extensions.marketplace]` | Extension registry override. | unset | +| `[automation]` | Automation scheduler defaults. | enabled, UTC, 5 concurrent jobs | +| `[[automation.jobs]]` | Scheduled automation jobs. | empty list | +| `[[automation.triggers]]` | Event-driven automation triggers. | empty list | +| `[autonomy.coordinator]` | Coordinator session bootstrap for workspace-scoped task runs. | disabled, agent `coordinator`, TTL 2 hours, 5 children, 1 active per workspace | +| `[task.orchestration]` | Bounds for run summaries, context bundles, scheduler health, and max-runtime. | 4 KiB summaries, 8 KiB context, prior 5/recent 50 events, spawn fail limit 5 | +| `[task.orchestration.profile]` | Defaults and gates for task execution profiles. | inherit coordinator/worker/sandbox; provider override + sandbox `none` allowed | +| `[task.orchestration.review]` | Defaults and bounds for the post-terminal review gate. | policy `none`, max rounds 3, max attempts 2, timeout 20m, failure `block_task` | +| `[[hooks.declarations]]` | Config-defined runtime hooks. | empty list | +| `[network]` | Experimental AGH network runtime. | enabled, channel `default` | ## Load And Merge Order @@ -208,13 +209,31 @@ client_secret_ref = "env:REMOTE_DOCS_MCP_CLIENT_SECRET" scopes = ["mcp.read", "mcp.write"] [providers.claude] -# Overrides the built-in Claude provider command/model and records native auth diagnostics. +# Overrides the built-in Claude provider command and records native auth diagnostics. command = "npx -y @agentclientprotocol/claude-agent-acp@latest" -default_model = "claude-sonnet-4-6" auth_mode = "native_cli" auth_status_command = "claude auth status" auth_login_command = "claude auth login" +[providers.claude.models] +# Pre-session catalog defaults consumed by the daemon-owned model catalog. +default = "claude-sonnet-4-6" + +[[providers.claude.models.curated]] +id = "claude-sonnet-4-6" +display_name = "Claude Sonnet 4.6" + +[[providers.claude.models.curated]] +id = "claude-haiku-4-5" +display_name = "Claude Haiku 4.5" + +[model_catalog.sources.models_dev] +# Optional models.dev enrichment source for catalog metadata. +enabled = true +endpoint = "https://models.dev/api.json" +ttl = "24h" +timeout = "10s" + [[providers.claude.mcp_servers]] name = "github" command = "npx" @@ -698,23 +717,67 @@ Provider keys override built-ins with the same name or create a custom provider. `goose`, `hermes`, `junie`, `kimi-cli`, `openclaw`, `openhands`, `qoder`, `qwen-code`, `pi`, `openrouter`, `zai`, `moonshot`, `vercel-ai-gateway`, `xai`, `minimax`, `mistral`, and `groq`. -| Field | Type | Default | Valid values | Description | -| --------------------- | --------------------------- | --------------------------------------------------------------------------------- | ------------------------------------------------- | -------------------------------------------------------------------------------------------- | -| `command` | string | Built-in command or empty for custom providers. | Required after built-in plus override resolution. | ACP launch command for this provider. | -| `display_name` | string | Built-in label or empty. | Any string. | Operator-facing label shown in settings and provider pickers. | -| `default_model` | string | Built-in model or empty. | Any string. | Model used when an agent omits `model`; native `pi` receives it through ACP model selection. | -| `harness` | string | `acp` unless a built-in sets `pi_acp`. | `acp`, `pi_acp`. | Launch strategy. `pi_acp` routes the provider through Pi's ACP adapter. | -| `runtime_provider` | string | Provider key. | Harness-specific provider id. | Downstream provider id used by Pi and other harnesses. | -| `transport` | string | empty. | Harness-specific string. | Optional Pi models override transport/API family. | -| `base_url` | string | empty. | URL string. | Optional Pi models override base URL for custom gateways. | -| `auth_mode` | string | `bound_secret` only when credential slots are configured; otherwise `native_cli`. | `native_cli`, `bound_secret`, `none`. | Declares whether auth belongs to the provider CLI, AGH secret binding, or no auth. | -| `env_policy` | string | `filtered`. | `filtered`, `isolated`. | Controls which daemon environment variables the provider subprocess inherits. | -| `home_policy` | string | `operator`. | `operator`, `isolated`. | Controls whether native CLI state comes from the operator home or an AGH provider home. | -| `auth_status_command` | string | empty. | Shell-style command string. | Optional status probe run by `agh provider auth status `. | -| `auth_login_command` | string | empty. | Shell-style command string. | Optional login command run by `agh provider auth login `. | -| `session_mcp` | boolean | `true` unless a provider disables it. | `true`, `false`. | Enables AGH session MCP injection for providers that support it. | -| `credential_slots` | array | empty. | See below. | Bound secret refs injected into provider subprocess environment variables at launch. | -| `mcp_servers` | array of MCP server objects | empty. | Same shape as `[[mcp_servers]]`. | Provider-specific MCP servers merged after top-level config and before agent MCP servers. | +| Field | Type | Default | Valid values | Description | +| --------------------- | --------------------------- | --------------------------------------------------------------------------------- | ------------------------------------------------- | ----------------------------------------------------------------------------------------- | +| `command` | string | Built-in command or empty for custom providers. | Required after built-in plus override resolution. | ACP launch command for this provider. | +| `display_name` | string | Built-in label or empty. | Any string. | Operator-facing label shown in settings and provider pickers. | +| `models` | table | Built-in defaults or empty. | Nested model config block (see below). | Pre-session model defaults, curated metadata, and optional discovery wiring. | +| `harness` | string | `acp` unless a built-in sets `pi_acp`. | `acp`, `pi_acp`. | Launch strategy. `pi_acp` routes the provider through Pi's ACP adapter. | +| `runtime_provider` | string | Provider key. | Harness-specific provider id. | Downstream provider id used by Pi and other harnesses. | +| `transport` | string | empty. | Harness-specific string. | Optional Pi models override transport/API family. | +| `base_url` | string | empty. | URL string. | Optional Pi models override base URL for custom gateways. | +| `auth_mode` | string | `bound_secret` only when credential slots are configured; otherwise `native_cli`. | `native_cli`, `bound_secret`, `none`. | Declares whether auth belongs to the provider CLI, AGH secret binding, or no auth. | +| `env_policy` | string | `filtered`. | `filtered`, `isolated`. | Controls which daemon environment variables the provider subprocess inherits. | +| `home_policy` | string | `operator`. | `operator`, `isolated`. | Controls whether native CLI state comes from the operator home or an AGH provider home. | +| `auth_status_command` | string | empty. | Shell-style command string. | Optional status probe run by `agh provider auth status `. | +| `auth_login_command` | string | empty. | Shell-style command string. | Optional login command run by `agh provider auth login `. | +| `session_mcp` | boolean | `true` unless a provider disables it. | `true`, `false`. | Enables AGH session MCP injection for providers that support it. | +| `credential_slots` | array | empty. | See below. | Bound secret refs injected into provider subprocess environment variables at launch. | +| `mcp_servers` | array of MCP server objects | empty. | Same shape as `[[mcp_servers]]`. | Provider-specific MCP servers merged after top-level config and before agent MCP servers. | + +The flat keys `default_model`, `supported_models`, and `supports_reasoning_effort` are no longer +accepted. Config that still sets them is rejected with a deterministic hard-cut error citing the +exact path. Move every value into `[providers..models]` below. + +### `[providers..models]` + +Pre-session model defaults and curated metadata are owned by the daemon-owned model catalog. The +catalog merges builtin defaults, the operator config, the optional `models.dev` enrichment source, +live provider discovery sources, and extension model sources, and exposes the result through HTTP, +UDS, CLI, the OpenAI-compatible projection, the Host API, and the web. Active ACP `configOptions` +continue to govern model and reasoning controls inside a running session, so curated entries are +metadata, never an allowlist. + +| Field | Type | Default | Valid values | Description | +| ------------------------------------------- | ------- | ---------------------------------- | ----------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------- | +| `models.default` | string | Built-in default or empty. | Any non-blank model id. | Model used when an agent omits `model`. Free-form: it does not need to appear in `models.curated`. | +| `models.curated` | array | empty. | See entry fields below. | Curated model entries projected as `config` rows. Not an allowlist; manual model ids stay valid. | +| `models.curated[].id` | string | required per entry. | Unique provider model id. | Provider model identifier sent to the runtime. | +| `models.curated[].display_name` | string | empty. | Any string. | Optional human label. | +| `models.curated[].context_window` | integer | empty. | Positive integer. | Context window in tokens. | +| `models.curated[].max_input_tokens` | integer | empty. | Positive integer. | Maximum input tokens. | +| `models.curated[].max_output_tokens` | integer | empty. | Positive integer. | Maximum output tokens. | +| `models.curated[].supports_tools` | boolean | empty. | `true`, `false`. | Whether the model supports tool calls. | +| `models.curated[].supports_reasoning` | boolean | empty. | `true`, `false`. | Whether the model supports reasoning effort. | +| `models.curated[].reasoning_efforts` | array | empty. | Subset of `minimal`, `low`, `medium`, `high`, `xhigh`. Blanks rejected. | Allowed reasoning levels for this model. | +| `models.curated[].default_reasoning_effort` | string | empty. | Member of `reasoning_efforts` when both are set. | Per-model default reasoning level. Must appear in `reasoning_efforts` when both are set. | +| `models.curated[].cost_input_per_million` | number | empty. | Non-negative number. | Display-only input cost per million tokens. | +| `models.curated[].cost_output_per_million` | number | empty. | Non-negative number. | Display-only output cost per million tokens. | +| `models.discovery.enabled` | boolean | `false` unless a built-in opts in. | `true`, `false`. | Enables the side-effect-free model discovery adapter for this provider. | +| `models.discovery.command` | string | empty. | Shell-style command string. | Side-effect-free discovery command. Mutually exclusive with `endpoint` unless the adapter documents both. | +| `models.discovery.endpoint` | string | empty. | Absolute HTTP(S) URL. | Side-effect-free discovery endpoint URL. Mutually exclusive with `command` unless the adapter documents both. | +| `models.discovery.timeout` | string | model catalog source timeout. | Positive duration (`10s`, `45s`, `2m`). | Per-discovery timeout. | + +Discovery adapters use the resolved provider auth, env, and home policy. Discovery never creates an +ACP session; if discovery is unavailable or fails, the catalog records source status and falls back +to stale or lower-priority rows. Session creation never depends on a successful discovery refresh. + +`OpenClaw`, `Hermes`, and `Pi` only register a live provider source when `models.discovery.enabled += true` and either `command` or `endpoint` is set; with no discovery wiring, those providers stay +on builtin/config rows plus the optional `models.dev` enrichment. + +Top-level `[model_catalog.sources.models_dev]` controls the cross-provider `models.dev` enrichment +source documented in the next section. `env_policy = "filtered"` preserves ordinary operator context such as `PATH`, `HOME`, and locale while stripping secret-shaped daemon variables before launch. `env_policy = "isolated"` starts from @@ -750,7 +813,7 @@ direct `pi` provider use `native_cli` by default and do not preflight provider A API-key wrappers such as OpenRouter, z.ai, Moonshot/Kimi, Vercel AI Gateway, xAI, MiniMax, Mistral, and Groq default to AGH-managed `bound_secret` slots while AGH runs Pi under the hood. -| Built-in | Harness | Runtime provider | Default model | Auth mode | Credential target | +| Built-in | Harness | Runtime provider | `models.default` | Auth mode | Credential target | | ------------------- | -------- | ------------------- | --------------------------- | -------------- | -------------------- | | `claude` | `acp` | `claude` | `claude-sonnet-4-6` | `native_cli` | provider login | | `codex` | `acp` | `codex` | `gpt-5.4` | `native_cli` | provider login | @@ -779,6 +842,36 @@ and Groq default to AGH-managed `bound_secret` slots while AGH runs Pi under the | `mistral` | `pi_acp` | `mistral` | `devstral-medium-latest` | `bound_secret` | `MISTRAL_API_KEY` | | `groq` | `pi_acp` | `groq` | `openai/gpt-oss-120b` | `bound_secret` | `GROQ_API_KEY` | +## `[model_catalog.sources.models_dev]` + +The `models.dev` enrichment source feeds catalog metadata such as token windows, tool support, and +reasoning hints. It is cross-provider, daemon-owned, and never proves account-level availability. + +| Field | Type | Default | Valid values | Description | +| ---------- | -------- | ----------------------------- | --------------------- | ------------------------------------------------------------------------------------------------------------------------ | +| `enabled` | boolean | `true` | `true`, `false` | Toggles the `models.dev` source. When `false`, the source still appears in status but performs no outbound calls. | +| `endpoint` | string | `https://models.dev/api.json` | Absolute HTTP(S) URL. | Endpoint queried for the cross-provider model index. Must be HTTP(S) when set. | +| `ttl` | duration | `24h` | Positive Go duration. | Cache lifetime before catalog rows are flagged stale and a refresh is scheduled. | +| `timeout` | duration | `10s` | Positive Go duration. | Per-call HTTP timeout for the source. The HTTP client always uses an explicit deadline (no shared `http.DefaultClient`). | + +```toml +[model_catalog.sources.models_dev] +enabled = true +endpoint = "https://models.dev/api.json" +ttl = "24h" +timeout = "10s" +``` + +`models.dev` rows write provider-scoped status: a single `models.dev` refresh records one +`(source_id="models_dev", provider_id)` status row per AGH provider mapped, never a global +empty-provider sentinel. Disabled sources still expose status but skip outbound calls. Refresh +lifetime is daemon-owned: requests trigger refreshes but the daemon detaches the work, applies an +explicit deadline, and joins outstanding refresh workers during shutdown. + +Live provider discovery is configured per provider through `[providers..models.discovery]`. +Discovery refresh is serialized per `provider_id` before any subprocess or provider-home work, and +concurrent refresh requests for the same provider coalesce behind the in-flight refresh. + ## `[observability]` | Field | Type | Default | Valid values | Description | @@ -1265,4 +1358,6 @@ context.post_compact - [mcp.json](/runtime/core/configuration/mcp-json) documents JSON sidecars and whole-object replacement. - [AGENT.md](/runtime/core/configuration/agent-md) documents agent-local `mcp_servers` and hooks. +- [Providers](/runtime/core/agents/providers) covers `[providers..models]` shape and `models.discovery` defaults. +- [Provider Model Catalog](/runtime/core/agents/model-catalog) explains the catalog projection, OpenAI-compatible endpoint, and extension `model.source`. - [File Locations](/runtime/core/configuration/file-locations) lists the exact global and workspace paths. diff --git a/packages/site/content/runtime/core/extensions/develop.mdx b/packages/site/content/runtime/core/extensions/develop.mdx index 96a8f4883..0e9af0ab0 100644 --- a/packages/site/content/runtime/core/extensions/develop.mdx +++ b/packages/site/content/runtime/core/extensions/develop.mdx @@ -213,6 +213,7 @@ Important current provide surfaces: | ---------------- | ------------------------------------------------ | | `memory.backend` | `memory/store`, `memory/recall`, `memory/forget` | | `bridge.adapter` | `bridges/deliver` | +| `model.source` | `models/list` | `bridge.adapter` extensions must also declare bridge metadata: @@ -226,7 +227,59 @@ display_name = "Slack" ``` Marketplace extensions run under a stricter policy. They are constrained to read-oriented grants: -`memory.read`, `observe.read`, `session.read`, `skills.read`, and `tool.read`. +`memory.read`, `model.read`, `observe.read`, `session.read`, `skills.read`, and `tool.read`. + +### Model Source Extensions + +Extensions that declare the provide capability `model.source` enrich the daemon-owned provider +model catalog. The daemon owns persistence and merge, so the extension only contributes source +rows; it cannot rewrite global catalog state. + +```toml +[capabilities] +provides = ["model.source"] + +[actions] +requires = ["models/list", "models/refresh", "models/status"] + +[security] +capabilities = ["model.read", "model.write"] + +[subprocess] +command = "node" +args = ["dist/index.js"] +``` + +The extension service method `models/list` is dispatched by AGH whenever the catalog refreshes the +`extension:` source for a provider the extension declares. The slug is derived from the +extension name and must match `^[a-z0-9][a-z0-9_-]*$`; manifests that do not normalize cleanly are +rejected at install time. + +| Direction | Method | Purpose | +| --------------- | ---------------- | ---------------------------------------------------------------------------- | +| AGH → extension | `models/list` | Returns provider model rows for the extension's declared providers. | +| Host API call | `models/list` | Reads the daemon-owned merged catalog projection, scoped by capability. | +| Host API call | `models/refresh` | Triggers a daemon-owned source refresh; serialized per provider. | +| Host API call | `models/status` | Reads daemon-owned source status, including `last_refresh` and `last_error`. | + +Capability areas align with the Host API authorization layer: + +| Method | Area | Default in marketplace policy | +| ---------------- | ------------- | ------------------------------------------ | +| `models/list` | `model.read` | Allowed (read-oriented grant). | +| `models/status` | `model.read` | Allowed (read-oriented grant). | +| `models/refresh` | `model.write` | Requires explicit grant; marketplace gate. | + +Extensions return validated rows. Invalid rows produce a recorded source status (with redacted +`last_error`) instead of corrupting the merged projection. Refresh runs under a daemon-enforced +deadline using the provider's auth/env/home policy, and refresh work for the same provider is +coalesced — concurrent refreshes return identical source statuses when the in-flight refresh +finishes. + +Generated TypeScript SDK and Go SDK helpers (`ProviderModel*`, `ModelCatalogSource*`, +`ModelSource*`) are published from the same OpenAPI/contract source as the daemon. Extensions +should depend on those helpers instead of hand-rolling JSON shapes so contract drift is caught at +typecheck time. ### Authored Context Host API diff --git a/packages/site/lib/__tests__/provider-model-catalog-docs.test.ts b/packages/site/lib/__tests__/provider-model-catalog-docs.test.ts new file mode 100644 index 000000000..c4e8e4722 --- /dev/null +++ b/packages/site/lib/__tests__/provider-model-catalog-docs.test.ts @@ -0,0 +1,134 @@ +import { readFileSync } from "node:fs"; +import { dirname, resolve } from "node:path"; +import { fileURLToPath } from "node:url"; +import { describe, expect, it } from "vitest"; + +const siteRoot = resolve(dirname(fileURLToPath(import.meta.url)), "..", ".."); +const runtimeRoot = resolve(siteRoot, "content/runtime"); + +const providersDoc = resolve(runtimeRoot, "core/agents/providers.mdx"); +const modelCatalogDoc = resolve(runtimeRoot, "core/agents/model-catalog.mdx"); +const configTomlDoc = resolve(runtimeRoot, "core/configuration/config-toml.mdx"); +const developExtensionsDoc = resolve(runtimeRoot, "core/extensions/develop.mdx"); +const cliProviderModelsIndex = resolve(runtimeRoot, "cli-reference/provider/models/index.mdx"); +const cliProviderModelsList = resolve(runtimeRoot, "cli-reference/provider/models/list.mdx"); +const cliProviderModelsRefresh = resolve(runtimeRoot, "cli-reference/provider/models/refresh.mdx"); +const cliProviderModelsStatus = resolve(runtimeRoot, "cli-reference/provider/models/status.mdx"); + +function read(path: string): string { + return readFileSync(path, "utf8"); +} + +function nonHardCutMatches(source: string, pattern: RegExp): string[] { + return source.split(/\r?\n/).flatMap(line => { + if ( + line.match(/no longer|hard-cut|rejected with|deterministic hard-cut|are rejected|reject the/) + ) { + return []; + } + return line.match(pattern) ? [line] : []; + }); +} + +describe("provider model catalog docs", () => { + it("removes old provider model field claims from the providers doc", () => { + const source = read(providersDoc); + const offending = nonHardCutMatches( + source, + /\b(default_model|supported_models|supports_reasoning_effort)\b/ + ); + expect(offending).toEqual([]); + }); + + it("removes old provider model field claims from config.toml docs", () => { + const source = read(configTomlDoc); + const offending = nonHardCutMatches( + source, + /\b(default_model|supported_models|supports_reasoning_effort)\b/ + ); + expect(offending).toEqual([]); + }); + + it("documents the nested provider models block in the providers doc", () => { + const source = read(providersDoc); + expect(source).toContain("[providers..models]"); + expect(source).toContain("models.default"); + expect(source).toContain("models.curated"); + expect(source).toContain("models.discovery"); + }); + + it("shows nested provider models examples only in the providers doc", () => { + const source = read(providersDoc); + expect(source).toContain("[providers.claude.models]"); + expect(source).toContain("[[providers.claude.models.curated]]"); + expect(source).toContain("[providers.openrouter.models]"); + }); + + it("documents [model_catalog.sources.models_dev] in config.toml", () => { + const source = read(configTomlDoc); + expect(source).toContain("[model_catalog.sources.models_dev]"); + expect(source).toContain("https://models.dev/api.json"); + expect(source).toContain("ttl"); + expect(source).toContain("timeout"); + }); + + it("documents provider models.discovery keys in config.toml", () => { + const source = read(configTomlDoc); + expect(source).toContain("models.discovery.enabled"); + expect(source).toContain("models.discovery.command"); + expect(source).toContain("models.discovery.endpoint"); + expect(source).toContain("models.discovery.timeout"); + }); + + it("documents native model catalog endpoints", () => { + const source = read(modelCatalogDoc); + expect(source).toContain("/api/providers/models"); + expect(source).toContain("/api/providers/{provider_id}/models"); + expect(source).toContain("/api/providers/models/refresh"); + expect(source).toContain("/api/providers/models/status"); + }); + + it("documents the OpenAI-compatible /api/openai/v1/models projection", () => { + const source = read(modelCatalogDoc); + expect(source).toContain("/api/openai/v1/models"); + expect(source).toContain("availability_state"); + expect(source).toContain("HTTP only"); + }); + + it("documents the daemon-owned refresh lifetime and serialization rules", () => { + const source = read(modelCatalogDoc); + expect(source).toContain("context.WithoutCancel"); + expect(source).toContain("serialized"); + expect(source).toContain("coalesce"); + expect(source).toContain("refresh_request_id"); + }); + + it("documents the model.source extension contract", () => { + const source = read(developExtensionsDoc); + expect(source).toContain("model.source"); + expect(source).toContain("models/list"); + expect(source).toContain("models/refresh"); + expect(source).toContain("models/status"); + expect(source).toContain("model.read"); + expect(source).toContain("model.write"); + }); + + it("includes the regenerated provider models CLI reference", () => { + const indexSource = read(cliProviderModelsIndex); + expect(indexSource).toContain("agh provider models"); + expect(indexSource).toContain("/runtime/cli-reference/provider/models/list"); + expect(indexSource).toContain("/runtime/cli-reference/provider/models/refresh"); + expect(indexSource).toContain("/runtime/cli-reference/provider/models/status"); + + expect(read(cliProviderModelsList)).toContain("agh provider models list"); + expect(read(cliProviderModelsRefresh)).toContain("agh provider models refresh"); + expect(read(cliProviderModelsStatus)).toContain("agh provider models status"); + }); + + it("explains the agh provider models namespace choice in the model catalog doc", () => { + const source = read(modelCatalogDoc); + expect(source).toContain("agh provider models"); + expect(source).toContain("agh models"); + expect(source).toContain("out of scope"); + }); +}); diff --git a/packages/site/lib/__tests__/runtime-manual-api-routes.test.ts b/packages/site/lib/__tests__/runtime-manual-api-routes.test.ts index 8b3d53ccf..474ceb8ea 100644 --- a/packages/site/lib/__tests__/runtime-manual-api-routes.test.ts +++ b/packages/site/lib/__tests__/runtime-manual-api-routes.test.ts @@ -99,6 +99,9 @@ function routePattern(route: string): RegExp { if (part.startsWith(":")) { return "[^/]+"; } + if (part.startsWith("*")) { + return ".*"; + } return part.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"); }) .join("/"); diff --git a/packages/site/lib/runtime-navigation.ts b/packages/site/lib/runtime-navigation.ts index a70f77de7..d7e726f51 100644 --- a/packages/site/lib/runtime-navigation.ts +++ b/packages/site/lib/runtime-navigation.ts @@ -59,7 +59,10 @@ export const API_SECTIONS: CoreSection[] = [ ids: ["tools", "toolsets", "resources", "bundles", "automation", "bridges"], }, { label: "Network", ids: ["network", "observe", "hooks"] }, - { label: "Operations", ids: ["daemon", "settings", "extensions", "vault", "agent", "tasks"] }, + { + label: "Operations", + ids: ["daemon", "settings", "providers", "extensions", "vault", "agent", "tasks", "openai"], + }, ]; function indexCoreChildren(coreFolder: Folder): Map { diff --git a/packages/site/scripts/generate-openapi.ts b/packages/site/scripts/generate-openapi.ts index df81f1d8d..bff74ebb4 100644 --- a/packages/site/scripts/generate-openapi.ts +++ b/packages/site/scripts/generate-openapi.ts @@ -97,6 +97,9 @@ function routePattern(route: string): RegExp { if (part.startsWith(":")) { return "[^/]+"; } + if (part.startsWith("*")) { + return ".*"; + } return part.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"); }) .join("/"); diff --git a/packages/ui/src/components/metric.tsx b/packages/ui/src/components/metric.tsx index 795a7ed6d..95401a198 100644 --- a/packages/ui/src/components/metric.tsx +++ b/packages/ui/src/components/metric.tsx @@ -55,14 +55,14 @@ function Metric({ > {label} -
+
{value} @@ -70,7 +70,7 @@ function Metric({ {detail !== undefined ? ( {detail} @@ -79,7 +79,7 @@ function Metric({ {subtext !== undefined ? (

{subtext}

diff --git a/sdk/typescript/src/generated/contracts.ts b/sdk/typescript/src/generated/contracts.ts index 78c462c35..0c30a8072 100644 --- a/sdk/typescript/src/generated/contracts.ts +++ b/sdk/typescript/src/generated/contracts.ts @@ -38,6 +38,9 @@ export type HostAPIMethod = | "memory/forget" | "memory/recall" | "memory/store" + | "models/list" + | "models/refresh" + | "models/status" | "network/channels" | "network/direct/messages" | "network/direct/resolve" @@ -2146,6 +2149,60 @@ export interface MessageStartPayload { raw?: JSONValue; } +export interface ModelSourceListParams { + provider_id?: string; + refresh?: boolean; + include_stale?: boolean; +} + +export interface ModelCatalogCostPayload { + input_per_million?: number; + output_per_million?: number; +} + +export interface ModelSourceRow { + source_id: string; + provider_id: string; + model_id: string; + display_name?: string; + priority?: number; + available?: boolean; + stale?: boolean; + refreshed_at: ISODateTime; + expires_at: ISODateTime; + context_window?: number; + max_input_tokens?: number; + max_output_tokens?: number; + supports_tools?: boolean; + supports_reasoning?: boolean; + reasoning_efforts?: string[]; + default_reasoning_effort?: string; + cost?: ModelCatalogCostPayload; + last_error?: string; +} + +export interface ModelSourceListResponse { + rows: ModelSourceRow[]; +} + +export interface ModelsListParams { + provider_id?: string; + source_id?: string; + refresh?: boolean; + include_stale?: boolean; +} + +export interface ModelsRefreshParams { + provider_id?: string; + source_id?: string; + force?: boolean; + request_id?: string; +} + +export interface ModelsStatusParams { + provider_id?: string; +} + export interface NetworkChannelPayload { channel: string; workspace_id?: string; @@ -2897,6 +2954,62 @@ export interface PromptPayload { context_blocks?: ContextBlock[]; } +export interface ModelCatalogSourceRefPayload { + source_id: string; + source_kind: string; + priority: number; + refreshed_at?: string; + stale: boolean; + last_error?: string; +} + +export interface ProviderModelPayload { + provider_id: string; + model_id: string; + display_name?: string; + sources: ModelCatalogSourceRefPayload[]; + available?: boolean; + availability_state: string; + stale: boolean; + refreshed_at?: string; + context_window?: number; + max_input_tokens?: number; + max_output_tokens?: number; + supports_tools?: boolean; + supports_reasoning?: boolean; + reasoning_efforts?: string[]; + default_reasoning_effort?: string; + cost?: ModelCatalogCostPayload; + last_error?: string; +} + +export interface ProviderModelListResponse { + models: ProviderModelPayload[]; +} + +export interface ModelCatalogSourceStatusPayload { + source_id: string; + source_kind: string; + provider_id: string; + priority: number; + last_refresh?: string; + next_refresh?: string; + last_success?: string; + last_error?: string; + refresh_state: string; + row_count: number; + stale: boolean; +} + +export interface ProviderModelRefreshResponse { + sources: ModelCatalogSourceStatusPayload[]; + error?: string; +} + +export interface ProviderModelStatusResponse { + sources: ModelCatalogSourceStatusPayload[]; +} + export interface ResourceGetParams { kind: ResourceKind; id: string; @@ -3548,6 +3661,8 @@ export interface SessionsCreateParams { agent: string; prompt?: string; provider?: string; + model?: string; + reasoning_effort?: string; workspace?: string; } @@ -5131,6 +5246,18 @@ export interface HostAPIMethodMap { params: SkillsListParams | undefined; result: SkillSummary[]; }; + "models/list": { + params: ModelsListParams | undefined; + result: ProviderModelListResponse; + }; + "models/refresh": { + params: ModelsRefreshParams | undefined; + result: ProviderModelRefreshResponse; + }; + "models/status": { + params: ModelsStatusParams | undefined; + result: ProviderModelStatusResponse; + }; "agents/soul/get": { params: AgentSoulGetParams; result: AgentSoulPayload; diff --git a/web/e2e/__tests__/session-provider-override.spec.ts b/web/e2e/__tests__/session-provider-override.spec.ts index 0a0b30579..8e1bcc63e 100644 --- a/web/e2e/__tests__/session-provider-override.spec.ts +++ b/web/e2e/__tests__/session-provider-override.spec.ts @@ -61,7 +61,7 @@ test.use({ }, }); -test("operator can create a provider-override session and gets an inline resume failure when that provider disappears", async ({ +test("operator can create a provider/model override session and gets an inline resume failure when that provider disappears", async ({ appPage, browserArtifacts, runtime, @@ -99,21 +99,47 @@ test("operator can create a provider-override session and gets an inline resume await expect(appPage.getByTestId("session-create-agent-select")).toContainText( browserLifecycleAgent ); - await expect(appPage.getByTestId("session-create-provider-select")).toHaveValue("claude"); + const providerSelect = appPage.getByTestId("session-create-provider-select"); + await expect(providerSelect).toContainText("Claude Code"); + await expect(appPage.getByTestId("session-create-provider-runtime")).toContainText("claude"); + await providerSelect.click(); const dialogOptions = await appPage - .getByTestId("session-create-provider-select") - .locator("option") - .evaluateAll(options => options.map(option => (option as HTMLOptionElement).value)); - expect(dialogOptions).toEqual(workspaceDetail.providers.map(provider => provider.name)); + .locator('[data-testid^="provider-command-item-"]') + .evaluateAll(items => + items + .map(item => item.getAttribute("data-testid")?.replace("provider-command-item-", "")) + .filter((value): value is string => Boolean(value)) + .sort() + ); + expect(dialogOptions).toEqual(workspaceDetail.providers.map(provider => provider.name).sort()); await browserArtifacts.captureScreenshot("session-provider-dialog-desktop", appPage); await appPage.setViewportSize({ width: 375, height: 812 }); - await expect(appPage.getByTestId("session-create-provider-select")).toBeVisible(); + await expect(providerSelect).toBeVisible(); await browserArtifacts.captureScreenshot("session-provider-dialog-mobile", appPage); await appPage.setViewportSize({ width: 1280, height: 800 }); - await appPage.getByTestId("session-create-provider-select").selectOption(overrideProvider); + await appPage.getByTestId(`provider-command-item-${overrideProvider}`).click(); + const catalogRefreshResponse = appPage.waitForResponse( + response => + response.request().method() === "POST" && + response.url().endsWith(`/api/providers/${overrideProvider}/models/refresh`) + ); + const refreshCatalog = appPage.getByTestId("session-create-catalog-refresh"); + await expect(refreshCatalog).toBeEnabled(); + await refreshCatalog.click(); + expect((await catalogRefreshResponse).ok()).toBe(true); + await expect(appPage.getByTestId("session-create-catalog-empty")).toBeVisible(); + + await appPage.getByTestId("session-create-model-select").click(); + await appPage.getByTestId("model-command-input").fill("qa-browser-model"); + await expect(appPage.getByTestId("model-command-item-custom")).toBeVisible(); + await appPage.getByTestId("model-command-item-custom").click(); + await expect(appPage.getByTestId("session-create-model-select")).toContainText( + "qa-browser-model" + ); + await expect(appPage.getByTestId("session-create-reasoning-select")).toBeDisabled(); const createRequestPromise = appPage.waitForRequest( request => request.method() === "POST" && request.url().endsWith("/api/sessions") @@ -128,14 +154,18 @@ test("operator can create a provider-override session and gets an inline resume const createResponse = await createResponsePromise; const createRequestBody = createRequest.postDataJSON() as { agent_name?: string; + model?: string; provider?: string; + reasoning_effort?: string; workspace?: string; }; expect(createRequestBody).toMatchObject({ agent_name: browserLifecycleAgent, + model: "qa-browser-model", provider: overrideProvider, workspace: workspace.id, }); + expect(createRequestBody).not.toHaveProperty("reasoning_effort"); expect(createResponse.ok()).toBeTruthy(); const createdSession = (await createResponse.json()) as SessionEnvelope; @@ -283,7 +313,14 @@ async function writeWorkspaceConfig(input: { lines.push( `[providers.${overrideProvider}]`, `command = "${escapeTomlString(input.overrideCommand)}"`, - `default_model = "qa-browser-model"`, + `[providers.${overrideProvider}.models]`, + `default = "qa-browser-model"`, + `[[providers.${overrideProvider}.models.curated]]`, + `id = "qa-browser-model"`, + `display_name = "QA Browser Model"`, + `supports_reasoning = true`, + `reasoning_efforts = ["low", "medium", "high"]`, + `default_reasoning_effort = "medium"`, `[[providers.${overrideProvider}.credential_slots]]`, `name = "api_key"`, `target_env = "QA_BROWSER_API_KEY"`, diff --git a/web/e2e/fixtures/__tests__/runtime-seed.test.ts b/web/e2e/fixtures/__tests__/runtime-seed.test.ts index 654ed905c..ff1f54119 100644 --- a/web/e2e/fixtures/__tests__/runtime-seed.test.ts +++ b/web/e2e/fixtures/__tests__/runtime-seed.test.ts @@ -1118,7 +1118,17 @@ describe("browser runtime seed helpers", () => { name: "browser-provider", settings: { command: "browser-provider", - default_model: "gpt-5.4", + models: { + default: "gpt-5.4", + curated: [ + { + id: "gpt-5.4", + supports_reasoning: true, + reasoning_efforts: ["low", "medium", "high"], + default_reasoning_effort: "medium", + }, + ], + }, }, }, ], @@ -1167,6 +1177,28 @@ describe("browser runtime seed helpers", () => { "/api/settings/providers/browser-provider", expect.objectContaining({ method: "PUT" }) ); + const providerRequest = requestJSON.mock.calls.find( + ([pathname]) => pathname === "/api/settings/providers/browser-provider" + ); + if (!providerRequest) { + throw new Error("settings provider seed did not issue provider PUT request"); + } + const providerInit = providerRequest[1] as RequestInit; + const providerBody = JSON.parse(String(providerInit.body)); + expect(providerBody.settings.models).toMatchObject({ + default: "gpt-5.4", + curated: [ + { + id: "gpt-5.4", + supports_reasoning: true, + reasoning_efforts: ["low", "medium", "high"], + default_reasoning_effort: "medium", + }, + ], + }); + expect(JSON.stringify(providerBody)).not.toContain("default_model"); + expect(JSON.stringify(providerBody)).not.toContain("supported_models"); + expect(JSON.stringify(providerBody)).not.toContain("supports_reasoning_effort"); expect(requestJSON).toHaveBeenCalledWith( "/api/settings/hooks/browser-turn-end", expect.objectContaining({ method: "PUT" }) diff --git a/web/src/generated/agh-openapi.d.ts b/web/src/generated/agh-openapi.d.ts index 6bf9f8747..ef7b3e2fb 100644 --- a/web/src/generated/agh-openapi.d.ts +++ b/web/src/generated/agh-openapi.d.ts @@ -1946,6 +1946,125 @@ export interface paths { patch?: never; trace?: never; }; + "/api/openai/v1/models": { + parameters: { + query?: never; + header?: never; + path?: never; + cookie?: never; + }; + /** List provider models using the OpenAI-compatible model shape */ + get: operations["listOpenAIModels"]; + put?: never; + post?: never; + delete?: never; + options?: never; + head?: never; + patch?: never; + trace?: never; + }; + "/api/providers/models": { + parameters: { + query?: never; + header?: never; + path?: never; + cookie?: never; + }; + /** List provider model catalog entries across providers */ + get: operations["listProviderModels"]; + put?: never; + post?: never; + delete?: never; + options?: never; + head?: never; + patch?: never; + trace?: never; + }; + "/api/providers/models/refresh": { + parameters: { + query?: never; + header?: never; + path?: never; + cookie?: never; + }; + get?: never; + put?: never; + /** Refresh provider model catalog sources across providers */ + post: operations["refreshProviderModels"]; + delete?: never; + options?: never; + head?: never; + patch?: never; + trace?: never; + }; + "/api/providers/models/status": { + parameters: { + query?: never; + header?: never; + path?: never; + cookie?: never; + }; + /** List provider model catalog source status across providers */ + get: operations["getProviderModelStatus"]; + put?: never; + post?: never; + delete?: never; + options?: never; + head?: never; + patch?: never; + trace?: never; + }; + "/api/providers/{provider_id}/models": { + parameters: { + query?: never; + header?: never; + path?: never; + cookie?: never; + }; + /** List provider model catalog entries for one provider */ + get: operations["listProviderModelsByProvider"]; + put?: never; + post?: never; + delete?: never; + options?: never; + head?: never; + patch?: never; + trace?: never; + }; + "/api/providers/{provider_id}/models/refresh": { + parameters: { + query?: never; + header?: never; + path?: never; + cookie?: never; + }; + get?: never; + put?: never; + /** Refresh provider model catalog sources for one provider */ + post: operations["refreshProviderModelsByProvider"]; + delete?: never; + options?: never; + head?: never; + patch?: never; + trace?: never; + }; + "/api/providers/{provider_id}/models/status": { + parameters: { + query?: never; + header?: never; + path?: never; + cookie?: never; + }; + /** List provider model catalog source status for one provider */ + get: operations["getProviderModelStatusByProvider"]; + put?: never; + post?: never; + delete?: never; + options?: never; + head?: never; + patch?: never; + trace?: never; + }; "/api/resources": { parameters: { query?: never; @@ -5391,6 +5510,18 @@ export interface operations { }; session: { acp_caps?: { + config_options?: { + current?: string; + description?: string; + id: string; + kind: string; + label?: string; + values?: { + description?: string; + label?: string; + value: string; + }[]; + }[]; supported_models?: string[]; supported_modes?: string[]; supports_load_session: boolean; @@ -5482,8 +5613,10 @@ export interface operations { /** Format: date-time */ ttl_expires_at?: string | null; } | null; + model?: string; name?: string; provider: string; + reasoning_effort?: string; sandbox?: { backend?: string; instance_id?: string; @@ -19355,256 +19488,18 @@ export interface operations { session_count?: number; sessions?: { acp_caps?: { - supported_models?: string[]; - supported_modes?: string[]; - supports_load_session: boolean; - } | null; - acp_session_id?: string; - activity?: { - current_tool?: string; - /** Format: date-time */ - deadline_at?: string | null; - /** Format: int64 */ - elapsed_ms: number; - /** Format: int64 */ - elapsed_seconds: number; - /** Format: int64 */ - idle_seconds: number; - iteration_current: number; - iteration_max: number; - /** Format: date-time */ - last_activity_at?: string | null; - last_activity_detail?: string; - last_activity_kind?: string; - /** Format: date-time */ - last_progress_at?: string | null; - tool_call_id?: string; - turn_id?: string; - turn_source?: string; - /** Format: date-time */ - turn_started_at?: string | null; - } | null; - agent_name: string; - channel?: string; - /** Format: date-time */ - created_at: string; - failure?: { - crash_bundle_path?: string; - kind: string; - summary?: string; - } | null; - health?: { - active_prompt: boolean; - agent_name: string; - attachable: boolean; - eligible_for_wake: boolean; - /** @enum {string} */ - health: "healthy" | "degraded" | "stale" | "dead" | "unknown"; - /** @enum {string} */ - ineligibility_reason?: - | "session_prompt_active" - | "session_not_attachable" - | "session_unhealthy" - | "session_health_stale" - | "session_health_hung" - | "session_health_dead" - | "session_health_unknown"; - /** Format: date-time */ - last_activity_at?: string | null; - last_error?: string; - /** Format: date-time */ - last_presence_at?: string | null; - session_id: string; - /** @enum {string} */ - state: "idle" | "prompting" | "stopped" | "detached"; - /** Format: date-time */ - updated_at: string; - workspace_id: string; - } | null; - id: string; - lineage?: { - auto_stop_on_parent: boolean; - parent_session_id?: string; - permission_policy: { - mcp_servers: string[]; - network_channels: string[]; - sandbox_profiles: string[]; - skills: string[]; - tools: string[]; - workspace_paths: string[]; - }; - root_session_id?: string; - spawn_budget: { - max_active_per_workspace?: number; - max_children: number; - max_depth: number; - /** Format: int64 */ - ttl_seconds: number; - }; - spawn_depth: number; - spawn_role?: string; - /** Format: date-time */ - ttl_expires_at?: string | null; - } | null; - name?: string; - provider: string; - sandbox?: { - backend?: string; - instance_id?: string; - last_sync_error?: string; - profile?: string; - provider_state_json?: unknown; - sandbox_id?: string; - state?: string; - } | null; - /** @enum {string} */ - state: "starting" | "active" | "stopping" | "stopped"; - stop_detail?: string; - /** @enum {string} */ - stop_reason?: - | "completed" - | "user_canceled" - | "max_iterations" - | "loop_detected" - | "timeout" - | "budget_exceeded" - | "error" - | "agent_crashed" - | "hook_stopped" - | "shutdown"; - type?: string; - /** Format: date-time */ - updated_at: string; - workspace_id?: string; - workspace_path?: string; - }[]; - workspace_id?: string; - }; - }; - }; - }; - /** @description Invalid network channel request */ - 400: { - headers: { - [name: string]: unknown; - }; - content: { - "application/json": { - error: string; - }; - }; - }; - /** @description Workspace not found */ - 404: { - headers: { - [name: string]: unknown; - }; - content: { - "application/json": { - error: string; - }; - }; - }; - /** @description Internal server error */ - 500: { - headers: { - [name: string]: unknown; - }; - content: { - "application/json": { - error: string; - }; - }; - }; - /** @description Network runtime is not configured */ - 503: { - headers: { - [name: string]: unknown; - }; - content: { - "application/json": { - error: string; - }; - }; - }; - default: { - headers: { - [name: string]: unknown; - }; - content?: never; - }; - }; - }; - getNetworkChannel: { - parameters: { - query?: never; - header?: never; - path: { - /** @description Network channel */ - channel: string; - }; - cookie?: never; - }; - requestBody?: never; - responses: { - /** @description OK */ - 200: { - headers: { - [name: string]: unknown; - }; - content: { - "application/json": { - channel: { - channel: string; - /** Format: date-time */ - created_at?: string | null; - created_by?: string; - historical_participant_count?: number; - kind_counts?: { - count: number; - kind: string; - }[]; - /** Format: date-time */ - last_activity_at?: string | null; - last_message_preview?: string; - /** Format: date-time */ - last_presence_at?: string | null; - local_peer_count?: number; - message_count?: number; - peer_count: number; - peers?: { - channel: string; - display_name?: string; - /** Format: date-time */ - expires_at?: string | null; - /** Format: date-time */ - joined_at?: string | null; - /** Format: date-time */ - last_seen?: string | null; - local: boolean; - peer_card: { - artifacts_supported: string[]; - capabilities: { + config_options?: { + current?: string; + description?: string; id: string; - summary: string; + kind: string; + label?: string; + values?: { + description?: string; + label?: string; + value: string; + }[]; }[]; - display_name?: string | null; - ext?: { - [key: string]: unknown; - }; - peer_id: string; - profiles_supported: string[]; - trust_modes_supported: string[]; - }; - peer_id: string; - session_id?: string | null; - }[]; - presence_count?: number; - purpose?: string; - remote_peer_count?: number; - session_count?: number; - sessions?: { - acp_caps?: { supported_models?: string[]; supported_modes?: string[]; supports_load_session: boolean; @@ -19696,8 +19591,274 @@ export interface operations { /** Format: date-time */ ttl_expires_at?: string | null; } | null; + model?: string; + name?: string; + provider: string; + reasoning_effort?: string; + sandbox?: { + backend?: string; + instance_id?: string; + last_sync_error?: string; + profile?: string; + provider_state_json?: unknown; + sandbox_id?: string; + state?: string; + } | null; + /** @enum {string} */ + state: "starting" | "active" | "stopping" | "stopped"; + stop_detail?: string; + /** @enum {string} */ + stop_reason?: + | "completed" + | "user_canceled" + | "max_iterations" + | "loop_detected" + | "timeout" + | "budget_exceeded" + | "error" + | "agent_crashed" + | "hook_stopped" + | "shutdown"; + type?: string; + /** Format: date-time */ + updated_at: string; + workspace_id?: string; + workspace_path?: string; + }[]; + workspace_id?: string; + }; + }; + }; + }; + /** @description Invalid network channel request */ + 400: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Workspace not found */ + 404: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Internal server error */ + 500: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Network runtime is not configured */ + 503: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + default: { + headers: { + [name: string]: unknown; + }; + content?: never; + }; + }; + }; + getNetworkChannel: { + parameters: { + query?: never; + header?: never; + path: { + /** @description Network channel */ + channel: string; + }; + cookie?: never; + }; + requestBody?: never; + responses: { + /** @description OK */ + 200: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + channel: { + channel: string; + /** Format: date-time */ + created_at?: string | null; + created_by?: string; + historical_participant_count?: number; + kind_counts?: { + count: number; + kind: string; + }[]; + /** Format: date-time */ + last_activity_at?: string | null; + last_message_preview?: string; + /** Format: date-time */ + last_presence_at?: string | null; + local_peer_count?: number; + message_count?: number; + peer_count: number; + peers?: { + channel: string; + display_name?: string; + /** Format: date-time */ + expires_at?: string | null; + /** Format: date-time */ + joined_at?: string | null; + /** Format: date-time */ + last_seen?: string | null; + local: boolean; + peer_card: { + artifacts_supported: string[]; + capabilities: { + id: string; + summary: string; + }[]; + display_name?: string | null; + ext?: { + [key: string]: unknown; + }; + peer_id: string; + profiles_supported: string[]; + trust_modes_supported: string[]; + }; + peer_id: string; + session_id?: string | null; + }[]; + presence_count?: number; + purpose?: string; + remote_peer_count?: number; + session_count?: number; + sessions?: { + acp_caps?: { + config_options?: { + current?: string; + description?: string; + id: string; + kind: string; + label?: string; + values?: { + description?: string; + label?: string; + value: string; + }[]; + }[]; + supported_models?: string[]; + supported_modes?: string[]; + supports_load_session: boolean; + } | null; + acp_session_id?: string; + activity?: { + current_tool?: string; + /** Format: date-time */ + deadline_at?: string | null; + /** Format: int64 */ + elapsed_ms: number; + /** Format: int64 */ + elapsed_seconds: number; + /** Format: int64 */ + idle_seconds: number; + iteration_current: number; + iteration_max: number; + /** Format: date-time */ + last_activity_at?: string | null; + last_activity_detail?: string; + last_activity_kind?: string; + /** Format: date-time */ + last_progress_at?: string | null; + tool_call_id?: string; + turn_id?: string; + turn_source?: string; + /** Format: date-time */ + turn_started_at?: string | null; + } | null; + agent_name: string; + channel?: string; + /** Format: date-time */ + created_at: string; + failure?: { + crash_bundle_path?: string; + kind: string; + summary?: string; + } | null; + health?: { + active_prompt: boolean; + agent_name: string; + attachable: boolean; + eligible_for_wake: boolean; + /** @enum {string} */ + health: "healthy" | "degraded" | "stale" | "dead" | "unknown"; + /** @enum {string} */ + ineligibility_reason?: + | "session_prompt_active" + | "session_not_attachable" + | "session_unhealthy" + | "session_health_stale" + | "session_health_hung" + | "session_health_dead" + | "session_health_unknown"; + /** Format: date-time */ + last_activity_at?: string | null; + last_error?: string; + /** Format: date-time */ + last_presence_at?: string | null; + session_id: string; + /** @enum {string} */ + state: "idle" | "prompting" | "stopped" | "detached"; + /** Format: date-time */ + updated_at: string; + workspace_id: string; + } | null; + id: string; + lineage?: { + auto_stop_on_parent: boolean; + parent_session_id?: string; + permission_policy: { + mcp_servers: string[]; + network_channels: string[]; + sandbox_profiles: string[]; + skills: string[]; + tools: string[]; + workspace_paths: string[]; + }; + root_session_id?: string; + spawn_budget: { + max_active_per_workspace?: number; + max_children: number; + max_depth: number; + /** Format: int64 */ + ttl_seconds: number; + }; + spawn_depth: number; + spawn_role?: string; + /** Format: date-time */ + ttl_expires_at?: string | null; + } | null; + model?: string; name?: string; provider: string; + reasoning_effort?: string; sandbox?: { backend?: string; instance_id?: string; @@ -21882,6 +22043,768 @@ export interface operations { }; }; }; + listOpenAIModels: { + parameters: { + query?: { + /** @description Filter by AGH provider id */ + provider_id?: string; + }; + header?: never; + path?: never; + cookie?: never; + }; + requestBody?: never; + responses: { + /** @description OK */ + 200: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + data: { + agh: { + availability_state: string; + available: boolean | null; + /** Format: int64 */ + context_window?: number | null; + cost?: { + /** Format: double */ + input_per_million?: number | null; + /** Format: double */ + output_per_million?: number | null; + } | null; + default_reasoning_effort?: string | null; + display_name?: string; + last_error?: string; + /** Format: int64 */ + max_input_tokens?: number | null; + /** Format: int64 */ + max_output_tokens?: number | null; + model_id: string; + provider_id: string; + reasoning_efforts?: string[]; + refreshed_at?: string; + sources: string[]; + stale: boolean; + supports_reasoning?: boolean | null; + supports_tools?: boolean | null; + }; + /** Format: int64 */ + created: number; + id: string; + object: string; + owned_by: string; + }[]; + object: string; + }; + }; + }; + /** @description Invalid model catalog filter */ + 400: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: { + code: string; + message: string; + param: string | null; + type: string; + }; + }; + }; + }; + /** @description Unauthorized */ + 401: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: { + code: string; + message: string; + param: string | null; + type: string; + }; + }; + }; + }; + /** @description Forbidden */ + 403: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: { + code: string; + message: string; + param: string | null; + type: string; + }; + }; + }; + }; + /** @description Internal server error */ + 500: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: { + code: string; + message: string; + param: string | null; + type: string; + }; + }; + }; + }; + /** @description Model catalog unavailable */ + 503: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: { + code: string; + message: string; + param: string | null; + type: string; + }; + }; + }; + }; + default: { + headers: { + [name: string]: unknown; + }; + content?: never; + }; + }; + }; + listProviderModels: { + parameters: { + query?: { + /** @description Filter by AGH provider id */ + provider_id?: string; + /** @description Filter by catalog source id */ + source_id?: string; + /** @description Refresh sources before listing models */ + refresh?: boolean; + /** @description Include stale source rows in the merged projection */ + include_stale?: boolean; + }; + header?: never; + path?: never; + cookie?: never; + }; + requestBody?: never; + responses: { + /** @description OK */ + 200: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + models: { + availability_state: string; + available: boolean | null; + /** Format: int64 */ + context_window?: number | null; + cost?: { + /** Format: double */ + input_per_million?: number | null; + /** Format: double */ + output_per_million?: number | null; + } | null; + default_reasoning_effort?: string | null; + display_name?: string; + last_error?: string; + /** Format: int64 */ + max_input_tokens?: number | null; + /** Format: int64 */ + max_output_tokens?: number | null; + model_id: string; + provider_id: string; + reasoning_efforts?: string[]; + refreshed_at?: string; + sources: { + last_error?: string; + priority: number; + refreshed_at?: string; + source_id: string; + source_kind: string; + stale: boolean; + }[]; + stale: boolean; + supports_reasoning?: boolean | null; + supports_tools?: boolean | null; + }[]; + }; + }; + }; + /** @description Invalid model catalog filter */ + 400: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Forbidden */ + 403: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Internal server error */ + 500: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Model catalog unavailable */ + 503: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + default: { + headers: { + [name: string]: unknown; + }; + content?: never; + }; + }; + }; + refreshProviderModels: { + parameters: { + query?: never; + header?: never; + path?: never; + cookie?: never; + }; + /** @description JSON request body */ + requestBody?: { + content: { + "application/json": { + force?: boolean; + request_id?: string; + source_id?: string; + }; + }; + }; + responses: { + /** @description OK */ + 200: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error?: string; + sources: { + last_error?: string; + last_refresh?: string; + last_success?: string; + next_refresh?: string; + priority: number; + provider_id: string; + refresh_state: string; + row_count: number; + source_id: string; + source_kind: string; + stale: boolean; + }[]; + }; + }; + }; + /** @description Invalid model catalog refresh request */ + 400: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Forbidden */ + 403: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Internal server error */ + 500: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Model catalog refresh unavailable */ + 503: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error?: string; + sources: { + last_error?: string; + last_refresh?: string; + last_success?: string; + next_refresh?: string; + priority: number; + provider_id: string; + refresh_state: string; + row_count: number; + source_id: string; + source_kind: string; + stale: boolean; + }[]; + }; + }; + }; + default: { + headers: { + [name: string]: unknown; + }; + content?: never; + }; + }; + }; + getProviderModelStatus: { + parameters: { + query?: never; + header?: never; + path?: never; + cookie?: never; + }; + requestBody?: never; + responses: { + /** @description OK */ + 200: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + sources: { + last_error?: string; + last_refresh?: string; + last_success?: string; + next_refresh?: string; + priority: number; + provider_id: string; + refresh_state: string; + row_count: number; + source_id: string; + source_kind: string; + stale: boolean; + }[]; + }; + }; + }; + /** @description Invalid model catalog filter */ + 400: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Forbidden */ + 403: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Internal server error */ + 500: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Model catalog unavailable */ + 503: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + default: { + headers: { + [name: string]: unknown; + }; + content?: never; + }; + }; + }; + listProviderModelsByProvider: { + parameters: { + query?: { + /** @description Filter by catalog source id */ + source_id?: string; + /** @description Refresh sources before listing models */ + refresh?: boolean; + /** @description Include stale source rows in the merged projection */ + include_stale?: boolean; + }; + header?: never; + path: { + /** @description AGH provider id */ + provider_id: string; + }; + cookie?: never; + }; + requestBody?: never; + responses: { + /** @description OK */ + 200: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + models: { + availability_state: string; + available: boolean | null; + /** Format: int64 */ + context_window?: number | null; + cost?: { + /** Format: double */ + input_per_million?: number | null; + /** Format: double */ + output_per_million?: number | null; + } | null; + default_reasoning_effort?: string | null; + display_name?: string; + last_error?: string; + /** Format: int64 */ + max_input_tokens?: number | null; + /** Format: int64 */ + max_output_tokens?: number | null; + model_id: string; + provider_id: string; + reasoning_efforts?: string[]; + refreshed_at?: string; + sources: { + last_error?: string; + priority: number; + refreshed_at?: string; + source_id: string; + source_kind: string; + stale: boolean; + }[]; + stale: boolean; + supports_reasoning?: boolean | null; + supports_tools?: boolean | null; + }[]; + }; + }; + }; + /** @description Invalid model catalog filter */ + 400: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Forbidden */ + 403: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Internal server error */ + 500: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Model catalog unavailable */ + 503: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + default: { + headers: { + [name: string]: unknown; + }; + content?: never; + }; + }; + }; + refreshProviderModelsByProvider: { + parameters: { + query?: never; + header?: never; + path: { + /** @description AGH provider id */ + provider_id: string; + }; + cookie?: never; + }; + /** @description JSON request body */ + requestBody?: { + content: { + "application/json": { + force?: boolean; + request_id?: string; + source_id?: string; + }; + }; + }; + responses: { + /** @description OK */ + 200: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error?: string; + sources: { + last_error?: string; + last_refresh?: string; + last_success?: string; + next_refresh?: string; + priority: number; + provider_id: string; + refresh_state: string; + row_count: number; + source_id: string; + source_kind: string; + stale: boolean; + }[]; + }; + }; + }; + /** @description Invalid model catalog refresh request */ + 400: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Forbidden */ + 403: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Internal server error */ + 500: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Model catalog refresh unavailable */ + 503: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error?: string; + sources: { + last_error?: string; + last_refresh?: string; + last_success?: string; + next_refresh?: string; + priority: number; + provider_id: string; + refresh_state: string; + row_count: number; + source_id: string; + source_kind: string; + stale: boolean; + }[]; + }; + }; + }; + default: { + headers: { + [name: string]: unknown; + }; + content?: never; + }; + }; + }; + getProviderModelStatusByProvider: { + parameters: { + query?: never; + header?: never; + path: { + /** @description AGH provider id */ + provider_id: string; + }; + cookie?: never; + }; + requestBody?: never; + responses: { + /** @description OK */ + 200: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + sources: { + last_error?: string; + last_refresh?: string; + last_success?: string; + next_refresh?: string; + priority: number; + provider_id: string; + refresh_state: string; + row_count: number; + source_id: string; + source_kind: string; + stale: boolean; + }[]; + }; + }; + }; + /** @description Invalid model catalog filter */ + 400: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Forbidden */ + 403: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Internal server error */ + 500: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + /** @description Model catalog unavailable */ + 503: { + headers: { + [name: string]: unknown; + }; + content: { + "application/json": { + error: string; + }; + }; + }; + default: { + headers: { + [name: string]: unknown; + }; + content?: never; + }; + }; + }; listResources: { parameters: { query?: { @@ -22481,6 +23404,18 @@ export interface operations { "application/json": { sessions: { acp_caps?: { + config_options?: { + current?: string; + description?: string; + id: string; + kind: string; + label?: string; + values?: { + description?: string; + label?: string; + value: string; + }[]; + }[]; supported_models?: string[]; supported_modes?: string[]; supports_load_session: boolean; @@ -22572,8 +23507,10 @@ export interface operations { /** Format: date-time */ ttl_expires_at?: string | null; } | null; + model?: string; name?: string; provider: string; + reasoning_effort?: string; sandbox?: { backend?: string; instance_id?: string; @@ -22650,8 +23587,10 @@ export interface operations { "application/json": { agent_name?: string; channel?: string; + model?: string; name?: string; provider?: string; + reasoning_effort?: string; workspace?: string; workspace_path?: string; }; @@ -22667,6 +23606,18 @@ export interface operations { "application/json": { session: { acp_caps?: { + config_options?: { + current?: string; + description?: string; + id: string; + kind: string; + label?: string; + values?: { + description?: string; + label?: string; + value: string; + }[]; + }[]; supported_models?: string[]; supported_modes?: string[]; supports_load_session: boolean; @@ -22758,8 +23709,10 @@ export interface operations { /** Format: date-time */ ttl_expires_at?: string | null; } | null; + model?: string; name?: string; provider: string; + reasoning_effort?: string; sandbox?: { backend?: string; instance_id?: string; @@ -22869,6 +23822,18 @@ export interface operations { "application/json": { session: { acp_caps?: { + config_options?: { + current?: string; + description?: string; + id: string; + kind: string; + label?: string; + values?: { + description?: string; + label?: string; + value: string; + }[]; + }[]; supported_models?: string[]; supported_modes?: string[]; supports_load_session: boolean; @@ -22960,8 +23925,10 @@ export interface operations { /** Format: date-time */ ttl_expires_at?: string | null; } | null; + model?: string; name?: string; provider: string; + reasoning_effort?: string; sandbox?: { backend?: string; instance_id?: string; @@ -23506,6 +24473,18 @@ export interface operations { "application/json": { session: { acp_caps?: { + config_options?: { + current?: string; + description?: string; + id: string; + kind: string; + label?: string; + values?: { + description?: string; + label?: string; + value: string; + }[]; + }[]; supported_models?: string[]; supported_modes?: string[]; supports_load_session: boolean; @@ -23597,8 +24576,10 @@ export interface operations { /** Format: date-time */ ttl_expires_at?: string | null; } | null; + model?: string; name?: string; provider: string; + reasoning_effort?: string; sandbox?: { backend?: string; instance_id?: string; @@ -27779,11 +28760,37 @@ export interface operations { secret_ref: string; target_env: string; }[]; - default_model?: string; display_name?: string; env_policy?: string; harness?: string; home_policy?: string; + models?: { + curated?: { + /** Format: int64 */ + context_window?: number | null; + /** Format: double */ + cost_input_per_million?: number | null; + /** Format: double */ + cost_output_per_million?: number | null; + default_reasoning_effort?: string; + display_name?: string; + id: string; + /** Format: int64 */ + max_input_tokens?: number | null; + /** Format: int64 */ + max_output_tokens?: number | null; + reasoning_efforts?: string[]; + supports_reasoning?: boolean | null; + supports_tools?: boolean | null; + }[]; + default?: string; + discovery?: { + command?: string; + enabled?: boolean | null; + endpoint?: string; + timeout?: string; + } | null; + } | null; runtime_provider?: string; transport?: string; }; @@ -27817,11 +28824,37 @@ export interface operations { secret_ref: string; target_env: string; }[]; - default_model?: string; display_name?: string; env_policy?: string; harness?: string; home_policy?: string; + models?: { + curated?: { + /** Format: int64 */ + context_window?: number | null; + /** Format: double */ + cost_input_per_million?: number | null; + /** Format: double */ + cost_output_per_million?: number | null; + default_reasoning_effort?: string; + display_name?: string; + id: string; + /** Format: int64 */ + max_input_tokens?: number | null; + /** Format: int64 */ + max_output_tokens?: number | null; + reasoning_efforts?: string[]; + supports_reasoning?: boolean | null; + supports_tools?: boolean | null; + }[]; + default?: string; + discovery?: { + command?: string; + enabled?: boolean | null; + endpoint?: string; + timeout?: string; + } | null; + } | null; runtime_provider?: string; transport?: string; }; @@ -27944,11 +28977,37 @@ export interface operations { secret_ref: string; target_env: string; }[]; - default_model?: string; display_name?: string; env_policy?: string; harness?: string; home_policy?: string; + models?: { + curated?: { + /** Format: int64 */ + context_window?: number | null; + /** Format: double */ + cost_input_per_million?: number | null; + /** Format: double */ + cost_output_per_million?: number | null; + default_reasoning_effort?: string; + display_name?: string; + id: string; + /** Format: int64 */ + max_input_tokens?: number | null; + /** Format: int64 */ + max_output_tokens?: number | null; + reasoning_efforts?: string[]; + supports_reasoning?: boolean | null; + supports_tools?: boolean | null; + }[]; + default?: string; + discovery?: { + command?: string; + enabled?: boolean | null; + endpoint?: string; + timeout?: string; + } | null; + } | null; runtime_provider?: string; transport?: string; }; @@ -27982,11 +29041,37 @@ export interface operations { secret_ref: string; target_env: string; }[]; - default_model?: string; display_name?: string; env_policy?: string; harness?: string; home_policy?: string; + models?: { + curated?: { + /** Format: int64 */ + context_window?: number | null; + /** Format: double */ + cost_input_per_million?: number | null; + /** Format: double */ + cost_output_per_million?: number | null; + default_reasoning_effort?: string; + display_name?: string; + id: string; + /** Format: int64 */ + max_input_tokens?: number | null; + /** Format: int64 */ + max_output_tokens?: number | null; + reasoning_efforts?: string[]; + supports_reasoning?: boolean | null; + supports_tools?: boolean | null; + }[]; + default?: string; + discovery?: { + command?: string; + enabled?: boolean | null; + endpoint?: string; + timeout?: string; + } | null; + } | null; runtime_provider?: string; transport?: string; }; @@ -28097,11 +29182,37 @@ export interface operations { secret_ref: string; target_env: string; }[]; - default_model?: string; display_name?: string; env_policy?: string; harness?: string; home_policy?: string; + models?: { + curated?: { + /** Format: int64 */ + context_window?: number | null; + /** Format: double */ + cost_input_per_million?: number | null; + /** Format: double */ + cost_output_per_million?: number | null; + default_reasoning_effort?: string; + display_name?: string; + id: string; + /** Format: int64 */ + max_input_tokens?: number | null; + /** Format: int64 */ + max_output_tokens?: number | null; + reasoning_efforts?: string[]; + supports_reasoning?: boolean | null; + supports_tools?: boolean | null; + }[]; + default?: string; + discovery?: { + command?: string; + enabled?: boolean | null; + endpoint?: string; + timeout?: string; + } | null; + } | null; runtime_provider?: string; transport?: string; }; @@ -41509,7 +42620,6 @@ export interface operations { }[]; providers?: { auth_mode?: string; - default_model?: string; display_name?: string; env_policy?: string; harness?: string; @@ -41519,6 +42629,18 @@ export interface operations { }[]; sessions?: { acp_caps?: { + config_options?: { + current?: string; + description?: string; + id: string; + kind: string; + label?: string; + values?: { + description?: string; + label?: string; + value: string; + }[]; + }[]; supported_models?: string[]; supported_modes?: string[]; supports_load_session: boolean; @@ -41610,8 +42732,10 @@ export interface operations { /** Format: date-time */ ttl_expires_at?: string | null; } | null; + model?: string; name?: string; provider: string; + reasoning_effort?: string; sandbox?: { backend?: string; instance_id?: string; diff --git a/web/src/hooks/routes/__tests__/use-app-layout.test.tsx b/web/src/hooks/routes/__tests__/use-app-layout.test.tsx index 74dba50fb..dab883487 100644 --- a/web/src/hooks/routes/__tests__/use-app-layout.test.tsx +++ b/web/src/hooks/routes/__tests__/use-app-layout.test.tsx @@ -61,6 +61,32 @@ vi.mock("@/systems/session/hooks/use-session-actions", () => ({ }), })); +vi.mock("@/systems/model-catalog", async () => { + const actual = + await vi.importActual("@/systems/model-catalog"); + return { + ...actual, + useProviderModels: () => ({ + data: undefined, + isLoading: false, + isFetching: false, + error: null, + }), + useProviderModelStatus: () => ({ + data: undefined, + isLoading: false, + isFetching: false, + error: null, + }), + useRefreshProviderModels: () => ({ + mutate: vi.fn(), + mutateAsync: vi.fn(), + isPending: false, + error: null, + }), + }; +}); + vi.mock("@/systems/session", async () => { const useSessionCreateDialogModule = await vi.importActual< typeof import("@/systems/session/hooks/use-session-create-dialog") diff --git a/web/src/hooks/routes/__tests__/use-settings-providers-page.test.tsx b/web/src/hooks/routes/__tests__/use-settings-providers-page.test.tsx index 4a97719be..11ab1c84c 100644 --- a/web/src/hooks/routes/__tests__/use-settings-providers-page.test.tsx +++ b/web/src/hooks/routes/__tests__/use-settings-providers-page.test.tsx @@ -39,7 +39,10 @@ const claudeEntry: SettingsProviderCollection["providers"][number] = { command_available: true, settings: { command: "npx -y @agentclientprotocol/claude-agent-acp@latest", - default_model: "claude-sonnet-4-6", + models: { + default: "claude-sonnet-4-6", + curated: [{ id: "claude-sonnet-4-6" }, { id: "claude-haiku-4-5" }], + }, auth_mode: "native_cli", env_policy: "filtered", home_policy: "operator", @@ -54,7 +57,7 @@ const claudeEntry: SettingsProviderCollection["providers"][number] = { fallback: { settings: { command: "npx -y @agentclientprotocol/claude-agent-acp@latest", - default_model: "claude-sonnet-4-6", + models: { default: "claude-sonnet-4-6" }, }, source: { kind: "builtin-provider", scope: "global" }, }, @@ -66,7 +69,17 @@ const codexEntry: SettingsProviderCollection["providers"][number] = { command_available: true, settings: { command: "npx -y @zed-industries/codex-acp@latest", - default_model: "gpt-5.4", + models: { + default: "gpt-5.4", + curated: [ + { + id: "gpt-5.4", + supports_reasoning: true, + reasoning_efforts: ["low", "medium", "high"], + }, + { id: "gpt-5.4-mini" }, + ], + }, auth_mode: "bound_secret", env_policy: "filtered", home_policy: "operator", @@ -160,7 +173,14 @@ describe("useSettingsProvidersPage", () => { expect(result.current.editor).toMatchObject({ mode: "create", - draft: { name: "", command: "", default_model: "", target_env: "", auth_mode: "native_cli" }, + draft: { + name: "", + command: "", + model_default: "", + curated_models: "", + target_env: "", + auth_mode: "native_cli", + }, }); }); @@ -193,7 +213,8 @@ describe("useSettingsProvidersPage", () => { name: "claude", draft: expect.objectContaining({ command: "npx -y @agentclientprotocol/claude-agent-acp@latest", - default_model: "claude-sonnet-4-6", + model_default: "claude-sonnet-4-6", + curated_models: "claude-sonnet-4-6\nclaude-haiku-4-5", target_env: "", auth_mode: "native_cli", env_policy: "filtered", @@ -223,7 +244,11 @@ describe("useSettingsProvidersPage", () => { result.current.openEdit(claudeEntry); }); act(() => { - result.current.updateDraft(draft => ({ ...draft, default_model: "claude-haiku" })); + result.current.updateDraft(draft => ({ + ...draft, + model_default: "claude-haiku", + curated_models: "claude-haiku\nclaude-sonnet-4-6", + })); }); act(() => { result.current.saveEditor(); @@ -236,7 +261,10 @@ describe("useSettingsProvidersPage", () => { expect(putSettingsProvider).toHaveBeenCalledWith("claude", { settings: { command: "npx -y @agentclientprotocol/claude-agent-acp@latest", - default_model: "claude-haiku", + models: { + default: "claude-haiku", + curated: [{ id: "claude-haiku" }, { id: "claude-sonnet-4-6" }], + }, harness: "acp", auth_mode: "native_cli", env_policy: "filtered", @@ -262,7 +290,10 @@ describe("useSettingsProvidersPage", () => { name: "openrouter", settings: { command: "npx -y pi-acp@latest", - default_model: "openai/gpt-5.4", + models: { + default: "openai/gpt-5.4", + curated: [{ id: "openai/gpt-5.4", supports_reasoning: true }], + }, harness: "pi_acp", runtime_provider: "openrouter", auth_mode: "bound_secret", @@ -318,7 +349,7 @@ describe("useSettingsProvidersPage", () => { act(() => { result.current.updateDraft(draft => ({ ...draft, - default_model: "anthropic/claude-sonnet", + model_default: "anthropic/claude-sonnet", credential_slots: draft.credential_slots.map((slot, index) => index === 1 ? { ...slot, secret_ref: "vault:providers/openrouter/organization" } : slot ), @@ -336,7 +367,10 @@ describe("useSettingsProvidersPage", () => { expect(putSettingsProvider).toHaveBeenCalledWith("openrouter", { settings: { command: "npx -y pi-acp@latest", - default_model: "anthropic/claude-sonnet", + models: { + default: "anthropic/claude-sonnet", + curated: [{ id: "openai/gpt-5.4", supports_reasoning: true }], + }, harness: "pi_acp", runtime_provider: "openrouter", auth_mode: "bound_secret", @@ -391,7 +425,7 @@ describe("useSettingsProvidersPage", () => { ...draft, name: "openrouter", command: "npx -y pi-acp@latest", - default_model: "openai/gpt-5.4", + model_default: "openai/gpt-5.4", target_env: "OPENROUTER_API_KEY", harness: "pi_acp", runtime_provider: "openrouter", @@ -411,7 +445,10 @@ describe("useSettingsProvidersPage", () => { expect(putSettingsProvider).toHaveBeenCalledWith("openrouter", { settings: { command: "npx -y pi-acp@latest", - default_model: "openai/gpt-5.4", + models: { + default: "openai/gpt-5.4", + curated: [], + }, harness: "pi_acp", runtime_provider: "openrouter", auth_mode: "bound_secret", @@ -438,6 +475,63 @@ describe("useSettingsProvidersPage", () => { }); }); + it("Should preserve curated metadata when re-saving with the same model ids", async () => { + vi.mocked(putSettingsProvider).mockResolvedValue({ + section: "general", + scope: "global", + behavior: "applied_now", + applied: true, + restart_required: false, + write_target: "global-config", + }); + + const { wrapper } = createWrapper(); + const { result } = renderHook(() => useSettingsProvidersPage(), { wrapper }); + + await waitFor(() => expect(result.current.providers).toHaveLength(2)); + + act(() => { + result.current.openEdit(codexEntry); + }); + act(() => { + result.current.saveEditor(); + }); + + await waitFor(() => { + expect(result.current.lastAction?.kind).toBe("saved"); + }); + + expect(putSettingsProvider).toHaveBeenCalledWith("codex", { + settings: { + command: "npx -y @zed-industries/codex-acp@latest", + models: { + default: "gpt-5.4", + curated: [ + { + id: "gpt-5.4", + supports_reasoning: true, + reasoning_efforts: ["low", "medium", "high"], + }, + { id: "gpt-5.4-mini" }, + ], + }, + harness: "acp", + auth_mode: "bound_secret", + env_policy: "filtered", + home_policy: "operator", + credential_slots: [ + { + name: "api_key", + target_env: "OPENAI_API_KEY", + secret_ref: "env:OPENAI_API_KEY", + kind: "api_key", + required: true, + }, + ], + }, + }); + }); + it("surfaces validation errors from the adapter without closing the editor", async () => { vi.mocked(putSettingsProvider).mockRejectedValue( new SettingsApiError("invalid credential_slots[0].secret_ref", 400) diff --git a/web/src/hooks/routes/use-settings-providers-page.ts b/web/src/hooks/routes/use-settings-providers-page.ts index 7e6422386..d8bec0bdd 100644 --- a/web/src/hooks/routes/use-settings-providers-page.ts +++ b/web/src/hooks/routes/use-settings-providers-page.ts @@ -14,12 +14,18 @@ import { type ProviderCredentialSlotDraft = NonNullable< NonNullable["credential_slots"] >[number]; +type ProviderModelsPayload = NonNullable< + NonNullable["models"] +>; +type ProviderModelPayload = NonNullable[number]; export type ProviderDraft = { name: string; command: string; display_name: string; - default_model: string; + model_default: string; + curated_models: string; + curated_snapshot: ProviderModelPayload[]; target_env: string; harness: string; runtime_provider: string; @@ -49,7 +55,9 @@ function emptyDraft(): ProviderDraft { name: "", command: "", display_name: "", - default_model: "", + model_default: "", + curated_models: "", + curated_snapshot: [], target_env: "", harness: "acp", runtime_provider: "", @@ -70,11 +78,14 @@ function emptyDraft(): ProviderDraft { function toDraft(entry: SettingsProviderEntry): ProviderDraft { const credentialSlots = credentialSlotsForDraft(entry.settings.credential_slots ?? []); const credentialSlot = credentialSlots[0]; + const curatedSnapshot = (entry.settings.models?.curated ?? []).map(model => ({ ...model })); return { name: entry.name, command: entry.settings.command ?? "", display_name: entry.settings.display_name ?? "", - default_model: entry.settings.default_model ?? "", + model_default: entry.settings.models?.default ?? "", + curated_models: joinCuratedModels(curatedSnapshot), + curated_snapshot: curatedSnapshot, target_env: credentialSlot?.target_env ?? "", harness: entry.settings.harness ?? "acp", runtime_provider: entry.settings.runtime_provider ?? "", @@ -96,7 +107,10 @@ function toRequest(draft: ProviderDraft): SettingsProviderRequest { const settings: SettingsProviderRequest["settings"] = {}; if (draft.command.trim()) settings.command = draft.command.trim(); if (draft.display_name.trim()) settings.display_name = draft.display_name.trim(); - if (draft.default_model.trim()) settings.default_model = draft.default_model.trim(); + settings.models = { + ...(draft.model_default.trim() ? { default: draft.model_default.trim() } : {}), + curated: parseCuratedModels(draft.curated_models, draft.curated_snapshot), + }; if (draft.harness.trim()) settings.harness = draft.harness.trim(); if (draft.runtime_provider.trim()) settings.runtime_provider = draft.runtime_provider.trim(); if (draft.transport.trim()) settings.transport = draft.transport.trim(); @@ -137,6 +151,33 @@ function envSecretRef(apiKeyEnv?: string): string { return envName ? `env:${envName}` : ""; } +function joinCuratedModels(models: ProviderModelPayload[]): string { + return models + .map(model => model.id.trim()) + .filter(Boolean) + .join("\n"); +} + +function parseCuratedModels(raw: string, snapshot: ProviderModelPayload[]): ProviderModelPayload[] { + const seen = new Set(); + const models: ProviderModelPayload[] = []; + const snapshotById = new Map( + snapshot.filter(entry => entry.id.trim().length > 0).map(entry => [entry.id.trim(), entry]) + ); + for (const part of raw.split(/[\n,]/u)) { + const id = part.trim(); + if (!id || seen.has(id)) continue; + seen.add(id); + const enrichment = snapshotById.get(id); + if (enrichment) { + models.push({ ...enrichment, id }); + } else { + models.push({ id }); + } + } + return models; +} + function credentialSlotsForDraft( slots: ProviderCredentialSlotDraft[] ): ProviderCredentialSlotDraft[] { diff --git a/web/src/routes/_app.tsx b/web/src/routes/_app.tsx index 9de93590b..5d86b0815 100644 --- a/web/src/routes/_app.tsx +++ b/web/src/routes/_app.tsx @@ -66,17 +66,32 @@ function AppLayout() { /> diff --git a/web/src/routes/_app/settings/__tests__/-providers.test.tsx b/web/src/routes/_app/settings/__tests__/-providers.test.tsx index de34d5f05..4b695a4ce 100644 --- a/web/src/routes/_app/settings/__tests__/-providers.test.tsx +++ b/web/src/routes/_app/settings/__tests__/-providers.test.tsx @@ -27,7 +27,10 @@ const claudeEntry: SettingsProviderEntry = { command_available: true, settings: { command: "npx -y @agentclientprotocol/claude-agent-acp@latest", - default_model: "claude-sonnet-4-6", + models: { + default: "claude-sonnet-4-6", + curated: [{ id: "claude-sonnet-4-6" }, { id: "claude-haiku-4-5" }], + }, auth_mode: "native_cli", env_policy: "filtered", home_policy: "operator", @@ -60,7 +63,17 @@ const builtinEntry: SettingsProviderEntry = { command_available: true, settings: { command: "npx -y @zed-industries/codex-acp@latest", - default_model: "gpt-5.4", + models: { + default: "gpt-5.4", + curated: [ + { + id: "gpt-5.4", + supports_reasoning: true, + reasoning_efforts: ["low", "medium", "high"], + }, + { id: "gpt-5.4-mini" }, + ], + }, auth_mode: "bound_secret", env_policy: "filtered", home_policy: "operator", @@ -158,6 +171,32 @@ vi.mock("@/hooks/routes/use-settings-providers-page", () => ({ useSettingsProvidersPage: () => pageState, })); +vi.mock("@/systems/model-catalog", async () => { + const actual = + await vi.importActual("@/systems/model-catalog"); + return { + ...actual, + useProviderModels: () => ({ + data: undefined, + isLoading: false, + isFetching: false, + error: null, + }), + useProviderModelStatus: () => ({ + data: { sources: [] }, + isLoading: false, + isFetching: false, + error: null, + }), + useRefreshProviderModels: () => ({ + mutate: vi.fn(), + mutateAsync: vi.fn(), + isPending: false, + error: null, + }), + }; +}); + function defaultEditor() { return { mode: "closed" as const }; } @@ -244,6 +283,12 @@ describe("ProvidersSettingsPage", () => { expect(screen.getByTestId("settings-page-providers-card-claude-auth-mode")).toHaveTextContent( "native_cli" ); + expect( + screen.getByTestId("settings-page-providers-card-claude-curated-models") + ).toHaveTextContent("claude-sonnet-4-6"); + expect(screen.getByTestId("settings-page-providers-card-codex-reasoning")).toHaveTextContent( + "Per model" + ); expect(screen.getByTestId("settings-page-providers-card-claude-auth-status")).toHaveTextContent( "native_cli" ); @@ -326,7 +371,8 @@ describe("ProvidersSettingsPage", () => { name: "claude", command: "npx -y @agentclientprotocol/claude-agent-acp@latest", display_name: "Claude", - default_model: "claude-sonnet-4-6", + model_default: "claude-sonnet-4-6", + curated_models: "claude-sonnet-4-6\nclaude-haiku-4-5", target_env: "", harness: "acp", runtime_provider: "", @@ -354,6 +400,9 @@ describe("ProvidersSettingsPage", () => { expect(screen.getByTestId("settings-providers-editor-command-input")).toHaveValue( "npx -y @agentclientprotocol/claude-agent-acp@latest" ); + expect(screen.getByTestId("settings-providers-editor-curated-models-input")).toHaveValue( + "claude-sonnet-4-6\nclaude-haiku-4-5" + ); expect(screen.getByTestId("settings-providers-editor-source-effective")).toHaveTextContent( "CONFIG" ); @@ -368,7 +417,8 @@ describe("ProvidersSettingsPage", () => { name: "claude", command: "npx -y @agentclientprotocol/claude-agent-acp@latest", display_name: "", - default_model: "", + model_default: "", + curated_models: "", target_env: "ANTHROPIC_API_KEY", harness: "acp", runtime_provider: "", diff --git a/web/src/routes/_app/settings/providers.tsx b/web/src/routes/_app/settings/providers.tsx index 0acd2f1c6..05f7ac9a1 100644 --- a/web/src/routes/_app/settings/providers.tsx +++ b/web/src/routes/_app/settings/providers.tsx @@ -10,6 +10,7 @@ import { Input, NativeSelect, NativeSelectOption, + Textarea, } from "@agh/ui"; import { @@ -296,10 +297,27 @@ function ProviderEditor({ - onChange(current => ({ ...current, default_model: event.target.value })) + onChange(current => ({ ...current, model_default: event.target.value })) + } + /> + } + /> + + onChange(current => ({ ...current, curated_models: event.target.value })) } /> } diff --git a/web/src/systems/agent/components/agent-sessions-list.tsx b/web/src/systems/agent/components/agent-sessions-list.tsx index 3d45904ec..e92353d53 100644 --- a/web/src/systems/agent/components/agent-sessions-list.tsx +++ b/web/src/systems/agent/components/agent-sessions-list.tsx @@ -36,25 +36,29 @@ export function AgentSessionsList({ if (isError) { return ( - +
+ +
); } if (sessions.length === 0) { return ( - +
+ +
); } diff --git a/web/src/systems/automation/components/automation-run-history.tsx b/web/src/systems/automation/components/automation-run-history.tsx index 2d23d02d2..c3a8e6f62 100644 --- a/web/src/systems/automation/components/automation-run-history.tsx +++ b/web/src/systems/automation/components/automation-run-history.tsx @@ -58,7 +58,7 @@ export function AutomationRunHistory({ />
) : error ? ( -
+
) : runs.length === 0 ? ( -
+
) : ( diff --git a/web/src/systems/knowledge/components/knowledge-delete-dialog.tsx b/web/src/systems/knowledge/components/knowledge-delete-dialog.tsx index f9f6e26de..f8bed3559 100644 --- a/web/src/systems/knowledge/components/knowledge-delete-dialog.tsx +++ b/web/src/systems/knowledge/components/knowledge-delete-dialog.tsx @@ -55,7 +55,7 @@ function KnowledgeDeleteDialog({ {error}
) : null} - +
) : null} - + + ) : null} + + + + Reasoning effort + + Hint reasoning depth when the selected provider supports it. + + + {defaultReasoning ? ( +

+ Default reasoning: {defaultReasoning} +

+ ) : null} +
+
+ {submitError ? (

- + + + ); +} + +function formatRowCount(source: ProviderModelSourceStatus): string { + return `${source.row_count} rows`; +} + +function errorMessage(error: unknown): string | null { + if (error instanceof Error && error.message.trim().length > 0) { + return error.message; + } + return null; +} diff --git a/web/src/systems/settings/components/settings-page-shell.tsx b/web/src/systems/settings/components/settings-page-shell.tsx index 1c4577c4c..80143c1da 100644 --- a/web/src/systems/settings/components/settings-page-shell.tsx +++ b/web/src/systems/settings/components/settings-page-shell.tsx @@ -74,7 +74,7 @@ function SettingsPageShell({ )} data-testid={`settings-page-${slug}-body`} > -

{children}
+
{children}
{footer ? ( diff --git a/web/src/systems/settings/mocks/fixtures.ts b/web/src/systems/settings/mocks/fixtures.ts index d1355fe52..9bb7febe9 100644 --- a/web/src/systems/settings/mocks/fixtures.ts +++ b/web/src/systems/settings/mocks/fixtures.ts @@ -377,7 +377,10 @@ export const settingsProviderFixtures: SettingsProviderEntry[] = [ settings: { command: "npx -y @agentclientprotocol/claude-agent-acp@latest", display_name: "Claude Code", - default_model: "claude-sonnet-4-6", + models: { + default: "claude-sonnet-4-6", + curated: [{ id: "claude-sonnet-4-6" }, { id: "claude-haiku-4-5" }], + }, harness: "acp", auth_mode: "native_cli", env_policy: "filtered", @@ -402,7 +405,10 @@ export const settingsProviderFixtures: SettingsProviderEntry[] = [ fallback: { settings: { command: "npx -y @agentclientprotocol/claude-agent-acp@latest", - default_model: "claude-sonnet-4-6", + models: { + default: "claude-sonnet-4-6", + curated: [{ id: "claude-sonnet-4-6" }, { id: "claude-haiku-4-5" }], + }, auth_mode: "native_cli", env_policy: "filtered", home_policy: "operator", @@ -416,7 +422,18 @@ export const settingsProviderFixtures: SettingsProviderEntry[] = [ command_available: true, settings: { command: "npx -y @zed-industries/codex-acp@latest", - default_model: "gpt-5.4", + models: { + default: "gpt-5.4", + curated: [ + { + id: "gpt-5.4", + supports_reasoning: true, + reasoning_efforts: ["low", "medium", "high"], + default_reasoning_effort: "medium", + }, + { id: "gpt-5.4-mini" }, + ], + }, harness: "acp", auth_mode: "native_cli", env_policy: "filtered", @@ -445,7 +462,13 @@ export const settingsProviderFixtures: SettingsProviderEntry[] = [ settings: { command: "npx -y pi-acp@latest", display_name: "OpenRouter", - default_model: "openai/gpt-5.4", + models: { + default: "openai/gpt-5.4", + curated: [ + { id: "openai/gpt-5.4", supports_reasoning: true }, + { id: "anthropic/claude-sonnet-4-6" }, + ], + }, harness: "pi_acp", runtime_provider: "openrouter", auth_mode: "bound_secret", @@ -627,7 +650,7 @@ export const settingsProviderFixtures: SettingsProviderEntry[] = [ settings: { command: "npx -y @qwen-code/qwen-code@latest --acp --experimental-skills", display_name: "Qwen Code", - default_model: "qwen3.6-plus", + models: { default: "qwen3.6-plus", curated: [{ id: "qwen3.6-plus" }] }, harness: "acp", }, source_metadata: { diff --git a/web/src/systems/skill/components/skill-detail-panel.tsx b/web/src/systems/skill/components/skill-detail-panel.tsx index 9ee22192a..4548dd238 100644 --- a/web/src/systems/skill/components/skill-detail-panel.tsx +++ b/web/src/systems/skill/components/skill-detail-panel.tsx @@ -145,7 +145,7 @@ function SkillCapabilitiesSection({ skill }: { skill: SkillPayload }) { return (
{capabilities.length === 0 ? ( -
+
{calls.length === 0 ? ( -
+
(); + for (const option of options) { + const key = harnessGroupKey(option.harness); + let bucket = buckets.get(key); + if (!bucket) { + bucket = { key, heading: harnessGroupHeading(key), options: [] }; + buckets.set(key, bucket); + } + bucket.options.push(option); + } + const order = Array.from(buckets.values()); + order.sort((a, b) => { + if (a.key === FALLBACK_GROUP_KEY) return 1; + if (b.key === FALLBACK_GROUP_KEY) return -1; + return a.heading.localeCompare(b.heading); + }); + return order; +} + +function providerSearchKey(option: SessionProviderOption): string { + const segments = [option.name]; + if (option.display_name) segments.push(option.display_name); + if (option.harness) segments.push(option.harness); + if (option.runtime_provider) segments.push(option.runtime_provider); + return segments.join(" "); +} + +export interface ProviderCommandListProps { + options: SessionProviderOption[]; + isSelected: (option: SessionProviderOption) => boolean; + onSelect: (option: SessionProviderOption) => void; + searchPlaceholder?: string; + emptyState?: ReactNode; + itemTestId?: (option: SessionProviderOption) => string; +} + +export function ProviderCommandList({ + options, + isSelected, + onSelect, + searchPlaceholder = "Search providers...", + emptyState = "No providers match your search.", + itemTestId, +}: ProviderCommandListProps) { + const groups = useMemo(() => bucketByHarness(options), [options]); + + return ( + + + + {emptyState} + {groups.map(group => ( + + {group.options.map(option => { + const selected = isSelected(option); + return ( + onSelect(option)} + data-checked={selected ? "true" : "false"} + data-testid={ + itemTestId ? itemTestId(option) : `provider-command-item-${option.name}` + } + > +
+ + {option.display_name?.trim() || option.name} + + + {option.harness ?? "acp"} + +
+
+ ); + })} +
+ ))} +
+
+ ); +} diff --git a/web/src/systems/workspace/components/provider-command-select.tsx b/web/src/systems/workspace/components/provider-command-select.tsx new file mode 100644 index 000000000..b12df9d51 --- /dev/null +++ b/web/src/systems/workspace/components/provider-command-select.tsx @@ -0,0 +1,75 @@ +import { useMemo, useState } from "react"; +import { ChevronsUpDown, Cpu } from "lucide-react"; + +import { cn, Popover, PopoverContent, PopoverTrigger } from "@agh/ui"; + +import { ProviderCommandList } from "./provider-command-list"; +import type { SessionProviderOption } from "../types"; + +const TRIGGER_BASE = + "flex h-9 w-full items-center justify-between gap-2 rounded-md border border-input bg-background px-3 py-2 text-sm shadow-none outline-none transition-colors hover:bg-accent disabled:cursor-not-allowed disabled:opacity-50 focus-visible:ring-2 focus-visible:ring-ring/50"; + +export interface ProviderCommandSelectProps { + options: SessionProviderOption[]; + value: string | null; + onChange: (next: string | null) => void; + placeholder?: string; + disabled?: boolean; + triggerId?: string; + triggerTestId?: string; + className?: string; +} + +export function ProviderCommandSelect({ + options, + value, + onChange, + placeholder = "Select a provider", + disabled, + triggerId, + triggerTestId, + className, +}: ProviderCommandSelectProps) { + const [open, setOpen] = useState(false); + const selected = useMemo( + () => options.find(option => option.name === value) ?? null, + [options, value] + ); + const isSelected = (option: SessionProviderOption) => option.name === value; + const handleSelect = (option: SessionProviderOption) => { + onChange(option.name); + setOpen(false); + }; + + return ( + setOpen(next)}> + + {selected ? ( + + + ) : ( + {placeholder} + )} + + + + + + ); +} diff --git a/web/src/systems/workspace/index.ts b/web/src/systems/workspace/index.ts index cc8c91238..bfde1e55f 100644 --- a/web/src/systems/workspace/index.ts +++ b/web/src/systems/workspace/index.ts @@ -20,4 +20,8 @@ export { useActiveWorkspace } from "./hooks/use-active-workspace"; export { useResolveWorkspace, useWorkspace, useWorkspaces } from "./hooks/use-workspaces"; // Components +export { ProviderCommandList } from "./components/provider-command-list"; +export type { ProviderCommandListProps } from "./components/provider-command-list"; +export { ProviderCommandSelect } from "./components/provider-command-select"; +export type { ProviderCommandSelectProps } from "./components/provider-command-select"; export { WorkspaceOnboarding, WorkspaceSetupDialog } from "./components/workspace-setup"; diff --git a/web/src/systems/workspace/mocks/fixtures.ts b/web/src/systems/workspace/mocks/fixtures.ts index d9c745001..ec48195e7 100644 --- a/web/src/systems/workspace/mocks/fixtures.ts +++ b/web/src/systems/workspace/mocks/fixtures.ts @@ -153,35 +153,30 @@ export const workspaceDetailFixture: WorkspaceDetailPayload = { display_name: "Claude Code", harness: "acp", runtime_provider: "claude", - default_model: "claude-sonnet-4-6", }, { name: "codex", display_name: "Codex", harness: "acp", runtime_provider: "codex", - default_model: "gpt-5.4", }, { name: "gemini", display_name: "Gemini CLI", harness: "acp", runtime_provider: "gemini", - default_model: "gemini-2.5-pro", }, { name: "qwen-code", display_name: "Qwen Code", harness: "acp", runtime_provider: "qwen-code", - default_model: "qwen3.6-plus", }, { name: "openrouter", display_name: "OpenRouter", harness: "pi_acp", runtime_provider: "openrouter", - default_model: "openai/gpt-5.4", }, { name: "cline", display_name: "Cline", harness: "acp", runtime_provider: "cline" }, { name: "hermes", display_name: "Hermes", harness: "acp", runtime_provider: "hermes" },