feat: add /force-model and /unforce-model session commands#167
Conversation
… tier clamp Two bugs flagged on PR #167: - The user_forced pin branch did not check the request's EnabledProviders, so a forced pin could dispatch to a provider without BYOK creds on the current request. Now the pin is treated as missing on this turn when the pinned provider isn't eligible, falling through to normal routing. - clampToCeiling could downgrade the forced decision and refreshPin then persisted the clamped model back into the pin, permanently losing the user's /force-model choice. The clamped decision now applies only to this turn's dispatch; the pin is refreshed with the original pin decision. Adds regression tests for both paths.
…o Debug Two security findings on PR #167: - parseForceModelCommand matched /force-model on any line of the final user message. Pasted content (snippets, transcripts) starting with "/" could silently rewrite session routing without explicit user intent. Restrict to the first non-empty line; update tests to assert the guard. - /force-model and /unforce-model logged session_key_hex (a stable per-session identifier) at Info level on every command use. Demote to Debug per the router's logging rules — this isn't a major business event and the identifier shouldn't be broadcast to shared log pipelines by default.
Users can type /force-model <model> in any message to pin the router to a specific model for the entire session. /unforce-model clears the pin and resumes automatic routing. - translate.ExtractForceModelCommand: pure parser that scans the last user message for the command and strips it from env.body before the request is forwarded upstream - proxy.handleForceModelCommand: writes a user_forced session pin (or expires it on /unforce-model) and returns a synthetic Anthropic acknowledgment response without hitting any upstream - turnloop: forced pins short-circuit scorer and planner entirely, making the override immune to planner switching - inferProviderForModel: infers provider from model name conventions (claude-* Anthropic, gpt-*/o-series OpenAI, gemini-* Google, slash-namespaced OpenRouter)
- Preserve ReasonUserForceModel after clampToCeiling: clamp was appending '+tier_clamp', breaking the exact-match on the next turn - Enforce per-request excluded-model policy on forced pins: fall through to normal routing if the forced model is excluded - Derive session key before ExtractForceModelCommand strips env.body: avoids session-key mismatch on the prompt-prefix fallback path when metadata.user_id is absent - Format synthetic acknowledgment as routing marker prefix so StripRoutingMarkerFromMessages strips it from subsequent inbound requests instead of leaking router text to the upstream
… tier clamp Two bugs flagged on PR #167: - The user_forced pin branch did not check the request's EnabledProviders, so a forced pin could dispatch to a provider without BYOK creds on the current request. Now the pin is treated as missing on this turn when the pinned provider isn't eligible, falling through to normal routing. - clampToCeiling could downgrade the forced decision and refreshPin then persisted the clamped model back into the pin, permanently losing the user's /force-model choice. The clamped decision now applies only to this turn's dispatch; the pin is refreshed with the original pin decision. Adds regression tests for both paths.
…o Debug Two security findings on PR #167: - parseForceModelCommand matched /force-model on any line of the final user message. Pasted content (snippets, transcripts) starting with "/" could silently rewrite session routing without explicit user intent. Restrict to the first non-empty line; update tests to assert the guard. - /force-model and /unforce-model logged session_key_hex (a stable per-session identifier) at Info level on every command use. Demote to Debug per the router's logging rules — this isn't a major business event and the identifier shouldn't be broadcast to shared log pipelines by default.
…es the ack Bugbot flagged: the prior async enqueuePinUpsert path may drop on semaphore saturation, while always (synchronously) Adding the new pin to the in-proc LRU. On /unforce-model that meant Postgres could still hold the active forced pin even after the client received "cleared" — on the next request the expired LRU entry would be evicted, loadPin would fall through to the stale Postgres row, and the forced pin would be silently resurrected. These are explicit user commands (one upsert each), not hot-path turns; pay the synchronous DB round-trip to guarantee the pin matches the acknowledgment. For /unforce-model the cache eviction also moves to AFTER the Postgres write to prevent a racing reader from repopulating the LRU from the prior row.
Branch was based on a pre-catalog-refactor commit; main has since deleted internal/router/capability/ in favor of the unified internal/router/catalog/ package. Rebased on main and remapped the two capability references (the TierFor call site in handleForceModelCommand and the catalog.Tier signature in the user_forced regression test) to their catalog equivalents.
8a95118 to
c6f3cb7
Compare
…h commands
Claude Code intercepts any prompt starting with "/" as a local slash command;
typing /force-model would otherwise resolve to "Unknown command" and never
reach the router. Ship two markdown wrapper files into <scope>/.claude/commands/
so the typed slash renders to a prompt of the same form, which the router's
existing first-line parser picks up.
- install/commands/{force-model,unforce-model}.md: the wrapper command files.
- install.sh: copy commands into $settings_dir/commands/ after writing
settings; resolved relative to the script so both ./install.sh and the
npm-bundled layout (next to bin.js) find them.
- uninstall.sh: remove only the two files this installer owns and rmdir the
commands/ directory if empty, leaving any user-authored commands alone.
- npm prepack: mirror install/commands/ into the package root and list
commands/ in package.json files so `npx @workweave/router` ships them.
Codex CLI supports custom prompts in <codex-home>/prompts/*.md, invoked as
/prompts:<name> with $ARGUMENTS expansion — same template-then-send model as
Claude Code, just a different on-disk location and a /prompts: prefix.
- Hoisted the slash-command writer into a shared install_slash_commands()
helper above the codex branch.
- Codex install path now drops the same two wrapper files into
<codex_dir>/prompts/ so /prompts:force-model expands to "/force-model <id>"
in the user message, which the router's first-line parser picks up.
- uninstall.sh removes only those two files from <codex_dir>/prompts/ and
rmdirs the directory if empty, mirroring the Claude path.
- Made the description text target-neutral ("this session" not "this
Claude Code session").
Co-authored-by: Steven <steven@workweave.ai>
Co-authored-by: Steven <steven@workweave.ai>
Co-authored-by: Steven <steven@workweave.ai>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit aa50a95. Configure here.
|
|
||
| func (t *ResponsesWriter) assembleOutput() []any { | ||
| out := make([]any, 0, 2+len(t.toolItems)) | ||
| out := make([]any, 0, len(t.toolItems)) |
There was a problem hiding this comment.
Slice capacity under-allocates by omitting text item
Low Severity
The capacity hint for assembleOutput changed from 2+len(t.toolItems) to len(t.toolItems), dropping the +2 that accounted for the text item (and a margin). In the common text-only-response case (toolItems is empty, textItem is non-nil), the initial capacity is now 0 instead of 2, triggering an unnecessary heap allocation on the first append. The old hint was correct; the new one systematically under-counts by 1 whenever a text item is present.
Reviewed by Cursor Bugbot for commit aa50a95. Configure here.


Summary
/force-model <model>in any user message pins the router to a specific model for the entire session — scorer and planner are bypassed so the choice is immutable until cleared/unforce-modelexpires the forced pin and resumes automatic routingHow it works
Detection (
internal/translate/force_model.go):env.ExtractForceModelCommand()scans the last user-role message for the directive (handles both string content and[{type:text}]array blocks), strips the command line, and returns the parsed result. Pure function, no I/O.Handling (
internal/proxy/force_model.go):handleForceModelCommandinfers the provider from the model name, writes auser_forcedsession pin (or writes an immediately-expired pin for/unforce-model), and returns a synthetic Anthropic response (streaming SSE or non-streaming JSON).Turn loop (
internal/proxy/turnloop.go): AfterloadPin, forced pins (pin.Reason == "user_forced") short-circuit before the scorer and planner and return the pinned model directly. The TTL is refreshed each turn.Hook (
internal/proxy/service.go): Command detection runs inProxyMessagesafter session context is resolved, only when apinStoreis configured.Closes https://linear.app/workweave/issue/W-1408/force-model
Test plan
/force-model deepseek/deepseek-v4-proin a Claude Code session → router acknowledges and all subsequent turns use DeepSeek/unforce-model→ router acknowledges and automatic selection resumes/force-model gpt-5\nActual question here) → command stripped, question forwarded normallygo test ./internal/translate/... ./internal/proxy/...🤖 Generated with Claude Code