Skip to content

feat: add /force-model and /unforce-model session commands#167

Merged
steventohme merged 11 commits into
mainfrom
steven/feat/force-model-command
May 20, 2026
Merged

feat: add /force-model and /unforce-model session commands#167
steventohme merged 11 commits into
mainfrom
steven/feat/force-model-command

Conversation

@steventohme
Copy link
Copy Markdown
Collaborator

@steventohme steventohme commented May 16, 2026

Summary

  • /force-model <model> in any user message pins the router to a specific model for the entire session — scorer and planner are bypassed so the choice is immutable until cleared
  • /unforce-model expires the forced pin and resumes automatic routing
  • Commands are stripped from the message body before forwarding, so the upstream never sees them; a synthetic Anthropic acknowledgment response is returned instead

How it works

Detection (internal/translate/force_model.go): env.ExtractForceModelCommand() scans the last user-role message for the directive (handles both string content and [{type:text}] array blocks), strips the command line, and returns the parsed result. Pure function, no I/O.

Handling (internal/proxy/force_model.go): handleForceModelCommand infers the provider from the model name, writes a user_forced session pin (or writes an immediately-expired pin for /unforce-model), and returns a synthetic Anthropic response (streaming SSE or non-streaming JSON).

Turn loop (internal/proxy/turnloop.go): After loadPin, forced pins (pin.Reason == "user_forced") short-circuit before the scorer and planner and return the pinned model directly. The TTL is refreshed each turn.

Hook (internal/proxy/service.go): Command detection runs in ProxyMessages after session context is resolved, only when a pinStore is configured.

Closes https://linear.app/workweave/issue/W-1408/force-model

Test plan

  • /force-model deepseek/deepseek-v4-pro in a Claude Code session → router acknowledges and all subsequent turns use DeepSeek
  • /unforce-model → router acknowledges and automatic selection resumes
  • Command mid-message (e.g. /force-model gpt-5\nActual question here) → command stripped, question forwarded normally
  • Unit tests pass: go test ./internal/translate/... ./internal/proxy/...

🤖 Generated with Claude Code

Comment thread internal/proxy/turnloop.go Outdated
Comment thread internal/proxy/turnloop.go Outdated
Comment thread internal/proxy/service.go
Comment thread internal/proxy/force_model.go
Comment thread internal/proxy/turnloop.go
Comment thread internal/proxy/turnloop.go
Comment thread internal/proxy/turnloop.go Outdated
steventohme added a commit that referenced this pull request May 20, 2026
… tier clamp

Two bugs flagged on PR #167:

- The user_forced pin branch did not check the request's EnabledProviders, so
  a forced pin could dispatch to a provider without BYOK creds on the current
  request. Now the pin is treated as missing on this turn when the pinned
  provider isn't eligible, falling through to normal routing.

- clampToCeiling could downgrade the forced decision and refreshPin then
  persisted the clamped model back into the pin, permanently losing the user's
  /force-model choice. The clamped decision now applies only to this turn's
  dispatch; the pin is refreshed with the original pin decision.

Adds regression tests for both paths.
Comment thread internal/translate/force_model.go
Comment thread internal/proxy/force_model.go
steventohme added a commit that referenced this pull request May 20, 2026
…o Debug

Two security findings on PR #167:

- parseForceModelCommand matched /force-model on any line of the final user
  message. Pasted content (snippets, transcripts) starting with "/" could
  silently rewrite session routing without explicit user intent. Restrict to
  the first non-empty line; update tests to assert the guard.

- /force-model and /unforce-model logged session_key_hex (a stable per-session
  identifier) at Info level on every command use. Demote to Debug per the
  router's logging rules — this isn't a major business event and the
  identifier shouldn't be broadcast to shared log pipelines by default.
Comment thread internal/proxy/force_model.go
Users can type /force-model <model> in any message to pin the router to
a specific model for the entire session. /unforce-model clears the pin
and resumes automatic routing.

- translate.ExtractForceModelCommand: pure parser that scans the last
  user message for the command and strips it from env.body before the
  request is forwarded upstream
- proxy.handleForceModelCommand: writes a user_forced session pin (or
  expires it on /unforce-model) and returns a synthetic Anthropic
  acknowledgment response without hitting any upstream
- turnloop: forced pins short-circuit scorer and planner entirely,
  making the override immune to planner switching
- inferProviderForModel: infers provider from model name conventions
  (claude-* Anthropic, gpt-*/o-series OpenAI, gemini-* Google,
  slash-namespaced OpenRouter)
- Preserve ReasonUserForceModel after clampToCeiling: clamp was
  appending '+tier_clamp', breaking the exact-match on the next turn
- Enforce per-request excluded-model policy on forced pins: fall
  through to normal routing if the forced model is excluded
- Derive session key before ExtractForceModelCommand strips env.body:
  avoids session-key mismatch on the prompt-prefix fallback path when
  metadata.user_id is absent
- Format synthetic acknowledgment as routing marker prefix so
  StripRoutingMarkerFromMessages strips it from subsequent inbound
  requests instead of leaking router text to the upstream
… tier clamp

Two bugs flagged on PR #167:

- The user_forced pin branch did not check the request's EnabledProviders, so
  a forced pin could dispatch to a provider without BYOK creds on the current
  request. Now the pin is treated as missing on this turn when the pinned
  provider isn't eligible, falling through to normal routing.

- clampToCeiling could downgrade the forced decision and refreshPin then
  persisted the clamped model back into the pin, permanently losing the user's
  /force-model choice. The clamped decision now applies only to this turn's
  dispatch; the pin is refreshed with the original pin decision.

Adds regression tests for both paths.
…o Debug

Two security findings on PR #167:

- parseForceModelCommand matched /force-model on any line of the final user
  message. Pasted content (snippets, transcripts) starting with "/" could
  silently rewrite session routing without explicit user intent. Restrict to
  the first non-empty line; update tests to assert the guard.

- /force-model and /unforce-model logged session_key_hex (a stable per-session
  identifier) at Info level on every command use. Demote to Debug per the
  router's logging rules — this isn't a major business event and the
  identifier shouldn't be broadcast to shared log pipelines by default.
…es the ack

Bugbot flagged: the prior async enqueuePinUpsert path may drop on semaphore
saturation, while always (synchronously) Adding the new pin to the in-proc
LRU. On /unforce-model that meant Postgres could still hold the active forced
pin even after the client received "cleared" — on the next request the
expired LRU entry would be evicted, loadPin would fall through to the stale
Postgres row, and the forced pin would be silently resurrected.

These are explicit user commands (one upsert each), not hot-path turns; pay
the synchronous DB round-trip to guarantee the pin matches the acknowledgment.
For /unforce-model the cache eviction also moves to AFTER the Postgres write
to prevent a racing reader from repopulating the LRU from the prior row.
Branch was based on a pre-catalog-refactor commit; main has since deleted
internal/router/capability/ in favor of the unified internal/router/catalog/
package. Rebased on main and remapped the two capability references (the
TierFor call site in handleForceModelCommand and the catalog.Tier signature
in the user_forced regression test) to their catalog equivalents.
@steventohme steventohme force-pushed the steven/feat/force-model-command branch from 8a95118 to c6f3cb7 Compare May 20, 2026 18:58
Comment thread internal/translate/force_model.go
Comment thread internal/translate/force_model.go
…h commands

Claude Code intercepts any prompt starting with "/" as a local slash command;
typing /force-model would otherwise resolve to "Unknown command" and never
reach the router. Ship two markdown wrapper files into <scope>/.claude/commands/
so the typed slash renders to a prompt of the same form, which the router's
existing first-line parser picks up.

- install/commands/{force-model,unforce-model}.md: the wrapper command files.
- install.sh: copy commands into $settings_dir/commands/ after writing
  settings; resolved relative to the script so both ./install.sh and the
  npm-bundled layout (next to bin.js) find them.
- uninstall.sh: remove only the two files this installer owns and rmdir the
  commands/ directory if empty, leaving any user-authored commands alone.
- npm prepack: mirror install/commands/ into the package root and list
  commands/ in package.json files so `npx @workweave/router` ships them.
Comment thread install/uninstall.sh
Codex CLI supports custom prompts in <codex-home>/prompts/*.md, invoked as
/prompts:<name> with $ARGUMENTS expansion — same template-then-send model as
Claude Code, just a different on-disk location and a /prompts: prefix.

- Hoisted the slash-command writer into a shared install_slash_commands()
  helper above the codex branch.
- Codex install path now drops the same two wrapper files into
  <codex_dir>/prompts/ so /prompts:force-model expands to "/force-model <id>"
  in the user message, which the router's first-line parser picks up.
- uninstall.sh removes only those two files from <codex_dir>/prompts/ and
  rmdirs the directory if empty, mirroring the Claude path.
- Made the description text target-neutral ("this session" not "this
  Claude Code session").
Comment thread internal/proxy/service.go
cursoragent and others added 2 commits May 20, 2026 21:48
Co-authored-by: Steven <steven@workweave.ai>
Co-authored-by: Steven <steven@workweave.ai>
Comment thread install/npm/scripts/copy-installer.js Outdated
Comment thread internal/proxy/force_model.go Outdated
Co-authored-by: Steven <steven@workweave.ai>
@steventohme steventohme merged commit d481ce3 into main May 20, 2026
7 checks passed
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit aa50a95. Configure here.


func (t *ResponsesWriter) assembleOutput() []any {
out := make([]any, 0, 2+len(t.toolItems))
out := make([]any, 0, len(t.toolItems))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slice capacity under-allocates by omitting text item

Low Severity

The capacity hint for assembleOutput changed from 2+len(t.toolItems) to len(t.toolItems), dropping the +2 that accounted for the text item (and a margin). In the common text-only-response case (toolItems is empty, textItem is non-nil), the initial capacity is now 0 instead of 2, triggering an unnecessary heap allocation on the first append. The old hint was correct; the new one systematically under-counts by 1 whenever a text item is present.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit aa50a95. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants