feat(models): cap free workspaces to low-tier models#924
Merged
Conversation
Free workspaces resolved to top-tier models for chat and most LLM work. Now, when billing is enabled, free plans are forced to the "low" tier; PRO/MAX, self-hosted (billing disabled), and explicit model overrides are unaffected. - getModelForUseCase: cap effective complexity to "low" for non-paid plans (after the explicit modelConfig override check), fetching planType via the existing workspace query. - resolveDefaultChatModelId: new plan-aware resolver for the main chat agent, which previously always used env.MODEL. Free -> DB low tier (honoring modelConfig.chat); paid / billing-off / no-workspace -> env.MODEL. - Wire resolveDefaultChatModelId into the chat entry points (conversation, no-stream, voice), keeping body.modelId precedence. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Free workspaces were resolving to top-tier models for chat and most LLM work. This change forces free plans to the
"low"model tier when billing is enabled. PRO/MAX, self-hosted (billing disabled), and explicit model overrides are unaffected.Two paths sent free users to top-tier models; both are fixed at their chokepoints in
apps/webapp/app/services/llm-provider.server.ts:Part A — routed path (
getModelForUseCase) — memory, search, sub-agents, titles, summaries, etc. After the explicit-override check, if billing is enabled and the workspace isn't on a paid plan, the effective complexity is forced to"low".planTypeis read from the existing workspace query (addedSubscriptionto theselect— no extra round-trip).Part B — main chat agent — previously used
getDefaultChatModelId()=env.MODELfor every plan (the real "high model for everything"). NewresolveDefaultChatModelId(workspaceId):"low"tier (honoring amodelConfig.chatoverride)env.MODEL(unchanged)Wired into the 3 chat entry points (
api.v1.conversation._index.tsx,no-stream-process.ts,api.v1.voice.turn.tsx), keepingbody.modelIdprecedence so an explicit per-request model still wins.Decisions encoded
"low"for everything"low"complexity routing (no new env var)Net effect
PRO/MAX and self-hosted are byte-for-byte unchanged; only free hosted workspaces drop to low-tier.
Test plan
llm-provider-plan-tier.test.ts— 12 tests, written test-first (confirmed RED → GREEN). Covers FREE cap, override honored, PRO/MAX untouched, billing-off, no-workspace, and the chat resolver.🤖 Generated with Claude Code