You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Command Code uses background models (MiniMax-M2.5, Kimi-K2.5) for internal tasks such as title generation, tool call naming, and taste learning. These calls are not user-selectable, yet they appear in usage logs and consume credits. In my sessions, these background calls account for a significant share of total cost — sometimes roughly half the spend for a task — because tool descriptions are being generated/processed on what seems like every other request.
Problems
MiniMax-M2.5 is not cost-effective for trivial tasks. Title generation and tool naming are lightweight operations that do not require a 400B+ parameter model. Much smaller and cheaper alternatives (e.g. Qwen 3 Flash, DeepSeek V4 Flash, Gemma 3 4B, Phi-4-mini) would handle these tasks at a fraction of the cost.
Users have no control over the background model. The /model selector only affects the primary conversation model. There is no way to choose, downgrade, or disable the background model — yet the resulting token usage is billed to the user's account.
Frequency of background calls inflates cost. Usage logs show tool-desc/summary calls being made repeatedly (appearing every other request in some sessions). Even at low per-call cost, this adds up significantly over a session.
Expected Behavior
Background helper tasks should use the cheapest viable model available, not a mid-tier general-purpose one.
Users should be able to see which model was used for each call in the usage dashboard (currently it just shows "MiniMax-M2.5" alongside the selected model without explanation).
Ideally, users should be able to opt out of, or at least select the model for, these background operations.
Actual Behavior
MiniMax-M2.5 (a 400B+ parameter model) is used for trivial text summarization tasks.
These calls are invisible during the session — they only show up in the usage dashboard after the fact.
The cumulative cost of these background calls can rival or exceed the cost of the primary model calls.
Suggestions
Switch background tasks to a small, fast, cheap model (Qwen 3 Flash, DeepSeek V4 Flash, or similar sub-cent-per-million-token models).
Surface background model usage separately in the UI so users understand what they're being billed for.
Summary
Command Code uses background models (MiniMax-M2.5, Kimi-K2.5) for internal tasks such as title generation, tool call naming, and taste learning. These calls are not user-selectable, yet they appear in usage logs and consume credits. In my sessions, these background calls account for a significant share of total cost — sometimes roughly half the spend for a task — because tool descriptions are being generated/processed on what seems like every other request.
Problems
MiniMax-M2.5 is not cost-effective for trivial tasks. Title generation and tool naming are lightweight operations that do not require a 400B+ parameter model. Much smaller and cheaper alternatives (e.g. Qwen 3 Flash, DeepSeek V4 Flash, Gemma 3 4B, Phi-4-mini) would handle these tasks at a fraction of the cost.
Users have no control over the background model. The
/modelselector only affects the primary conversation model. There is no way to choose, downgrade, or disable the background model — yet the resulting token usage is billed to the user's account.Frequency of background calls inflates cost. Usage logs show tool-desc/summary calls being made repeatedly (appearing every other request in some sessions). Even at low per-call cost, this adds up significantly over a session.
Expected Behavior
Actual Behavior
Suggestions
/taste(as suggested in Unexpected usage of multiple models during a single session #326).Related
Environment