Skip to content
This repository was archived by the owner on May 15, 2026. It is now read-only.
This repository was archived by the owner on May 15, 2026. It is now read-only.

Switch models and modes context length verification and compression #4022

@helLf1nGer

Description

@helLf1nGer

What problem does this proposed feature solve?

Switching between models that have different context lengths, e.g. Sonnet 3.7 (200 k tokens) vs. Gemini 2.5 Pro (1 M tokens). When one mode uses a high-capacity model and hands off to a lower-capacity model, the API call can fail if the context is too large.

Describe the proposed solution in detail

When there is a mode switch (Auto-approve Roo), and if user is using different models for different modes, Roo should be able to check if there is no conflict in context lengths used. For example if the Architect mode used Gemini 2.5 pro and currently uses 400k tokens of context, then switching to Code mode with Sonnet 3.7 maximum length of 200k, then the API call will fail. The proposed solution in this scenario is to have Architect mode compress the context before switching to Code mode for implementation.
On the other hand, if the Code mode finishes and passes back the context of 150k tokens, no need to compress, as Architect mode has capacity for working with that API call.
This should be a layer before the switch, get the context length previously, check the handoff message, make a decision on compression.

When there is a mode switch (e.g., Auto-approve Roo), Roo should:

  1. Fetch the length of the outgoing handoff context.
  2. Lookup the target mode’s model max-token capacity.
  3. Compare the two values.
  4. If context > capacity → compress (experimental Roo feature already exists) the context to fit the limit.
  5. Otherwise, hand off unchanged.

For example:

  • Architect mode (Gemini 2.5 Pro, 1 M limit) accumulates 400 k tokens.
  • Switch to Code mode (Sonnet 3.7, 200 k limit) → detect overflow → compress to ≤200 k before switching.
  • Switch back to Architect mode with 150 k tokens → no compression needed.

Technical considerations or implementation details (optional)

  • Add a hook in the mode-switcher to retrieve model metadata (max tokens).
  • Integrate experimental Roo Intelligent context condensing.

Describe alternatives considered (if any)

If using Boomerang mode (Orchestrator), you could enforce only one mode active per task to guarantee handoffs—but this reduces concurrency and flexibility, so it’s suboptimal compared to dynamic compression in the same chat/task window

Additional Context & Mockups

No response

Proposal Checklist

  • I have searched existing Issues and Discussions to ensure this proposal is not a duplicate.
  • This proposal is for a specific, actionable change intended for implementation (not a general idea).
  • I understand that this proposal requires review and approval before any development work begins.

Are you interested in implementing this feature if approved?

  • Yes, I would like to contribute to implementing this feature.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNew feature or requestIssue/PR - TriageNew issue. Needs quick review to confirm validity and assign labels.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions