TL;DR
Instead of managing what fits in the context window (compaction, sliding window), treat context as an external environment the model queries programmatically. This is the RLM paradigm from MIT's arXiv:2512.24601 paper - and it's production-ready in 2026.
Builds upon: #4659 - Sliding window context management by @rickross
Research: Recursive Language Models (arXiv:2512.24601)
Inspiration: Signal Zero symbolic cognition engine
The Problem
Current compaction approaches are lossy:
[Full Context] → OVERFLOW → [AI Summary] + [Recent Work]
The model loses:
- Nuanced reasoning chains
- Architectural decisions
- Failed approaches (negative knowledge)
- Cross-reference relationships
Even the excellent sliding window approach from #4659 still manages what fits in the window. It's smarter about what to keep, but the paradigm is the same.
The RLM Paradigm Shift
From the MIT paper:
"Long prompts should NOT be fed into the neural network directly but should instead be treated as part of the EXTERNAL ENVIRONMENT that the LLM can symbolically interact with."
Current paradigm:
Context → [Stuff into Window] → Model
RLM paradigm:
Context → [External Storage] ← Model queries via code → [Relevant Snippets] → Model
Performance Evidence
| Task |
Base LLM |
Compaction |
RLM |
| BrowseComp+ (10M tokens) |
0% |
~20% |
91% |
| DeepDive (Long Context) |
45% |
52% |
78% |
| Oolong (Retrieval) |
38% |
48% |
71% |
RLM scales to 10M+ tokens - two orders of magnitude beyond context windows.
How It Would Work in OpenCode
1. Context as External Environment
Instead of packing messages into the context window:
// Current approach
const messages = session.messages.slice(-N);
await model.generate(messages);
RLM approach:
// Context stored externally
const contextStore = new ContextEnvironment(session.messages);
// Model writes code to query what it needs
const queryCode = await model.generateCode(`
# Find relevant context about authentication
relevant = context.search("authentication", limit=10)
recent = context.slice(-20) # Last 20 messages
return merge(relevant, recent)
`);
// Execute query, inject results
const snippets = await contextStore.execute(queryCode);
await model.generate(systemPrompt + snippets);
2. Available Query Operations
# The model has access to:
context.search(query: str, limit: int) -> List[Message]
context.slice(start: int, end: int) -> List[Message]
context.filter(predicate: Callable) -> List[Message]
context.regex_search(pattern: str) -> List[Message]
context.get_by_id(id: str) -> Message
# Sub-LLM delegation for complex analysis
sub_llm(prompt: str) -> str
sub_llm_batch(prompts: List[str]) -> List[str] # Parallel
3. Sub-Agent Decomposition
For massive contexts, the model spawns sub-agents:
# Analyze 100k tokens across 10 parallel sub-agents
chunks = context.chunk(size=10000)
analyses = sub_llm_batch([
f"Summarize security concerns in:\n{chunk}"
for chunk in chunks
])
synthesis = sub_llm(f"Combine findings:\n{analyses}")
Integration with #4659 Concepts
This complements rather than replaces @rickross's sliding window approach:
| #4659 Concept |
RLM Integration |
| Inception Messages |
Always-loaded symbols in the context store |
| Chess-Clock Time |
Weighting factor for search relevance |
| Priority Levels |
Cache priority in context store |
| Heuristic Pruning |
Background optimization of context store |
The ACM tools from #4659 (acm_preserve, acm_prune, etc.) become metadata for the context store, not filtering logic.
Symbolic Compression Layer (From Signal Zero)
For extreme efficiency, frequently-accessed context can be compressed to symbolic representations:
// Instead of 500 tokens of architectural context
const fullContext = "This system uses event sourcing...";
// Compressed to ~10 tokens
const symbol = {
id: "ARCH-EVENT-SOURCING",
triad: ["events", "broadcast", "persist"],
expand: () => loadFullContext("ARCH-EVENT-SOURCING")
};
The model references symbols in queries, expands only when needed.
Implementation Phases
Phase 1: Context Store Foundation
ContextEnvironment class for external message storage
- Basic search/filter operations
- Semantic search via embeddings
- Integration with existing session storage
Phase 2: Code-Based Query System
- Sandboxed REPL for context queries
- Query result injection into model context
- Safety guards for code execution
Phase 3: Sub-Agent Delegation
- Sub-LLM spawning for context decomposition
- Parallel execution with
sub_llm_batch
- Token budget management
Phase 4: Symbolic Compression
- Triad generation for frequent context
- On-demand expansion
- Integration with inception messages
Phase 5: Hybrid Mode
- Combine sliding window + RLM
- Graceful fallback for simpler models
- User controls for query depth
Questions for Maintainers
-
Paradigm alignment: Does treating context as external environment align with OpenCode's direction?
-
Hybrid approach: Should RLM complement compaction, or eventually replace it?
-
Plugin vs core: Could this be a plugin, or does it require core changes?
-
Sub-agents: Is spawning sub-LLMs acceptable for cost/latency tradeoffs?
-
Model requirements: Should this be limited to models with strong code generation (Claude, GPT-4+)?
Why Now?
RLM is being called "the most exciting Agentic Paradigm of 2026" (VentureBeat, Jan 2026). The MIT paper has implementations on GitHub (github.com/alexzhang13/rlm). This isn't future speculation - it's production-ready research.
Combined with @rickross's battle-tested ACM tools from #4659, OpenCode could have the most sophisticated context management in any coding agent.
References
This proposal is research-backed discussion, not a PR. Looking for maintainer input on whether this direction aligns with OpenCode's vision before implementation work begins.
TL;DR
Instead of managing what fits in the context window (compaction, sliding window), treat context as an external environment the model queries programmatically. This is the RLM paradigm from MIT's arXiv:2512.24601 paper - and it's production-ready in 2026.
Builds upon: #4659 - Sliding window context management by @rickross
Research: Recursive Language Models (arXiv:2512.24601)
Inspiration: Signal Zero symbolic cognition engine
The Problem
Current compaction approaches are lossy:
The model loses:
Even the excellent sliding window approach from #4659 still manages what fits in the window. It's smarter about what to keep, but the paradigm is the same.
The RLM Paradigm Shift
From the MIT paper:
Current paradigm:
RLM paradigm:
Performance Evidence
RLM scales to 10M+ tokens - two orders of magnitude beyond context windows.
How It Would Work in OpenCode
1. Context as External Environment
Instead of packing messages into the context window:
RLM approach:
2. Available Query Operations
3. Sub-Agent Decomposition
For massive contexts, the model spawns sub-agents:
Integration with #4659 Concepts
This complements rather than replaces @rickross's sliding window approach:
The ACM tools from #4659 (
acm_preserve,acm_prune, etc.) become metadata for the context store, not filtering logic.Symbolic Compression Layer (From Signal Zero)
For extreme efficiency, frequently-accessed context can be compressed to symbolic representations:
The model references symbols in queries, expands only when needed.
Implementation Phases
Phase 1: Context Store Foundation
ContextEnvironmentclass for external message storagePhase 2: Code-Based Query System
Phase 3: Sub-Agent Delegation
sub_llm_batchPhase 4: Symbolic Compression
Phase 5: Hybrid Mode
Questions for Maintainers
Paradigm alignment: Does treating context as external environment align with OpenCode's direction?
Hybrid approach: Should RLM complement compaction, or eventually replace it?
Plugin vs core: Could this be a plugin, or does it require core changes?
Sub-agents: Is spawning sub-LLMs acceptable for cost/latency tradeoffs?
Model requirements: Should this be limited to models with strong code generation (Claude, GPT-4+)?
Why Now?
RLM is being called "the most exciting Agentic Paradigm of 2026" (VentureBeat, Jan 2026). The MIT paper has implementations on GitHub (github.com/alexzhang13/rlm). This isn't future speculation - it's production-ready research.
Combined with @rickross's battle-tested ACM tools from #4659, OpenCode could have the most sophisticated context management in any coding agent.
References
This proposal is research-backed discussion, not a PR. Looking for maintainer input on whether this direction aligns with OpenCode's vision before implementation work begins.