Skip to content

[FEATURE] Recursive Language Model (RLM) Context Management - Context as External Environment #11829

@chindris-mihai-alexandru

Description

TL;DR

Instead of managing what fits in the context window (compaction, sliding window), treat context as an external environment the model queries programmatically. This is the RLM paradigm from MIT's arXiv:2512.24601 paper - and it's production-ready in 2026.

Builds upon: #4659 - Sliding window context management by @rickross
Research: Recursive Language Models (arXiv:2512.24601)
Inspiration: Signal Zero symbolic cognition engine


The Problem

Current compaction approaches are lossy:

[Full Context] → OVERFLOW → [AI Summary] + [Recent Work]

The model loses:

  • Nuanced reasoning chains
  • Architectural decisions
  • Failed approaches (negative knowledge)
  • Cross-reference relationships

Even the excellent sliding window approach from #4659 still manages what fits in the window. It's smarter about what to keep, but the paradigm is the same.


The RLM Paradigm Shift

From the MIT paper:

"Long prompts should NOT be fed into the neural network directly but should instead be treated as part of the EXTERNAL ENVIRONMENT that the LLM can symbolically interact with."

Current paradigm:

Context → [Stuff into Window] → Model

RLM paradigm:

Context → [External Storage] ← Model queries via code → [Relevant Snippets] → Model

Performance Evidence

Task Base LLM Compaction RLM
BrowseComp+ (10M tokens) 0% ~20% 91%
DeepDive (Long Context) 45% 52% 78%
Oolong (Retrieval) 38% 48% 71%

RLM scales to 10M+ tokens - two orders of magnitude beyond context windows.


How It Would Work in OpenCode

1. Context as External Environment

Instead of packing messages into the context window:

// Current approach
const messages = session.messages.slice(-N);
await model.generate(messages);

RLM approach:

// Context stored externally
const contextStore = new ContextEnvironment(session.messages);

// Model writes code to query what it needs
const queryCode = await model.generateCode(`
  # Find relevant context about authentication
  relevant = context.search("authentication", limit=10)
  recent = context.slice(-20)  # Last 20 messages
  return merge(relevant, recent)
`);

// Execute query, inject results
const snippets = await contextStore.execute(queryCode);
await model.generate(systemPrompt + snippets);

2. Available Query Operations

# The model has access to:
context.search(query: str, limit: int) -> List[Message]
context.slice(start: int, end: int) -> List[Message]
context.filter(predicate: Callable) -> List[Message]
context.regex_search(pattern: str) -> List[Message]
context.get_by_id(id: str) -> Message

# Sub-LLM delegation for complex analysis
sub_llm(prompt: str) -> str
sub_llm_batch(prompts: List[str]) -> List[str]  # Parallel

3. Sub-Agent Decomposition

For massive contexts, the model spawns sub-agents:

# Analyze 100k tokens across 10 parallel sub-agents
chunks = context.chunk(size=10000)
analyses = sub_llm_batch([
    f"Summarize security concerns in:\n{chunk}" 
    for chunk in chunks
])
synthesis = sub_llm(f"Combine findings:\n{analyses}")

Integration with #4659 Concepts

This complements rather than replaces @rickross's sliding window approach:

#4659 Concept RLM Integration
Inception Messages Always-loaded symbols in the context store
Chess-Clock Time Weighting factor for search relevance
Priority Levels Cache priority in context store
Heuristic Pruning Background optimization of context store

The ACM tools from #4659 (acm_preserve, acm_prune, etc.) become metadata for the context store, not filtering logic.


Symbolic Compression Layer (From Signal Zero)

For extreme efficiency, frequently-accessed context can be compressed to symbolic representations:

// Instead of 500 tokens of architectural context
const fullContext = "This system uses event sourcing...";

// Compressed to ~10 tokens
const symbol = {
  id: "ARCH-EVENT-SOURCING",
  triad: ["events", "broadcast", "persist"],
  expand: () => loadFullContext("ARCH-EVENT-SOURCING")
};

The model references symbols in queries, expands only when needed.


Implementation Phases

Phase 1: Context Store Foundation

  • ContextEnvironment class for external message storage
  • Basic search/filter operations
  • Semantic search via embeddings
  • Integration with existing session storage

Phase 2: Code-Based Query System

  • Sandboxed REPL for context queries
  • Query result injection into model context
  • Safety guards for code execution

Phase 3: Sub-Agent Delegation

  • Sub-LLM spawning for context decomposition
  • Parallel execution with sub_llm_batch
  • Token budget management

Phase 4: Symbolic Compression

  • Triad generation for frequent context
  • On-demand expansion
  • Integration with inception messages

Phase 5: Hybrid Mode

  • Combine sliding window + RLM
  • Graceful fallback for simpler models
  • User controls for query depth

Questions for Maintainers

  1. Paradigm alignment: Does treating context as external environment align with OpenCode's direction?

  2. Hybrid approach: Should RLM complement compaction, or eventually replace it?

  3. Plugin vs core: Could this be a plugin, or does it require core changes?

  4. Sub-agents: Is spawning sub-LLMs acceptable for cost/latency tradeoffs?

  5. Model requirements: Should this be limited to models with strong code generation (Claude, GPT-4+)?


Why Now?

RLM is being called "the most exciting Agentic Paradigm of 2026" (VentureBeat, Jan 2026). The MIT paper has implementations on GitHub (github.com/alexzhang13/rlm). This isn't future speculation - it's production-ready research.

Combined with @rickross's battle-tested ACM tools from #4659, OpenCode could have the most sophisticated context management in any coding agent.


References


This proposal is research-backed discussion, not a PR. Looking for maintainer input on whether this direction aligns with OpenCode's vision before implementation work begins.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions