A Claude Code skill that implements the Recursive Language Model (RLM) paradigm for processing arbitrarily large documents.
This skill is inspired by the paper:
Recursive Language Models Alex L. Zhang, Tim Kraska, Omar Khattab MIT CSAIL arXiv:2512.24601v1 [cs.AI] 31 Dec 2025
The paper introduces RLMs as a general inference strategy that treats long prompts as part of an external environment rather than feeding them directly into the LLM's context window. This allows LLMs to:
- Process inputs up to two orders of magnitude beyond their context window limits
- Dramatically outperform base LLMs on long-context tasks
- Maintain comparable or lower cost per query
@_joshideas (x.com)
From the paper:
"The key insight is that long prompts should not be fed into the neural network (e.g., Transformer) directly but should instead be treated as part of the environment that the LLM can symbolically interact with."
An RLM exposes the same interface as an LLM (string in, string out) but internally:
- Loads the prompt as a variable in a Python REPL environment
- Allows the LLM to write code that peeks into and decomposes the prompt
- Enables recursive self-calls on programmatic snippets of the variable
- Aggregates results into a final response
This skill adapts the RLM paradigm to Claude Code's architecture:
~/.claude/skills/rlm/
├── SKILL.md # Skill definition with YAML frontmatter
├── rlm_loader.py # Python utilities for document manipulation
├── .venv/ # Virtual environment with pdfplumber
└── README.md # This file
1. Document Loader (rlm_loader.py)
Provides programmatic access to large documents without loading them into context:
from rlm_loader import load_document
doc = load_document('/path/to/file.pdf')
doc.get_info() # Metadata: pages, lines, chars
doc.get_page(0) # Extract specific page
doc.get_lines(10, 50) # Extract line range
doc.search("keyword") # Find matches with context
doc.search(r"\d{4}", regex=True) # Regex search
doc.chunk_by_pages() # Split into page chunks
doc.chunk_by_chars(4000) # Split by character count2. Sub-Agent Processing
Uses Claude Code's Task tool to spawn parallel sub-agents for chunk processing:
- Each sub-agent receives a chunk + specific question
- Multiple sub-agents run concurrently
- Results are aggregated by the root agent
3. Skill Definition (SKILL.md)
Instructs Claude Code on the RLM workflow:
- Load and examine document structure
- Select appropriate strategy (search, scan, parallel chunks, recursive decomposition)
- Execute using code-based filtering and sub-agents
- Synthesize final answer
- PDF (via pdfplumber)
- Plain text (.txt)
- Markdown (.md)
- Code files (.py, .js, .ts, etc.)
- JSON, XML, HTML, CSV
- Log files
Invoke the skill:
/rlm /path/to/document.pdf "What are the key findings?"
Or reference a large file in conversation - the skill activates automatically for documents that would exceed reasonable context limits.
| Strategy | When to Use | Processing Cost |
|---|---|---|
| Targeted Search | Looking for specific information | Constant |
| Section Read | Questions about specific parts | Constant |
| Parallel Chunks | Aggregation, summarization | Linear |
| Recursive Decomposition | Multi-hop reasoning, pairwise analysis | Linear to Quadratic |
From the paper's findings:
- Never load entire documents - Use programmatic access to minimize context usage
- Filter with code first - Use regex/keyword search before reading content
- Parallelize sub-agents - Launch multiple Task calls in single response
- Sub-agents are stateless - Provide all needed context in their prompt
- Verify before answering - Retrieve more context if uncertain
RLMs demonstrated:
- 91.33% accuracy on BrowseComp+ (vs 70.47% for summarization baseline)
- 58.00% F1 on OOLONG-Pairs (vs 0.04% for base GPT-5)
- Strong performance at 10M+ token scale
- Comparable or lower cost than base model calls at median
- Paper: https://arxiv.org/abs/2512.24601
- pdfplumber: https://github.com/jsvine/pdfplumber
- Claude Code Skills: https://docs.anthropic.com/en/docs/claude-code
This implementation is provided as-is.