fast-rlm

An MCP server that turns Claude Code into a Recursive Language Model (RLM) for processing long contexts.

Instead of calling external APIs, fast-rlm provides a Python REPL, context chunking, and session management — letting Claude drive recursive decomposition loops using its own inference quota at zero additional API cost.

How it works

Load context — feed any long text/data into a session
Chunk — split it into manageable pieces (by size or separator)
Process — Claude reads each chunk, runs Python code to analyze it, and stores findings
Aggregate — combine results across chunks into a final answer

Claude acts as both the primary agent and sub-agent, recursively decomposing complex tasks.

Tools

Tool	Description
`rlm_start`	Initialize a session with context
`rlm_chunk`	Split context into chunks (by char count or separator)
`rlm_read_chunk`	Read a specific chunk
`rlm_exec`	Execute Python in the persistent REPL
`rlm_store`	Store intermediate findings
`rlm_finish`	Finalize session and return all results
`rlm_system_prompt`	Get the RLM workflow instructions

Setup

Prerequisites

Python 3.10+
uv (recommended) or pip

Install in Claude Code

claude mcp add fast-rlm \
  -- uv run --with "mcp[cli]" --python 3.14 \
  /path/to/fast-rlm/server.py

Or add to ~/.claude.json manually:

{
  "mcpServers": {
    "fast-rlm": {
      "type": "stdio",
      "command": "uv",
      "args": [
        "run", "--with", "mcp[cli]", "--python", "3.14",
        "/path/to/fast-rlm/server.py"
      ]
    }
  }
}

Usage

Tell Claude Code to "use fast-rlm" and give it a long context task. Example:

use fast-rlm. summarize this 200-page document: <paste>

Claude will automatically:

Call rlm_start() to load the context
Call rlm_chunk() to split it
Process each chunk with rlm_read_chunk() + rlm_exec()
Store findings with rlm_store()
Aggregate and return via rlm_finish()

Python REPL

The rlm_exec tool provides a persistent Python environment (like Jupyter). Variables persist across calls:

# First call
rlm_exec("items = [line for line in context.split('\\n') if 'ERROR' in line]")

# Second call — `items` is still available
rlm_exec("print(f'Found {len(items)} errors')")

Design principles

Zero external cost — all inference uses Claude's own quota
Recursive — chunks can be re-chunked if still too large
Parallel — process multiple chunks concurrently
Stateful — REPL persists variables across calls within a session
Lightweight — single file, no dependencies beyond mcp

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
README.md		README.md
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fast-rlm

How it works

Tools

Setup

Prerequisites

Install in Claude Code

Usage

Python REPL

Design principles

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

fast-rlm

How it works

Tools

Setup

Prerequisites

Install in Claude Code

Usage

Python REPL

Design principles

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages