RLM-REPL

Recursive Language Model with REPL Inference Strategy

A Python library that enables any language model to manage unlimited context using SQL-based retrieval with DuckDB.

Overview

RLM-REPL implements a human-like reading strategy for processing large documents:

Overview - Read the beginning to understand document structure
Search - Find relevant sections using keyword search
Deep Read - Extract detailed information from located sections
Synthesize - Combine findings into a comprehensive answer

This approach allows small context window models to effectively work with documents of any size.

Features

Two-sided architecture: Separate Data and Inference layers
In-memory default: Fast DuckDB in-memory database (no setup required)
Persistent option: Optional persistent database for caching
CLI tool: Instant testing from command line
Python API: Full programmatic control
Streaming events: Real-time progress tracking
Configurable verbosity: Control output detail level
OpenAI-compatible: Works with any OpenAI-compatible API

Installation

pip install rlm-repl

Or install from source:

git clone https://github.com/labKnowledge/rlm-repl.git
cd rlm-repl
pip install -e .

Quick Start

CLI Usage

# Interactive mode
rlm-repl document.txt

# With custom model
rlm-repl document.txt --base-url http://localhost:11434/v1 --model qwen3-coder

# Single question mode
rlm-repl document.txt --question "What is the main topic?"

# Quiet mode
rlm-repl document.txt -q --question "Summarize the document"

Python API

from rlm_repl import RLMREPL, RLMConfig

# Configure for Ollama (local)
config = RLMConfig(
    base_url="http://localhost:11434/v1",
    api_key="ollama",
    model="qwen3-coder",
)

# Create REPL and load document
with RLMREPL(config) as repl:
    repl.load_document("large_book.txt")
    
    result = repl.ask("What are the main themes?")
    print(result.answer)
    print(f"Read {result.total_words} words in {result.elapsed_time:.1f}s")

Documentation

Comprehensive documentation is available in the docs/ directory:

Getting Started - Installation, setup, and first steps
API Reference - Complete API documentation
Configuration - All configuration options
Examples - Detailed usage examples
Architecture - How the system works
Troubleshooting - Common issues and solutions

Supported Models

Any OpenAI-compatible API:

Ollama (local): llama3, qwen3, mistral, etc.
OpenAI: gpt-4, gpt-3.5-turbo
vLLM: Any hosted model
LMStudio: Local models
Together AI, Groq, etc.

Examples

See the examples/ directory for complete examples:

basic_usage.py - Simple document Q&A
streaming_events.py - Real-time progress tracking
persistent_database.py - Caching documents
api_usage.py - Building applications with RLM-REPL

How It Works

Document Loading: Text is parsed into lines with metadata (headers, code blocks, list items)
SQL Storage: Lines are stored in DuckDB with indexes for efficient querying
Reading Strategy: LLM decides what to read using SQL queries
Iterative Reading: Multiple passes gather relevant information
Answer Synthesis: Final answer is generated from gathered context

Reading Modes

overview: Read document beginning (lines 1-100)
search: Find keywords with LIKE '%term%'
read: Focused reading (20-50 lines)
deep_read: Detailed analysis (50-100 lines)

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Format code
ruff format rlm_repl

# Type checking
mypy rlm_repl

License

MIT License - see LICENSE file for details.

Contributing

Contributions welcome! Please open an issue or submit a pull request.

Background & History

RLM-REPL was created by Remy Gakwaya after reading the MIT paper on Recursive Language Models. The initial implementation attempted to use a REPL approach where the LLM would generate Python functions to process documents. However, this approach proved challenging, especially with smaller language models that struggled to create complex Python functions reliably.

After hundreds of iterations and experiments, Remy developed the RLM-REPL v8 concept - a human-like reading strategy specifically designed to work with local, smaller language models on limited computational resources. The philosophy was simple: if it can work reliably with poor and small models in limited computation, it would perform exceptionally well when powered by leading LLMs.

The library evolved to use SQL-based retrieval instead of LLM-generated Python functions, leveraging DuckDB for efficient document storage and querying. This approach:

Works reliably with models of all sizes, from small local models to leading cloud-based LLMs
Provides a structured, predictable interface (SQL) that even smaller models can handle
Enables efficient querying with database indexes
Implements the human-like reading strategy (overview → search → deep read → synthesize) developed in v8

Acknowledgments

Author: Remy Gakwaya

Inspiration: Based on the MIT paper on Recursive Language Models

Innovation: The RLM-REPL v8 concept - human-like reading strategies for LLM document processing - was developed by Remy after extensive experimentation (hundreds of iterations) to create a solution that works reliably with local, smaller language models.

Evolution: This implementation uses SQL-based retrieval instead of LLM-generated Python functions, making it more reliable and accessible for smaller language models while maintaining the proven v8 reading strategy.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
docs		docs
examples		examples
rlm_repl		rlm_repl
tests		tests
website		website
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RLM-REPL

Overview

Features

Installation

Quick Start

CLI Usage

Python API

Documentation

Supported Models

Examples

How It Works

Reading Modes

Development

License

Contributing

Background & History

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RLM-REPL

Overview

Features

Installation

Quick Start

CLI Usage

Python API

Documentation

Supported Models

Examples

How It Works

Reading Modes

Development

License

Contributing

Background & History

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages