# Recursive Language Models (RLM): A Hands-On Tutorial

Welcome! This notebook introduces **Recursive Language Models (RLM)** — an inference paradigm that lets language models programmatically examine, decompose, and recursively call themselves over their input using a REPL environment.

Instead of a single `llm.completion()` call, RLM gives the model a **code execution environment** where it can write and run code, inspect intermediate results, and spawn **recursive sub-calls** to itself to handle sub-problems — enabling near-infinite context handling.

> **Paper:** *Recursive Language Models* — Alex L. Zhang, Tim Kraska, Omar Khattab (MIT, 2025)  
> **Repository:** [github.com/alexzhang13/rlm](https://github.com/alexzhang13/rlm)

## Table of Contents

1. [Setup & Configuration](#1-setup--configuration)
2. [What is an RLM?](#2-what-is-an-rlm)
3. [Your First RLM Completion](#3-your-first-rlm-completion)
4. [How the REPL Loop Works](#4-how-the-repl-loop-works)
5. [Recursive Sub-Calls: Depth > 1](#5-recursive-sub-calls)
6. [Batched Recursive Queries](#6-batched-recursive-queries)
7. [Context Compaction](#7-context-compaction)
8. [Custom Tools](#8-custom-tools)
9. [Logging & Trajectory Inspection](#9-logging--trajectory-inspection)
10. [Exercises](#10-exercises)
11. [References](#11-references)

---
## 1. Setup & Configuration <a id='1-setup--configuration'></a>

RLM requires an LLM backend. This tutorial is pre-configured to use the API key provided by your instructor via environment variables. If you're running this on your own, set the appropriate key below.

### Supported backends

| Backend | Environment Variable | Example Model |
|---------|---------------------|---------------|
| OpenAI | `OPENAI_API_KEY` | `gpt-4o`, `gpt-4o-mini` |
| Anthropic | `ANTHROPIC_API_KEY` | `claude-sonnet-4-20250514` |
| Portkey (router) | `PORTKEY_API_KEY` | `@openai/gpt-4o` |
| OpenRouter | `OPENROUTER_API_KEY` | `openai/gpt-4o` |
| vLLM (local) | `VLLM_BASE_URL` | your local model |

In [None]:
import os
import json

# ── Pick your backend ──────────────────────────────────────────────
# The environment variables are pre-set by the JupyterHub deployment.
# If they aren't set, uncomment ONE of the lines below and paste your key.

# os.environ["OPENAI_API_KEY"]    = "sk-..."
# os.environ["ANTHROPIC_API_KEY"] = "sk-ant-..."
# os.environ["PORTKEY_API_KEY"]   = "..."

# ── Detect which backend is available ──────────────────────────────
if os.environ.get("OPENAI_API_KEY"):
    BACKEND = "openai"
    BACKEND_KWARGS = {
        "model_name": os.environ.get("OPENAI_MODEL", "gpt-4o-mini"),
        "api_key": os.environ["OPENAI_API_KEY"],
    }
elif os.environ.get("ANTHROPIC_API_KEY"):
    BACKEND = "anthropic"
    BACKEND_KWARGS = {
        "model_name": os.environ.get("ANTHROPIC_MODEL", "claude-sonnet-4-20250514"),
        "api_key": os.environ["ANTHROPIC_API_KEY"],
    }
elif os.environ.get("PORTKEY_API_KEY"):
    BACKEND = "portkey"
    BACKEND_KWARGS = {
        "model_name": os.environ.get("PORTKEY_MODEL", "@openai/gpt-4o-mini"),
        "api_key": os.environ["PORTKEY_API_KEY"],
    }
else:
    raise EnvironmentError(
        "No API key found. Set one of: OPENAI_API_KEY, ANTHROPIC_API_KEY, "
        "or PORTKEY_API_KEY in this cell or in the JupyterHub environment."
    )

print(f"Backend:  {BACKEND}")
print(f"Model:    {BACKEND_KWARGS['model_name']}")

In [None]:
# Verify the rlm library is installed
from rlm import RLM
from rlm.logger import RLMLogger

print("rlm library imported successfully!")

---
## 2. What is an RLM? <a id='2-what-is-an-rlm'></a>

### The Problem with Standard LLM Calls

A normal LLM call looks like:

```python
response = llm.completion("Summarize this 500-page document: ...")
```

This has fundamental limitations:
- **Context window limits** — the document may not fit in the model's context
- **Single-shot reasoning** — the model must produce its answer in one pass
- **No tool use** — the model can't inspect, slice, or compute over the input

### The RLM Solution

An RLM call replaces this with an **iterative REPL loop**:

```
┌─────────────────────────────────────────────────┐
│                   RLM Loop                       │
│                                                  │
│  1. Model receives prompt + context variable     │
│  2. Model writes code in a ```python block       │
│  3. Code executes in REPL environment            │
│  4. stdout/stderr fed back to model              │
│  5. Repeat until model calls FINAL_VAR(answer)   │
│                                                  │
│  At any step, the model can:                     │
│  • Slice/index the context variable              │
│  • Spawn sub-RLM calls with rlm_query()          │
│  • Store intermediate results in REPL variables  │
│  • Use any Python library available              │
└─────────────────────────────────────────────────┘
```

### Key Built-in Functions

Inside the REPL, the model has access to:

| Function | Purpose |
|----------|--------|
| `context` | Variable holding the input payload |
| `FINAL_VAR(x)` | Declare the final answer and exit the loop |
| `llm_query(prompt)` | Query the underlying LLM directly (no REPL) |
| `rlm_query(prompt)` | Spawn a recursive child RLM (with its own REPL) |
| `rlm_query_batched(prompts)` | Spawn multiple child RLMs in parallel |

---
## 3. Your First RLM Completion <a id='3-your-first-rlm-completion'></a>

Let's start with the simplest possible RLM call. The model will use the REPL to compute something and return it.

In [None]:
# Create an RLM instance
rlm = RLM(
    backend=BACKEND,
    backend_kwargs=BACKEND_KWARGS,
    environment="local",       # Execute code in the local Python process
    environment_kwargs={},
    max_depth=1,               # No recursive sub-calls (yet)
    verbose=True,              # Print the REPL interaction to console
)

print("RLM instance created.")
print(f"  Backend: {BACKEND}")
print(f"  Environment: local")
print(f"  Max depth: 1")

In [None]:
# ── First completion: a simple computation ──
result = rlm.completion(
    "Compute the sum of the first 10 Fibonacci numbers. "
    "Show your work step by step in the REPL, then return the answer with FINAL_VAR."
)

print("\n" + "=" * 60)
print("RESULT")
print("=" * 60)
print(f"Response: {result.response}")
print(f"Execution time: {result.execution_time:.2f}s")

### What just happened?

1. The RLM sent your prompt to the LLM backend
2. The model responded with a Python code block (in \`\`\`python fences)
3. RLM extracted the code and executed it in a sandboxed REPL
4. The stdout was fed back to the model as the next message
5. The model eventually called `FINAL_VAR(answer)` to return its result

The `verbose=True` flag let you see the full back-and-forth. This iterative loop is the core of RLM.

---
## 4. How the REPL Loop Works <a id='4-how-the-repl-loop-works'></a>

Let's look more carefully at the iteration mechanics. We'll use the **RLMLogger** to capture the full trajectory.

In [None]:
# Create an RLM with logging
logger = RLMLogger()  # In-memory logger

rlm_logged = RLM(
    backend=BACKEND,
    backend_kwargs=BACKEND_KWARGS,
    environment="local",
    max_depth=1,
    max_iterations=5,  # Safety limit on REPL turns
    logger=logger,
    verbose=True,
)

result = rlm_logged.completion(
    "Generate a list of 20 random integers between 1 and 100 using random.seed(42). "
    "Compute the mean, median, and standard deviation. "
    "Print each result clearly, then return a dict with all three values via FINAL_VAR."
)

print("\n" + "=" * 60)
print(f"Response: {result.response}")
print(f"Execution time: {result.execution_time:.2f}s")

In [None]:
# ── Inspect the trajectory ──
if result.metadata:
    traj = result.metadata
    iterations = traj.get("iterations", [])
    print(f"Total iterations: {len(iterations)}")
    print()
    
    for i, iteration in enumerate(iterations):
        print(f"── Iteration {i + 1} ──")
        code_blocks = iteration.get("code_blocks", [])
        for j, cb in enumerate(code_blocks):
            code = cb.get("code", "")
            repl_result = cb.get("result", {})
            stdout = repl_result.get("stdout", "")
            stderr = repl_result.get("stderr", "")
            
            print(f"  Code block {j + 1}:")
            # Show first 300 chars of code
            for line in code.split("\n")[:10]:
                print(f"    >>> {line}")
            if len(code.split("\n")) > 10:
                print(f"    ... ({len(code.split(chr(10)))} lines total)")
            if stdout:
                print(f"  stdout: {stdout[:200]}")
            if stderr:
                print(f"  stderr: {stderr[:200]}")
        print()
else:
    print("No trajectory metadata captured.")

### Key Observations

Notice how the model:
1. **Wrote Python code** to generate the data and compute statistics
2. **Used `print()`** to inspect intermediate results
3. **Called `FINAL_VAR()`** when it had the final answer
4. **Persisted state** across REPL iterations (variables survive between turns)

This is fundamentally different from chain-of-thought or tool-use — the model has **full programmatic control** over its reasoning process.

---
## 5. Recursive Sub-Calls: Depth > 1 <a id='5-recursive-sub-calls'></a>

The **recursive** part of RLM comes from `rlm_query()`. When the model calls `rlm_query(prompt)` inside its REPL, a **child RLM** is spawned with its own REPL environment. The child processes the sub-problem independently and returns its result to the parent.

```
Parent RLM (depth=0)
  │
  ├── Iteration 1: writes code, calls rlm_query("sub-task")
  │     │
  │     └── Child RLM (depth=1)
  │           ├── Iteration 1: writes code, prints results
  │           └── Iteration 2: calls FINAL_VAR(answer)
  │                             └── answer flows back to parent
  │
  └── Iteration 2: uses child's answer, calls FINAL_VAR(final)
```

This enables **divide-and-conquer** over arbitrarily long inputs.

In [None]:
# ── Recursive RLM: depth=2 ──
logger_deep = RLMLogger()

rlm_deep = RLM(
    backend=BACKEND,
    backend_kwargs=BACKEND_KWARGS,
    environment="local",
    max_depth=2,           # Allow one level of recursion
    max_iterations=5,
    logger=logger_deep,
    verbose=True,
)

prompt = (
    "Use rlm_query() to ask a sub-model: "
    "'What are the first 8 prime numbers? Reply with just the numbers separated by commas.' "
    "Store the response in a variable called 'primes'. "
    "Then parse the primes into a list of integers, compute their sum, "
    "and return the sum with FINAL_VAR."
)

result = rlm_deep.completion(prompt)

print("\n" + "=" * 60)
print(f"Response: {result.response}")
print(f"Execution time: {result.execution_time:.2f}s")

In [None]:
# ── Inspect sub-call metadata ──
def print_metadata_tree(result, depth=0):
    """Recursively print metadata from an RLMChatCompletion."""
    indent = "  " * depth
    prefix = "└─ " if depth > 0 else ""
    
    print(f"{indent}{prefix}[Depth {depth}] "
          f"time={result.execution_time:.2f}s  "
          f"response_len={len(str(result.response))}")
    
    if result.metadata:
        traj = result.metadata
        n_iters = len(traj.get("iterations", []))
        print(f"{indent}   {n_iters} iteration(s)")
        
        for i, iteration in enumerate(traj.get("iterations", [])):
            for cb in iteration.get("code_blocks", []):
                repl_result = cb.get("result", {})
                for j, sub_call in enumerate(repl_result.get("rlm_calls", [])):
                    sub_resp = str(sub_call.get("response", ""))[:80]
                    print(f"{indent}   iter {i+1} sub-call {j+1}: "
                          f"response={sub_resp!r}")
    print()

print("METADATA TREE")
print("=" * 40)
print_metadata_tree(result)

---
## 6. Batched Recursive Queries <a id='6-batched-recursive-queries'></a>

When the model needs to process multiple sub-tasks, it can use `rlm_query_batched()` to spawn several child RLMs **in parallel**. This is crucial for tasks like summarizing multiple document sections simultaneously.

In [None]:
# ── Batched parallel sub-calls ──
rlm_batched = RLM(
    backend=BACKEND,
    backend_kwargs=BACKEND_KWARGS,
    environment="local",
    max_depth=2,
    max_iterations=5,
    verbose=True,
)

prompt = (
    "Use rlm_query_batched() to ask THREE different questions in parallel:\n"
    "  1. 'What are the first 5 prime numbers? Reply with just the numbers.'\n"
    "  2. 'What are the first 5 perfect squares? Reply with just the numbers.'\n"
    "  3. 'What are the first 5 triangular numbers? Reply with just the numbers.'\n"
    "Store the list of responses in a variable called 'answers', "
    "then combine them into a summary string and return it with FINAL_VAR."
)

result = rlm_batched.completion(prompt)

print("\n" + "=" * 60)
print(f"Response: {result.response}")
print(f"Execution time: {result.execution_time:.2f}s")

### Why batched queries matter

Consider summarizing a 500-page document:

1. The parent RLM **chunks** the document (stored in `context`) into sections
2. It calls `rlm_query_batched()` with a summarization prompt for each chunk
3. Each child RLM summarizes its chunk independently **in parallel**
4. The parent collects all summaries and produces the final result

This recursive divide-and-conquer pattern is what gives RLM its power over arbitrarily long inputs.

---
## 7. Context Compaction <a id='7-context-compaction'></a>

When an RLM runs for many iterations, the conversation history can fill up the context window. **Compaction** automatically summarizes the conversation when usage exceeds a threshold, preserving key information while freeing space.

The model receives a `history` variable containing the summarized prior context, so it can still reference earlier results.

In [None]:
# ── Compaction example ──
# We use a very low threshold to force compaction to trigger

rlm_compact = RLM(
    backend=BACKEND,
    backend_kwargs=BACKEND_KWARGS,
    environment="local",
    max_depth=1,
    max_iterations=12,
    compaction=True,
    compaction_threshold_pct=0.02,  # Very low — triggers quickly for demo
    verbose=True,
)

prompt = (
    "Complete the following steps, each in its own REPL block (one block per message). "
    "Do NOT combine steps.\n\n"
    "Step 1: Use `import random; random.seed(42)` then generate a list of 20 random "
    "integers between 1 and 1000. Print ALL of them. Store as `data_a`.\n\n"
    "Step 2: Generate another 20 random integers (continuing the same RNG stream) "
    "between 1 and 1000. Print ALL of them. Store as `data_b`.\n\n"
    "Step 3: Compute and print the mean of `data_a`. Store as `mean_a`.\n\n"
    "Step 4: Compute and print the mean of `data_b`. Store as `mean_b`.\n\n"
    "Step 5: Compute final_answer = round(mean_a + mean_b, 2) and call FINAL_VAR(final_answer)."
)

result = rlm_compact.completion(prompt, root_prompt=prompt)

print("\n" + "=" * 60)
print(f"Response: {result.response}")
print(f"Execution time: {result.execution_time:.2f}s")

### How compaction works

1. After each iteration, RLM checks what percentage of the context window is used
2. When usage exceeds `compaction_threshold_pct`, the conversation history is **summarized** by the LLM
3. The summary replaces the full history, freeing context space
4. The model receives a `history` variable with the summary so it can still reference past results
5. REPL variables (like `data_a`, `mean_a`) persist in the environment regardless of compaction

This is crucial for tasks that require many iterations — without compaction, the context window would fill up and the model would fail.

---
## 8. Custom Tools <a id='8-custom-tools'></a>

You can inject **custom functions and data** into the RLM's REPL environment. The model can then call these functions as part of its reasoning process.

In [None]:
from typing import Any

# Define some custom tools
def word_count(text: str) -> int:
    """Count the number of words in a text."""
    return len(text.split())

def extract_numbers(text: str) -> list[float]:
    """Extract all numbers from a text string."""
    import re
    return [float(x) for x in re.findall(r'-?\d+\.?\d*', text)]

def celsius_to_fahrenheit(c: float) -> float:
    """Convert Celsius to Fahrenheit."""
    return c * 9/5 + 32

# Data that will be injected as a variable
CITY_DATA = {
    "New York": {"population": 8_336_817, "area_sq_mi": 302.6},
    "Los Angeles": {"population": 3_979_576, "area_sq_mi": 468.7},
    "Chicago": {"population": 2_693_976, "area_sq_mi": 227.3},
    "Houston": {"population": 2_304_580, "area_sq_mi": 670.6},
}


rlm_tools = RLM(
    backend=BACKEND,
    backend_kwargs=BACKEND_KWARGS,
    environment="local",
    max_depth=1,
    custom_tools={
        "word_count": {
            "tool": word_count,
            "description": "Count words in a string",
        },
        "extract_numbers": {
            "tool": extract_numbers,
            "description": "Extract all numbers from a text string",
        },
        "celsius_to_fahrenheit": {
            "tool": celsius_to_fahrenheit,
            "description": "Convert a temperature from Celsius to Fahrenheit",
        },
        "CITY_DATA": {
            "tool": CITY_DATA,
            "description": "Dict of US cities with population and area data",
        },
    },
    verbose=True,
)

result = rlm_tools.completion(
    "Using the CITY_DATA, compute the population density (people per square mile) "
    "for each city. Which city has the highest density? "
    "Return a dict mapping city name to density, sorted by density descending, via FINAL_VAR."
)

print("\n" + "=" * 60)
print(f"Response: {result.response}")

### Custom tools enable domain-specific RLMs

By injecting functions and data, you can build RLMs specialized for:
- **Data analysis** — pass in database query functions
- **Scientific computing** — pass in simulation functions
- **Web scraping** — pass in HTTP fetch functions
- **Business logic** — pass in pricing calculators, inventory lookups, etc.

The model discovers these tools through descriptions in its system prompt and uses them naturally in its REPL code.

---
## 9. Logging & Trajectory Inspection <a id='9-logging--trajectory-inspection'></a>

The `RLMLogger` captures complete execution trajectories — every iteration, every code block, every sub-call. This is invaluable for debugging and understanding how the model reasons.

In [None]:
# ── Full logging example ──
full_logger = RLMLogger()  # log_dir="./logs" to also save to disk

rlm_full = RLM(
    backend=BACKEND,
    backend_kwargs=BACKEND_KWARGS,
    environment="local",
    max_depth=1,
    max_iterations=5,
    logger=full_logger,
    verbose=False,  # Quiet mode — we'll inspect via the logger instead
)

result = rlm_full.completion(
    "Write a function that checks if a number is prime. "
    "Test it on the numbers 1 through 20. "
    "Return the list of primes via FINAL_VAR."
)

print(f"Response: {result.response}")
print(f"Execution time: {result.execution_time:.2f}s")

In [None]:
# ── Detailed trajectory analysis ──
if result.metadata:
    traj = result.metadata
    
    # Run-level metadata
    run_meta = traj.get("run_metadata", {})
    print("Run Metadata:")
    for key, value in run_meta.items():
        print(f"  {key}: {value}")
    print()
    
    # Iteration details
    iterations = traj.get("iterations", [])
    print(f"Iterations: {len(iterations)}")
    print()
    
    total_code_blocks = 0
    total_lines_of_code = 0
    
    for i, iteration in enumerate(iterations):
        code_blocks = iteration.get("code_blocks", [])
        total_code_blocks += len(code_blocks)
        print(f"  Iteration {i + 1}: {len(code_blocks)} code block(s)")
        for j, cb in enumerate(code_blocks):
            code = cb.get("code", "")
            lines = code.count("\n") + 1
            total_lines_of_code += lines
            repl_result = cb.get("result", {})
            success = not repl_result.get("stderr", "")
            status = "OK" if success else "ERROR"
            print(f"    Block {j+1}: {lines} lines [{status}]")
            if not success:
                print(f"      stderr: {repl_result['stderr'][:200]}")
    
    print(f"\nTotal: {total_code_blocks} code blocks, {total_lines_of_code} lines of code")
else:
    print("No metadata captured (logger may not have been attached).")

### Saving trajectories to disk

To save trajectories for later analysis or for use with the RLM [web visualizer](https://github.com/alexzhang13/rlm/tree/main/visualizer):

```python
logger = RLMLogger(log_dir="./logs")
```

This writes JSONL files that can be loaded into the visualizer for interactive exploration of the full execution tree.

---
## 10. Exercises <a id='10-exercises'></a>

Now it's your turn! Try these exercises to deepen your understanding of RLM.

### Exercise 1: Context Processing

Pass a long text as the `context` and have the RLM process it programmatically.

In [None]:
# ── Exercise 1: Using the context variable ──

long_text = """
The history of computing is a fascinating journey. Charles Babbage conceived the
Analytical Engine in 1837, which contained many features of modern computers.
Ada Lovelace wrote what is considered the first computer program for this machine.
In 1936, Alan Turing published his paper on computable numbers, laying the
theoretical foundation for computer science. The first electronic general-purpose
computer, ENIAC, was completed in 1945. It weighed 30 tons and occupied 1,800
square feet. The transistor was invented at Bell Labs in 1947, revolutionizing
electronics. The integrated circuit followed in 1958, leading to the
miniaturization of computing. The first microprocessor, the Intel 4004, was
released in 1971. The personal computer revolution began in the late 1970s with
machines like the Apple II and the IBM PC. Tim Berners-Lee invented the World
Wide Web in 1989. The smartphone era began with the iPhone in 2007. Today,
artificial intelligence and machine learning are transforming computing once again.
"""

rlm_ctx = RLM(
    backend=BACKEND,
    backend_kwargs=BACKEND_KWARGS,
    environment="local",
    max_depth=1,
    verbose=True,
)

# The context variable is available as `context` inside the REPL
result = rlm_ctx.completion(
    "The variable `context` contains a passage about computing history. "
    "Extract every year mentioned and the event associated with it. "
    "Return a dict mapping year (int) to event (str) via FINAL_VAR.",
    context=long_text,
)

print("\n" + "=" * 60)
print(f"Response: {result.response}")

### Exercise 2: Recursive Document Summarization

Use `rlm_query_batched()` to implement a map-reduce style summarizer. The parent should chunk text and delegate summarization to child RLMs.

In [None]:
# ── Exercise 2: Recursive summarization ──
# TODO: Create an RLM with depth=2 and write a prompt that instructs
# the parent to:
#   1. Split the context into 3 equal parts
#   2. Use rlm_query_batched() to summarize each part
#   3. Combine the summaries into one final summary
#   4. Return via FINAL_VAR

# Your code here:
# rlm_summarizer = RLM(...)
# result = rlm_summarizer.completion("...", context=long_text)

print("Complete this exercise by implementing recursive summarization!")

### Exercise 3: Custom Tool Integration

Create a custom tool that simulates a database, and have the RLM query it.

In [None]:
# ── Exercise 3: Build a database tool ──
# TODO: Create a function that simulates a SQL query on in-memory data.
# Inject it as a custom tool and have the RLM answer questions about the data.

# Hint:
# def query_db(table: str, column: str, condition: str = None) -> list:
#     ...
#
# rlm_db = RLM(..., custom_tools={"query_db": {"tool": query_db, "description": "..."}})

print("Complete this exercise by building a database query tool!")

### Exercise 4: Error Recovery

What happens when the model writes buggy code? RLM feeds the error back, and the model can fix it. Test this by giving a tricky prompt.

In [None]:
# ── Exercise 4: Observe error recovery ──

rlm_err = RLM(
    backend=BACKEND,
    backend_kwargs=BACKEND_KWARGS,
    environment="local",
    max_depth=1,
    max_iterations=5,
    logger=RLMLogger(),
    verbose=True,
)

# This prompt intentionally asks for something that might cause errors
result = rlm_err.completion(
    "Try to import a library called 'nonexistent_lib'. "
    "When that fails, catch the error and instead compute factorial(10) "
    "using the math library. Return the result via FINAL_VAR."
)

print("\n" + "=" * 60)
print(f"Response: {result.response}")
print(f"Execution time: {result.execution_time:.2f}s")

# Check how many iterations were needed
if result.metadata:
    iters = len(result.metadata.get("iterations", []))
    print(f"Iterations needed: {iters}")

---
## 11. References <a id='11-references'></a>

1. **Zhang, A.L., Kraska, T., & Khattab, O.** (2025). *Recursive Language Models.* arXiv:2512.24601. MIT OASYS Lab.

2. **RLM GitHub Repository:** [github.com/alexzhang13/rlm](https://github.com/alexzhang13/rlm)

3. **RLM Visualizer:** Interactive web tool for exploring RLM execution trajectories — see the `visualizer/` directory in the repo.

4. **RLM on PyPI:** `pip install rlms`

---

### Key Takeaways

- **RLM is an inference paradigm**, not a model architecture — it works with any LLM backend
- The REPL loop gives models **programmatic control** over their reasoning
- **Recursive sub-calls** enable divide-and-conquer over arbitrarily long inputs
- **Compaction** manages context windows automatically for long-running tasks
- **Custom tools** let you build domain-specific RLM applications
- **Trajectory logging** provides full transparency into the model's reasoning process

Happy hacking!