# Understanding REPL State Persistence

This notebook demonstrates how `exec()` works and why state persistence matters for RLM.

## Step 1: How `exec()` works

Python's `exec()` function executes a string as Python code.
It takes an optional `globals` dict where variables are stored.

In [None]:
# Basic exec - variables go into a globals dict
env = {}

code = "x = 10"
exec(code, env)

print("After exec, env contains:")
print({k: v for k, v in env.items() if not k.startswith('__')})

## Step 2: Variables accumulate in the same dict

In [None]:
# Run another exec with the SAME env dict
code2 = "y = x + 5"  # Notice: we use 'x' which was defined earlier!
exec(code2, env)

print("After second exec, env contains:")
print({k: v for k, v in env.items() if not k.startswith('__')})

In [None]:
# Run a third exec - variables keep accumulating
code3 = "z = x + y"
exec(code3, env)

print("After third exec, env contains:")
print({k: v for k, v in env.items() if not k.startswith('__')})
print(f"\nz = {env['z']}")

## Step 3: The PROBLEM - Creating new dict each time

If we create a NEW dict for each exec, variables are LOST.

In [None]:
# Simulating the BROKEN behavior (current repl.py)

def execute_broken(code):
    """This is how current repl.py works - creates new dict each time"""
    env = {}  # <-- NEW dict every call!
    exec(code, env)
    return {k: v for k, v in env.items() if not k.startswith('__')}

# First call
result1 = execute_broken("x = 10")
print(f"After call 1: {result1}")

# Second call - x is GONE!
try:
    result2 = execute_broken("y = x + 5")  # This will FAIL!
    print(f"After call 2: {result2}")
except NameError as e:
    print(f"ERROR: {e}")
    print("x was lost because we created a new dict!")

## Step 4: The FIX - Persistent dict

Keep the same dict across all calls.

In [None]:
# Simulating the FIXED behavior

class REPLFixed:
    def __init__(self):
        # Create env ONCE in __init__
        self.env = {}
    
    def execute(self, code):
        """Uses the SAME self.env every time"""
        exec(code, self.env)  # <-- Same dict!
        return {k: v for k, v in self.env.items() if not k.startswith('__')}

# Create REPL instance
repl = REPLFixed()

# First call
result1 = repl.execute("x = 10")
print(f"After call 1: {result1}")

# Second call - x is STILL THERE!
result2 = repl.execute("y = x + 5")
print(f"After call 2: {result2}")

# Third call
result3 = repl.execute("z = x + y")
print(f"After call 3: {result3}")

## Step 5: Adding functions to the environment

We can pre-load functions (like `llm_query`) into the env dict.

In [None]:
# Simulating RLM REPL with functions

class REPLWithFunctions:
    def __init__(self, context):
        # Pre-load context and functions into env
        self.env = {
            'context': context,
            'llm_query': self._llm_query,
        }
    
    def _llm_query(self, prompt):
        """Fake sub-LLM for demo"""
        return f"[LLM Response to: {prompt[:50]}...]"
    
    def execute(self, code):
        exec(code, self.env)
        return {k: v for k, v in self.env.items() 
                if not k.startswith('__') and not callable(v) and k != 'context'}

# Create REPL with a document
doc = """Chapter 1: Introduction
This paper discusses RLM.

Chapter 2: Methods
We use a REPL environment.

Chapter 3: Results
RLM achieves 91% accuracy."""

repl = REPLWithFunctions(doc)
print("Initial env has: context, llm_query")
print()

In [None]:
# Iteration 1: LLM explores the context
code1 = """
lines = context.split('\\n')
print(f"Document has {len(lines)} lines")
"""
repl.execute(code1)
print(f"Variables after iteration 1: {repl.execute('')}")

In [None]:
# Iteration 2: LLM uses 'lines' from previous iteration
code2 = """
chapters = [l for l in lines if l.startswith('Chapter')]
print(f"Found chapters: {chapters}")
"""
repl.execute(code2)
print(f"Variables after iteration 2: {repl.execute('')}")

In [None]:
# Iteration 3: LLM calls sub-LLM and stores result
code3 = """
answer = llm_query("What is the accuracy mentioned in: " + context)
print(f"Sub-LLM returned: {answer}")
"""
repl.execute(code3)
print(f"Variables after iteration 3: {repl.execute('')}")

## Summary

| Approach | Dict Creation | Variables Persist? |
|----------|---------------|--------------------|
| Broken (current) | `env = {}` inside method | NO |
| Fixed | `self.env = {}` in `__init__` | YES |

The LLM just writes normal code like `x = 10`. It doesn't know about `env`.
The persistence is handled by the REPL system keeping the same dict.