# Tutorial: DSPy ReAct Agent with fleet-rlm

Audience:
- Developers exploring the `fleet-rlm` interactive ReAct agent APIs.

Prerequisites:
- `uv sync --extra dev --extra interactive`
- `.env` configured with planner LM settings
- Modal auth and `LITELLM` secret configured

Learning goals:
- Build and inspect `RLMReActChatAgent`
- Run one sync turn and one streaming turn
- Inspect history/doc state and register a custom tool safely


## Outline

1. Setup and environment diagnostics
2. Instantiate agent in a context manager
3. Run synchronous chat turn
4. Run streaming chat turn (`iter_chat_turn_stream`)
5. Inspect state and documents
6. Add a custom tool (exercise scaffold)
7. Troubleshooting


In [7]:
from __future__ import annotations

from pathlib import Path
import os
import json

from fleet_rlm.runners import build_react_chat_agent
from fleet_rlm.react_agent import list_react_tool_names

DOC_PATH = Path('rlm_content/dspy-knowledge/dspy-doc.txt')
SECRET_NAME = os.getenv('FLEET_RLM_SECRET', 'LITELLM')
TIMEOUT = 900
RUN_LIVE_CALLS = True  # Set True only after Modal auth + secret setup.

info = {
    'cwd': str(Path.cwd()),
    'doc_exists': DOC_PATH.exists(),
    'doc_path': str(DOC_PATH),
    'secret_name': SECRET_NAME,
    'run_live_calls': RUN_LIVE_CALLS,
    'has_dspy_llm_api_key_env': bool(os.getenv('DSPY_LLM_API_KEY')),
    'has_dspy_lm_model_env': bool(os.getenv('DSPY_LM_MODEL')),
}
print(json.dumps(info, indent=2))


{
  "cwd": "/Users/zocho/.codex/worktrees/396e/fleet-rlm-dspy/notebooks",
  "doc_exists": false,
  "doc_path": "rlm_content/dspy-knowledge/dspy-doc.txt",
  "secret_name": "LITELLM",
  "run_live_calls": true,
  "has_dspy_llm_api_key_env": true,
  "has_dspy_lm_model_env": true
}


## Step 1 - Build an Agent (Safe by Default)

This cell shows the exact `build_react_chat_agent(...)` usage.

By default (`RUN_LIVE_CALLS=False`) it does not start Modal sandboxes.



In [8]:
if RUN_LIVE_CALLS:
    with build_react_chat_agent(
        docs_path=DOC_PATH if DOC_PATH.exists() else None,
        react_max_iters=10,
        rlm_max_iterations=30,
        rlm_max_llm_calls=50,
        timeout=TIMEOUT,
        secret_name=SECRET_NAME,
    ) as agent:
        tool_names = list_react_tool_names(agent.react_tools)
        print(f'Loaded {len(tool_names)} tools')
        print(tool_names[:12])
else:
    print('Skipping live agent build. Set RUN_LIVE_CALLS=True to run this step.')


Loaded 13 tools
['load_document', 'set_active_document', 'list_documents', 'chunk_host', 'chunk_sandbox', 'parallel_semantic_map', 'analyze_long_document', 'summarize_long_document', 'extract_from_logs', 'read_buffer', 'clear_buffer', 'save_buffer_to_volume']


## Step 2 - Run a Synchronous Turn

`chat_turn(...)` returns a dictionary with `assistant_response`, `trajectory`, and `history_turns`.



In [9]:
if RUN_LIVE_CALLS:
    with build_react_chat_agent(
        docs_path=DOC_PATH if DOC_PATH.exists() else None,
        timeout=TIMEOUT,
        secret_name=SECRET_NAME,
    ) as agent:
        result = agent.chat_turn('Summarize what this project does in 3 bullet points.')
        print('assistant_response:')
        print()
        print(result['assistant_response'])
        print()
        print('meta:')
        print({
            'history_turns': result.get('history_turns'),
            'trajectory_keys': list((result.get('trajectory') or {}).keys())[:10],
        })
else:
    print('Skipping sync turn. Toggle RUN_LIVE_CALLS=True once your environment is configured.')


assistant_response:

I am unable to summarize the project at this time because I cannot find a documentation file (such as `README.md` or `project.txt`) in the current environment. If you can provide the text or specify the correct file path, I would be happy to summarize it for you in 3 bullet points.

meta:
{'history_turns': 1, 'trajectory_keys': ['thought_0', 'tool_name_0', 'tool_args_0', 'observation_0', 'thought_1', 'tool_name_1', 'tool_args_1', 'observation_1', 'thought_2', 'tool_name_2']}


## Step 3 - Run a Streaming Turn

This demonstrates `iter_chat_turn_stream(...)` and collects:
- assistant tokens
- status lines
- reasoning steps (if trace enabled)
- tool timeline + final payload



In [10]:
if RUN_LIVE_CALLS:
    with build_react_chat_agent(
        docs_path=DOC_PATH if DOC_PATH.exists() else None,
        timeout=TIMEOUT,
        secret_name=SECRET_NAME,
    ) as agent:
        assembled_tokens = []
        statuses = []
        reasoning = []
        tools = []
        terminal_event = None

        for event in agent.iter_chat_turn_stream(
            'List the key interactive commands and what each does.',
            trace=True,
        ):
            if event.kind == 'assistant_token':
                assembled_tokens.append(event.text)
            elif event.kind == 'status':
                statuses.append(event.text)
            elif event.kind == 'reasoning_step':
                reasoning.append(event.text)
            elif event.kind in {'tool_call', 'tool_result'}:
                tools.append(event.text)
            elif event.kind in {'final', 'cancelled', 'error'}:
                terminal_event = event

        final_text = ''.join(assembled_tokens).strip()
        print('assistant_response:')
        print()
        print(final_text)
        print()
        print('summary:')
        print({
            'status_count': len(statuses),
            'reasoning_count': len(reasoning),
            'tool_events': tools[:10],
            'terminal_kind': terminal_event.kind if terminal_event else None,
            'history_turns': (terminal_event.payload.get('history_turns') if terminal_event else None),
        })
else:
    print('Skipping streaming turn. Toggle RUN_LIVE_CALLS=True to stream real events.')


assistant_response:

Here are the key interactive commands available for managing documents and data processing:

### **Document Management**
*   **`load_document`**: Loads a text document from the host filesystem into the agent's document memory.
*   **`set_active_document`**: Sets a specific loaded document as the "active" one for subsequent operations.
*   **`list_documents`**: Lists all currently loaded document aliases and identifies the active document.
*   **`load_text_from_volume`**: Loads text from a persistent Modal Volume into the host-side document memory.

### **Text Processing & Chunking**
*   **`chunk_host`**: Splits a document on the host using strategies like size, headers, timestamps, or JSON keys.
*   **`chunk_sandbox`**: Chunks the active document inside the sandbox environment and stores the results in a buffer.
*   **`parallel_semantic_map`**: Runs parallel semantic analysis over text chunks using batched LLM queries.

### **Analysis & Extraction**
*   **`analyze_

## Step 4 - State and Memory Inspection

You can inspect in-memory chat state and loaded documents directly from the agent API.



In [11]:
if RUN_LIVE_CALLS:
    with build_react_chat_agent(
        docs_path=DOC_PATH if DOC_PATH.exists() else None,
        timeout=TIMEOUT,
        secret_name=SECRET_NAME,
    ) as agent:
        if DOC_PATH.exists():
            _ = agent.load_document(str(DOC_PATH), alias='active')

        snapshot = {
            'history_messages': len(agent.history.messages),
            'documents': agent.list_documents(),
        }
        print(json.dumps(snapshot, indent=2, default=str))
else:
    print('No live state to inspect yet. Toggle RUN_LIVE_CALLS=True when ready.')


{
  "history_messages": 0,
  "documents": {
    "documents": [],
    "active_alias": null,
    "cache_size": 0,
    "cache_limit": 100
  }
}


## Exercise

1. Register a tiny custom tool that echoes a topic string.
2. Ask the agent to call that tool in a short request.
3. Inspect `history.messages` after the turn.



In [12]:
def echo_topic(topic: str) -> dict[str, str]:
    return {'topic': topic, 'note': 'custom tool demo'}

if RUN_LIVE_CALLS:
    with build_react_chat_agent(
        docs_path=DOC_PATH if DOC_PATH.exists() else None,
        timeout=TIMEOUT,
        secret_name=SECRET_NAME,
    ) as agent:
        print(agent.register_extra_tool(echo_topic))
        out = agent.chat_turn('Use the echo_topic tool with topic=agent-observability.')
        print(out['assistant_response'])
else:
    print('Exercise scaffold ready. Set RUN_LIVE_CALLS=True to execute.')


{'status': 'ok', 'tool_name': 'echo_topic'}
I have successfully executed the `echo_topic` tool with the topic "agent-observability". The tool returned a confirmation note indicating this was part of a custom tool demo.


## Pitfalls / Troubleshooting

- If Modal calls fail, run `uv run modal setup` and verify auth.
- If LLM access fails, run `uv run fleet-rlm check-secret` and confirm `LITELLM` keys exist.
- If document loading fails, confirm `DOC_PATH` exists.
- If streaming is unavailable for your backend, `chat_turn_stream` falls back to non-streaming behavior.

