# Research Agent Example

This notebook demonstrates a research agent that uses multiple capabilities to perform
complex, multi-step research tasks requiring many agentic cycles.

## Capabilities Used

1. **stateless_todo_list** - Task management for organizing research steps
2. **web_fetch** - Fetching and analyzing web content
3. **session_file_system** - Storing research notes and findings

## Research Task

The agent will research **WebAssembly (WASM) runtimes**, comparing different implementations,
their features, and use cases. This task requires:

- Planning the research approach (todo list)
- Fetching information from multiple sources
- Taking organized notes (file system)
- Synthesizing findings into a final report

## Prerequisites

- Everruns API server running at `http://localhost:9000`
- Python packages: `requests`

In [None]:
# Install required packages
!pip install requests

In [None]:
import requests
import json
import time

# Configuration
BASE_URL = "http://localhost:9000"
API_V1 = f"{BASE_URL}/v1"

# Model to use for the agent (will be resolved to UUID)
MODEL_NAME = "gpt-5"

# Timeout for waiting on agent responses (research tasks may take longer)
RESPONSE_TIMEOUT = 300  # 5 minutes

def print_json(data):
    """Pretty print JSON data."""
    print(json.dumps(data, indent=2, default=str))

def find_model_id(model_name: str) -> str:
    """Find the UUID of a model by its name from the llm-models API."""
    response = requests.get(f"{API_V1}/llm-models")
    response.raise_for_status()
    models = response.json()
    
    for model in models:
        if model.get("model_id") == model_name:
            return model["id"]
    
    raise ValueError(f"Model '{model_name}' not found. Available models: {[m.get('model_id') for m in models]}")

# Resolve model name to UUID
model_id = find_model_id(MODEL_NAME)
print(f"Resolved model '{MODEL_NAME}' to UUID: {model_id}")

## 1. Create Research Agent with Capabilities

Create an agent with a system prompt tailored for research tasks and enable
the capabilities needed for research:
- `stateless_todo_list` - For tracking research tasks
- `web_fetch` - For gathering information from the web
- `session_file_system` - For storing notes and reports

In [None]:
RESEARCH_SYSTEM_PROMPT = """
You are an expert research analyst. Your role is to conduct thorough research on
technical topics, gathering information from multiple sources and synthesizing
findings into clear, well-organized reports.

## Research Methodology

1. **Plan First**: Break down the research topic into specific questions and create
   a task list to track your progress.

2. **Gather Information**: Fetch content from authoritative sources. Look for:
   - Official documentation and project pages
   - Technical blog posts and articles
   - Comparison guides and benchmarks

3. **Take Notes**: Save key findings to files as you research. Organize notes by
   subtopic for easy reference later.

4. **Synthesize**: Combine findings into a coherent analysis. Compare and contrast
   different sources. Identify patterns and draw conclusions.

5. **Report**: Create a final report with:
   - Executive summary
   - Detailed findings for each research question
   - Recommendations based on analysis
   - References to sources used

## Quality Standards

- Always cite your sources
- Distinguish between facts and opinions
- Note any limitations or gaps in available information
- Update your task list as you progress

## File Organization

Use this structure for organizing research:
```
/research/
  notes/        - Raw notes from each source
  analysis/     - Your analysis and comparisons
  report.md     - Final synthesized report
```
"""

# Create agent with capabilities in a single request
agent_data = {
    "name": "Research Agent",
    "description": "An agent specialized in conducting thorough technical research with organized note-taking",
    "system_prompt": RESEARCH_SYSTEM_PROMPT,
    "tags": ["research", "example", "multi-capability"],
    "capabilities": [
        "stateless_todo_list",
        "web_fetch",
        "session_file_system"
    ]
}

response = requests.post(f"{API_V1}/agents", json=agent_data)
response.raise_for_status()
agent = response.json()
agent_id = agent["id"]
print(f"Created agent: {agent_id}")
print(f"Name: {agent['name']}")
print(f"Capabilities: {agent_data['capabilities']}")

## 2. Create Research Session

Start a new session for the research task.

In [None]:
session_data = {
    "title": "WebAssembly Runtimes Research"
}

response = requests.post(f"{API_V1}/agents/{agent_id}/sessions", json=session_data)
response.raise_for_status()
session = response.json()

session_id = session["id"]
print(f"Created session: {session_id}")
print(f"Title: {session['title']}")

## 3. Event Streaming Helper

Create a helper function to poll events and display agent activity in real-time.

In [None]:
def read_until_complete(agent_id: str, session_id: str, timeout: int = RESPONSE_TIMEOUT):
    """Poll events continuously until the session completes.
    
    Prints all events as they arrive. Returns the final agent response text.
    """
    url = f"{API_V1}/agents/{agent_id}/sessions/{session_id}/events"
    seen_sequences = set()
    start_time = time.time()
    last_agent_response = None
    
    while time.time() - start_time < timeout:
        response = requests.get(url)
        response.raise_for_status()
        events = response.json().get("data", [])
        
        for event in events:
            seq = event["sequence"]
            if seq in seen_sequences:
                continue
            seen_sequences.add(seq)
            
            event_type = event["event_type"]
            data = event["data"]
            
            print(f"[{seq}] {event_type}")
            
            # For message.agent, capture response text
            if event_type == "message.agent":
                content = data.get("content", [])
                text_parts = [p["text"] for p in content if p.get("type") == "text"]
                last_agent_response = "\n".join(text_parts)
                print(json.dumps(data, indent=2))
                print()
            elif event_type == "session.completed":
                print(json.dumps(data, indent=2))
                print()
                return last_agent_response
            elif event_type == "session.failed":
                print(json.dumps(data, indent=2))
                print()
                return None
            else:
                # Print full event data for all other events
                print(json.dumps(data, indent=2))
                print()
        
        time.sleep(0.5)
    
    print("Timeout waiting for response")
    return None

## 4. Send Research Task

Give the agent a comprehensive research task that will exercise all capabilities.
This task should require multiple agentic cycles:

1. **Planning** - Agent creates a todo list
2. **Web Research** - Multiple web fetches to gather information
3. **Note Taking** - Writing findings to files
4. **Synthesis** - Combining notes into a final report

In [None]:
RESEARCH_TASK = """
Research WebAssembly (WASM) runtimes and create a comparison report.

## Specific Questions to Answer

1. **What are the major standalone WASM runtimes?**
   - List at least 3 major runtimes (e.g., Wasmtime, Wasmer, WasmEdge)
   - Note their primary use cases

2. **How do they compare on key features?**
   - Language support (what languages can compile to WASM)
   - WASI support level
   - Performance characteristics
   - Embedding capabilities (using WASM in other applications)

3. **What are the trade-offs?**
   - When would you choose one over another?
   - What are the limitations of each?

## Deliverables

1. Create a task list to track your research progress
2. Take notes on each runtime in separate files
3. Create a final report at /research/report.md with:
   - Executive summary
   - Comparison table
   - Recommendations by use case
   - Sources referenced

Please be thorough but concise. Focus on practical information that would help
a developer choose the right runtime for their project.
"""

message_data = {
    "message": {
        "role": "user",
        "content": [
            {"type": "text", "text": RESEARCH_TASK}
        ]
    },
    "controls": {
        "model_id": model_id
    }
}

print(f"Sending research task to agent (model: {MODEL_NAME}, id: {model_id})...")
print("="*50)

response = requests.post(
    f"{API_V1}/agents/{agent_id}/sessions/{session_id}/messages",
    json=message_data
)
response.raise_for_status()
print(f"Message sent: {response.json()['id']}")
print("\nWaiting for agent to complete research...")
print("(This may take several minutes as the agent performs multiple cycles)")
print("="*50)

In [None]:
# Poll events and display agent activity
print("Polling for events...\n")
agent_response = read_until_complete(agent_id, session_id)

if agent_response:
    print("=" * 40)
    print("FINAL AGENT RESPONSE:")
    print(agent_response)
else:
    print("No response received")

## 5. Review Research Artifacts

Check what files the agent created during research.

In [None]:
# List all files created by the agent
# We'll make a follow-up request to ask the agent to show the file structure

list_files_message = {
    "message": {
        "role": "user",
        "content": [
            {"type": "text", "text": "Please list all the files you created and show me the contents of the final report."}
        ]
    },
    "controls": {
        "model_id": model_id
    }
}

print("Asking agent to show created files...")
print("="*50)

response = requests.post(
    f"{API_V1}/agents/{agent_id}/sessions/{session_id}/messages",
    json=list_files_message
)
response.raise_for_status()

print("Polling for events...\n")
agent_response = read_until_complete(agent_id, session_id)

if agent_response:
    print("=" * 40)
    print("AGENT RESPONSE:")
    print(agent_response)

## 6. Analyze Tool Usage

Review how the agent used each capability.

In [None]:
# Get all events for analysis
response = requests.get(f"{API_V1}/agents/{agent_id}/sessions/{session_id}/events")
response.raise_for_status()
all_events = response.json().get("data", [])

# Analyze tool usage
tool_usage = {}
for event in all_events:
    if event["event_type"] == "message.tool_call":
        content = event["data"].get("message", {}).get("content", [])
        for part in content:
            if part.get("type") == "tool_call":
                tool_name = part.get("name")
                tool_usage[tool_name] = tool_usage.get(tool_name, 0) + 1

print("Tool Usage Summary:")
print("-" * 30)
for tool, count in sorted(tool_usage.items(), key=lambda x: -x[1]):
    capability = {
        "write_todos": "stateless_todo_list",
        "web_fetch": "web_fetch",
        "read_file": "session_file_system",
        "write_file": "session_file_system",
        "list_directory": "session_file_system",
        "grep_files": "session_file_system",
        "delete_file": "session_file_system",
        "stat_file": "session_file_system",
    }.get(tool, "unknown")
    print(f"  {tool}: {count} calls (capability: {capability})")

print(f"\nTotal tool calls: {sum(tool_usage.values())}")

# Count completed sessions
session_completed = [e for e in all_events if e["event_type"] == "session.completed"]
print(f"Session completions: {len(session_completed)}")

## 7. Cleanup

Delete the session and archive the agent.

In [None]:
# Delete session
requests.delete(f"{API_V1}/agents/{agent_id}/sessions/{session_id}")
print(f"Deleted session: {session_id}")

# Archive agent
requests.delete(f"{API_V1}/agents/{agent_id}")
print(f"Archived agent: {agent_id}")

## Summary

This example demonstrated a research agent using three capabilities:

### Capabilities Used

| Capability | Tools | Purpose |
|------------|-------|--------|
| `stateless_todo_list` | `write_todos` | Track research progress and tasks |
| `web_fetch` | `web_fetch` | Gather information from web sources |
| `session_file_system` | `read_file`, `write_file`, `list_directory`, etc. | Store notes and create reports |

### Research Workflow

1. **Planning Phase** - Agent creates a structured task list
2. **Information Gathering** - Multiple web fetches to official docs and resources
3. **Note Taking** - Findings saved to organized files
4. **Synthesis** - Notes combined into a comprehensive report
5. **Delivery** - Final report with executive summary and recommendations

### Key Observations

- The agent autonomously chose which sources to research
- Task management helped maintain focus and track progress
- File system allowed persistent note-taking across iterations
- Multiple agentic cycles were needed to complete the comprehensive task

### API Workflow

```
1. POST /v1/agents                             → Create agent with capabilities
2. POST /v1/agents/{id}/sessions               → Create session
3. POST /v1/agents/{id}/sessions/{id}/messages → Send research task (with model in controls)
4. GET /v1/agents/{id}/sessions/{id}/events    → Poll for progress/results
```