## LangGraph Open Deep Research - Supervisor-Researcher Architecture

In this notebook, we'll explore the **supervisor-researcher delegation architecture** for conducting deep research with LangGraph.

You can visit this repository to see the original application: [Open Deep Research](https://github.com/langchain-ai/open_deep_research)

Let's jump in!

## What We're Building

This implementation uses a **hierarchical delegation pattern** where:

1. **User Clarification** - Optionally asks clarifying questions to understand the research scope
2. **Research Brief Generation** - Transforms user messages into a structured research brief
3. **Supervisor** - A lead researcher that analyzes the brief and delegates research tasks
4. **Parallel Researchers** - Multiple sub-agents that conduct focused research simultaneously
5. **Research Compression** - Each researcher synthesizes their findings
6. **Final Report** - All findings are combined into a comprehensive report

![Architecture Diagram](https://i.imgur.com/Q8HEZn0.png)

This differs from a section-based approach by allowing dynamic task decomposition based on the research question, rather than predefined sections.

---

# ü§ù Breakout Room #1
## Deep Research Foundations

In this breakout room, we'll understand the architecture and components of the Open Deep Research system.

## Task 1: Dependencies

You'll need API keys for Anthropic (for the LLM) and Tavily (for web search). We'll configure the system to use Anthropic's Claude Sonnet 4 exclusively.

In [1]:
import os
import getpass

os.environ["ANTHROPIC_API_KEY"] = getpass.getpass("Enter your Anthropic API key: ")
os.environ["TAVILY_API_KEY"] = getpass.getpass("Enter your Tavily API key: ")

## Task 2: State Definitions

The state structure is hierarchical with three levels:

### Agent State (Top Level)
Contains the overall conversation messages, research brief, accumulated notes, and final report.

### Supervisor State (Middle Level)
Manages the research supervisor's messages, research iterations, and coordinating parallel researchers.

### Researcher State (Bottom Level)
Each individual researcher has their own message history, tool call iterations, and research findings.

We also have structured outputs for tool calling:
- **ConductResearch** - Tool for supervisor to delegate research to a sub-agent
- **ResearchComplete** - Tool to signal research phase is done
- **ClarifyWithUser** - Structured output for asking clarifying questions
- **ResearchQuestion** - Structured output for the research brief

Let's import these from our library: [`open_deep_library/state.py`](open_deep_library/state.py)

In [2]:
# Import state definitions from the library
from open_deep_library.state import (
    # Main workflow states
    AgentState,           # Lines 65-72: Top-level agent state with messages, research_brief, notes, final_report
    AgentInputState,      # Lines 62-63: Input state is just messages
    
    # Supervisor states
    SupervisorState,      # Lines 74-81: Supervisor manages research delegation and iterations
    
    # Researcher states
    ResearcherState,      # Lines 83-90: Individual researcher with messages and tool iterations
    ResearcherOutputState, # Lines 92-96: Output from researcher (compressed research + raw notes)
    
    # Structured outputs for tool calling
    ConductResearch,      # Lines 15-19: Tool for delegating research to sub-agents
    ResearchComplete,     # Lines 21-22: Tool to signal research completion
    ClarifyWithUser,      # Lines 30-41: Structured output for user clarification
    ResearchQuestion,     # Lines 43-48: Structured output for research brief
)

## Task 3: Utility Functions and Tools

The system uses several key utilities:

### Search Tools
- **tavily_search** - Async web search with automatic summarization to stay within token limits
- Supports Anthropic native web search and Tavily API

### Reflection Tools
- **think_tool** - Allows researchers to reflect on their progress and plan next steps (ReAct pattern)

### Helper Utilities
- **get_all_tools** - Assembles the complete toolkit (search + MCP + reflection)
- **get_today_str** - Provides current date context for research
- Token limit handling utilities for graceful degradation

These are defined in [`open_deep_library/utils.py`](open_deep_library/utils.py)

In [16]:
!pip install langchain-mcp-adapters
!pip install open-deep-research

Collecting open-deep-research
  Downloading open_deep_research-0.0.16-py3-none-any.whl.metadata (9.8 kB)
Collecting langchain-tavily (from open-deep-research)
  Downloading langchain_tavily-0.2.17-py3-none-any.whl.metadata (20 kB)
Collecting langchain-groq>=0.2.4 (from open-deep-research)
  Downloading langchain_groq-1.1.2-py3-none-any.whl.metadata (2.4 kB)
Collecting arxiv>=2.1.3 (from open-deep-research)
  Downloading arxiv-2.4.0-py3-none-any.whl.metadata (6.3 kB)
Collecting pymupdf>=1.25.3 (from open-deep-research)
  Downloading pymupdf-1.26.7-cp310-abi3-win_amd64.whl.metadata (3.4 kB)
Collecting xmltodict>=0.14.2 (from open-deep-research)
  Downloading xmltodict-1.0.2-py3-none-any.whl.metadata (15 kB)
Collecting linkup-sdk>=0.2.3 (from open-deep-research)
  Downloading linkup_sdk-0.10.0-py3-none-any.whl.metadata (8.0 kB)
Collecting exa-py>=1.8.8 (from open-deep-research)
  Downloading exa_py-2.4.0-py3-none-any.whl.metadata (3.4 kB)
Collecting beautifulsoup4==4.13.3 (from open-deep-

  DEPRECATION: Building 'forbiddenfruit' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'forbiddenfruit'. Discussion can be found at https://github.com/pypa/pip/issues/6334
  DEPRECATION: Building 'sgmllib3k' using the legacy setup.py bdist_wheel mechanism, which will be removed in a future version. pip 25.3 will enforce this behaviour change. A possible replacement is to use the standardized build interface by setting the `--use-pep517` option, (possibly combined with `--no-build-isolation`), or adding a `pyproject.toml` file to the source tree of 'sgmllib3k'. Discussion can be found at https://github.com/pypa/pip/issues/6334
  You can safely remove it manually.
  You can safely remove it ma

In [17]:
from pathlib import Path

start = Path(r"c:\MyWorkspace\Assignments\AIE9") # adjust if needed
matches = []

for p in start.rglob("open_deep_research"):
    if p.is_dir():
        matches.append(p)

print("Found:", len(matches))
for m in matches[:20]:
    print(m)


Found: 0


In [18]:
import sys
from pathlib import Path

ROOT = Path.cwd()
SRC = ROOT / "src"
if SRC.exists():
    sys.path.insert(0, str(SRC))
else:
    sys.path.insert(0, str(ROOT))

print("sys.path[0] =", sys.path[0])

from open_deep_library.utils import tavily_search, think_tool
print("‚úÖ imports ok")

sys.path[0] = c:\MyWorkspace\Assignments\AIE9\08_Open_DeepResearch
‚úÖ imports ok


In [19]:
# ============================================================
# OPEN DEEP RESEARCH ‚Äî SAFE IMPORTS (works even if open_deep_research is missing)
# Paste + run this cell.
# ============================================================

import sys
from pathlib import Path

# Ensure current folder is importable (contains open_deep_library/)
ROOT = Path.cwd()
if str(ROOT) not in sys.path:
    sys.path.insert(0, str(ROOT))

# Import ONLY symbols that do NOT require open_deep_research to exist.
# (Your env confirms tavily_search / think_tool import successfully.)
from open_deep_library.utils import (
    tavily_search,
    think_tool,
    get_today_str,
    get_api_key_for_model,
    is_token_limit_exceeded,
    get_model_token_limit,
    remove_up_to_last_ai_message,
    anthropic_websearch_called,
    openai_websearch_called,
    get_notes_from_tool_calls,
)

print("‚úÖ Safe imports loaded (open_deep_research not required).")

# Optional: try importing get_all_tools (will fail if it depends on open_deep_research)
try:
    from open_deep_library.utils import get_all_tools
    print("‚úÖ get_all_tools imported successfully.")
except Exception as e:
    print("‚ö†Ô∏è get_all_tools NOT available in this environment.")
    print(" Reason:", repr(e))
    print(" Fix: restore/install the missing 'open_deep_research' package OR patch utils.py to make it optional.")


‚úÖ Safe imports loaded (open_deep_research not required).
‚úÖ get_all_tools imported successfully.


## Task 4: Configuration System

The configuration system controls:

### Research Behavior
- **allow_clarification** - Whether to ask clarifying questions before research
- **max_concurrent_research_units** - How many parallel researchers can run (default: 5)
- **max_researcher_iterations** - How many times supervisor can delegate research (default: 6)
- **max_react_tool_calls** - Tool call limit per researcher (default: 10)

### Model Configuration
- **research_model** - Model for research and supervision (we'll use Anthropic)
- **compression_model** - Model for synthesizing findings
- **final_report_model** - Model for writing the final report
- **summarization_model** - Model for summarizing web search results

### Search Configuration
- **search_api** - Which search API to use (ANTHROPIC, TAVILY, or NONE)
- **max_content_length** - Character limit before summarization

Defined in [`open_deep_library/configuration.py`](open_deep_library/configuration.py)

In [20]:
# Import configuration from the library
from open_deep_library.configuration import (
    Configuration,    # Lines 38-247: Main configuration class with all settings
    SearchAPI,        # Lines 11-17: Enum for search API options (ANTHROPIC, TAVILY, NONE)
)

## Task 5: Prompt Templates

The system uses carefully engineered prompts for each phase:

### Phase 1: Clarification
**clarify_with_user_instructions** - Analyzes if the research scope is clear or needs clarification

### Phase 2: Research Brief
**transform_messages_into_research_topic_prompt** - Converts user messages into a detailed research brief

### Phase 3: Supervisor
**lead_researcher_prompt** - System prompt for the supervisor that manages delegation strategy

### Phase 4: Researcher
**research_system_prompt** - System prompt for individual researchers conducting focused research

### Phase 5: Compression
**compress_research_system_prompt** - Prompt for synthesizing research findings without losing information

### Phase 6: Final Report
**final_report_generation_prompt** - Comprehensive prompt for writing the final report

All prompts are defined in [`open_deep_library/prompts.py`](open_deep_library/prompts.py)

In [21]:
# Import prompt templates from the library
from open_deep_library.prompts import (
    clarify_with_user_instructions,                    # Lines 3-41: Ask clarifying questions
    transform_messages_into_research_topic_prompt,     # Lines 44-77: Generate research brief
    lead_researcher_prompt,                            # Lines 79-136: Supervisor system prompt
    research_system_prompt,                            # Lines 138-183: Researcher system prompt
    compress_research_system_prompt,                   # Lines 186-222: Research compression prompt
    final_report_generation_prompt,                    # Lines 228-308: Final report generation
)

## ‚ùì Question #1:

Explain the interrelationships between the three states (Agent, Supervisor, Researcher). Why don't we just make a single huge state?

##### Answer:
How the states relate

Think of the system as one main agent that delegates to a supervisor, which in turn fans out work to many researchers.

AgentState (top-level / ‚Äúproduct state‚Äù)

Holds the user-facing conversation (messages) and final outputs (final_report)

Also holds the shared research artifacts that flow across phases: research_brief, notes, raw_notes, plus supervisor_messages

Runs the main phases: clarify ‚Üí write brief ‚Üí run supervisor ‚Üí write final report


1. SupervisorState (coordination state)

A subset focused on managing delegation and aggregation:

a. supervisor_messages (the supervisor‚Äôs working thread)

b. research_brief, notes, raw_notes

c. research_iterations


It decides what research units to run (via ConductResearch) and turns researcher outputs into notes for the final report.


2. ResearcherState (worker state per research unit)

One instance per delegated research topic.

Tracks the researcher‚Äôs own tool loop state:

a. researcher_messages

b. tool_call_iterations

c. research_topic

d. compressed_research

e. raw_notes


Outputs a compact payload (compressed_research + notes) back to the supervisor.



Why not a single huge state?

1. Because each level has different responsibilities, different lifetimes, and different reducers, and mixing them creates real problems:

2. Separation of concerns: Agent/Supervisor/Researcher need different fields. One giant state becomes a ‚Äújunk drawer‚Äù of fields most nodes don‚Äôt need.

3. Prevents state collisions: You already have overlapping concepts (messages, notes). Splitting avoids confusing updates (e.g., researcher_messages vs supervisor_messages).

4. Smaller context = better performance: A huge state increases what gets carried around and summarized. That worsens ‚Äúcontext rot‚Äù and can slow the system down.

5. Cleaner reducers + safer updates: Your code uses custom reducers (like override_reducer) for some fields. Different graphs can apply the right reducer logic without side effects.

6. Better scalability + parallelism: Supervisor can spawn many researchers concurrently. Separate ResearcherState instances keep each worker isolated.

7. Easier debugging/testing: You can unit test supervisor behavior and researcher behavior independently.


### In short: one huge state is harder to maintain, easier to break, and more expensive to run.



## ‚ùì Question #2:

What are the advantages and disadvantages of importing these components instead of including them in the notebook?

##### Answer:
Advantages (why imports are good)

1. Modularity & reuse: You can reuse state.py, prompts, and graph construction across notebooks/apps.

2. Maintainability: Changes are localized (fix a prompt or state once, everywhere benefits).

3. Testability: You can write unit tests for state reducers, router logic, researcher compression, etc.

4. Cleaner notebook: The notebook focuses on how to use the system, not drowning in implementation detail.

5. Production-ready structure: Matches real-world packaging (pyproject, library folder, versioning).

Disadvantages (what you lose)

1. Less transparent for learning: Readers can‚Äôt see everything in one place; they must jump between files.

2. More setup friction: Imports/path issues (editable installs, working directory problems) can break notebooks.

3. Harder to quick-edit: Prototyping is slightly slower if you constantly hop between modules.



## üèóÔ∏è Activity #1: Explore the Prompts

Open `open_deep_library/prompts.py` and examine one of the prompt templates in detail.

**Requirements:**
1. Choose one prompt template (clarify, brief, supervisor, researcher, compression, or final report)
2. Explain what the prompt is designed to accomplish
3. Identify 2-3 key techniques used in the prompt (e.g., structured output, role definition, examples)
4. Suggest one improvement you might make to the prompt

**YOUR CODE HERE** - Write your analysis in a markdown cell below

Prompt chosen: lead_researcher_prompt (Supervisor prompt)

1) What the prompt is designed to accomplish

This prompt turns the model into a research supervisor whose main job is to:

plan the research approach

delegate research to sub-agents via the ConductResearch tool

review progress after each research round

stop at the right time by calling ResearchComplete


In other words, it‚Äôs not meant to ‚Äúwrite the report.‚Äù It‚Äôs meant to manage the research process efficiently and decide when the research is ‚Äúgood enough‚Äù to move to final writing.

2) 2‚Äì3 key techniques used in the prompt

A) Strong role definition + constraints

It clearly defines the role: ‚ÄúYou are a research supervisor.‚Äù

It narrows the allowed actions to a small set of tools (ConductResearch, ResearchComplete, think_tool) so the agent behaves like a coordinator, not a free-form writer.


B) Tool-first workflow with explicit control rules

It mandates using think_tool before and after each ConductResearch call.

It explicitly forbids parallel think_tool calls (‚ÄúDo not call think_tool with any other tools in parallel‚Äù), which reduces messy tool traces and keeps the loop predictable.


C) Structured sections and step-by-step procedure

The prompt uses labeled sections like <Task>, <Available Tools>, <Instructions>.

It provides a numbered procedure that reduces ambiguity and improves consistency across runs.


3) One improvement I would make

Add a clear completion rubric so the supervisor knows exactly when to stop research, for example:

‚ÄúStop when you have: (1) 5‚Äì8 high-quality sources, (2) coverage of all sub-questions, (3) at least one counterpoint or limitation, (4) a short evidence-backed outline.‚Äù


---

# ü§ù Breakout Room #2
## Building & Running the Researcher

In this breakout room, we'll explore the node functions, build the graph, and run wellness research.

## Task 6: Node Functions - The Building Blocks

Now let's look at the node functions that make up our graph. We'll import them from the library and understand what each does.

### The Complete Research Workflow

The workflow consists of 8 key nodes organized into 3 subgraphs:

1. **Main Graph Nodes:**
   - `clarify_with_user` - Entry point that checks if clarification is needed
   - `write_research_brief` - Transforms user input into structured research brief
   - `final_report_generation` - Synthesizes all research into final report

2. **Supervisor Subgraph Nodes:**
   - `supervisor` - Lead researcher that plans and delegates
   - `supervisor_tools` - Executes supervisor's tool calls (delegation, reflection)

3. **Researcher Subgraph Nodes:**
   - `researcher` - Individual researcher conducting focused research
   - `researcher_tools` - Executes researcher's tool calls (search, reflection)
   - `compress_research` - Synthesizes researcher's findings

All nodes are defined in [`open_deep_library/deep_researcher.py`](open_deep_library/deep_researcher.py)

### Node 1: clarify_with_user

**Purpose:** Analyzes user messages and asks clarifying questions if the research scope is unclear.

**Key Steps:**
1. Check if clarification is enabled in configuration
2. Use structured output to analyze if clarification is needed
3. If needed, end with a clarifying question for the user
4. If not needed, proceed to research brief with verification message

**Implementation:** [`open_deep_library/deep_researcher.py` lines 60-115](open_deep_library/deep_researcher.py#L60-L115)

In [23]:
# Import the clarify_with_user node
from open_deep_library.deep_researcher import clarify_with_user

### Node 2: write_research_brief

**Purpose:** Transforms user messages into a structured research brief for the supervisor.

**Key Steps:**
1. Use structured output to generate detailed research brief from messages
2. Initialize supervisor with system prompt and research brief
3. Set up supervisor messages with proper context

**Why this matters:** A well-structured research brief helps the supervisor make better delegation decisions.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 118-175](open_deep_library/deep_researcher.py#L118-L175)

In [7]:
# Import the write_research_brief node
from open_deep_library.deep_researcher import write_research_brief

### Node 3: supervisor

**Purpose:** Lead research supervisor that plans research strategy and delegates to sub-researchers.

**Key Steps:**
1. Configure model with three tools:
   - `ConductResearch` - Delegate research to a sub-agent
   - `ResearchComplete` - Signal that research is done
   - `think_tool` - Strategic reflection before decisions
2. Generate response based on current context
3. Increment research iteration count
4. Proceed to tool execution

**Decision Making:** The supervisor uses `think_tool` to reflect before delegating research, ensuring thoughtful decomposition of the research question.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 178-223](open_deep_library/deep_researcher.py#L178-L223)

In [24]:
# Import the supervisor node (from supervisor subgraph)
from open_deep_library.deep_researcher import supervisor

### Node 4: supervisor_tools

**Purpose:** Executes the supervisor's tool calls, including strategic thinking and research delegation.

**Key Steps:**
1. Check exit conditions:
   - Exceeded maximum iterations
   - No tool calls made
   - `ResearchComplete` called
2. Process `think_tool` calls for strategic reflection
3. Execute `ConductResearch` calls in parallel:
   - Spawn researcher subgraphs for each delegation
   - Limit to `max_concurrent_research_units` (default: 5)
   - Gather all results asynchronously
4. Aggregate findings and return to supervisor

**Parallel Execution:** This is where the magic happens - multiple researchers work simultaneously on different aspects of the research question.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 225-349](open_deep_library/deep_researcher.py#L225-L349)

In [25]:
# Import the supervisor_tools node
from open_deep_library.deep_researcher import supervisor_tools

### Node 5: researcher

**Purpose:** Individual researcher that conducts focused research on a specific topic.

**Key Steps:**
1. Load all available tools (search, MCP, reflection)
2. Configure model with tools and researcher system prompt
3. Generate response with tool calls
4. Increment tool call iteration count

**ReAct Pattern:** Researchers use `think_tool` to reflect after each search, deciding whether to continue or provide their answer.

**Available Tools:**
- Search tools (Tavily or Anthropic native search)
- `think_tool` for strategic reflection
- `ResearchComplete` to signal completion
- MCP tools (if configured)

**Implementation:** [`open_deep_library/deep_researcher.py` lines 365-424](open_deep_library/deep_researcher.py#L365-L424)

In [26]:
# Import the researcher node (from researcher subgraph)
from open_deep_library.deep_researcher import researcher

### Node 6: researcher_tools

**Purpose:** Executes the researcher's tool calls, including searches and strategic reflection.

**Key Steps:**
1. Check early exit conditions (no tool calls, native search used)
2. Execute all tool calls in parallel:
   - Search tools fetch and summarize web content
   - `think_tool` records strategic reflections
   - MCP tools execute external integrations
3. Check late exit conditions:
   - Exceeded `max_react_tool_calls` (default: 10)
   - `ResearchComplete` called
4. Continue research loop or proceed to compression

**Error Handling:** Safely handles tool execution errors and continues with available results.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 435-509](open_deep_library/deep_researcher.py#L435-L509)

In [27]:
# Import the researcher_tools node
from open_deep_library.deep_researcher import researcher_tools

### Node 7: compress_research

**Purpose:** Compresses and synthesizes research findings into a concise, structured summary.

**Key Steps:**
1. Configure compression model
2. Add compression instruction to messages
3. Attempt compression with retry logic:
   - If token limit exceeded, remove older messages
   - Retry up to 3 times
4. Extract raw notes from tool and AI messages
5. Return compressed research and raw notes

**Why Compression?** Researchers may accumulate lots of tool outputs and reflections. Compression ensures:
- All important information is preserved
- Redundant information is deduplicated
- Content stays within token limits for the final report

**Token Limit Handling:** Gracefully handles token limit errors by progressively truncating messages.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 511-585](open_deep_library/deep_researcher.py#L511-L585)

In [28]:
# Import the compress_research node
from open_deep_library.deep_researcher import compress_research

### Node 8: final_report_generation

**Purpose:** Generates the final comprehensive research report from all collected findings.

**Key Steps:**
1. Extract all notes from completed research
2. Configure final report model
3. Attempt report generation with retry logic:
   - If token limit exceeded, truncate findings by 10%
   - Retry up to 3 times
4. Return final report or error message

**Token Limit Strategy:**
- First retry: Use model's token limit √ó 4 as character limit
- Subsequent retries: Reduce by 10% each time
- Graceful degradation with helpful error messages

**Report Quality:** The prompt guides the model to create well-structured reports with:
- Proper headings and sections
- Inline citations
- Comprehensive coverage of all findings
- Sources section at the end

**Implementation:** [`open_deep_library/deep_researcher.py` lines 607-697](open_deep_library/deep_researcher.py#L607-L697)

In [29]:
# Import the final_report_generation node
from open_deep_library.deep_researcher import final_report_generation

## Task 7: Graph Construction - Putting It All Together

The system is organized into three interconnected graphs:

### 1. Researcher Subgraph (Bottom Level)
Handles individual focused research on a specific topic:
```
START ‚Üí researcher ‚Üí researcher_tools ‚Üí compress_research ‚Üí END
               ‚Üë            ‚Üì
               ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò (loops until max iterations or ResearchComplete)
```

### 2. Supervisor Subgraph (Middle Level)
Manages research delegation and coordination:
```
START ‚Üí supervisor ‚Üí supervisor_tools ‚Üí END
            ‚Üë              ‚Üì
            ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò (loops until max iterations or ResearchComplete)
            
supervisor_tools spawns multiple researcher_subgraphs in parallel
```

### 3. Main Deep Researcher Graph (Top Level)
Orchestrates the complete research workflow:
```
START ‚Üí clarify_with_user ‚Üí write_research_brief ‚Üí research_supervisor ‚Üí final_report_generation ‚Üí END
                 ‚Üì                                       (supervisor_subgraph)
               (may end early if clarification needed)
```

Let's import the compiled graphs from the library.

In [31]:
# Import the pre-compiled graphs from the library
from open_deep_library.deep_researcher import (
    # Bottom level: Individual researcher workflow
    researcher_subgraph,    # Lines 588-605: researcher ‚Üí researcher_tools ‚Üí compress_research
    
    # Middle level: Supervisor coordination
    supervisor_subgraph,    # Lines 351-363: supervisor ‚Üí supervisor_tools (spawns researchers)
    
    # Top level: Complete research workflow
    deep_researcher,        # Lines 699-719: Main graph with all phases
)

## Why This Architecture?

### Advantages of Supervisor-Researcher Delegation

1. **Dynamic Task Decomposition**
   - Unlike section-based approaches with predefined structure, the supervisor can break down research based on the actual question
   - Adapts to different types of research (comparisons, lists, deep dives, etc.)

2. **Parallel Execution**
   - Multiple researchers work simultaneously on different aspects
   - Much faster than sequential section processing
   - Configurable parallelism (1-20 concurrent researchers)

3. **ReAct Pattern for Quality**
   - Researchers use `think_tool` to reflect after each search
   - Prevents excessive searching and improves search quality
   - Natural stopping conditions based on information sufficiency

4. **Flexible Tool Integration**
   - Easy to add MCP tools for specialized research
   - Supports multiple search APIs (Anthropic, Tavily)
   - Each researcher can use different tool combinations

5. **Graceful Token Limit Handling**
   - Compression prevents token overflow
   - Progressive truncation in final report generation
   - Research can scale to arbitrary depths

### Trade-offs

- **Complexity:** More moving parts than section-based approach
- **Cost:** Parallel researchers use more tokens (but faster)
- **Unpredictability:** Research structure emerges dynamically

## Task 8: Running the Deep Researcher

Now let's see the system in action! We'll use it to research wellness strategies for improving sleep quality.

### Setup

We need to:
1. Set up the wellness research request
2. Configure the execution with Anthropic settings
3. Run the research workflow

In [32]:
# Set up the graph with Anthropic configuration
from IPython.display import Markdown, display
import uuid

# Note: deep_researcher is already compiled from the library
# For this demo, we'll use it directly without additional checkpointing
graph = deep_researcher

print("‚úì Graph ready for execution")
print("  (Note: The graph is pre-compiled from the library)")

‚úì Graph ready for execution
  (Note: The graph is pre-compiled from the library)


### Configuration for Anthropic

We'll configure the system to use:
- **Claude Sonnet 4** for all research, supervision, and report generation
- **Tavily** for web search (you can also use Anthropic's native search)
- **Moderate parallelism** (1 concurrent researcher for cost control)
- **Clarification enabled** (will ask if research scope is unclear)

In [33]:
# Configure for Anthropic with moderate settings
config = {
    "configurable": {
        # Model configuration - using Claude Sonnet 4 for everything
        "research_model": "anthropic:claude-sonnet-4-20250514",
        "research_model_max_tokens": 10000,
        
        "compression_model": "anthropic:claude-sonnet-4-20250514",
        "compression_model_max_tokens": 8192,
        
        "final_report_model": "anthropic:claude-sonnet-4-20250514",
        "final_report_model_max_tokens": 10000,
        
        "summarization_model": "anthropic:claude-sonnet-4-20250514",
        "summarization_model_max_tokens": 8192,
        
        # Research behavior
        "allow_clarification": True,
        "max_concurrent_research_units": 1,  # 1 parallel researcher
        "max_researcher_iterations": 2,      # Supervisor can delegate up to 2 times
        "max_react_tool_calls": 3,           # Each researcher can make up to 3 tool calls
        
        # Search configuration
        "search_api": "tavily",  # Using Tavily for web search
        "max_content_length": 50000,
        
        # Thread ID for this conversation
        "thread_id": str(uuid.uuid4())
    }
}

print("‚úì Configuration ready")
print(f"  - Research Model: Claude Sonnet 4")
print(f"  - Max Concurrent Researchers: 1")
print(f"  - Max Iterations: 2")
print(f"  - Search API: Tavily")

‚úì Configuration ready
  - Research Model: Claude Sonnet 4
  - Max Concurrent Researchers: 1
  - Max Iterations: 2
  - Search API: Tavily


### Execute the Wellness Research

Now let's run the research! We'll ask the system to research evidence-based strategies for improving sleep quality.

The workflow will:
1. **Clarify** - Check if the request is clear (may skip if obvious)
2. **Research Brief** - Transform our request into a structured brief
3. **Supervisor** - Plan research strategy and delegate to researchers
4. **Parallel Research** - Researchers gather information simultaneously
5. **Compression** - Each researcher synthesizes their findings
6. **Final Report** - All findings combined into comprehensive report

In [34]:
# Create our wellness research request
research_request = """
I want to improve my sleep quality. I currently:
- Go to bed at inconsistent times (10pm-1am)
- Use my phone in bed
- Often feel tired in the morning

Please research the best evidence-based strategies for improving sleep quality and create a comprehensive sleep improvement plan for me.
"""

# Execute the graph
async def run_research():
    """Run the research workflow and display results."""
    print("Starting research workflow...\n")
    
    async for event in graph.astream(
        {"messages": [{"role": "user", "content": research_request}]},
        config,
        stream_mode="updates"
    ):
        # Display each step
        for node_name, node_output in event.items():
            print(f"\n{'='*60}")
            print(f"Node: {node_name}")
            print(f"{'='*60}")
            
            if node_name == "clarify_with_user":
                if "messages" in node_output:
                    last_msg = node_output["messages"][-1]
                    print(f"\n{last_msg.content}")
            
            elif node_name == "write_research_brief":
                if "research_brief" in node_output:
                    print(f"\nResearch Brief Generated:")
                    print(f"{node_output['research_brief'][:500]}...")
            
            elif node_name == "supervisor":
                print(f"\nSupervisor planning research strategy...")
                if "supervisor_messages" in node_output:
                    last_msg = node_output["supervisor_messages"][-1]
                    if hasattr(last_msg, 'tool_calls') and last_msg.tool_calls:
                        print(f"Tool calls: {len(last_msg.tool_calls)}")
                        for tc in last_msg.tool_calls:
                            print(f"  - {tc['name']}")
            
            elif node_name == "supervisor_tools":
                print(f"\nExecuting supervisor's tool calls...")
                if "notes" in node_output:
                    print(f"Research notes collected: {len(node_output['notes'])}")
            
            elif node_name == "final_report_generation":
                if "final_report" in node_output:
                    print(f"\n" + "="*60)
                    print("FINAL REPORT GENERATED")
                    print("="*60 + "\n")
                    display(Markdown(node_output["final_report"]))
    
    print("\n" + "="*60)
    print("Research workflow completed!")
    print("="*60)

# Run the research
await run_research()

Starting research workflow...


Node: clarify_with_user

I have sufficient information to proceed with your sleep improvement research request. I understand you're looking for evidence-based strategies to address your current sleep challenges, which include inconsistent bedtimes (10pm-1am), phone use in bed, and morning fatigue. I'll research the best scientific approaches to sleep hygiene and create a comprehensive, personalized sleep improvement plan that addresses these specific issues. I'll begin the research process now.

Node: write_research_brief

Research Brief Generated:
I want to improve my sleep quality and need a comprehensive, evidence-based sleep improvement plan. My current sleep challenges include: going to bed at inconsistent times (ranging from 10pm to 1am), using my phone in bed, and often feeling tired in the morning despite getting sleep. Please research the most effective, scientifically-backed strategies for improving sleep quality that specifically address incon

# Evidence-Based Sleep Improvement Plan: Addressing Inconsistent Bedtimes, Screen Use, and Morning Fatigue

Unfortunately, the research conducted for this comprehensive sleep improvement plan encountered technical limitations that prevented access to peer-reviewed studies and official clinical guidelines from major sleep organizations. However, based on established sleep science principles and clinical practice standards, the following evidence-based recommendations address the specific challenges of inconsistent bedtime schedules, screen time before bed, and morning fatigue.

## Sleep Hygiene Foundation

Optimal sleep hygiene forms the cornerstone of sleep quality improvement. Core practices include maintaining a cool bedroom temperature between 60-67¬∞F (15-19¬∞C), ensuring complete darkness through blackout curtains or eye masks, and minimizing noise disruption. The sleep environment should be reserved exclusively for sleep and intimate activities to strengthen the mental association between the bedroom and sleep.

Regular physical activity, preferably completed at least 4-6 hours before bedtime, significantly improves sleep quality and reduces the time needed to fall asleep. However, vigorous exercise close to bedtime can be stimulating and should be avoided. Similarly, caffeine consumption should be limited after 2 PM, as caffeine has a half-life of 5-6 hours and can interfere with sleep initiation and depth even when consumed earlier in the day.

## Establishing Consistent Sleep Timing

The current 3-hour variation in bedtime (10 PM to 1 AM) represents a significant circadian rhythm disruption that requires systematic correction. Circadian rhythm entrainment relies on consistent light-dark cycles and regular sleep-wake timing to optimize melatonin production and core body temperature fluctuations.

To establish consistency, gradually shift bedtime by 15-30 minutes earlier each night until reaching the target bedtime. This process should take 1-2 weeks to avoid shocking the circadian system. Once the desired bedtime is achieved, maintain it within a 30-minute window every night, including weekends, to prevent "social jet lag."

Morning light exposure within the first hour of waking serves as the most powerful circadian anchor. Spend 15-30 minutes outdoors or near a bright window immediately upon waking, even on cloudy days. This light exposure suppresses residual melatonin and signals the circadian clock to maintain proper sleep-wake timing.

## Screen Time and Blue Light Management

Electronic device usage before bedtime disrupts sleep through multiple mechanisms: blue light exposure suppresses melatonin production, cognitive stimulation increases arousal, and the content consumed can trigger emotional responses that interfere with the relaxation necessary for sleep onset.

Implement a digital sunset by ceasing all screen use 1-2 hours before the target bedtime. This includes smartphones, tablets, computers, and television. If complete avoidance is impossible, utilize blue light filtering glasses or device settings that reduce blue light emission after sunset. However, these measures are less effective than complete screen avoidance.

Create a charging station outside the bedroom to eliminate the temptation to use devices in bed. Replace bedtime phone use with relaxing activities such as reading physical books, gentle stretching, meditation, or journaling. These alternative activities promote the relaxation response needed for quality sleep.

## Addressing Morning Fatigue

Persistent morning fatigue despite adequate sleep duration often indicates poor sleep quality rather than insufficient sleep quantity. Sleep architecture consists of multiple stages, including deep sleep (slow-wave sleep) and REM sleep, both crucial for feeling refreshed upon waking.

Sleep inertia, the groggy feeling upon waking, can be minimized by avoiding sleep fragmentation and ensuring complete sleep cycles. A typical sleep cycle lasts 90-110 minutes, so timing total sleep to align with complete cycles (7.5 or 9 hours rather than 8 hours) may reduce morning grogginess.

Consistent wake times, even more important than consistent bedtimes, help regulate circadian rhythms and reduce morning fatigue. Set an alarm for the same time every day, including weekends, and resist the urge to hit the snooze button, which fragments sleep and increases grogginess.

## Pre-Sleep Routine Development

Establish a consistent 30-60 minute wind-down routine that signals to the body that sleep is approaching. This routine should be relaxing and the same each night to create a conditioned response. Effective activities include taking a warm bath or shower (which facilitates the drop in core body temperature needed for sleep), practicing progressive muscle relaxation, or engaging in gentle yoga or stretching.

Temperature regulation plays a crucial role in sleep onset. The body needs to drop its core temperature by 1-2 degrees Fahrenheit to initiate sleep. A warm bath 90 minutes before bedtime facilitates this temperature drop through vasodilation when exiting the warm water.

## Advanced Sleep Optimization Strategies

Sleep restriction therapy can improve sleep efficiency by limiting time in bed to match actual sleep time, then gradually increasing as sleep quality improves. This technique should be implemented carefully and may benefit from professional guidance.

Stimulus control involves strengthening the association between the bedroom and sleep by following specific rules: only go to bed when sleepy, leave the bed if unable to fall asleep within 15-20 minutes, and return only when sleepy again. This prevents the bed from becoming associated with wakefulness and frustration.

Relaxation techniques such as progressive muscle relaxation, diaphragmatic breathing, and mindfulness meditation can reduce pre-sleep arousal and racing thoughts that interfere with sleep onset. These techniques require practice to become effective but can significantly improve sleep quality over time.

## Implementation Strategy

Begin implementing changes gradually to avoid overwhelming existing routines. Start with establishing a consistent bedtime and eliminating screen use in the bedroom, as these address the most significant current sleep disruptors. Once these habits are established (typically 2-3 weeks), add additional sleep hygiene practices and optimization strategies.

Track sleep patterns and quality using a sleep diary to identify which interventions provide the greatest benefit. Note bedtime, wake time, estimated sleep onset time, number of awakenings, and morning energy levels to objectively assess progress.

Consider professional evaluation if morning fatigue persists despite implementing these strategies, as underlying sleep disorders such as sleep apnea, restless leg syndrome, or other medical conditions may require specific treatment beyond behavioral interventions.

### Sources

No sources were successfully retrieved during this research session due to technical limitations. This comprehensive plan is based on established sleep science principles and clinical practice standards, but peer-reviewed sources from sleep medicine journals and official sleep organization guidelines could not be accessed as initially intended.


Research workflow completed!


## Task 9: Understanding the Output

Let's break down what happened:

### Phase 1: Clarification
The system checked if your request was clear. Since you provided specific details about your sleep issues, it likely proceeded without asking clarifying questions.

### Phase 2: Research Brief
Your request was transformed into a detailed research brief that guides the supervisor's delegation strategy.

### Phase 3: Supervisor Delegation
The supervisor analyzed the brief and decided how to break down the research:
- Used `think_tool` to plan strategy
- Called `ConductResearch` to delegate to researchers
- Each delegation specified a focused research topic (e.g., sleep hygiene, circadian rhythm, blue light effects)

### Phase 4: Parallel Research
Researchers worked on their assigned topics:
- Each researcher used web search tools to gather information
- Used `think_tool` to reflect after each search
- Decided when they had enough information
- Compressed their findings into clean summaries

### Phase 5: Final Report
All research findings were synthesized into a comprehensive sleep improvement plan with:
- Well-structured sections
- Evidence-based recommendations
- Practical action items
- Sources for further reading

## Task 10: Key Takeaways & Next Steps

### Architecture Benefits
1. **Dynamic Decomposition** - Research structure emerges from the question, not predefined
2. **Parallel Efficiency** - Multiple researchers work simultaneously
3. **ReAct Quality** - Strategic reflection improves search decisions
4. **Scalability** - Handles token limits gracefully through compression
5. **Flexibility** - Easy to add new tools and capabilities

### When to Use This Pattern
- **Complex research questions** that need multi-angle investigation
- **Comparison tasks** where parallel research on different topics is beneficial
- **Open-ended exploration** where structure should emerge dynamically
- **Time-sensitive research** where parallel execution speeds up results

### When to Use Section-Based Instead
- **Highly structured reports** with predefined format requirements
- **Template-based content** where sections are always the same
- **Sequential dependencies** where later sections depend on earlier ones
- **Budget constraints** where token efficiency is critical

### Extend the System
1. **Add MCP Tools** - Integrate specialized tools for your domain
2. **Custom Prompts** - Modify prompts for specific research types
3. **Different Models** - Try different Claude versions or mix models
4. **Persistence** - Use a real database for checkpointing instead of memory

### Learn More
- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/)
- [Open Deep Research Repo](https://github.com/langchain-ai/open_deep_research)
- [Anthropic Claude Documentation](https://docs.anthropic.com/)
- [Tavily Search API](https://tavily.com/)

## ‚ùì Question #3:

What are the trade-offs of using parallel researchers vs. sequential research? When might you choose one approach over the other?

##### Answer:
Parallel research (many researchers at once)

## Pros

1. Much faster wall-clock time (great when you need breadth quickly)

2. Better coverage: different angles/sources in one round

3. Works well when tasks are independent (e.g., ‚Äúbenefits‚Äù, ‚Äúrisks‚Äù, ‚Äúbest practices‚Äù, ‚Äúlatest research‚Äù)


## Cons

1. Higher cost (multiple model calls + tools)

2. More noisy/duplicative outputs; harder to merge

3. More risk of inconsistent assumptions or contradictions

4. Harder to manage context: more notes to summarize (‚Äúcontext rot‚Äù)



Sequential research (one researcher/round at a time)

## Pros

1. Cheaper and easier to control

2. Each step can build on the last (refine questions, fill gaps)

3. Cleaner, less duplication, simpler aggregation


## Cons

1. Slower end-to-end

2. Can miss breadth unless you do many rounds



When to choose which

## Choose parallel when:

1. you have a fixed deadline / need speed

2. you want broad coverage across subtopics

3. sources are independent (no need for iterative refinement)

## Choose sequential when:

1. budget is tight

2. the question is ambiguous and needs iterative narrowing

3. you need high reliability and careful synthesis (fewer moving parts)


## ‚ùì Question #4:

How would you adapt this deep research architecture for a production wellness application? What additional components would you need?

##### Answer:
To productionize it for wellness, you‚Äôd keep the same roles (Agent ‚Üí Supervisor ‚Üí Researchers) but add safety, persistence, and operations.

1) Safety guardrails (critical)

Medical safety policy: no diagnosis, no medication changes, escalation rules for red flags

Contraindication checks using user profile (allergies, meds, conditions)

Tool gating: restrict web search, filter sources, block untrusted sites

Safety evaluations + red teaming, plus automated ‚Äúunsafe advice‚Äù detection


2) Persistent storage + user isolation

Store user profile, preferences, and conditions in a real DB (Postgres)

Store daily check-ins, plans, and summaries in durable storage (S3/object store)

Namespace everything by tenant_id/user_id to prevent leakage

Keep an audit trail of what was recommended and why (for trust + debugging)


3) Context management at scale

Retrieval over curated health content (your KB) + optionally vetted web

Summarization layers:

user snapshot (goals, constraints, current plan)

rolling weekly summary


Don‚Äôt pass full history; pass structured state + retrieved snippets


4) Observability + monitoring

Tracing per run: which researchers fired, tool calls, timings, failures

Metrics: latency, token usage, cost per request, cache hit rate, safety event counts

Quality monitoring: user feedback, outcome tracking (did they follow plan? improvements?)


5) Cost controls

Budget limits per request (max tool calls, max researchers, max tokens)

Adaptive parallelism (only fan out when needed)

Caching (semantic cache for repeated queries, cached retrieval results)

Model tiering: cheaper models for extraction/formatting, stronger for supervision/safety synthesis


6) Product features

Personalization engine using stored preferences + constraints

Scheduling/reminders (optional) and progress tracking

Human-in-the-loop escalation for high-risk or ambiguous cases


Net: the research architecture is a great backbone, but production requires guardrails, persistence, isolation, observability, and cost governance on top.




## üèóÔ∏è Activity #2: Custom Wellness Research

Using what you've learned, run a custom wellness research task.

**Requirements:**
1. Create a wellness-related research question (exercise, nutrition, stress, etc.)
2. Modify the configuration for your use case
3. Run the research and analyze the output
4. Document what worked well and what could be improved

**Experiment ideas:**
- Research exercise routines for specific conditions (bad knee, lower back pain)
- Compare different stress management techniques
- Investigate nutrition strategies for specific goals
- Explore meditation and mindfulness research

**YOUR CODE HERE**

In [3]:
# ============================================================
# Final Activity #2 ‚Äî Custom Wellness Research (Jupyter-safe: top-level await)
# ============================================================

from pathlib import Path
from datetime import datetime

RESEARCH_QUESTION = (
    "Compare evidence-based stress management techniques for working professionals. "
    "Focus on 5 strategies: (1) mindfulness meditation, (2) paced breathing, "
    "(3) exercise, (4) CBT-based journaling/reframing, (5) sleep hygiene. "
    "For each: mechanism, how to do it, time required, who it works best for, limitations."
)

CFG = {
    "search_rounds": 3,
    "results_per_round": 5,
    "save_dir": Path("workspace") / "custom_research",
    "report_filename": "stress_management_comparison.md",
}

CFG["save_dir"].mkdir(parents=True, exist_ok=True)
today = get_today_str() if "get_today_str" in globals() else datetime.now().strftime("%Y-%m-%d")

def save_md(path: Path, content: str) -> None:
    path.parent.mkdir(parents=True, exist_ok=True)
    path.write_text(content, encoding="utf-8")

# ============================================================
# One-shot setup: import tools + run think_tool plan
# ============================================================

import sys
from pathlib import Path

# Ensure current folder is importable (contains open_deep_library/)
ROOT = Path.cwd()
if str(ROOT) not in sys.path:
    sys.path.insert(0, str(ROOT))

# Safe imports (does NOT require open_deep_research to exist)
from open_deep_library.utils import (
    think_tool,
    get_today_str,
)

# ---- Define your research question ----
RESEARCH_QUESTION = (
    "Compare evidence-based stress management techniques for working professionals. "
    "Focus on 5 strategies: mindfulness meditation, paced breathing, exercise, "
    "CBT-based journaling/reframing, and sleep hygiene."
)

# ---- Run plan (think_tool expects 'reflection') ----
plan = think_tool.invoke({
    "reflection": (
        f"Task: {RESEARCH_QUESTION}\n\n"
        "Create a short research plan:\n"
        "- Sub-questions to answer\n"
        "- Keywords to search\n"
        "- Evidence to prioritize (guidelines, systematic reviews, RCTs)\n"
        "- Final report outline\n"
        "Keep it concise."
    )
})

print("‚úÖ Research plan created.")
print(plan)


# 2) Async Tavily search rounds (tavily_search is async-only)
async def run_search_rounds():
    all_findings = []
    for i in range(1, CFG["search_rounds"] + 1):
        q = RESEARCH_QUESTION
        if i == 2:
            q += " systematic review meta-analysis"
        if i == 3:
            q += " randomized trial workplace stress"

        res = await tavily_search.ainvoke({
            "queries": [q],
            "max_results": CFG["results_per_round"],
        })
        all_findings.append(res)
        print(f"‚úÖ Search round {i}/{CFG['search_rounds']} complete.")
    return all_findings

all_findings = await run_search_rounds()

# 3) Synthesis
synthesis = think_tool.invoke({
    "reflection": (
        "Using the research notes below, write a structured markdown report with:\n"
        "1) Executive summary\n"
        "2) Comparison table (strategy, time, difficulty, evidence strength, best-for)\n"
        "3) 5 detailed sections (mechanism, protocol, expected benefits, limitations)\n"
        "4) Practical 2-week starter plan combining 2‚Äì3 techniques\n"
        "5) References list (use sources mentioned in the notes)\n"
        "Avoid medical diagnosis. Be clear and actionable.\n\n"
        f"RESEARCH QUESTION:\n{RESEARCH_QUESTION}\n\n"
        f"RESEARCH NOTES:\n{all_findings}"
    )
})

report_md = f"""# Custom Wellness Research Report
**Date:** {today}  
**Question:** {RESEARCH_QUESTION}

---

## Research plan
{plan}

---

## Findings + synthesis
{synthesis}
"""

out_path = CFG["save_dir"] / CFG["report_filename"]
save_md(out_path, report_md)
print(f"\n‚úÖ Saved report to: {out_path}")

# 4) Evaluate
analysis = think_tool.invoke({
    "reflection": (
        "Evaluate the report quality in ~10 bullets:\n"
        "- What worked well (coverage, structure, evidence quality)\n"
        "- What could be improved (missing angles, weak evidence, unclear parts)\n"
        "- What config changes to try next\n\n"
        f"REPORT CONTENT:\n{synthesis}"
    )
})

analysis_path = CFG["save_dir"] / "analysis_notes.md"
save_md(analysis_path, f"# Output evaluation\n\n{analysis}\n")
print(f"‚úÖ Saved analysis to: {analysis_path}")

print("\n--- Preview (first 15 lines) ---")
for line in report_md.splitlines()[:15]:
    print(line)

  from .autonotebook import tqdm as notebook_tqdm


‚úÖ Research plan created.
Reflection recorded: Task: Compare evidence-based stress management techniques for working professionals. Focus on 5 strategies: mindfulness meditation, paced breathing, exercise, CBT-based journaling/reframing, and sleep hygiene.

Create a short research plan:
- Sub-questions to answer
- Keywords to search
- Evidence to prioritize (guidelines, systematic reviews, RCTs)
- Final report outline
Keep it concise.


NameError: name 'tavily_search' is not defined