## Activity #1 — Experiment Results

- Increase Parallelism (max_concurrent_research_units: 1 → 3)
  - Reduced wall-clock time by running researchers concurrently, but increased token throughput and 429 rate-limit risk.
  - Higher cost variability due to multiple simultaneous tool/model calls.

- Deeper Research (max_researcher_iterations: 1 → 3; max_react_tool_calls: 2 → 5)
  - Produced richer notes and a more comprehensive final report.
  - Increased latency and token usage, with a higher chance of hitting context/token limits.

- Use Anthropic Native Search (search_api: "tavily" → "anthropic")
  - Fewer explicit tool messages and simpler loops, but more LLM-side token usage subject to Anthropic rate limits.
  - Comparable quality if queries are clear, with variability tied to model-side retrieval quality.

- Disable Clarification (allow_clarification: True → False)
  - Avoided one initial model call and sped up start-up.
  - Risked missing key constraints, occasionally reducing accuracy when the request is ambiguous.


## LangGraph Open Deep Research - Supervisor-Researcher Architecture

In this notebook, we'll explore the **supervisor-researcher delegation architecture** for conducting deep research with LangGraph.

You can visit this repository to see the original application: [Open Deep Research](https://github.com/langchain-ai/open_deep_research)

Let's jump in!

## What We're Building

This implementation uses a **hierarchical delegation pattern** where:

1. **User Clarification** - Optionally asks clarifying questions to understand the research scope
2. **Research Brief Generation** - Transforms user messages into a structured research brief
3. **Supervisor** - A lead researcher that analyzes the brief and delegates research tasks
4. **Parallel Researchers** - Multiple sub-agents that conduct focused research simultaneously
5. **Research Compression** - Each researcher synthesizes their findings
6. **Final Report** - All findings are combined into a comprehensive report

![Architecture Diagram](https://private-user-images.githubusercontent.com/181020547/465824799-12a2371b-8be2-4219-9b48-90503eb43c69.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NjAwNDgyMzcsIm5iZiI6MTc2MDA0NzkzNywicGF0aCI6Ii8xODEwMjA1NDcvNDY1ODI0Nzk5LTEyYTIzNzFiLThiZTItNDIxOS05YjQ4LTkwNTAzZWI0M2M2OS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUxMDA5JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MTAwOVQyMjEyMTdaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1iYTRmYTAzYjkzYjA2MGE4ZTZlYjQ4ODU1OWIwY2VlZWU0Mzk0YzdmMjQ1YTlhMDMyNmI3NWNlZTQxNDdlZGViJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.a8477QD1J4Lrmys7jB8gt_H5pdiKBsKsu3npEqZjEpo)

This differs from a section-based approach by allowing dynamic task decomposition based on the research question, rather than predefined sections.

## Dependencies

You'll need API keys for Anthropic (for the LLM) and Tavily (for web search). We'll configure the system to use Anthropic's Claude Sonnet 4 exclusively.

In [2]:
import os
import getpass

os.environ["ANTHROPIC_API_KEY"] = getpass.getpass("Enter your Anthropic API key: ")
os.environ["TAVILY_API_KEY"] = getpass.getpass("Enter your Tavily API key: ")

## Task 1: State Definitions

The state structure is hierarchical with three levels:

### Agent State (Top Level)
Contains the overall conversation messages, research brief, accumulated notes, and final report.

### Supervisor State (Middle Level)
Manages the research supervisor's messages, research iterations, and coordinating parallel researchers.

### Researcher State (Bottom Level)
Each individual researcher has their own message history, tool call iterations, and research findings.

We also have structured outputs for tool calling:
- **ConductResearch** - Tool for supervisor to delegate research to a sub-agent
- **ResearchComplete** - Tool to signal research phase is done
- **ClarifyWithUser** - Structured output for asking clarifying questions
- **ResearchQuestion** - Structured output for the research brief

Let's import these from our library: [`open_deep_library/state.py`](open_deep_library/state.py)

In [3]:
# Import state definitions from the library
from open_deep_library.state import (
    # Main workflow states
    AgentState,           # Lines 65-72: Top-level agent state with messages, research_brief, notes, final_report
    AgentInputState,      # Lines 62-63: Input state is just messages
    
    # Supervisor states
    SupervisorState,      # Lines 74-81: Supervisor manages research delegation and iterations
    
    # Researcher states
    ResearcherState,      # Lines 83-90: Individual researcher with messages and tool iterations
    ResearcherOutputState, # Lines 92-96: Output from researcher (compressed research + raw notes)
    
    # Structured outputs for tool calling
    ConductResearch,      # Lines 15-19: Tool for delegating research to sub-agents
    ResearchComplete,     # Lines 21-22: Tool to signal research completion
    ClarifyWithUser,      # Lines 30-41: Structured output for user clarification
    ResearchQuestion,     # Lines 43-48: Structured output for research brief
)

#### ❓ Question 1:

 Explain the interrelationships between the three states.  Why don't we just make a single huge state?

### ✅ ANSWER:
Agent state orchestrates everything. The flow looks like Agent->Superviso->Reseacrher(s)->supervisor->agent.
The relationship is that they are all working on their specialized subtasks, while be orchestrated by the agent state. 

This is not one large state because we want to promote modularity and parallelism of each state. It simplifies the graph allowing each subgraph to be specialized in something. This produces better quality outputs using states rather than one large on.


## Task 2: Utility Functions and Tools

The system uses several key utilities:

### Search Tools
- **tavily_search** - Async web search with automatic summarization to stay within token limits
- Supports Anthropic native web search and Tavily API

### Reflection Tools
- **think_tool** - Allows researchers to reflect on their progress and plan next steps (ReAct pattern)

### Helper Utilities
- **get_all_tools** - Assembles the complete toolkit (search + MCP + reflection)
- **get_today_str** - Provides current date context for research
- Token limit handling utilities for graceful degradation

These are defined in [`open_deep_library/utils.py`](open_deep_library/utils.py)

In [4]:
# Import utility functions and tools from the library
from open_deep_library.utils import (
    # Search tool - Lines 43-136: Tavily search with automatic summarization
    tavily_search,
    
    # Reflection tool - Lines 219-244: Strategic thinking tool for ReAct pattern
    think_tool,
    
    # Tool assembly - Lines 569-597: Get all configured tools
    get_all_tools,
    
    # Date utility - Lines 872-879: Get formatted current date
    get_today_str,
    
    # Supporting utilities for error handling
    get_api_key_for_model,          # Lines 892-914: Get API keys from config or env
    is_token_limit_exceeded,         # Lines 665-701: Detect token limit errors
    get_model_token_limit,           # Lines 831-846: Look up model's token limit
    remove_up_to_last_ai_message,    # Lines 848-866: Truncate messages for retry
    anthropic_websearch_called,      # Lines 607-637: Detect Anthropic native search usage
    openai_websearch_called,         # Lines 639-658: Detect OpenAI native search usage
    get_notes_from_tool_calls,       # Lines 599-601: Extract notes from tool messages
)

### ❓ Question 2:  

What are the advantages and disadvantages of importing these components instead of including them in the notebook?

### ANSWER ✅:
Advantages:
- Keeps the notebook short and concise
- Ensures a single source of truth, all these methods live elsewhere.
- All of it is easily reusable and can be imported elsewhere (another notebook for instance). 
- On reusability, we could also test these methods since they're in a module instead of within the notebook

Disadvantages:
- The notebook itself becomes harder to tweak. If I wanted to inline change something in one of the methods, I need to make the changes and reimport, vs jsut inlinin the change if it lived within the notebook
- In my opinion, it adds friction importing all of the methods as a package vs the code just all living in the notebook.
    - Though, I do prefer the module method being used



## Task 3: Configuration System

The configuration system controls:

### Research Behavior
- **allow_clarification** - Whether to ask clarifying questions before research
- **max_concurrent_research_units** - How many parallel researchers can run (default: 5)
- **max_researcher_iterations** - How many times supervisor can delegate research (default: 6)
- **max_react_tool_calls** - Tool call limit per researcher (default: 10)

### Model Configuration
- **research_model** - Model for research and supervision (we'll use Anthropic)
- **compression_model** - Model for synthesizing findings
- **final_report_model** - Model for writing the final report
- **summarization_model** - Model for summarizing web search results

### Search Configuration
- **search_api** - Which search API to use (ANTHROPIC, TAVILY, or NONE)
- **max_content_length** - Character limit before summarization

Defined in [`open_deep_library/configuration.py`](open_deep_library/configuration.py)

In [5]:
# Import configuration from the library
from open_deep_library.configuration import (
    Configuration,    # Lines 38-247: Main configuration class with all settings
    SearchAPI,        # Lines 11-17: Enum for search API options (ANTHROPIC, TAVILY, NONE)
)

## Task 4: Prompt Templates

The system uses carefully engineered prompts for each phase:

### Phase 1: Clarification
**clarify_with_user_instructions** - Analyzes if the research scope is clear or needs clarification

### Phase 2: Research Brief
**transform_messages_into_research_topic_prompt** - Converts user messages into a detailed research brief

### Phase 3: Supervisor
**lead_researcher_prompt** - System prompt for the supervisor that manages delegation strategy

### Phase 4: Researcher
**research_system_prompt** - System prompt for individual researchers conducting focused research

### Phase 5: Compression
**compress_research_system_prompt** - Prompt for synthesizing research findings without losing information

### Phase 6: Final Report
**final_report_generation_prompt** - Comprehensive prompt for writing the final report

All prompts are defined in [`open_deep_library/prompts.py`](open_deep_library/prompts.py)

In [6]:
# Import prompt templates from the library
from open_deep_library.prompts import (
    clarify_with_user_instructions,                    # Lines 3-41: Ask clarifying questions
    transform_messages_into_research_topic_prompt,     # Lines 44-77: Generate research brief
    lead_researcher_prompt,                            # Lines 79-136: Supervisor system prompt
    research_system_prompt,                            # Lines 138-183: Researcher system prompt
    compress_research_system_prompt,                   # Lines 186-222: Research compression prompt
    final_report_generation_prompt,                    # Lines 228-308: Final report generation
)

## Task 5: Node Functions - The Building Blocks

Now let's look at the node functions that make up our graph. We'll import them from the library and understand what each does.

### The Complete Research Workflow

The workflow consists of 8 key nodes organized into 3 subgraphs:

1. **Main Graph Nodes:**
   - `clarify_with_user` - Entry point that checks if clarification is needed
   - `write_research_brief` - Transforms user input into structured research brief
   - `final_report_generation` - Synthesizes all research into final report

2. **Supervisor Subgraph Nodes:**
   - `supervisor` - Lead researcher that plans and delegates
   - `supervisor_tools` - Executes supervisor's tool calls (delegation, reflection)

3. **Researcher Subgraph Nodes:**
   - `researcher` - Individual researcher conducting focused research
   - `researcher_tools` - Executes researcher's tool calls (search, reflection)
   - `compress_research` - Synthesizes researcher's findings

All nodes are defined in [`open_deep_library/deep_researcher.py`](open_deep_library/deep_researcher.py)

### Node 1: clarify_with_user

**Purpose:** Analyzes user messages and asks clarifying questions if the research scope is unclear.

**Key Steps:**
1. Check if clarification is enabled in configuration
2. Use structured output to analyze if clarification is needed
3. If needed, end with a clarifying question for the user
4. If not needed, proceed to research brief with verification message

**Implementation:** [`open_deep_library/deep_researcher.py` lines 60-115](open_deep_library/deep_researcher.py#L60-L115)

In [7]:
# Import the clarify_with_user node
from open_deep_library.deep_researcher import clarify_with_user

### Node 2: write_research_brief

**Purpose:** Transforms user messages into a structured research brief for the supervisor.

**Key Steps:**
1. Use structured output to generate detailed research brief from messages
2. Initialize supervisor with system prompt and research brief
3. Set up supervisor messages with proper context

**Why this matters:** A well-structured research brief helps the supervisor make better delegation decisions.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 118-175](open_deep_library/deep_researcher.py#L118-L175)

In [8]:
# Import the write_research_brief node
from open_deep_library.deep_researcher import write_research_brief

### Node 3: supervisor

**Purpose:** Lead research supervisor that plans research strategy and delegates to sub-researchers.

**Key Steps:**
1. Configure model with three tools:
   - `ConductResearch` - Delegate research to a sub-agent
   - `ResearchComplete` - Signal that research is done
   - `think_tool` - Strategic reflection before decisions
2. Generate response based on current context
3. Increment research iteration count
4. Proceed to tool execution

**Decision Making:** The supervisor uses `think_tool` to reflect before delegating research, ensuring thoughtful decomposition of the research question.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 178-223](open_deep_library/deep_researcher.py#L178-L223)

In [9]:
# Import the supervisor node (from supervisor subgraph)
from open_deep_library.deep_researcher import supervisor

### Node 4: supervisor_tools

**Purpose:** Executes the supervisor's tool calls, including strategic thinking and research delegation.

**Key Steps:**
1. Check exit conditions:
   - Exceeded maximum iterations
   - No tool calls made
   - `ResearchComplete` called
2. Process `think_tool` calls for strategic reflection
3. Execute `ConductResearch` calls in parallel:
   - Spawn researcher subgraphs for each delegation
   - Limit to `max_concurrent_research_units` (default: 5)
   - Gather all results asynchronously
4. Aggregate findings and return to supervisor

**Parallel Execution:** This is where the magic happens - multiple researchers work simultaneously on different aspects of the research question.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 225-349](open_deep_library/deep_researcher.py#L225-L349)

In [10]:
# Import the supervisor_tools node
from open_deep_library.deep_researcher import supervisor_tools

### Node 5: researcher

**Purpose:** Individual researcher that conducts focused research on a specific topic.

**Key Steps:**
1. Load all available tools (search, MCP, reflection)
2. Configure model with tools and researcher system prompt
3. Generate response with tool calls
4. Increment tool call iteration count

**ReAct Pattern:** Researchers use `think_tool` to reflect after each search, deciding whether to continue or provide their answer.

**Available Tools:**
- Search tools (Tavily or Anthropic native search)
- `think_tool` for strategic reflection
- `ResearchComplete` to signal completion
- MCP tools (if configured)

**Implementation:** [`open_deep_library/deep_researcher.py` lines 365-424](open_deep_library/deep_researcher.py#L365-L424)

In [11]:
# Import the researcher node (from researcher subgraph)
from open_deep_library.deep_researcher import researcher

### Node 6: researcher_tools

**Purpose:** Executes the researcher's tool calls, including searches and strategic reflection.

**Key Steps:**
1. Check early exit conditions (no tool calls, native search used)
2. Execute all tool calls in parallel:
   - Search tools fetch and summarize web content
   - `think_tool` records strategic reflections
   - MCP tools execute external integrations
3. Check late exit conditions:
   - Exceeded `max_react_tool_calls` (default: 10)
   - `ResearchComplete` called
4. Continue research loop or proceed to compression

**Error Handling:** Safely handles tool execution errors and continues with available results.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 435-509](open_deep_library/deep_researcher.py#L435-L509)

In [12]:
# Import the researcher_tools node
from open_deep_library.deep_researcher import researcher_tools

### Node 7: compress_research

**Purpose:** Compresses and synthesizes research findings into a concise, structured summary.

**Key Steps:**
1. Configure compression model
2. Add compression instruction to messages
3. Attempt compression with retry logic:
   - If token limit exceeded, remove older messages
   - Retry up to 3 times
4. Extract raw notes from tool and AI messages
5. Return compressed research and raw notes

**Why Compression?** Researchers may accumulate lots of tool outputs and reflections. Compression ensures:
- All important information is preserved
- Redundant information is deduplicated
- Content stays within token limits for the final report

**Token Limit Handling:** Gracefully handles token limit errors by progressively truncating messages.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 511-585](open_deep_library/deep_researcher.py#L511-L585)

In [13]:
# Import the compress_research node
from open_deep_library.deep_researcher import compress_research

### Node 8: final_report_generation

**Purpose:** Generates the final comprehensive research report from all collected findings.

**Key Steps:**
1. Extract all notes from completed research
2. Configure final report model
3. Attempt report generation with retry logic:
   - If token limit exceeded, truncate findings by 10%
   - Retry up to 3 times
4. Return final report or error message

**Token Limit Strategy:**
- First retry: Use model's token limit × 4 as character limit
- Subsequent retries: Reduce by 10% each time
- Graceful degradation with helpful error messages

**Report Quality:** The prompt guides the model to create well-structured reports with:
- Proper headings and sections
- Inline citations
- Comprehensive coverage of all findings
- Sources section at the end

**Implementation:** [`open_deep_library/deep_researcher.py` lines 607-697](open_deep_library/deep_researcher.py#L607-L697)

In [14]:
# Import the final_report_generation node
from open_deep_library.deep_researcher import final_report_generation

## Task 6: Graph Construction - Putting It All Together

The system is organized into three interconnected graphs:

### 1. Researcher Subgraph (Bottom Level)
Handles individual focused research on a specific topic:
```
START → researcher → researcher_tools → compress_research → END
               ↑            ↓
               └────────────┘ (loops until max iterations or ResearchComplete)
```

### 2. Supervisor Subgraph (Middle Level)
Manages research delegation and coordination:
```
START → supervisor → supervisor_tools → END
            ↑              ↓
            └──────────────┘ (loops until max iterations or ResearchComplete)
            
supervisor_tools spawns multiple researcher_subgraphs in parallel
```

### 3. Main Deep Researcher Graph (Top Level)
Orchestrates the complete research workflow:
```
START → clarify_with_user → write_research_brief → research_supervisor → final_report_generation → END
                 ↓                                       (supervisor_subgraph)
               (may end early if clarification needed)
```

Let's import the compiled graphs from the library.

In [15]:
# Import the pre-compiled graphs from the library
from open_deep_library.deep_researcher import (
    # Bottom level: Individual researcher workflow
    researcher_subgraph,    # Lines 588-605: researcher → researcher_tools → compress_research
    
    # Middle level: Supervisor coordination
    supervisor_subgraph,    # Lines 351-363: supervisor → supervisor_tools (spawns researchers)
    
    # Top level: Complete research workflow
    deep_researcher,        # Lines 699-719: Main graph with all phases
)

## Why This Architecture?

### Advantages of Supervisor-Researcher Delegation

1. **Dynamic Task Decomposition**
   - Unlike section-based approaches with predefined structure, the supervisor can break down research based on the actual question
   - Adapts to different types of research (comparisons, lists, deep dives, etc.)

2. **Parallel Execution**
   - Multiple researchers work simultaneously on different aspects
   - Much faster than sequential section processing
   - Configurable parallelism (1-20 concurrent researchers)

3. **ReAct Pattern for Quality**
   - Researchers use `think_tool` to reflect after each search
   - Prevents excessive searching and improves search quality
   - Natural stopping conditions based on information sufficiency

4. **Flexible Tool Integration**
   - Easy to add MCP tools for specialized research
   - Supports multiple search APIs (Anthropic, Tavily)
   - Each researcher can use different tool combinations

5. **Graceful Token Limit Handling**
   - Compression prevents token overflow
   - Progressive truncation in final report generation
   - Research can scale to arbitrary depths

### Trade-offs

- **Complexity:** More moving parts than section-based approach
- **Cost:** Parallel researchers use more tokens (but faster)
- **Unpredictability:** Research structure emerges dynamically

## Task 7: Running the Deep Researcher

Now let's see the system in action! We'll use it to analyze a PDF document about how people use AI.

### Setup

We need to:
1. Load the PDF document
2. Configure the execution with Anthropic settings
3. Run the research workflow

In [16]:
# Load the PDF document
from pathlib import Path
import PyPDF2

def load_pdf(pdf_path: str) -> str:
    """Load and extract text from PDF."""
    pdf_text = ""
    with open(pdf_path, 'rb') as file:
        pdf_reader = PyPDF2.PdfReader(file)
        for page in pdf_reader.pages:
            pdf_text += page.extract_text() + "\n\n"
    return pdf_text

# Load the PDF about how people use AI
pdf_path = "data/howpeopleuseai.pdf"
pdf_content = load_pdf(pdf_path)

print(f"Loaded PDF with {len(pdf_content)} characters")
print(f"First 500 characters:\n{pdf_content[:500]}...")

Loaded PDF with 112460 characters
First 500 characters:
NBER WORKING PAPER SERIES
HOW PEOPLE USE CHATGPT
Aaron Chatterji
Thomas Cunningham
David J. Deming
Zoe Hitzig
Christopher Ong
Carl Yan Shan
Kevin Wadman
Working Paper 34255
http://www.nber.org/papers/w34255
NATIONAL BUREAU OF ECONOMIC RESEARCH
1050 Massachusetts Avenue
Cambridge, MA 02138
September 2025
We acknowledge help and comments from Joshua Achiam, Hemanth Asirvatham, Ryan 
Beiermeister,  Rachel Brown, Cassandra Duchan Solis, Jason Kwon, Elliott Mokski, Kevin Rao, 
Harrison Satcher,  Gawe...


In [17]:
# Set up the graph with Anthropic configuration
from IPython.display import Markdown, display
import uuid

# Note: deep_researcher is already compiled from the library
# For this demo, we'll use it directly without additional checkpointing
graph = deep_researcher

print("✓ Graph ready for execution")
print("  (Note: The graph is pre-compiled from the library)")

✓ Graph ready for execution
  (Note: The graph is pre-compiled from the library)


### Configuration for Anthropic

We'll configure the system to use:
- **Claude Sonnet 4** for all research, supervision, and report generation
- **Tavily** for web search (you can also use Anthropic's native search)
- **Moderate parallelism** (3 concurrent researchers)
- **Clarification enabled** (will ask if research scope is unclear)

In [37]:
# Configure for Anthropic with moderate settings
config = {
    "configurable": {
        # Model configuration - using Claude Sonnet 4 for everything
        "research_model": "anthropic:claude-sonnet-4-20250514",
        "research_model_max_tokens": 4100,
        
        "compression_model": "anthropic:claude-sonnet-4-20250514",
        "compression_model_max_tokens": 4100,
        
        "final_report_model": "anthropic:claude-sonnet-4-20250514",
        "final_report_model_max_tokens": 6000,
        
        "summarization_model": "anthropic:claude-sonnet-4-20250514",
        "summarization_model_max_tokens": 4100,
        
        # Research behavior
        "allow_clarification": True,
        "max_concurrent_research_units": 1,  # 1 parallel researchers
        "max_researcher_iterations": 2,      # Supervisor can delegate up to 2 times
        "max_react_tool_calls": 3,           # Each researcher can make up to 3 tool calls
        
        # Search configuration
        "search_api": "tavily",  # Using Tavily for web search
        "max_content_length": 50000,
        
        # Thread ID for this conversation
        "thread_id": str(uuid.uuid4())
    }
}

print("✓ Configuration ready")
print(f"  - Research Model: Claude Sonnet 4")
print(f"  - Max Concurrent Researchers: 3")
print(f"  - Max Iterations: 4")
print(f"  - Search API: Tavily")

✓ Configuration ready
  - Research Model: Claude Sonnet 4
  - Max Concurrent Researchers: 3
  - Max Iterations: 4
  - Search API: Tavily


### Execute the Research

Now let's run the research! We'll ask the system to analyze the PDF and provide insights about how people use AI.

The workflow will:
1. **Clarify** - Check if the request is clear (may skip if obvious)
2. **Research Brief** - Transform our request into a structured brief
3. **Supervisor** - Plan research strategy and delegate to researchers
4. **Parallel Research** - Multiple researchers gather information simultaneously
5. **Compression** - Each researcher synthesizes their findings
6. **Final Report** - All findings combined into comprehensive report

In [38]:
# Create our research request with PDF context
research_request = f"""
I have a PDF document about how people use AI. Please analyze this document and provide insights about:

1. What are the main findings about how people are using AI?
2. What are the most common use cases?
3. What trends or patterns emerge from the data?

Here's a short excerpt of the PDF content (truncated to reduce tokens):

{pdf_content[:3000]}  # First 3k chars to stay within limits

...[content truncated for context window]
"""

# Execute the graph
async def run_research():
    """Run the research workflow and display results."""
    print("Starting research workflow...\n")
    
    async for event in graph.astream(
        {"messages": [{"role": "user", "content": research_request}]},
        config,
        stream_mode="updates"
    ):
        # Display each step
        for node_name, node_output in event.items():
            print(f"\n{'='*60}")
            print(f"Node: {node_name}")
            print(f"{'='*60}")
            
            if node_name == "clarify_with_user":
                if "messages" in node_output:
                    last_msg = node_output["messages"][-1]
                    print(f"\n{last_msg.content}")
            
            elif node_name == "write_research_brief":
                if "research_brief" in node_output:
                    print(f"\nResearch Brief Generated:")
                    print(f"{node_output['research_brief'][:500]}...")
            
            elif node_name == "supervisor":
                print(f"\nSupervisor planning research strategy...")
                if "supervisor_messages" in node_output:
                    last_msg = node_output["supervisor_messages"][-1]
                    if hasattr(last_msg, 'tool_calls') and last_msg.tool_calls:
                        print(f"Tool calls: {len(last_msg.tool_calls)}")
                        for tc in last_msg.tool_calls:
                            print(f"  - {tc['name']}")
            
            elif node_name == "supervisor_tools":
                print(f"\nExecuting supervisor's tool calls...")
                if "notes" in node_output:
                    print(f"Research notes collected: {len(node_output['notes'])}")
            
            elif node_name == "final_report_generation":
                if "final_report" in node_output:
                    print(f"\n" + "="*60)
                    print("FINAL REPORT GENERATED")
                    print("="*60 + "\n")
                    display(Markdown(node_output["final_report"]))
    
    print("\n" + "="*60)
    print("Research workflow completed!")
    print("="*60)

# Run the research
await run_research()

Starting research workflow...


Node: clarify_with_user

I have sufficient information to proceed with your analysis request. You've provided an NBER working paper titled "How People Use ChatGPT" and are asking for insights about: 1) main findings about AI usage patterns, 2) most common use cases, and 3) emerging trends from the data. Based on the excerpt provided, I can see this is a comprehensive study covering ChatGPT adoption from November 2022 through July 2025, including demographic patterns and usage classification. I will now begin analyzing the document content and preparing a detailed report addressing your three key questions.

Node: write_research_brief

Research Brief Generated:
I need a comprehensive analysis of the NBER working paper "How People Use ChatGPT" by Aaron Chatterji, Thomas Cunningham, David J. Deming, Zoe Hitzig, Christopher Ong, Carl Yan Shan, and Kevin Wadman (Working Paper No. 34255, September 2025). Specifically, I want detailed insights on three key area




Node: research_supervisor

Node: final_report_generation

FINAL REPORT GENERATED



# Comprehensive Analysis of "How People Use ChatGPT" - NBER Working Paper 34255

## Overview of the Study

NBER Working Paper No. 34255, titled "How People Use ChatGPT," represents one of the most comprehensive empirical studies of AI chatbot usage patterns to date. Published in September 2025 by Aaron Chatterji, Thomas Cunningham, David J. Deming, Zoe Hitzig, Christopher Ong, Carl Yan Shan, and Kevin Wadman, the study documents ChatGPT's remarkable consumer adoption journey from its November 2022 launch through July 2025 [1][2].

The research employed a sophisticated privacy-preserving automated pipeline to analyze representative samples of ChatGPT conversations while maintaining strict user anonymity. Using a Data Clean Room (DCR), the researchers automatically scrubbed personally identifiable information and never allowed direct viewing of actual message content, instead relying on automated classifiers to categorize messages by purpose, topic, and intent [3]. This methodological approach, approved by Harvard IRB, ensures both research validity and user privacy protection.

By July 2025, the study documented that ChatGPT had reached approximately 10% of the world's adult population, representing over 700 million weekly active users who collectively send more than 2.6 billion messages per day—equivalent to over 30,000 messages per second [2][4]. This scale of adoption represents an unprecedented rate of technology diffusion, with ChatGPT achieving in less than two years what took smartphones approximately ten years to accomplish.

## Main Findings About AI/ChatGPT Usage Patterns

### Explosive Growth and Adoption Metrics

The study reveals extraordinary growth trajectories that redefine our understanding of technology adoption curves. ChatGPT reached one million users within just five days of its launch on December 5, 2022, and achieved 100 million weekly active users by November 2023—less than one year after release [3]. The platform's weekly active user base has been doubling every 7-8 months since launch, with message volume increasing 5.8 times in the final year studied while user volume grew 3.2 times [3].

This growth pattern demonstrates not merely viral adoption but sustained engagement, with all signup cohorts showing similar patterns of increased usage beginning in late 2024/early 2025. This suggests that improvements in ChatGPT's quality and user-friendliness, rather than mere novelty effects, drive the continued expansion [3].

### Demographic Transformation and Patterns

**Gender Gap Evolution**: One of the most significant findings concerns the dramatic closure of the initial gender gap. Early adopters were disproportionately male, reflecting typical patterns seen with new technologies. However, by July 2025, this gap had not only closed but slightly reversed, with 52% of active users having typically female first names compared to just 37% in January 2024 [2][6]. This represents one of the fastest gender parity achievements documented for a major technology platform.

**Age Demographics**: The platform demonstrates strong appeal among younger users, with nearly half of all messages originating from users under 26 years old [4]. Users aged 18-24 form the primary demographic, with the 25-34 age group comprising the second largest segment. Combined, users between 18 and 34 represent 54.85% of the total user base [5], indicating ChatGPT's particular resonance with digital natives and early-career professionals.

**Global Distribution and Economic Patterns**: The study documents fascinating patterns in global adoption that challenge conventional assumptions about technology diffusion. While the United States accounts for the largest single user base at 19.01%, this is followed by India (7.86%), Brazil (5.05%), Canada (3.57%), and the United Kingdom (3.48%) [5]. More significantly, growth rates in lower-income countries exceeded those in high-income nations by more than four times [6], suggesting that ChatGPT's value proposition may be particularly compelling in contexts where access to information and decision-support tools has traditionally been more limited.

**Educational and Professional Characteristics**: Work usage correlates strongly with educational attainment and professional status, being more common among educated users in highly-paid professional occupations [2][4]. This pattern suggests that while ChatGPT has achieved broad demographic penetration, its workplace applications remain concentrated among knowledge workers who can most readily integrate AI assistance into their professional workflows.

### User Engagement and Satisfaction Metrics

User satisfaction remains consistently high throughout the study period, with positive interactions outnumbering negative ones by approximately 4:1 [4]. Users demonstrate substantial engagement, spending an average of 13 minutes and 58 seconds per session [7]. This engagement depth, combined with the frequency of use, indicates that ChatGPT has successfully transitioned from a novelty tool to an integrated component of users' digital workflows and daily routines.

## Most Common Use Cases and Application Categories

### The Three-Pillar Framework

The study's most striking finding regarding use cases is the dominance of three primary categories: "Practical Guidance," "Seeking Information," and "Writing," which collectively account for nearly 80% of all ChatGPT conversations [1][2][4][6]. This concentration suggests that despite the technology's broad capabilities, user behavior has converged around specific high-value applications.

The researchers employed multiple classification frameworks to understand usage patterns. One approach categorizes interactions into three types: "Asking" (approximately 49% of messages), "Doing" (approximately 40%), and "Expressing" (approximately 11%) [4][6]. This framework reveals that ChatGPT primarily serves as an information and task-completion tool rather than a creative or expressive medium.

### Writing as the Dominant Work Application

Writing emerges as the single most important work-related application, dominating professional use cases and accounting for 40% of all work-related messages [6]. This finding highlights ChatGPT's unique value proposition compared to traditional search engines: its ability to generate tailored, contextual digital outputs rather than simply retrieving existing information. The prominence of writing applications spans various professional contexts, from email composition and document drafting to creative content generation and editing assistance.

This writing dominance reflects a fundamental shift in how knowledge workers approach content creation, with ChatGPT serving as both a writing partner and a cognitive amplifier that can help users overcome writer's block, improve clarity, and generate ideas more efficiently.

### Surprising Findings About Programming Usage

Contrary to widespread perceptions about AI coding assistance, the study reveals that computer programming represents only 4.2% of all messages [4][6]. This finding challenges popular narratives about ChatGPT's primary value being in software development and suggests that mainstream adoption has moved far beyond technical use cases. The relatively small share of programming-related queries indicates that ChatGPT's broader utility lies in its natural language processing capabilities rather than specialized technical functions.

### Information Seeking and Decision Support

The study emphasizes ChatGPT's role as a decision-support tool, which emerges as its strongest economic value proposition [4][6]. Users frequently engage ChatGPT not merely to retrieve information but to help process, interpret, and apply that information to specific decisions or problems. This application proves especially valuable in knowledge-intensive jobs where workers must synthesize complex information and make nuanced judgments.

The decision-support function distinguishes ChatGPT from traditional search engines by providing personalized, contextual advice rather than generic information retrieval. Users can engage in iterative conversations that help them think through problems, weigh options, and arrive at more informed conclusions.

## Trends and Patterns from the Data

### The Great Shift: From Work to Personal Use

Perhaps the most significant trend documented in the study is the dramatic evolution from work-focused to personal usage. The data shows that non-work-related messages grew from 53% of all usage in mid-2024 to over 70% by mid-2025 [1][2][6]. This shift represents more than a gradual change; researchers describe it as a "pirouetting" rather than a flattening curve, indicating accelerating adoption for personal applications.

This trend suggests that ChatGPT's initial adoption among professionals and early adopters has given way to mainstream consumer integration. As users become more comfortable with the technology and discover its utility for personal tasks—from planning activities and seeking advice to learning new topics and solving everyday problems—the platform has evolved into a general-purpose digital assistant rather than primarily a workplace tool.

### Temporal Patterns and Engagement Evolution

The study documents remarkable growth in both user numbers and message volume. Daily messages increased six-fold from 451 million to 2.6 billion within a single year, with 700 million people now sending approximately 18 billion messages weekly [6]. This growth pattern indicates not just user acquisition but deepening engagement as existing users find more applications for the technology.

All signup cohorts show similar patterns of increased usage beginning in late 2024/early 2025, suggesting that platform improvements rather than cohort effects drive continued growth [3]. This pattern indicates that ChatGPT has substantially improved in quality and user-friendliness, creating value that transcends the initial novelty factor.

### Geographic and Economic Expansion

The study reveals fascinating patterns of global adoption that challenge traditional technology diffusion models. Countries with vastly different GDP per capita levels—including Brazil, South Korea, and the United States—now show similar adoption rates [3]. This convergence suggests that ChatGPT's value proposition transcends economic barriers in ways that previous technologies have not.

The finding that growth rates in lower-income countries exceed those in high-income nations by more than four times [6] indicates that ChatGPT may be particularly valuable in contexts where access to information, education, and professional services has traditionally been limited. This pattern could have significant implications for global knowledge equity and economic development.

### Demographic Maturation and Mainstream Integration

The closure of the gender gap represents more than statistical parity; it indicates ChatGPT's successful transition from a male-dominated early adopter technology to a mainstream tool with broad demographic appeal. The shift from 37% female users in January 2024 to 52% by July 2025 [6] demonstrates rapid mainstream adoption among user groups who were initially underrepresented.

This demographic maturation coincides with the platform's evolution from a work-focused tool to a comprehensive personal assistant. As use cases expanded beyond professional applications to include personal guidance, learning, entertainment, and daily problem-solving, the user base naturally diversified to reflect broader population demographics.

### Value Creation and Economic Impact

The researchers conclude that ChatGPT's primary economic value lies in its function as a decision-support tool that helps users make choices, think through problems, and produce better writing [4][6]. This value proposition distinguishes ChatGPT from traditional search engines through its ability to generate tailored, actionable outputs rather than simply retrieving information.

The study suggests that ChatGPT generates significant consumer surplus from non-work applications, indicating substantial value creation beyond measurable productivity gains [6]. This consumer surplus reflects the platform's ability to enhance daily life through improved decision-making, learning opportunities, creative assistance, and problem-solving support.

### The Mainstream Transformation

The research documents ChatGPT's complete transformation from novelty to necessity. As one analysis notes, "ChatGPT has gone mainstream. It's not just a toy or a novelty anymore—it's becoming part of daily routines, both at work and at home" [8]. This integration into daily routines represents a fundamental shift in human-AI interaction, with ChatGPT serving as both a productivity tool and a cognitive companion.

The platform's strongest appeal appears to be its function as a writing partner and sounding board, contributing to both productivity and creativity at an extraordinary scale [8]. This dual role—enhancing both efficiency and creative capability—positions ChatGPT as a transformative technology that augments human capabilities rather than simply automating existing tasks.

The study's findings collectively suggest that we are witnessing the early stages of a fundamental shift in how humans interact with information, make decisions, and approach complex tasks. ChatGPT's rapid mainstream adoption and evolving use patterns indicate that AI assistants are becoming integral components of modern cognitive workflows, with implications that extend far beyond the current use cases documented in this groundbreaking research.

### Sources

[1] How People Use ChatGPT: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5487080
[2] How People Use ChatGPT | NBER: https://www.nber.org/papers/w34255
[3] How People Use ChatGPT - by David Deming: https://forklightning.substack.com/p/how-people-use-chatgpt
[4] How People Really Use ChatGPT: Findings from NBER Research: https://techmaniacs.com/2025/09/15/how-people-really-use-chatgpt-findings-from-nber-research/
[5] 33 Essential ChatGPT Statistics You Need To Know In 2025: https://thesocialshepherd.com/blog/chatgpt-statistics
[6] How people use ChatGPT: Study reveals surprising trends: https://www.linkedin.com/posts/andrewbirmingham1_how-do-people-really-use-chatgpt-activity-7377472211877687297--wSm
[7] Latest ChatGPT Statistics: 800M+ Users, Revenue (Oct 2025): https://nerdynav.com/chatgpt-statistics/
[8] How People Are Really Using ChatGPT - Mike Jeffs: https://mikejeffs.com/blog/how-people-are-really-using-chatgpt/


Research workflow completed!


## Understanding the Output

Let's break down what happened:

### Phase 1: Clarification
The system checked if your request was clear. Since you provided a PDF and specific questions, it likely proceeded without clarification.

### Phase 2: Research Brief
Your request was transformed into a detailed research brief that guides the supervisor's delegation strategy.

### Phase 3: Supervisor Delegation
The supervisor analyzed the brief and decided how to break down the research:
- Used `think_tool` to plan strategy
- Called `ConductResearch` multiple times to delegate to parallel researchers
- Each delegation specified a focused research topic

### Phase 4: Parallel Research
Multiple researchers worked simultaneously:
- Each researcher used web search tools to gather information
- Used `think_tool` to reflect after each search
- Decided when they had enough information
- Compressed their findings into clean summaries

### Phase 5: Final Report
All research findings were synthesized into a comprehensive report with:
- Well-structured sections
- Inline citations
- Sources listed at the end
- Balanced coverage of all findings

#### 🏗️ Activity #1: Try Different Configurations

You can experiment with different settings to see how they affect the research.  You may select three or more of the following settings (or invent your own experiments) and describe the results.

### Increase Parallelism
```python
"max_concurrent_research_units": 10  # More researchers working simultaneously
```

### Deeper Research
```python
"max_researcher_iterations": 8   # Supervisor can delegate more times
"max_react_tool_calls": 15      # Each researcher can search more
```

### Use Anthropic Native Search
```python
"search_api": "anthropic"  # Use Claude's built-in web search
```

### Disable Clarification
```python
"allow_clarification": False  # Skip clarification phase
```

### ✅ Answer Activity #1 — Experiment Results
I toggled some configs here and there. I wil base my results off of thes toggles. I hit some rate limits frequently, so I made some changes, but these summarizations are what I found:

- Increase Parallelism with max_concurrent_research_units
  - Reduced time by running researchers concurrently, but increased tokens
  - Higher cost due to multiple simultaneous tool/model calls.

- Deeper Research with max_researcher_iterations and max_react_tool_calls
  - Produced better notes and a more comprehensive final report in my opinion
  - Increased latency and token usage

- Use Anthropic for search api
  - Fewer tool calls, but had comparable quality in my opinion.



## Key Takeaways

### Architecture Benefits
1. **Dynamic Decomposition** - Research structure emerges from the question, not predefined
2. **Parallel Efficiency** - Multiple researchers work simultaneously
3. **ReAct Quality** - Strategic reflection improves search decisions
4. **Scalability** - Handles token limits gracefully through compression
5. **Flexibility** - Easy to add new tools and capabilities

### When to Use This Pattern
- **Complex research questions** that need multi-angle investigation
- **Comparison tasks** where parallel research on different topics is beneficial
- **Open-ended exploration** where structure should emerge dynamically
- **Time-sensitive research** where parallel execution speeds up results

### When to Use Section-Based Instead
- **Highly structured reports** with predefined format requirements
- **Template-based content** where sections are always the same
- **Sequential dependencies** where later sections depend on earlier ones
- **Budget constraints** where token efficiency is critical

## Next Steps

### Extend the System
1. **Add MCP Tools** - Integrate specialized tools for your domain
2. **Custom Prompts** - Modify prompts for specific research types
3. **Different Models** - Try different Claude versions or mix models
4. **Persistence** - Use a real database for checkpointing instead of memory

### Learn More
- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/)
- [Open Deep Research Repo](https://github.com/langchain-ai/open_deep_research)
- [Anthropic Claude Documentation](https://docs.anthropic.com/)
- [Tavily Search API](https://tavily.com/)

### Deploy
- Use LangGraph Cloud for production deployment
- Add proper error handling and logging
- Implement rate limiting and cost controls
- Monitor research quality and costs