## LangGraph Open Deep Research - Supervisor-Researcher Architecture

In this notebook, we'll explore the **supervisor-researcher delegation architecture** for conducting deep research with LangGraph.

You can visit this repository to see the original application: [Open Deep Research](https://github.com/langchain-ai/open_deep_research)

Let's jump in!

## What We're Building

This implementation uses a **hierarchical delegation pattern** where:

1. **User Clarification** - Optionally asks clarifying questions to understand the research scope
2. **Research Brief Generation** - Transforms user messages into a structured research brief
3. **Supervisor** - A lead researcher that analyzes the brief and delegates research tasks
4. **Parallel Researchers** - Multiple sub-agents that conduct focused research simultaneously
5. **Research Compression** - Each researcher synthesizes their findings
6. **Final Report** - All findings are combined into a comprehensive report

![Architecture Diagram](https://i.imgur.com/Q8HEZn0.png)

This differs from a section-based approach by allowing dynamic task decomposition based on the research question, rather than predefined sections.

---

# ü§ù Breakout Room #1
## Deep Research Foundations

In this breakout room, we'll understand the architecture and components of the Open Deep Research system.

## Task 1: Dependencies

You'll need API keys for Anthropic (for the LLM) and Tavily (for web search). We'll configure the system to use Anthropic's Claude Sonnet 4 exclusively.

In [None]:
import os
import getpass

os.environ["ANTHROPIC_API_KEY"] = getpass.getpass("Enter your Anthropic API key: ")
os.environ["TAVILY_API_KEY"] = getpass.getpass("Enter your Tavily API key: ")

## Task 2: State Definitions

The state structure is hierarchical with three levels:

### Agent State (Top Level)
Contains the overall conversation messages, research brief, accumulated notes, and final report.

### Supervisor State (Middle Level)
Manages the research supervisor's messages, research iterations, and coordinating parallel researchers.

### Researcher State (Bottom Level)
Each individual researcher has their own message history, tool call iterations, and research findings.

We also have structured outputs for tool calling:
- **ConductResearch** - Tool for supervisor to delegate research to a sub-agent
- **ResearchComplete** - Tool to signal research phase is done
- **ClarifyWithUser** - Structured output for asking clarifying questions
- **ResearchQuestion** - Structured output for the research brief

Let's import these from our library: [`open_deep_library/state.py`](open_deep_library/state.py)

In [2]:
# Import state definitions from the library
from open_deep_library.state import (
    # Main workflow states
    AgentState,           # Lines 65-72: Top-level agent state with messages, research_brief, notes, final_report
    AgentInputState,      # Lines 62-63: Input state is just messages
    
    # Supervisor states
    SupervisorState,      # Lines 74-81: Supervisor manages research delegation and iterations
    
    # Researcher states
    ResearcherState,      # Lines 83-90: Individual researcher with messages and tool iterations
    ResearcherOutputState, # Lines 92-96: Output from researcher (compressed research + raw notes)
    
    # Structured outputs for tool calling
    ConductResearch,      # Lines 15-19: Tool for delegating research to sub-agents
    ResearchComplete,     # Lines 21-22: Tool to signal research completion
    ClarifyWithUser,      # Lines 30-41: Structured output for user clarification
    ResearchQuestion,     # Lines 43-48: Structured output for research brief
)

## Task 3: Utility Functions and Tools

The system uses several key utilities:

### Search Tools
- **tavily_search** - Async web search with automatic summarization to stay within token limits
- Supports Anthropic native web search and Tavily API

### Reflection Tools
- **think_tool** - Allows researchers to reflect on their progress and plan next steps (ReAct pattern)

### Helper Utilities
- **get_all_tools** - Assembles the complete toolkit (search + MCP + reflection)
- **get_today_str** - Provides current date context for research
- Token limit handling utilities for graceful degradation

These are defined in [`open_deep_library/utils.py`](open_deep_library/utils.py)

In [3]:
# Import utility functions and tools from the library
from open_deep_library.utils import (
    # Search tool - Lines 43-136: Tavily search with automatic summarization
    tavily_search,
    
    # Reflection tool - Lines 219-244: Strategic thinking tool for ReAct pattern
    think_tool,
    
    # Tool assembly - Lines 569-597: Get all configured tools
    get_all_tools,
    
    # Date utility - Lines 872-879: Get formatted current date
    get_today_str,
    
    # Supporting utilities for error handling
    get_api_key_for_model,          # Lines 892-914: Get API keys from config or env
    is_token_limit_exceeded,         # Lines 665-701: Detect token limit errors
    get_model_token_limit,           # Lines 831-846: Look up model's token limit
    remove_up_to_last_ai_message,    # Lines 848-866: Truncate messages for retry
    anthropic_websearch_called,      # Lines 607-637: Detect Anthropic native search usage
    openai_websearch_called,         # Lines 639-658: Detect OpenAI native search usage
    get_notes_from_tool_calls,       # Lines 599-601: Extract notes from tool messages
)

## Task 4: Configuration System

The configuration system controls:

### Research Behavior
- **allow_clarification** - Whether to ask clarifying questions before research
- **max_concurrent_research_units** - How many parallel researchers can run (default: 5)
- **max_researcher_iterations** - How many times supervisor can delegate research (default: 6)
- **max_react_tool_calls** - Tool call limit per researcher (default: 10)

### Model Configuration
- **research_model** - Model for research and supervision (we'll use Anthropic)
- **compression_model** - Model for synthesizing findings
- **final_report_model** - Model for writing the final report
- **summarization_model** - Model for summarizing web search results

### Search Configuration
- **search_api** - Which search API to use (ANTHROPIC, TAVILY, or NONE)
- **max_content_length** - Character limit before summarization

Defined in [`open_deep_library/configuration.py`](open_deep_library/configuration.py)

In [4]:
# Import configuration from the library
from open_deep_library.configuration import (
    Configuration,    # Lines 38-247: Main configuration class with all settings
    SearchAPI,        # Lines 11-17: Enum for search API options (ANTHROPIC, TAVILY, NONE)
)

## Task 5: Prompt Templates

The system uses carefully engineered prompts for each phase:

### Phase 1: Clarification
**clarify_with_user_instructions** - Analyzes if the research scope is clear or needs clarification

### Phase 2: Research Brief
**transform_messages_into_research_topic_prompt** - Converts user messages into a detailed research brief

### Phase 3: Supervisor
**lead_researcher_prompt** - System prompt for the supervisor that manages delegation strategy

### Phase 4: Researcher
**research_system_prompt** - System prompt for individual researchers conducting focused research

### Phase 5: Compression
**compress_research_system_prompt** - Prompt for synthesizing research findings without losing information

### Phase 6: Final Report
**final_report_generation_prompt** - Comprehensive prompt for writing the final report

All prompts are defined in [`open_deep_library/prompts.py`](open_deep_library/prompts.py)

In [5]:
# Import prompt templates from the library
from open_deep_library.prompts import (
    clarify_with_user_instructions,                    # Lines 3-41: Ask clarifying questions
    transform_messages_into_research_topic_prompt,     # Lines 44-77: Generate research brief
    lead_researcher_prompt,                            # Lines 79-136: Supervisor system prompt
    research_system_prompt,                            # Lines 138-183: Researcher system prompt
    compress_research_system_prompt,                   # Lines 186-222: Research compression prompt
    final_report_generation_prompt,                    # Lines 228-308: Final report generation
)

## ‚ùì Question #1:

Explain the interrelationships between the three states (Agent, Supervisor, Researcher). Why don't we just make a single huge state?

##### Answer:

Interrelationships between AgentState, SupervisorState and ResearcherState:

- They form a hierarchical state for the system, where AgentState is the top-level state that orchestrates the overall system with the context of the full system; SupervisorState is used by the supervisor subgraph, it shares state with the main AgentState for the same key definitions (e.g., `research_brief`); while ResearcherState is fully isolated and each research gets its own instance for independent research.
- The state infomration flows down from AgentState to SupervisorState (e.g., converting user request into research brief); and then from SupervisorState to each individual ResearcherState (e.g., convert research brief into individual research topics); then the information flows back from ResearcherState (compressed findings) to SupervisorState for generating a final report.


Why not one big state?

- This hierarchical state pattern helps with context isolation among the main components of the deep research agent system, and avoids context bloating if we were to stack everything in one big state, especially since each individual research sub-agent could add to lots of state information (e.g., if they all write to the same `messages` list).

This pattern also helps with parellalzation for research process to be executed in parelle, and it helps with debugging for each individual agent of the system. 


## ‚ùì Question #2:

What are the advantages and disadvantages of importing these components instead of including them in the notebook?

##### Answer:

Advantages of importing components instead of including directly in notebook:

- Code resuability across projects or sections in the notebook
- Better readability by including the lower-level details in the components folder, which making the notebook node can stay clean and focus on implementing higher level components and system design
- Easier to convert to production since the library is already in standalone modules

Disadvantages of importing components instead of including directly in notebook:

- Harder to quickly prototype and iterate on ideas than directly have the code in notebook
- Navigating code base might be a bit more challenging since we need to switch between multiple folders while traversing a module from high level system to low level component implementation



## üèóÔ∏è Activity #1: Explore the Prompts

Open `open_deep_library/prompts.py` and examine one of the prompt templates in detail.

**Requirements:**
1. Choose one prompt template (clarify, brief, supervisor, researcher, compression, or final report)
2. Explain what the prompt is designed to accomplish
3. Identify 2-3 key techniques used in the prompt (e.g., structured output, role definition, examples)
4. Suggest one improvement you might make to the prompt

**YOUR CODE HERE** - Write your analysis in a markdown cell below

1. Prompt template choosen: `lead_researcher_prompt`

2. This prompt is designed for the research supervisor agent, which directs it to act as a research manager. Its main tasks is to use think_tool to plan and break user's question into topics, delegate each topic to a research sub-agent via `ConductResearch` tool with specific delegation details. After each tool call to ConductResearch, anaylze the result to check if answer is enough or should delegate more research. Once it's satified with the research result, indicate research is done via ResearchComplete tool. Overall this prompt directs the workflow of the supervisor agent.

3. Key techniques:

- Prompt has multiple sections that are wrapped in xml tags (<Task>, <Instructions>, <Hard Limits>, etc.), this helps organizing the prompt into sections with clear categories, which help steer the agent's workflow. 
- Few-shot examples: specific examples are provided on when to have parallel sub-agents (e.g., compared 3 AI companies) vs. a single sub-agent (e.g., for a straightforward question) for research.
- Role definition: LLM is given a persona as a "Research supervisor"

4. What to improve: Overall this prompt is pretty comphrehensive. If I were to improve this, I will likely add improve the `Show Your Thinking` section by adding some few-shot examples to those considerations. For example, I will elaborate on what situations should I delegate more research or call ResearchComplete.

```
<Show Your Thinking>
Before you call ConductResearch tool call, use think_tool to plan your approach:
- Can the task be broken down into smaller sub-tasks?

After each ConductResearch tool call, use think_tool to analyze the results:
- What key information did I find?
- What's missing?
- Do I have enough to answer the question comprehensively?
- Should I delegate more research or call ResearchComplete?
</Show Your Thinking>
```



---

# ü§ù Breakout Room #2
## Building & Running the Researcher

In this breakout room, we'll explore the node functions, build the graph, and run wellness research.

## Task 6: Node Functions - The Building Blocks

Now let's look at the node functions that make up our graph. We'll import them from the library and understand what each does.

### The Complete Research Workflow

The workflow consists of 8 key nodes organized into 3 subgraphs:

1. **Main Graph Nodes:**
   - `clarify_with_user` - Entry point that checks if clarification is needed
   - `write_research_brief` - Transforms user input into structured research brief
   - `final_report_generation` - Synthesizes all research into final report

2. **Supervisor Subgraph Nodes:**
   - `supervisor` - Lead researcher that plans and delegates
   - `supervisor_tools` - Executes supervisor's tool calls (delegation, reflection)

3. **Researcher Subgraph Nodes:**
   - `researcher` - Individual researcher conducting focused research
   - `researcher_tools` - Executes researcher's tool calls (search, reflection)
   - `compress_research` - Synthesizes researcher's findings

All nodes are defined in [`open_deep_library/deep_researcher.py`](open_deep_library/deep_researcher.py)

### Node 1: clarify_with_user

**Purpose:** Analyzes user messages and asks clarifying questions if the research scope is unclear.

**Key Steps:**
1. Check if clarification is enabled in configuration
2. Use structured output to analyze if clarification is needed
3. If needed, end with a clarifying question for the user
4. If not needed, proceed to research brief with verification message

**Implementation:** [`open_deep_library/deep_researcher.py` lines 60-115](open_deep_library/deep_researcher.py#L60-L115)

In [6]:
# Import the clarify_with_user node
from open_deep_library.deep_researcher import clarify_with_user

### Node 2: write_research_brief

**Purpose:** Transforms user messages into a structured research brief for the supervisor.

**Key Steps:**
1. Use structured output to generate detailed research brief from messages
2. Initialize supervisor with system prompt and research brief
3. Set up supervisor messages with proper context

**Why this matters:** A well-structured research brief helps the supervisor make better delegation decisions.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 118-175](open_deep_library/deep_researcher.py#L118-L175)

In [7]:
# Import the write_research_brief node
from open_deep_library.deep_researcher import write_research_brief

### Node 3: supervisor

**Purpose:** Lead research supervisor that plans research strategy and delegates to sub-researchers.

**Key Steps:**
1. Configure model with three tools:
   - `ConductResearch` - Delegate research to a sub-agent
   - `ResearchComplete` - Signal that research is done
   - `think_tool` - Strategic reflection before decisions
2. Generate response based on current context
3. Increment research iteration count
4. Proceed to tool execution

**Decision Making:** The supervisor uses `think_tool` to reflect before delegating research, ensuring thoughtful decomposition of the research question.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 178-223](open_deep_library/deep_researcher.py#L178-L223)

In [8]:
# Import the supervisor node (from supervisor subgraph)
from open_deep_library.deep_researcher import supervisor

### Node 4: supervisor_tools

**Purpose:** Executes the supervisor's tool calls, including strategic thinking and research delegation.

**Key Steps:**
1. Check exit conditions:
   - Exceeded maximum iterations
   - No tool calls made
   - `ResearchComplete` called
2. Process `think_tool` calls for strategic reflection
3. Execute `ConductResearch` calls in parallel:
   - Spawn researcher subgraphs for each delegation
   - Limit to `max_concurrent_research_units` (default: 5)
   - Gather all results asynchronously
4. Aggregate findings and return to supervisor

**Parallel Execution:** This is where the magic happens - multiple researchers work simultaneously on different aspects of the research question.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 225-349](open_deep_library/deep_researcher.py#L225-L349)

In [9]:
# Import the supervisor_tools node
from open_deep_library.deep_researcher import supervisor_tools

### Node 5: researcher

**Purpose:** Individual researcher that conducts focused research on a specific topic.

**Key Steps:**
1. Load all available tools (search, MCP, reflection)
2. Configure model with tools and researcher system prompt
3. Generate response with tool calls
4. Increment tool call iteration count

**ReAct Pattern:** Researchers use `think_tool` to reflect after each search, deciding whether to continue or provide their answer.

**Available Tools:**
- Search tools (Tavily or Anthropic native search)
- `think_tool` for strategic reflection
- `ResearchComplete` to signal completion
- MCP tools (if configured)

**Implementation:** [`open_deep_library/deep_researcher.py` lines 365-424](open_deep_library/deep_researcher.py#L365-L424)

In [10]:
# Import the researcher node (from researcher subgraph)
from open_deep_library.deep_researcher import researcher

### Node 6: researcher_tools

**Purpose:** Executes the researcher's tool calls, including searches and strategic reflection.

**Key Steps:**
1. Check early exit conditions (no tool calls, native search used)
2. Execute all tool calls in parallel:
   - Search tools fetch and summarize web content
   - `think_tool` records strategic reflections
   - MCP tools execute external integrations
3. Check late exit conditions:
   - Exceeded `max_react_tool_calls` (default: 10)
   - `ResearchComplete` called
4. Continue research loop or proceed to compression

**Error Handling:** Safely handles tool execution errors and continues with available results.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 435-509](open_deep_library/deep_researcher.py#L435-L509)

In [11]:
# Import the researcher_tools node
from open_deep_library.deep_researcher import researcher_tools

### Node 7: compress_research

**Purpose:** Compresses and synthesizes research findings into a concise, structured summary.

**Key Steps:**
1. Configure compression model
2. Add compression instruction to messages
3. Attempt compression with retry logic:
   - If token limit exceeded, remove older messages
   - Retry up to 3 times
4. Extract raw notes from tool and AI messages
5. Return compressed research and raw notes

**Why Compression?** Researchers may accumulate lots of tool outputs and reflections. Compression ensures:
- All important information is preserved
- Redundant information is deduplicated
- Content stays within token limits for the final report

**Token Limit Handling:** Gracefully handles token limit errors by progressively truncating messages.

**Implementation:** [`open_deep_library/deep_researcher.py` lines 511-585](open_deep_library/deep_researcher.py#L511-L585)

In [12]:
# Import the compress_research node
from open_deep_library.deep_researcher import compress_research

### Node 8: final_report_generation

**Purpose:** Generates the final comprehensive research report from all collected findings.

**Key Steps:**
1. Extract all notes from completed research
2. Configure final report model
3. Attempt report generation with retry logic:
   - If token limit exceeded, truncate findings by 10%
   - Retry up to 3 times
4. Return final report or error message

**Token Limit Strategy:**
- First retry: Use model's token limit √ó 4 as character limit
- Subsequent retries: Reduce by 10% each time
- Graceful degradation with helpful error messages

**Report Quality:** The prompt guides the model to create well-structured reports with:
- Proper headings and sections
- Inline citations
- Comprehensive coverage of all findings
- Sources section at the end

**Implementation:** [`open_deep_library/deep_researcher.py` lines 607-697](open_deep_library/deep_researcher.py#L607-L697)

In [13]:
# Import the final_report_generation node
from open_deep_library.deep_researcher import final_report_generation

## Task 7: Graph Construction - Putting It All Together

The system is organized into three interconnected graphs:

### 1. Researcher Subgraph (Bottom Level)
Handles individual focused research on a specific topic:
```
START ‚Üí researcher ‚Üí researcher_tools ‚Üí compress_research ‚Üí END
               ‚Üë            ‚Üì
               ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò (loops until max iterations or ResearchComplete)
```

### 2. Supervisor Subgraph (Middle Level)
Manages research delegation and coordination:
```
START ‚Üí supervisor ‚Üí supervisor_tools ‚Üí END
            ‚Üë              ‚Üì
            ‚îî‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îÄ‚îò (loops until max iterations or ResearchComplete)
            
supervisor_tools spawns multiple researcher_subgraphs in parallel
```

### 3. Main Deep Researcher Graph (Top Level)
Orchestrates the complete research workflow:
```
START ‚Üí clarify_with_user ‚Üí write_research_brief ‚Üí research_supervisor ‚Üí final_report_generation ‚Üí END
                 ‚Üì                                       (supervisor_subgraph)
               (may end early if clarification needed)
```

Let's import the compiled graphs from the library.

In [14]:
# Import the pre-compiled graphs from the library
from open_deep_library.deep_researcher import (
    # Bottom level: Individual researcher workflow
    researcher_subgraph,    # Lines 588-605: researcher ‚Üí researcher_tools ‚Üí compress_research
    
    # Middle level: Supervisor coordination
    supervisor_subgraph,    # Lines 351-363: supervisor ‚Üí supervisor_tools (spawns researchers)
    
    # Top level: Complete research workflow
    deep_researcher,        # Lines 699-719: Main graph with all phases
)

## Why This Architecture?

### Advantages of Supervisor-Researcher Delegation

1. **Dynamic Task Decomposition**
   - Unlike section-based approaches with predefined structure, the supervisor can break down research based on the actual question
   - Adapts to different types of research (comparisons, lists, deep dives, etc.)

2. **Parallel Execution**
   - Multiple researchers work simultaneously on different aspects
   - Much faster than sequential section processing
   - Configurable parallelism (1-20 concurrent researchers)

3. **ReAct Pattern for Quality**
   - Researchers use `think_tool` to reflect after each search
   - Prevents excessive searching and improves search quality
   - Natural stopping conditions based on information sufficiency

4. **Flexible Tool Integration**
   - Easy to add MCP tools for specialized research
   - Supports multiple search APIs (Anthropic, Tavily)
   - Each researcher can use different tool combinations

5. **Graceful Token Limit Handling**
   - Compression prevents token overflow
   - Progressive truncation in final report generation
   - Research can scale to arbitrary depths

### Trade-offs

- **Complexity:** More moving parts than section-based approach
- **Cost:** Parallel researchers use more tokens (but faster)
- **Unpredictability:** Research structure emerges dynamically

## Task 8: Running the Deep Researcher

Now let's see the system in action! We'll use it to research wellness strategies for improving sleep quality.

### Setup

We need to:
1. Set up the wellness research request
2. Configure the execution with Anthropic settings
3. Run the research workflow

In [15]:
# Set up the graph with Anthropic configuration
from IPython.display import Markdown, display
import uuid

# Note: deep_researcher is already compiled from the library
# For this demo, we'll use it directly without additional checkpointing
graph = deep_researcher

print("‚úì Graph ready for execution")
print("  (Note: The graph is pre-compiled from the library)")

‚úì Graph ready for execution
  (Note: The graph is pre-compiled from the library)


### Configuration for Anthropic

We'll configure the system to use:
- **Claude Sonnet 4** for all research, supervision, and report generation
- **Tavily** for web search (you can also use Anthropic's native search)
- **Moderate parallelism** (1 concurrent researcher for cost control)
- **Clarification enabled** (will ask if research scope is unclear)

In [16]:
# Configure for Anthropic with moderate settings
config = {
    "configurable": {
        # Model configuration - using Claude Sonnet 4 for everything
        "research_model": "anthropic:claude-sonnet-4-20250514",
        "research_model_max_tokens": 10000,
        
        "compression_model": "anthropic:claude-sonnet-4-20250514",
        "compression_model_max_tokens": 8192,
        
        "final_report_model": "anthropic:claude-sonnet-4-20250514",
        "final_report_model_max_tokens": 10000,
        
        "summarization_model": "anthropic:claude-sonnet-4-20250514",
        "summarization_model_max_tokens": 8192,
        
        # Research behavior
        "allow_clarification": True,
        "max_concurrent_research_units": 1,  # 1 parallel researcher
        "max_researcher_iterations": 2,      # Supervisor can delegate up to 2 times
        "max_react_tool_calls": 3,           # Each researcher can make up to 3 tool calls
        
        # Search configuration
        "search_api": "tavily",  # Using Tavily for web search
        "max_content_length": 50000,
        
        # Thread ID for this conversation
        "thread_id": str(uuid.uuid4())
    }
}

print("‚úì Configuration ready")
print(f"  - Research Model: Claude Sonnet 4")
print(f"  - Max Concurrent Researchers: 1")
print(f"  - Max Iterations: 2")
print(f"  - Search API: Tavily")

‚úì Configuration ready
  - Research Model: Claude Sonnet 4
  - Max Concurrent Researchers: 1
  - Max Iterations: 2
  - Search API: Tavily


### Execute the Wellness Research

Now let's run the research! We'll ask the system to research evidence-based strategies for improving sleep quality.

The workflow will:
1. **Clarify** - Check if the request is clear (may skip if obvious)
2. **Research Brief** - Transform our request into a structured brief
3. **Supervisor** - Plan research strategy and delegate to researchers
4. **Parallel Research** - Researchers gather information simultaneously
5. **Compression** - Each researcher synthesizes their findings
6. **Final Report** - All findings combined into comprehensive report

In [18]:
# Create our wellness research request
research_request = """
I want to improve my sleep quality. I currently:
- Go to bed at inconsistent times (10pm-1am)
- Use my phone in bed
- Often feel tired in the morning

Please research the best evidence-based strategies for improving sleep quality and create a comprehensive sleep improvement plan for me.
"""

# Execute the graph
async def run_research():
    """Run the research workflow and display results."""
    print("Starting research workflow...\n")
    
    async for event in graph.astream(
        {"messages": [{"role": "user", "content": research_request}]},
        config,
        stream_mode="updates"
    ):
        # Display each step
        for node_name, node_output in event.items():
            print(f"\n{'='*60}")
            print(f"Node: {node_name}")
            print(f"{'='*60}")
            
            if node_name == "clarify_with_user":
                if "messages" in node_output:
                    last_msg = node_output["messages"][-1]
                    print(f"\n{last_msg.content}")
            
            elif node_name == "write_research_brief":
                if "research_brief" in node_output:
                    print(f"\nResearch Brief Generated:")
                    print(f"{node_output['research_brief'][:500]}...")
            
            elif node_name == "supervisor":
                print(f"\nSupervisor planning research strategy...")
                if "supervisor_messages" in node_output:
                    last_msg = node_output["supervisor_messages"][-1]
                    if hasattr(last_msg, 'tool_calls') and last_msg.tool_calls:
                        print(f"Tool calls: {len(last_msg.tool_calls)}")
                        for tc in last_msg.tool_calls:
                            print(f"  - {tc['name']}")
            
            elif node_name == "supervisor_tools":
                print(f"\nExecuting supervisor's tool calls...")
                if "notes" in node_output:
                    print(f"Research notes collected: {len(node_output['notes'])}")
            
            elif node_name == "final_report_generation":
                if "final_report" in node_output:
                    print(f"\n" + "="*60)
                    print("FINAL REPORT GENERATED")
                    print("="*60 + "\n")
                    display(Markdown(node_output["final_report"]))
    
    print("\n" + "="*60)
    print("Research workflow completed!")
    print("="*60)

# Run the research
await run_research()

Starting research workflow...


Node: clarify_with_user

I have sufficient information to proceed with your sleep improvement research. I understand you want evidence-based strategies to address your current sleep challenges, which include inconsistent bedtimes (10pm-1am range), phone use in bed, and morning fatigue. I will research comprehensive, scientifically-backed sleep improvement strategies and create a personalized plan based on your specific situation. Beginning research now.

Node: write_research_brief

Research Brief Generated:
I want to improve my sleep quality and need a comprehensive, evidence-based sleep improvement plan. My current sleep challenges include: going to bed at inconsistent times (ranging from 10pm to 1am), using my phone in bed, and often feeling tired in the morning despite getting sleep. Please research the most effective, scientifically-backed strategies for improving sleep quality that specifically address irregular bedtimes, electronic device use befor




Node: research_supervisor

Node: final_report_generation

FINAL REPORT GENERATED



# Comprehensive Evidence-Based Sleep Improvement Plan

## Executive Summary

Based on extensive research from peer-reviewed studies and sleep medicine organizations, your sleep challenges stem from well-documented issues that significantly impact sleep quality and morning alertness. The scientific evidence shows that irregular bedtimes, electronic device use before sleep, and morning fatigue are interconnected problems that require a systematic, evidence-based approach. This comprehensive plan addresses each issue with specific, actionable strategies backed by sleep research and clinical guidelines.

## The Science Behind Your Sleep Challenges

### Circadian Rhythm Disruption from Irregular Bedtimes

Your inconsistent bedtime pattern (10pm-1am range) is causing significant circadian misalignment. A comprehensive systematic review analyzing 59 studies found that irregular sleep timing leads to multiple adverse outcomes including higher depressive and anxiety symptoms, elevated body mass index, insulin resistance, and hypertension [1]. The research revealed that the least regular sleepers had 20-88% higher all-cause mortality risk, independent of sleep duration and quality.

The National Sleep Foundation's 2023 consensus statement emphasizes that consistent sleep and wake times are crucial for both mental and physical health. Adults with regular sleep schedules have a 39% lower mortality risk compared to those with irregular schedules and insufficient sleep duration [2]. The mechanisms behind this include circadian misalignment, autonomic imbalance, and systemic inflammation.

### Electronic Device Impact on Sleep Physiology

Your phone use in bed is directly disrupting your sleep through multiple pathways. Blue light exposure from electronic devices (wavelengths 400-500nm) halts melatonin secretion, which is essential for regulating circadian rhythm [3]. The most harmful blue light wavelengths (415-455nm) stimulate the brain, diminish melatonin production, and improve adrenocortical hormone production, creating hormonal imbalance that directly affects sleep quality.

A large-scale study of 122,058 participants found that daily screen use prior to bed was associated with a 33% higher prevalence of poor sleep quality and 7.64 fewer minutes of sleep on workdays [4]. The effects were more pronounced for evening chronotypes, with daily screen users going to bed 15.62 minutes later. Research on university students showed that 2 hours of evening light exposure caused an average 1.1-hour circadian phase delay and a 55% decrease in melatonin production [5].

## Immediate Changes You Can Implement Tonight

### Digital Sunset Protocol

Implement an immediate "digital sunset" 90 minutes before your target bedtime. The American Academy of Sleep Medicine recommends eliminating screen use at least 60-90 minutes before bedtime [6]. This allows your natural melatonin production to begin, as blue light exposure suppresses melatonin by over 50% within two hours according to NIH research.

**Specific actions:**
- Set a daily phone alarm for 90 minutes before bedtime
- Use your phone's "Do Not Disturb" or "Sleep Mode" features
- Charge your phone outside the bedroom
- Replace evening screen time with reading a physical book, gentle stretching, or meditation

### Emergency Sleep Schedule Stabilization

Choose a consistent bedtime between 10-11pm and wake time that allows for 7-9 hours of sleep (the American Academy of Sleep Medicine recommendation for adults). Start with your most natural preference within this range, then stick to it every single night, including weekends. Research shows that even small deviations can disrupt circadian rhythm recovery.

**Implementation strategy:**
- Calculate backwards from your required wake time to determine bedtime
- Set multiple alarms: 2 hours before bed (wind-down), 90 minutes before (digital sunset), 30 minutes before (final preparations)
- Use blackout curtains or eye masks to ensure darkness
- Set room temperature between 65-68¬∞F (18-20¬∞C) for optimal sleep

## Comprehensive Sleep Hygiene Foundation

### Optimal Sleep Environment Design

Create a sleep sanctuary that supports natural sleep architecture. Research shows that sleep consists of Non-REM stages (light to deep restorative sleep) and REM sleep (90-minute cycles for emotional processing and memory consolidation) [7]. Your environment should facilitate these natural cycles.

**Environmental modifications:**
- Remove all electronic devices from the bedroom
- Use blackout curtains or heavy drapes to block external light
- Install a white noise machine or use earplugs to minimize sound disruption
- Ensure your mattress and pillows provide proper spinal alignment
- Keep the room cool (65-68¬∞F) as body temperature naturally drops during sleep

### Pre-Sleep Routine Protocol

Develop a 60-90 minute wind-down routine that signals your body to prepare for sleep. Research on children showed that consistent bedtime routines led to sleeping an average of more than one hour longer per night, with a dose-dependent relationship between routine frequency and sleep quality [8].

**Evidence-based routine components:**
- Light stretching or gentle yoga (promotes muscle relaxation)
- Reading fiction (engages imagination without stimulating problem-solving)
- Progressive muscle relaxation (systematically tense and release muscle groups)
- Warm bath or shower (the subsequent cooling mimics natural body temperature drop)
- Meditation or deep breathing exercises (activates parasympathetic nervous system)
- Write in a gratitude journal (reduces anxiety and racing thoughts)

## Advanced Strategies for Sleep Onset and Quality

### Light Exposure Management

Properly manage light exposure throughout the day to strengthen your circadian rhythm. Get bright light exposure (preferably sunlight) within the first hour of waking for 15-30 minutes. This helps establish your circadian anchor point and improves evening melatonin production.

**Light management protocol:**
- Morning: Seek bright light exposure immediately upon waking
- Afternoon: Maintain normal indoor lighting, take breaks outdoors when possible
- Evening: Dim lights 2-3 hours before bedtime, use warm-toned bulbs (below 3000K)
- Night: Use minimal red-toned lighting if you must navigate in darkness

### Cognitive and Physical Preparation

Address the mental and physical factors that may be contributing to your morning fatigue. Poor sleep quality often results from insufficient deep sleep phases or frequent micro-awakenings that fragment sleep architecture.

**Optimization strategies:**
- Avoid caffeine after 2pm (caffeine has a 6-8 hour half-life)
- Stop eating large meals 3 hours before bedtime
- Limit alcohol consumption, especially within 4 hours of sleep
- Practice "brain dumping" - write down tomorrow's tasks or concerns 2 hours before bed
- Use relaxation techniques like 4-7-8 breathing (inhale 4, hold 7, exhale 8)

## Long-Term Habit Development Plan

### Week 1-2: Foundation Building

Focus exclusively on establishing consistent sleep and wake times. Research shows this is the single most important factor for sleep quality improvement. Track your progress using a sleep diary or wearable device to monitor sleep regularity index.

### Week 3-4: Digital Hygiene Mastery

Expand your digital sunset protocol and create lasting boundaries with technology. Studies show that using screens in bed increases insomnia risk by 59% and cuts sleep time by 24 minutes [9]. Develop alternative evening activities that promote relaxation.

### Week 5-8: Advanced Optimization

Fine-tune your sleep environment, nutrition timing, and stress management practices. Begin incorporating regular exercise (but not within 4 hours of bedtime) as physical activity improves sleep quality and reduces sleep onset time.

### Week 9-12: Sustainability and Troubleshooting

Develop strategies for maintaining your improved sleep habits during travel, stress, or schedule disruptions. Create backup plans for when your primary routine is compromised.

## Addressing Morning Fatigue Specifically

### Sleep Architecture Optimization

Morning fatigue despite adequate sleep duration often indicates poor sleep quality or inappropriate wake timing within your sleep cycle. Focus on waking during lighter sleep phases rather than deep sleep or REM phases.

**Strategies for refreshing mornings:**
- Use a smart alarm that wakes you during light sleep phases
- Maintain consistent wake times even if you go to bed late occasionally
- Get immediate bright light exposure upon waking
- Engage in light physical activity within 30 minutes of waking
- Avoid hitting snooze, which fragments sleep and increases grogginess

### Addressing Underlying Issues

If morning fatigue persists after 4-6 weeks of consistent sleep hygiene, consider evaluation for sleep disorders such as sleep apnea, restless leg syndrome, or circadian rhythm disorders. The American Academy of Sleep Medicine recommends professional evaluation for persistent sleep issues.

## Monitoring and Adjustment Protocol

### Tracking Metrics

Monitor these key indicators of sleep improvement:
- Sleep onset time (should decrease to under 20 minutes)
- Number of nighttime awakenings (should minimize)
- Morning energy levels (1-10 scale daily rating)
- Daytime alertness and cognitive performance
- Mood and emotional regulation

### Progressive Refinement

Adjust your plan based on objective feedback. If certain strategies aren't working after 2 weeks of consistent implementation, modify rather than abandon them. Sleep improvement is highly individual, and fine-tuning is essential for long-term success.

## Expected Timeline and Outcomes

Based on sleep research, you can expect:
- **Week 1-2:** Improved sleep onset time and reduced nighttime phone use
- **Week 3-4:** More consistent energy levels and easier morning awakening
- **Week 5-8:** Significant improvements in deep sleep quality and daytime alertness
- **Week 9-12:** Fully established habits with sustained improvements in overall sleep quality

The scientific evidence strongly supports that addressing sleep regularity, eliminating pre-sleep screen exposure, and implementing comprehensive sleep hygiene will significantly improve your sleep quality and morning energy levels. Consistency and patience are key, as circadian rhythm adjustments typically require 2-4 weeks to stabilize.

### Sources

[1] Sleep regularity as an important component of sleep hygiene - ScienceDirect: https://www.sciencedirect.com/science/article/abs/pii/S108707922500156X

[2] One hour's screen use after going to bed increases your risk of insomnia by 59% - Frontiers: https://www.frontiersin.org/news/2025/03/31/hours-screen-use-after-bed-increases-insomnia-risk-frontiers-psychiatry

[3] Blue Light and the Effect on Sleep - Physio-pedia: https://www.physio-pedia.com/Blue_Light_and_the_Effect_on_Sleep

[4] Electronic Screen Use and Sleep Duration and Timing in Adults - JAMA Network Open: https://jamanetwork.com/journals/jamanetworkopen/fullarticle/2831993

[5] Impacts of Blue Light Exposure From Electronic Devices - Chronobiology in Medicine: https://www.chronobiologyinmedicine.org/m/journal/view.php?number=167

[6] How Technology Impacts Sleep Quality - AAST: https://aastweb.org/how-technology-impactssleep-quality/

[7] Exploring the Role of Circadian Rhythms in Sleep and Recovery - PMC: https://pmc.ncbi.nlm.nih.gov/articles/PMC11221196/

[8] Screen Use Disrupts Precious Sleep Time - National Sleep Foundation: https://www.thensf.org/screen-use-disrupts-precious-sleep-time/

[9] Technology Impacts on Sleep Quality - Sleep Health Solutions: https://www.sleephealthsolutionsohio.com/blog/how-technology-use-decreases-sleep-quality/


Research workflow completed!


## Task 9: Understanding the Output

Let's break down what happened:

### Phase 1: Clarification
The system checked if your request was clear. Since you provided specific details about your sleep issues, it likely proceeded without asking clarifying questions.

### Phase 2: Research Brief
Your request was transformed into a detailed research brief that guides the supervisor's delegation strategy.

### Phase 3: Supervisor Delegation
The supervisor analyzed the brief and decided how to break down the research:
- Used `think_tool` to plan strategy
- Called `ConductResearch` to delegate to researchers
- Each delegation specified a focused research topic (e.g., sleep hygiene, circadian rhythm, blue light effects)

### Phase 4: Parallel Research
Researchers worked on their assigned topics:
- Each researcher used web search tools to gather information
- Used `think_tool` to reflect after each search
- Decided when they had enough information
- Compressed their findings into clean summaries

### Phase 5: Final Report
All research findings were synthesized into a comprehensive sleep improvement plan with:
- Well-structured sections
- Evidence-based recommendations
- Practical action items
- Sources for further reading

## Task 10: Key Takeaways & Next Steps

### Architecture Benefits
1. **Dynamic Decomposition** - Research structure emerges from the question, not predefined
2. **Parallel Efficiency** - Multiple researchers work simultaneously
3. **ReAct Quality** - Strategic reflection improves search decisions
4. **Scalability** - Handles token limits gracefully through compression
5. **Flexibility** - Easy to add new tools and capabilities

### When to Use This Pattern
- **Complex research questions** that need multi-angle investigation
- **Comparison tasks** where parallel research on different topics is beneficial
- **Open-ended exploration** where structure should emerge dynamically
- **Time-sensitive research** where parallel execution speeds up results

### When to Use Section-Based Instead
- **Highly structured reports** with predefined format requirements
- **Template-based content** where sections are always the same
- **Sequential dependencies** where later sections depend on earlier ones
- **Budget constraints** where token efficiency is critical

### Extend the System
1. **Add MCP Tools** - Integrate specialized tools for your domain
2. **Custom Prompts** - Modify prompts for specific research types
3. **Different Models** - Try different Claude versions or mix models
4. **Persistence** - Use a real database for checkpointing instead of memory

### Learn More
- [LangGraph Documentation](https://langchain-ai.github.io/langgraph/)
- [Open Deep Research Repo](https://github.com/langchain-ai/open_deep_research)
- [Anthropic Claude Documentation](https://docs.anthropic.com/)
- [Tavily Search API](https://tavily.com/)

## ‚ùì Question #3:

What are the trade-offs of using parallel researchers vs. sequential research? When might you choose one approach over the other?

##### Answer:

Parallel researchers are in general faster since the research process since the research process can be done concurrently by multiple researchers; by contrast sequential research will be slower since research for one section is built on top of earlier sections. In addition the parallel researchers structure used in this implementation can decompose the user question into structured research topics more organically instead of following predefined sections or formats.

However the parellel researchers could have risks of rate limiting with high concurrency by the LLM providers, also the cost could be less predictable and could be higher than section-based approach.

Choose parallel researchers when:
- sub-topics are in general independent and can be researched independently (e.g., comparison tasks)
- the research report needs to be generated fast 

Choose sequential research when:
- there's dependency between each sub-topic where one sub-topic is built on top of the results of previous sub-topics
- when the report needs to follow highly structured/templated based content
- where costs or rate limiting are a concern of the system

## ‚ùì Question #4:

How would you adapt this deep research architecture for a production wellness application? What additional components would you need?

##### Answer:
- Add guard-rails to check the draft report before returning a final report to the user. Example guard-rails can include checking for PII information, check compliance with regulation, factual check, any harmful/hate content, add disclaimers when making medical claims.
- After report is generated, allow user to ask follow-up questions and edit specific sections. For example, this can be done through a combination of using checkpointer backed by real database to preserve short-term memory, adding context managment backed by filesystem (e.g., to store versioned report). 
- Adding long-term memory to store user context in a database so follow-up sessions will also preserve context.
- Add custom integration with more tools for getting better research results. For example, add tool for user to update their own source such as text/pdf file; support URLs which can be youtbute video/blog posts/Google drive links; add support for MCP tools to get information from other providers
- Improve research quality by instructing our research agents to prioritize results from credible sources (e.g., peer-reviewed papers, articles from authorities like Harvard/Stanford/WHO/CDC/highly cited researchers etc.), over less credible sources such as random blogs or videos from the internet.


## üèóÔ∏è Activity #2: Custom Wellness Research

Using what you've learned, run a custom wellness research task.

**Requirements:**
1. Create a wellness-related research question (exercise, nutrition, stress, etc.)
2. Modify the configuration for your use case
3. Run the research and analyze the output
4. Document what worked well and what could be improved

**Experiment ideas:**
- Research exercise routines for specific conditions (bad knee, lower back pain)
- Compare different stress management techniques
- Investigate nutrition strategies for specific goals
- Explore meditation and mindfulness research

**YOUR CODE HERE**

In [20]:
os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key: ")

In [23]:
# Optional: LangSmith for tracing
from uuid import uuid4

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = f"AIE9 - Deep Research Agent - {uuid4().hex[0:8]}"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass("LangSmith API Key (press Enter to skip): ") or ""

if not os.environ["LANGCHAIN_API_KEY"]:
    os.environ["LANGCHAIN_TRACING_V2"] = "false"
    print("LangSmith tracing disabled")
else:
    print(f"LangSmith tracing enabled. Project: {os.environ['LANGCHAIN_PROJECT']}")

LangSmith tracing enabled. Project: AIE9 - Deep Research Agent - 0aa3bc57


In [27]:
# YOUR CODE HERE
# Create your own wellness research request and run it

my_wellness_request = """
I'm a 30-year-old software engineer who sits 8+ hours daily and has developed 
lower back pain and neck stiffness over the past year.

Current situation:
- Work schedule is 9am-6pm, 5 days a week, very intense during the working hours.
- Tried a standing desk briefly but stopped due to leg fatigue

Goal: Find evidence-based ways to reduce my lower back pain and neck stiffness 
that realistically fit into a demanding work schedule. I need solutions I can do 
during work hours AND outside work hours.

Compare three approaches:
1. Ergonomic solutions (e.g.,chairs, monitor positioning, desk setup)
2. Strength training targeting back pain
3. Stretching routines and aerobic exercises

For each approach, I want to know: how strong is the evidence, how quickly 
can I expect results, and what's the time commitment? 

Finally, combine the research results into a practical weekly plan.
"""

# Optionally modify the config
my_config = {
    "configurable": {
        "research_model": "anthropic:claude-sonnet-4-5-20250929", # changed to sonnet 4.5 for better performance
        "research_model_max_tokens": 10000,
        "compression_model": "anthropic:claude-sonnet-4-5-20250929", # changed to sonnet 4.5 for better performance
        "compression_model_max_tokens": 8192,
        "final_report_model": "anthropic:claude-sonnet-4-5-20250929", # changed to sonnet 4.5 for better performance
        "final_report_model_max_tokens": 10000,
        "summarization_model": "openai:gpt-4o-mini", # changed to gpt-4o-mini/haiku for faster and cheaper response with separate rate limit from sonnet 4.5
        "summarization_model_max_tokens": 8192,
        "allow_clarification": True,
        "max_concurrent_research_units": 3, # increased to 3 for parallel research
        "max_researcher_iterations": 4, # increased to 4 for more iterations of supervisor (e.g., think_tool, and ConductResearch calls)
        "max_react_tool_calls": 3, # increased to 3 for more tool calling for each researcher
        "search_api": "tavily",
        "max_content_length": 25000, # reduced from 50000 to 25000 for avoiding summarization timeout by tavily
        "thread_id": str(uuid.uuid4())
    }
}
async def run_custom_research(wellness_request, config):
    """Run the research workflow and display results."""
    print("Starting research workflow...\n")
    
    async for namespace, event in graph.astream(
        {"messages": [{"role": "user", "content": wellness_request}]},
        config,
        stream_mode="updates",
        subgraphs=True
    ):
        for node_name, node_output in event.items():
            level = "main" if not namespace else "supervisor"
            
            print(f"\n{'='*60}")
            print(f"[{level}] Node: {node_name}")
            print(f"{'='*60}")

            if node_output is None:
                continue
            
            if node_name == "clarify_with_user":
                if "messages" in node_output:
                    last_msg = node_output["messages"][-1]
                    print(f"\n{last_msg.content}")
            
            elif node_name == "write_research_brief":
                if "research_brief" in node_output:
                    print(f"\nResearch Brief Generated:")
                    print(f"{node_output['research_brief'][:500]}...")
            
            elif node_name == "supervisor":
                print(f"\nSupervisor planning research strategy...")
                if "supervisor_messages" in node_output:
                    last_msg = node_output["supervisor_messages"][-1]
                    if hasattr(last_msg, 'tool_calls') and last_msg.tool_calls:
                        print(f"Tool calls: {len(last_msg.tool_calls)}")
                        for tc in last_msg.tool_calls:
                            print(f"  - {tc['name']}: {tc['args'].get('research_topic', tc['args'].get('reflection', ''))[:300]}")
                    elif hasattr(last_msg, 'content') and last_msg.content:
                        print(f"Response: {str(last_msg.content)[:400]}")
            
            elif node_name == "supervisor_tools":
                print(f"\nExecuting supervisor's tool calls...")
                if "supervisor_messages" in node_output:
                    for msg in node_output["supervisor_messages"]:
                        if hasattr(msg, 'content'):
                            print(f"  [{msg.name}]: {str(msg.content)[:300]}")
                if "notes" in node_output:
                    print(f"Research notes collected: {len(node_output['notes'])}")
                if "raw_notes" in node_output:
                    print(f"Raw notes collected: {len(node_output['raw_notes'])}")
            
            elif node_name == "researcher":
                if "researcher_messages" in node_output:
                    last_msg = node_output["researcher_messages"][-1]
                    if hasattr(last_msg, 'tool_calls') and last_msg.tool_calls:
                        print(f"Researcher tool calls: {len(last_msg.tool_calls)}")
                        for tc in last_msg.tool_calls:
                            print(f"  - {tc['name']}: {tc['args'].get('query', tc['args'].get('reflection', ''))[:200]}")
            
            elif node_name == "researcher_tools":
                if "researcher_messages" in node_output:
                    for msg in node_output["researcher_messages"]:
                        if hasattr(msg, 'name') and hasattr(msg, 'content'):
                            print(f"  [{msg.name}]: {str(msg.content)[:200]}...")
            
            elif node_name == "compress_research":
                if "compressed_research" in node_output:
                    print(f"\nCompressed research ({len(node_output['compressed_research'])} chars):")
                    print(f"{node_output['compressed_research'][:400]}...")
            
            elif node_name == "final_report_generation":
                if "final_report" in node_output:
                    print(f"\n" + "="*60)
                    print("FINAL REPORT GENERATED")
                    print("="*60 + "\n")
                    display(Markdown(node_output["final_report"]))
    
    print("\n" + "="*60)
    print("Research workflow completed!")
    print("="*60)

await run_custom_research(my_wellness_request, my_config)


Starting research workflow...


[main] Node: clarify_with_user

Thank you for providing comprehensive details about your situation. I have all the information needed to proceed with your research request.

**What I understand:**
- You're a 30-year-old software engineer with lower back pain and neck stiffness from prolonged sitting (8+ hours daily)
- You work 9am-6pm, 5 days/week with intense hours
- Previous attempt with standing desk didn't work due to leg fatigue
- You need evidence-based solutions that fit both during and outside work hours

**Research scope:**
I will compare three approaches (ergonomic solutions, strength training, and stretching/aerobic exercises) and evaluate each based on:
1. Strength of evidence
2. Expected timeline for results
3. Time commitment required

I will then synthesize the findings into a practical weekly plan tailored to your demanding work schedule.

**Starting research now.**

[main] Node: write_research_brief

Research Brief Generated:
I'm a 30-ye

# Comprehensive Evidence-Based Solutions for Lower Back Pain and Neck Stiffness in Sedentary Software Engineers

## Executive Summary

Lower back pain and neck stiffness from prolonged sitting affect a substantial portion of office workers, with back pain being the leading cause of job-related disability in the United States. This comprehensive analysis examines three evidence-based approaches‚Äîergonomic solutions, strength training, and stretching/aerobic exercise‚Äîto address these issues for a 30-year-old software engineer working 8+ hours daily. The research reveals that **no single approach is superior**; rather, a multimodal strategy combining all three interventions provides optimal results. The evidence shows that ergonomic interventions reduce pain by approximately 28% (Visual Analog Scale), strength training can decrease neck pain by 35-45%, and targeted exercise programs significantly improve both pain and functional outcomes within 4-12 weeks.

---

## Approach 1: Ergonomic Solutions

### Strength of Scientific Evidence

The evidence supporting ergonomic interventions for musculoskeletal pain is substantial but shows mixed results depending on the specific intervention and measurement outcomes.

A 2025 systematic review and meta-analysis published in the Journal of Clinical Medicine evaluated 24 randomized controlled trials with 4,086 participants across diverse workplace environments. The study found **significant reductions in pain intensity** with a mean difference in Visual Analog Scale (VAS) scores of -0.28 (95% CI: -0.43, -0.14, p = 0.0001). Most importantly for your situation, ergonomic interventions significantly reduced musculoskeletal disorders-related pain in the lower back with an odds ratio of 0.53 (95% CI: 0.40‚Äì0.70, p < 0.00001), representing approximately a **47% reduction in lower back pain odds** [1].

However, a critical finding emerged from another systematic review: when comparing interventions for office workers already experiencing symptoms, **exercise showed stronger evidence than ergonomic interventions alone**. A review of 27 randomized controlled trials found moderate-quality evidence that neck/shoulder strengthening exercises were effective for symptomatic office workers, while ergonomic interventions showed only low-quality evidence of effectiveness [2]. This suggests that for someone like you who already has developed pain, ergonomics should be viewed as a foundational element rather than a standalone solution.

The effectiveness varies by intervention type. While physical activity combined with ergonomics may lessen back pain intensity, one network meta-analysis found these effects were often minor and statistically insignificant for preventing back pain, suggesting **limited effectiveness for prevention alone** but greater value for symptom management [3].

### Timeline for Results

Ergonomic improvements offer both immediate and progressive benefits:

**Immediate Relief (Same Day to 1 Week):**
- Proper monitor positioning can provide immediate reduction in neck strain
- Correct chair adjustments offer instant lumbar support improvements
- Keyboard and mouse repositioning reduces wrist and elbow strain within hours

**Short-Term Improvements (2-6 Weeks):**
A 2025 study examining office chair designs found that heavy-duty ergonomic chairs with swivel mechanisms led to **longer comfort times before discomfort set in**, particularly in legs and neck, with users reporting delayed body response to discomfort [4]. The transition period typically requires 2-4 weeks for adaptation.

**Long-Term Benefits (3-6 Months):**
Research on sit-stand desks showed **significant reductions in musculoskeletal discomfort and post-work fatigue over six months**, along with improvements in overall well-being [5]. Another six-month intervention with sit-stand workstations resulted in notable benefits including a reduction in waist circumference (-1.81 cm, p = 0.04) and improvements in body composition, fatigue, and quality of life [6].

### Time Commitment Required

One of the most attractive aspects of ergonomic solutions is the minimal ongoing time commitment:

**Initial Setup: 2-4 Hours (One-Time)**
- Chair adjustment and setup: 30-45 minutes
- Monitor positioning and desk organization: 30-60 minutes
- Keyboard and mouse positioning: 15-30 minutes
- Sit-stand desk protocol establishment: 30-60 minutes
- Professional ergonomic assessment (recommended): 60-90 minutes

**Daily Ongoing Time: Minimal Active Effort**
- Sit-stand transitions: 2-5 minutes spread throughout the day
- Posture checks and micro-adjustments: 30 seconds every hour
- No dedicated exercise time required for basic ergonomics

**Weekly Maintenance: 5-10 Minutes**
- Reviewing and adjusting workstation setup as needed
- Checking that ergonomic equipment remains properly positioned

### Specific Implementation Recommendations

#### Office Chair Selection and Setup

Research consistently shows that chairs meeting ergonomic requirements reduce the severity, intensity, and frequency of musculoskeletal symptoms. Essential ergonomic features include [7]:

**Critical Chair Features:**
- **Dynamic lumbar support**: Adjustable lower back support that follows your spine's natural curve
- **Adjustable seat depth and height**: Seat height allowing feet flat on floor with thighs parallel to ground; seat depth leaving 2-4 inches between seat edge and back of knees
- **High backrest with recline function**: Backrest reaching at least to shoulder blades, with 110-130 degree recline capability
- **Multi-directional armrests**: Adjustable in height, width, and angle to support forearms without shoulder elevation
- **Pressure-relieving seat design**: Waterfall edge and adequate cushioning to prevent pressure on thighs
- **Breathable materials**: Mesh or fabric that prevents heat buildup during long sitting periods

A 2025 study analyzing 426 office workers found that **heavy-duty chairs with swivel mechanisms (group 1) and lighter ergonomic chairs (group 2) led to significantly longer comfort times before discomfort**, while regular wooden and plastic chairs resulted in significant discomfort across all body parts [4].

#### Monitor Positioning Guidelines

Improper monitor placement forces operators into awkward positions, directly contributing to neck pain and stiffness. Evidence-based positioning includes [8, 9]:

**Optimal Monitor Setup:**
- **Height**: Top of screen at or below eye level, with the monitor positioned **15-20 degrees below horizontal eye level** for optimal viewing
- **Distance**: 20-40 inches (arm's length) from your eyes to the screen
- **Placement**: Directly in front of you to avoid neck rotation
- **Angle**: Slight backward tilt (10-20 degrees) to reduce glare and maintain neutral neck posture

For multiple monitors, position the primary monitor directly in front and the secondary at an angle to minimize neck strain [8]. The Ergonomics Guidelines for Computer Workstations emphasize setting the screen at or below eye level and maintaining approximately 18 inches distance from the eyes [10]. CDC guidance specifically recommends having the top of the screen at or below eye level, positioned **20 degrees below eye level** [11].

#### Keyboard and Mouse Positioning

Research demonstrates that keyboard distance significantly impacts wrist and elbow joint posture. A study using motion capture technology found that **when the keyboard was 8 cm away from the desk edge, wrist flexion decreased 83% and radial deviation decreased 89%**, with wrist flexion decreasing by up to 94% with optimal positioning [12].

**Evidence-Based Positioning:**
- **Keyboard height**: Maximum 2.5 inches above the workstation surface
- **Keyboard distance**: 8-15 cm from the desk edge to maintain neutral wrist position
- **Elbow angle**: Maintain 90-degree angle or slightly greater (90-110 degrees)
- **Mouse placement**: Same height and close to keyboard to minimize reaching
- **Wrist position**: Neutral (straight), avoiding flexion, extension, or radial/ulnar deviation

CDC research identified that **keyboard height above elbow height was the primary factor for neck and shoulder symptoms**, while radial deviation was the only postural factor for hand and arm disorders [13]. Using **compact keyboards** allows less stretching between keyboard and mouse, and **vertical mice** encourage a neutral hand position, reducing strain [14].

#### Sit-Stand Desk Protocols

Your previous experience with standing desk leg fatigue is common and can be addressed through proper protocols. The key is **gradual progression and alternating positions** rather than prolonged standing.

**Evidence-Based Sit-Stand Schedule:**

Research examining four different sit-stand schedules (90-min sitting/30-min standing, 80/40, 105/15, and 60/60) found that **office workers prefer sit/stand duration ratios between 1:1 and 3:1**, with longer standing periods showing a tendency to reduce muscle fatigue [15, 16]. However, the study noted that more active break-time activities seemed more effective than simply standing.

**Recommended Progressive Protocol:**

*Week 1-2 (Adaptation Phase):*
- 45 minutes sitting / 15 minutes standing
- Repeat 4-5 times during workday
- Total standing: 60-75 minutes daily

*Week 3-4 (Building Tolerance):*
- 40 minutes sitting / 20 minutes standing
- Repeat 4-5 times during workday
- Total standing: 80-100 minutes daily

*Week 5+ (Maintenance):*
- 30-35 minutes sitting / 25-30 minutes standing
- Repeat 4-5 times during workday
- Total standing: 100-150 minutes daily
- Target: 2-4 hours standing time in 8-hour workday

**To Prevent Leg Fatigue:**
- Use an **anti-fatigue mat** (essential for reducing leg discomfort)
- **Shift weight** between feet every 10-15 minutes while standing
- Perform **calf raises and ankle circles** during standing periods
- Wear **supportive footwear** (avoid hard-soled dress shoes during standing)
- **Gradually increase** standing time over 4-6 weeks
- Include **walking breaks** rather than just standing (see sit-stand-walk intervention below)

A study on sit-stand-walk interventions found that a **sit-stand-walk group reported significantly lower levels of musculoskeletal discomfort and perceived physical fatigue** compared to those in stationary sitting or standing positions [17]. This suggests incorporating brief walks (even just around your desk area) during transition periods.

#### Lighting and Additional Ergonomic Factors

Proper lighting reduces eye strain and associated tension in neck and shoulder muscles. Key recommendations include [10]:

- Position monitor perpendicular to windows to minimize glare
- Use indirect or adjustable task lighting rather than harsh overhead lights
- Adjust screen brightness to match ambient lighting
- Use blue-light filtering if working during evening hours
- Take visual breaks using the 20-20-20 rule (every 20 minutes, look at something 20 feet away for 20 seconds)

---

## Approach 2: Strength Training for Back Pain and Neck Stiffness

### Strength of Scientific Evidence

Strength training specifically targeting back and neck musculature has **moderate to strong evidence** for reducing pain in symptomatic office workers, with clear superiority over ergonomic interventions alone for those already experiencing symptoms.

#### Evidence for Lower Back Pain

A systematic review and meta-analysis of 60 studies examining resistance exercise training in sedentary office workers found **modest reductions in neck discomfort (SMD = -1.76) and shoulder discomfort (SMD = -13.29)**, along with improvements in shoulder muscle strength (SMD = 4.13) and neck extensor strength (SMD = 9.07). However, the review noted high uncertainty due to heterogeneity and bias [18].

More compelling evidence comes from targeted interventions. A randomized controlled trial investigating **core stabilization exercises versus conventional physiotherapy for chronic non-specific low back pain** in 36 participants showed that the core exercise group demonstrated:
- **4.4-point decrease in activity-related pain** (versus 1.8 points in conventional therapy)
- **31-point improvement in functional disability** (versus 10 points in conventional therapy)
- **Greater increase in lumbar multifidus muscle cross-sectional area** at L3-L5 levels
- **Decreased fat infiltration** in lumbar muscles
- **Superior improvements lasting 12 weeks** post-program (versus 4 weeks for strengthening alone) [19]

A comprehensive systematic review published in Frontiers in Physiology (October 2025) compared three core training modalities‚ÄîPilates, core stability, and core resistance training‚Äîacross 57 randomized controlled trials with over 7,700 participants. The findings revealed:
- **Pilates training showed optimal pain relief effects** (SMD = 0.75; 95% CI: 0.58‚Äì0.92)
- **Core resistance training demonstrated the most significant effects for functional status improvement** (SMD = 0.76; 95% CI: 0.55‚Äì0.97)
- **Core stability training also effective** (SMD = 0.53) [20]

Another randomized controlled trial comparing core stabilization exercises (CSE) versus strengthening exercises (STE) in 36 patients with subacute nonspecific low back pain found **significantly greater improvements in the CSE group** regarding proprioception, balance, trunk muscle thickness, functional disability reduction, and fear of movement reduction [21].

A meta-analysis by Steffens et al. found that **exercise prevented lower back pain by 35-45% and reduced sick leave due to lower back pain by 25-75%** [22].

#### Evidence for Neck Pain and Stiffness

For neck pain specifically, the evidence strongly favors strengthening exercises over other interventions. A systematic review and meta-analysis examining exercise therapy for office workers with non-specific neck pain included eight randomized control trials with an average quality score of 6.63/10 on the PEDro scale. The findings showed:
- **Strengthening exercises led to clinically significant pain reduction** compared to no exercise
- **Five studies showed significant improvements** in both neck pain and quality of life
- **Endurance and stretching exercises showed limited effects** on pain and quality of life [23]

The KTDRR systematic review evaluating 27 randomized controlled trials found **moderate-quality evidence that neck/shoulder strengthening exercises and general fitness training were effective** in reducing neck pain in symptomatic office workers, with larger effect sizes for strengthening exercises and **greater engagement correlating with better outcomes** [2].

A one-year study published in JAMA demonstrated that **participation in endurance and strength training programs led to considerable reduction in average neck pain and disability** [24].

A dose-response study published in BMC Sports Science, Medicine and Rehabilitation found that daily bouts of **specific high-intensity resistance training of the shoulder and neck region reduced pain levels by 25% (mean pain) and 43% (worst pain)**, with a **10.6% increase in health-related quality of life**. Notably, both 10-minute and 2√ó10-minute daily training durations were equally effective [25].

### Timeline for Results

The timeline for strength training results varies based on pain chronicity and exercise adherence:

#### Lower Back Pain Timeline

**Early Improvements (2-4 Weeks):**
Most patients with **acute or mild lower back conditions notice improvements within 2-4 weeks** of consistent exercise [26]. A 10-week dynamic back extension training program showed a **21% increase in back muscle strength** after training [27].

**Moderate Improvements (4-8 Weeks):**
For subacute and chronic conditions, **most patients see improvements within 6-8 weeks** [26]. Research indicates that **muscle strength effectiveness increased and lasted for 8 weeks** after exercise programs [28].

**Substantial Improvements (8-12 Weeks):**
Chronic issues typically require **8-12 weeks or more** for significant improvements [26]. However, **core stabilization exercises showed superior results lasting 12 weeks post-program** compared to 4 weeks for conventional strengthening [19].

**Long-Term Maintenance:**
The SV Proactive Physical Therapy guidance notes that **chronic pain may require several months of therapy with gradual progress**, emphasizing that healing is a journey requiring patience and realistic expectations [29].

#### Neck Pain Timeline

Research on neck pain shows similar timelines:

**Short-Term (3-4 Weeks):**
A cervical and scapula-focused resistance exercise program conducted over **four weeks showed significant improvements** in pain levels (VAS), cervical range of motion, upper trapezius tone, neck disability index, and quality of life compared to massage therapy [30].

**Medium-Term (12 Weeks):**
A secondary analysis of the NEXpro trial evaluating neck exercises with health promotion interventions showed **significant improvements in headache occurrence and frequency over a 12-week intervention**, with statistically significant benefits particularly in earlier intervention periods [31].

### Time Commitment Required

Strength training requires consistent investment but is achievable within a demanding work schedule:

#### Per Session Time Commitment

**Lower Back Core Training:**
- Warm-up: 5 minutes
- Core stabilization exercises: 15-20 minutes
- Cool-down and stretching: 5 minutes
- **Total per session: 25-30 minutes**

**Neck and Shoulder Strengthening:**
- Warm-up: 3-5 minutes
- Resistance exercises: 10-20 minutes
- Cool-down: 2-5 minutes
- **Total per session: 15-30 minutes**

**Workplace-Based High-Intensity Training:**
- Quick sessions: **10 minutes** (can be split into 2√ó5 minute bouts)
- Can be performed at desk or in office break room
- **Total: 10-20 minutes daily**

#### Weekly Time Commitment

Based on systematic review findings and optimal dosage research:

**Lower Back Training:**
- **Frequency: 3 times per week minimum**
- **Duration: 25-30 minutes per session**
- **Total weekly: 75-90 minutes**

**Neck and Shoulder Training:**
- **Frequency: 3-5 days per week**
- **Duration: 15-20 minutes per session**
- **Total weekly: 45-100 minutes**

**Combined Integrated Program:**
- **Frequency: 3-4 days per week**
- **Duration: 30-40 minutes per session**
- **Total weekly: 90-160 minutes (1.5-2.7 hours)**

The dose-response study noted that **no significant difference existed between 10-minute and 2√ó10-minute daily training durations** for effectiveness, providing flexibility for fitting training into your schedule [25].

### Specific Implementation Recommendations

#### Core Stabilization Exercises for Lower Back Pain

Based on the evidence from multiple randomized controlled trials, core stabilization exercises should form the foundation of your lower back pain management:

**Deep Core Activation Exercises (Perform Daily):**

1. **Transverse Abdominis Activation (Abdominal Drawing-In)**
   - Lie on back with knees bent, feet flat
   - Draw belly button toward spine without moving pelvis or spine
   - Hold 10 seconds, breathe normally
   - Sets: 3 sets of 10 repetitions
   - Frequency: Daily, can be done during work breaks lying on office floor or at home

2. **Quadruped Arm/Leg Raises (Bird Dogs)**
   - Start on hands and knees, spine neutral
   - Extend right arm and left leg simultaneously, maintaining level pelvis
   - Hold 10 seconds, return to start
   - Alternate sides
   - Sets: 3 sets of 10 repetitions per side
   - Frequency: 3-4 times weekly

3. **Plank Variations**
   - Begin with forearm plank (20-30 seconds)
   - Progress to 60-second holds
   - Add side planks for oblique engagement
   - Sets: 3 sets, holding to fatigue
   - Frequency: 3-4 times weekly

4. **Bridge Exercise**
   - Lie on back, knees bent, feet flat
   - Lift hips while engaging glutes and core
   - Hold 10-15 seconds, lower slowly
   - Sets: 3 sets of 12-15 repetitions
   - Frequency: 3-4 times weekly

5. **Dead Bug Exercise**
   - Lie on back, arms extended toward ceiling, knees bent at 90 degrees
   - Lower right arm and left leg simultaneously while maintaining neutral spine
   - Return to start, alternate sides
   - Sets: 3 sets of 10 repetitions per side
   - Frequency: 3-4 times weekly

**Progressive Resistance Core Training (3x/week):**

The systematic review comparing core training modalities found that **core resistance training demonstrated the most significant and stable effects for functional status improvement** [20]. Implement these progressively:

- **Weeks 1-2**: Focus on form and activation with bodyweight exercises
- **Weeks 3-4**: Add light resistance (resistance bands)
- **Weeks 5-8**: Progress to moderate resistance with dumbbells or cable machines
- **Weeks 9-12**: Increase resistance and complexity with compound movements

**Training Parameters:**
- **Frequency**: 3 times per week minimum (optimal: 3-4 times)
- **Duration**: 45-60 minutes per session (including warm-up/cool-down)
- **Intensity**: Moderate to high (able to complete full set with good form while feeling challenged)

#### Neck and Shoulder Strengthening Exercises

Research strongly supports targeted neck and shoulder strengthening for office workers with neck pain. Clinical trials have identified specific effective exercises:

**Neck Strengthening Protocol (Based on Clinical Evidence):**

1. **Deep Neck Flexor Strengthening**
   - Lie on back with small towel roll under neck
   - Perform chin tuck (bring chin toward chest without lifting head)
   - Hold 10 seconds, maintaining light pressure
   - Sets: 3 sets of 10 repetitions
   - Frequency: Daily (can be done during lunch break lying on office floor)

2. **Scapular Retraction with Resistance Band**
   - Attach resistance band at chest height
   - Hold band with both hands, arms extended
   - Pull band back by squeezing shoulder blades together
   - Hold 2-3 seconds, return slowly
   - Sets: 3 sets of 12-15 repetitions
   - Frequency: 3-5 times weekly
   - **Evidence**: Identified in multiple RCTs as effective for neck pain [32]

3. **Cervical Extension Strengthening**
   - Sit upright, place hand on back of head
   - Gently press head backward against hand resistance
   - Hold 5-7 seconds without moving neck
   - Sets: 3 sets of 10 repetitions
   - Frequency: 3-5 times weekly

4. **Shoulder Shrugs with Resistance**
   - Hold dumbbells at sides (start light: 5-10 lbs)
   - Elevate shoulders toward ears, hold 2 seconds
   - Lower slowly over 3-4 seconds
   - Sets: 3 sets of 12-15 repetitions
   - Frequency: 3 times weekly

5. **Prone Y-T-W Exercises**
   - Lie face down on bench or bed
   - Perform "Y" position: arms overhead forming Y shape
   - Perform "T" position: arms out to sides at shoulder height
   - Perform "W" position: elbows bent, hands by shoulders
   - Hold each position 3-5 seconds
   - Sets: 3 sets of 10 repetitions each position
   - Frequency: 3 times weekly

**Workplace-Based High-Intensity Protocol:**

The dose-response study found that **daily 10-minute high-intensity resistance training** sessions were effective for reducing neck and shoulder pain [25]:

**Daily Office-Based Routine (10 minutes):**
- Resistance band shoulder exercises: 3 minutes
- Neck isometric holds: 2 minutes
- Scapular strengthening: 3 minutes
- Dynamic stretching: 2 minutes

This can be split into **two 5-minute sessions** (mid-morning and mid-afternoon) with equal effectiveness.

#### Important Training Principles

Based on clinical practice guidelines and systematic reviews:

1. **Progressive Overload**: Gradually increase resistance, repetitions, or hold times every 1-2 weeks
2. **Form Over Intensity**: Maintain proper form throughout; reduce resistance if form breaks down
3. **Consistency**: Regular training 3-5 times weekly is more effective than sporadic intense sessions
4. **Adequate Recovery**: Allow 24-48 hours between intense sessions for the same muscle groups
5. **Address Fear-Avoidance**: The CSE study found reduced fear of movement; don't avoid exercises due to mild discomfort [21]
6. **Individualization**: Clinical guidelines emphasize that **individually tailored approaches are preferable** [33]

---

## Approach 3: Stretching Routines and Aerobic Exercises

### Strength of Scientific Evidence

The evidence for stretching and aerobic exercise shows more nuanced results, with effectiveness varying significantly by intervention type, duration, and whether used for prevention versus treatment.

#### Evidence for Stretching

A systematic review investigating the impact of physical exercise on low back pain in office workers analyzed 11 articles with diverse exercise protocols lasting from 6 weeks to 12 months. Results indicated that **exercise markedly improves lower back pain symptoms, flexibility, and quality of life**, especially with supervised or video-supported interventions. The review found that **low back pain affects approximately 34% of office workers** and that **physical exercise is strongly recommended for lower back pain management**. Recommended interventions included **short-duration exercises (10-15 minutes) performed 3-5 days a week during work hours** [34].

However, for neck pain specifically, the evidence for stretching is less robust. A systematic review and meta-analysis examining exercise effectiveness for office workers with non-specific neck pain found that **strengthening exercises had clinically significant effects on pain reduction, whereas endurance and stretching exercises showed limited effects** on both pain and quality of life [23]. Another systematic review of nine randomized controlled trials found **strong evidence for muscle strengthening and endurance exercises in treating neck pain**, but noted that **no exercise type was identified as being effective in the prevention** of nonspecific neck pain [35].

#### Evidence for Aerobic Exercise

The evidence for aerobic exercise in managing chronic low back pain is substantial. A Cochrane Review protocol emphasizes that **clinical guidelines recommend exercise as the first line of care for chronic low back pain**, with aerobic activities being among the diverse modalities studied [36].

A systematic review protocol investigating aerobic exercise effects on chronic non-specific low back pain noted that **aerobic exercises are among the best treatment options for chronic low back pain, reducing pain and disability in patients**. The review highlighted that chronic low back pain affects approximately 39% of the population during their lifetime and **costs the U.S. healthcare system over 100 billion dollars annually** [37].

A study comparing stabilization and muscle strengthening exercise programs in 70 sedentary women with chronic low back pain found that **both types of exercises significantly reduced pain and functional disability** over a 20-week program. Notably, **stabilization exercises showed greater effectiveness with longer-lasting results** (effects lasting 12 weeks compared to 4 weeks for muscle strengthening), and **both programs positively affected muscle strength for 8 weeks post-completion** [28].

The International Association for the Study of Pain (IASP) emphasizes that **exercise therapy is a key component of effective chronic low back pain management**, with evidence-based treatment recommendations advocating non-pharmacological approaches to improve physical function and overall quality of life. Importantly, **all chronic low back pain guidelines acknowledge that no one particular exercise modality is superior to others**, suggesting an **individualized approach may be preferable** [33].

#### Evidence Quality Concerns

A systematic review examining clinical practice guidelines for exercise therapy and physical activity in low back pain found that **only 27% of guidelines provided satisfactory quality evidence, while 72% were rated as critical**. Significantly, **none of the guidelines discussed physical activity recommendations for primary prevention of low back pain**, though **100% of guidelines recommended at least one type of supervised exercise in management** and **88% provided recommendations for people to stay active** [38].

A systematic review with meta-analysis on workplace physical exercise as treatment for low back pain yielded concerning findings: after analyzing 15 studies meeting inclusion criteria, the meta-analysis showed that **workplace physical exercise did not significantly reduce the incidence of low back pain** (difference of means=0.62, 95%CI -0.8-2.04, p<0.4) [39]. This suggests that while exercise is beneficial for management, its preventive effects in workplace settings require more robust research.

Despite mixed findings for prevention, a meta-analysis by Steffens et al. found that **exercise prevented lower back pain by 35-45% and sick leave due to lower back pain by 25-75%** [22], indicating that properly designed exercise programs can have preventive benefits.

### Timeline for Results

#### Lower Back Pain Timeline

The timeline for seeing results from stretching and aerobic exercise varies based on pain chronicity and exercise adherence:

**Acute Conditions (2-4 Weeks):**
The VA/DoD Clinical Practice Guidelines note that **most low back pain goes away within a few days** and can be managed at home with self-treatment including remaining active, applying heat, and engaging in simple exercises [40]. Kaiser Permanente information states that **normally with good self-care, low back pain symptoms improve within 4 to 6 weeks** [41].

**Subacute to Chronic Conditions (4-12 Weeks):**
Physical therapy sources consistently report that **most people with lower back pain complete treatment within four to twelve weeks**, with **most patients noticing improvement within two to four weeks** [42]. The systematic review by Gobbo et al. noted that exercise protocols lasting from **6 weeks to 12 months** showed marked improvements in lower back pain symptoms [34].

**Progressive Improvement Pattern:**
Recovery occurs in stages as emphasized by SV Proactive Physical Therapy:
- **Early sessions (Weeks 1-3)**: Focus on pain management and initial mobility improvements
- **Mid-program (Weeks 4-8)**: Progress to strengthening and functional recovery
- **Later stages (Weeks 9-12+)**: Maintenance and prevention focus

For chronic pain specifically, **several months of therapy with gradual progress** may be required, with the emphasis on setting realistic expectations and understanding that **healing is a journey** [29].

#### Neck Pain Timeline

For neck pain, the NEXpro trial showed that **neck exercises combined with health promotion interventions produced significant improvements in headache occurrence and frequency over a 12-week intervention**, with statistically significant benefits particularly in earlier intervention periods. The study noted that **refresher sessions at later periods are necessary to sustain the benefits** [31].

### Time Commitment Required

#### Daily Stretching Routines

**Micro-breaks During Work (No Dedicated Time Block Required):**
- Frequency: Every 60-90 minutes during workday
- Duration: 2-5 minutes per break
- Total daily during work: 20-30 minutes spread throughout 8-hour workday
- **Key advantage**: Can be performed at desk without leaving workspace

**Dedicated Stretching Sessions:**
- Morning routine: 10-15 minutes
- Evening routine: 10-15 minutes
- Total daily: 20-30 minutes dedicated time

#### Aerobic Exercise

**Minimum Effective Dose:**
- Frequency: 3-5 days per week
- Duration: 20-30 minutes per session
- Total weekly: 60-150 minutes (1-2.5 hours)

**Optimal Dose for Pain Management:**
- Frequency: 5 days per week
- Duration: 30-45 minutes per session
- Total weekly: 150-225 minutes (2.5-3.75 hours)

#### Combined Weekly Time Commitment

**Minimal Effective Program:**
- Micro-breaks during work: 20-30 minutes daily (spread throughout day, no dedicated block)
- Dedicated stretching: 20-30 minutes daily
- Aerobic exercise: 60-90 minutes weekly (3 sessions of 20-30 minutes)
- **Total structured time: 2.5-3.5 hours weekly** (plus micro-breaks embedded in workday)

**Optimal Program:**
- Micro-breaks during work: 30-40 minutes daily (spread throughout day)
- Dedicated stretching: 30 minutes daily
- Aerobic exercise: 150-180 minutes weekly (5 sessions of 30-36 minutes)
- **Total structured time: 4.5-5.5 hours weekly** (plus micro-breaks embedded in workday)

### Specific Implementation Recommendations

#### Evidence-Based Stretching Protocol for Lower Back

Based on the systematic review findings emphasizing **short-duration exercises (10-15 minutes) performed 3-5 days a week during work hours** [34]:

**Morning Routine (10-15 minutes, before work):**

1. **Cat-Cow Stretch**
   - Start on hands and knees
   - Arch back (cow), then round spine (cat)
   - 10 repetitions, hold each position 3-5 seconds
   - Improves spinal mobility and reduces morning stiffness

2. **Child's Pose**
   - Kneel, sit back on heels, extend arms forward
   - Hold 30-60 seconds, breathe deeply
   - Stretches lower back and hips

3. **Knee-to-Chest Stretch**
   - Lie on back, bring one knee to chest
   - Hold 20-30 seconds per leg
   - 2-3 repetitions per side
   - Releases lower back tension

4. **Piriformis Stretch**
   - Lie on back, cross one ankle over opposite knee
   - Pull thigh toward chest
   - Hold 30 seconds per side, 2-3 repetitions
   - Addresses sciatic nerve tension

5. **Spinal Twist**
   - Lie on back, bring knees to chest
   - Lower knees to one side while keeping shoulders flat
   - Hold 30 seconds per side
   - Improves spinal rotation and releases back tension

**Evening Routine (10-15 minutes, after work):**

1. **Standing Forward Fold**
   - Stand with feet hip-width apart
   - Hinge at hips, let arms and head hang
   - Hold 30-60 seconds
   - Releases accumulated tension from sitting

2. **Reclined Figure-4 Stretch**
   - Lie on back, cross ankle over opposite knee
   - Pull supporting leg toward chest
   - Hold 30-45 seconds per side
   - Releases hip and lower back tension

3. **Supine Hamstring Stretch**
   - Lie on back, extend one leg toward ceiling
   - Use strap or towel around foot
   - Hold 30-45 seconds per leg, 2 repetitions
   - Addresses tight hamstrings contributing to lower back pain

4. **Lower Back Extension (Cobra or Sphinx Pose)**
   - Lie face down, prop upper body on forearms
   - Hold 20-30 seconds, repeat 3-5 times
   - Counters prolonged flexion from sitting

5. **Hip Flexor Stretch (Lunge Position)**
   - Kneel on one knee, opposite foot forward
   - Press hips forward gently
   - Hold 30 seconds per side, 2 repetitions
   - Addresses tight hip flexors from prolonged sitting

#### Evidence-Based Stretching Protocol for Neck and Shoulders

Clinical trials have identified specific effective stretches for neck pain in office workers [32, 43]:

**Desk-Based Stretches (Can be performed at workstation):**

1. **Upper Trapezius Stretch**
   - Sit upright, bring right ear toward right shoulder
   - Left arm hangs relaxed or gently pulls head
   - Hold 20-30 seconds per side
   - Repeat 3 times per side
   - **Evidence**: Identified as effective in multiple RCTs [43]

2. **Levator Scapulae Stretch**
   - Sit upright, turn head 45 degrees to right
   - Bring nose toward right armpit
   - Right hand can gently assist stretch
   - Hold 20-30 seconds per side, 3 repetitions
   - **Evidence**: Specific protocol used in clinical trials [43]

3. **Chin Tucks**
   - Sit with neutral spine
   - Draw chin straight back (like making double chin)
   - Hold 5-10 seconds
   - Repeat 10 times
   - Addresses forward head posture

4. **Neck Rotation**
   - Slowly turn head to look over right shoulder
   - Hold 10-15 seconds
   - Return to center, repeat left
   - 5 repetitions per side
   - Improves cervical rotation mobility

5. **Shoulder Rolls**
   - Roll shoulders backward in large circles
   - 10 repetitions backward, 10 forward
   - Releases shoulder tension

#### Micro-breaks and Movement Snacks

Research on micro-breaks shows promising evidence for workplace health interventions. A systematic review and meta-analysis found **statistically significant but modest improvements in vigor (d = .36, p < .001) and fatigue reduction (d = .35, p < .001)** due to micro-breaks. Importantly, **the longer the micro-break, the greater the boost in performance** [44].

**Scientific Support for Micro-breaks:**
- **Reduced musculoskeletal discomfort**: Sedentary workers who took up to two short micro-breaks an hour had reduced musculoskeletal discomfort and improved focus [45]
- **Mental health benefits**: Preliminary findings suggest that interrupting sedentary work with movement micro-breaks may have beneficial effects on employee mental health [46]
- **No productivity loss**: Studies found no significant decrease in performance with regular micro-breaks [44]
- **Optimal frequency**: Taking **2-3 minute micro-breaks involving light physical activity** leads to improved physical and mental health without affecting productivity [45]

**Evidence-Based Micro-break Protocol:**

**Frequency**: Every 60-90 minutes during workday (approximately 5-6 breaks in 8-hour shift)

**Duration**: 2-5 minutes per break

**Activities to Include** (rotate through different activities each break):

1. **Standing and Walking Break** (2-3 minutes)
   - Stand up from desk
   - Walk around office, to water fountain, or brief outdoor walk
   - Research shows this is more effective than just standing [17]

2. **Dynamic Stretching** (3-5 minutes)
   - Neck rolls and stretches
   - Shoulder shrugs and rolls
   - Standing spinal twists
   - Standing quadriceps and calf stretches
   - Hip circles

3. **Brief Exercise Snack** (3-5 minutes)
   - Bodyweight squats: 10-15 repetitions
   - Wall push-ups: 10-15 repetitions
   - Calf raises: 15-20 repetitions
   - Standing marches: 30 seconds
   - Modern data shows that **micro-breaking can prevent about 10% of long-term work absence** [47]

4. **Postural Reset** (2-3 minutes)
   - Stand against wall with proper posture
   - Perform scapular squeezes
   - Practice diaphragmatic breathing
   - Reset workstation ergonomics

**Key Principle**: PainScience.com emphasizes that **effective micro-breaks involve at least one minute of exercise that significantly increases heart rate** [47]. The guideline from research: **"The only truly ergonomic workstation is one that you regularly push away from."**

#### Aerobic Exercise Recommendations

Based on clinical practice guidelines and systematic reviews emphasizing that exercise is the first line of care for chronic low back pain [33, 36]:

**Recommended Activities** (choose based on preference and accessibility):

1. **Low-Impact Options** (best for those with current pain):
   - Brisk walking
   - Swimming or water aerobics
   - Cycling (stationary or outdoor)
   - Elliptical machine
   - Rowing machine

2. **Moderate-Impact Options** (as pain decreases):
   - Light jogging
   - Group fitness classes (low-impact aerobics)
   - Dancing
   - Recreational sports (doubles tennis, golf with walking)

**Weekly Structure for 30-Year-Old Software Engineer:**

**Beginner Protocol (Weeks 1-4):**
- Frequency: 3 days per week (e.g., Monday, Wednesday, Friday)
- Duration: 20-30 minutes per session
- Intensity: Moderate (able to talk but not sing)
- Suggested: Walking or cycling
- Total: 60-90 minutes weekly

**Intermediate Protocol (Weeks 5-12):**
- Frequency: 4-5 days per week
- Duration: 30-40 minutes per session
- Intensity: Moderate to moderate-vigorous
- Variety: Mix 2-3 different activities
- Total: 120-200 minutes weekly

**Maintenance Protocol (Week 13+):**
- Frequency: 5 days per week
- Duration: 30-45 minutes per session
- Intensity: Moderate-vigorous
- Variety: Mix activities for enjoyment and adherence
- Total: 150-225 minutes weekly

**Integration with Work Schedule:**
Given your 9am-6pm schedule, optimal timing:
- **Before work**: 6:30-7:15am (45-minute session)
- **After work**: 6:30-7:15pm (45-minute session)
- **Lunch break**: 12:00-12:30pm (30-minute brisk walk)
- **Weekend**: Longer sessions (45-60 minutes) on Saturday and Sunday

**Key Implementation Principles:**

1. **Gradual Progression**: Start conservatively and increase duration/intensity by no more than 10% per week to avoid overuse injuries

2. **Consistency Over Intensity**: The IASP guidelines emphasize that **regular moderate exercise is more beneficial than spor


Research workflow completed!


### Observations and changes made to the research agent

### Changes made to the configurations
- Switched summarization model to gpt-4o-mini and reduced max content length to 20000 tokens. Reason is that initially I used all Anthropic models, but was getting rate limiting 429 errors. Also tested haiku model but still gets the 429 errors.

- Switched other nodes including research, compress and final report model to Sonnet 4.5 for better performance. 

- `max_concurrent_research_units` increased from 1 to 3 so we can have parallelization of the researcher nodes.

- Updated `max_researcher_iterations` to 4 for more iterations of supervisor (e.g., max number of think_tool and parallel ConductResearch calls is increased to 4). Default value of 2 actually caused the supervisor to skip researching certain topics. 

`max_react_tool_calls` is increased to 3 for more tool calling for each researcher. These two changes will improve the research quality from both individual researchers as well as the main supervisor.

- Reduced `max_content_length` from 50K to 25K. This is because when using the original value, the summarization call often times out after 60 seconds, which indicates the web page content may be too large to finish the summarization within 60 seconds. The 60 seconds timeout is defined in the following code for search result summary:

```
        # Execute summarization with timeout to prevent hanging
        summary = await asyncio.wait_for(
            model.ainvoke([HumanMessage(content=prompt_content)]),
            timeout=60.0  # 60 second timeout for summarization
        )
```
- Overall these parameters are chosen such that we can have a high quality research report meanwhile also balancing not to get rate limited by LLM providers.

- For the `run_custom_research` call, added `subgraphs=True` option so that we can also stream events from inside the `research_supervisor` subgraph. Otherwise the logs from `supervisor` and `supervisor_tools` will not be streamed. 

## Analysis of the output

The quality of the report is very decent, as it covers very comprehensive research insights for all the 3 approaches user has requested to investigate. Each approach have evidence based details, actionable insights and recommendations with rich citations.

This report is a result from a few back-and-forth tests, where the deep research agent either only generates shallow research result (e.g., this mostly happened when using a small value for `max_researcher_iterations`, which limits the amount of thinking and research the supervisor can do); or doing too much parallelization or generating too much token that we can rate limited by Anthropic with 429 error, and/or 60 sec timeout error from summarization.

Finally we settled with the configuration that balanced research quality with constratints from service providers, which led to the report. 

## What worked well and what could be improved

What worked well:
- Deep research agent is working as expected as we can see the research brief gets created, and supervisor is able to call 3 parallel ConductResearch calls, use think_tool to plan and summarize the results, and each research is able to call multiple tools to find rich sources of information for these topics.
- Using openai:gpt-4o-mini for summarization offloaded the most frequent API calls to a different provider, avoiding rate limit by Anthropic. I was originally using Anthropic that kept getting 429 errors. 
- The final report is very comprehensive and useful with decent amount of citations.

What could be improved:
- Our deep research agent is often limited by service providers, such as 429 error from Anthropic due to rate limiting. This caused us to sacrifice or limiting the configuration for some aspects of the system, such as the `max_researcher_iterations` and output token at various nodes. 
- For example the final report is truncated before showing the full citation list since we have to set `final_report_model_max_tokens` to a reasonable value without getting rate limited. 
- For a given budget, we'll have to tweak these configuration values to balance cost and research quality. Or we might need to add code to wait/retry after getting rate limited. Otherwise we'll need to increase our budget to have a rate limit.