# Building Deep Research Agent 

This script will walk you through building on top of our simple tool calling agent to evolve it to a full Deep Research Agent. 
We will cover: 
1. Prompting strategies 
2. Multi-tool agents designg
3. Compacting conversations 

Docs: 
Weights & Biases Inference [docs](https://docs.wandb.ai/guides/inference/)

## Imports + API keys

Our Deep Research Agent will actually still only use 2 services: 
1. W&B for inference and tracking 
2. Exa for web search 

In [None]:
#if you are running this on colab, uncomment the following line and run it
#!uv pip install exa-py weave openai

In [1]:
# auto reload and reload ext
%load_ext autoreload
%autoreload 2

In [2]:
# Global Configuration & Setup
import inspect
import json
import os
import requests
import weave
import openai
from enum import Enum
from pydantic import BaseModel, Field
from rich.pretty import pprint
from typing import Any, Callable, Dict, List, get_type_hints
from exa_py import Exa
from datetime import datetime


In [3]:
#if you are running this on colab, uncomment the following lines and run it
#from google.colab import userdata
#EXA_API_KEY=userdata.get('EXA_API_KEY')
#OPENAI_API_KEY=userdata.get('OPENAI_API_KEY')
#WANDB_API_KEY=userdata.get('WANDB_API_KEY')

# if you use .env file, uncomment the following lines and run it
from dotenv import load_dotenv
load_dotenv()
EXA_API_KEY=os.getenv('EXA_API_KEY')
OPENAI_API_KEY=os.getenv('OPENAI_API_KEY')
WANDB_API_KEY=os.getenv('WANDB_API_KEY')


In [4]:
MODEL_SMALL = "Qwen/Qwen3-235B-A22B-Instruct-2507"
MODEL_MEDIUM = "zai-org/GLM-4.5"
MODEL_LARGE = "moonshotai/Kimi-K2-Instruct"

WANDB_ENTITY = "wandb-applied-ai-team"
WANDB_PROJECT = "london-workshop-2025"

oai_client = openai.OpenAI(
    base_url='https://api.inference.wandb.ai/v1',
    api_key=os.getenv("WANDB_API_KEY"),
    project=f"{WANDB_ENTITY}/{WANDB_PROJECT}")

exa_client = Exa(api_key=os.getenv("EXA_API_KEY"))

weave.init("wandb-applied-ai-team/london-workshop-2025")

[36m[1mweave[0m: weave version 0.52.10 is available!  To upgrade, please run:
[36m[1mweave[0m:  $ pip install weave --upgrade
[36m[1mweave[0m: Logged in as Weights & Biases user: agatamlyn.
[36m[1mweave[0m: View Weave data at https://wandb.ai/wandb-applied-ai-team/london-workshop-2025/weave


<weave.trace.weave_client.WeaveClient at 0x154d76750>

## Helper functions

In [7]:
#these are the same functions we have covered in the notebook 01_simple_tool_calling_agent.ipynb so lets just import them here
from utils import function_tool, perform_tool_calls

In [8]:
@weave.op
def call_model(model_name: str, messages: List[Dict[str, Any]], **kwargs) -> str:
    "Call a model with the given messages and kwargs."
    response = oai_client.chat.completions.create(
        model=model_name,
        messages=messages,
        **kwargs
    )

    return response.choices[0].message

In [9]:
def get_today_str() -> str:
    """Get current date in a human-readable format."""
    return datetime.now().strftime("%a %b %-d, %Y")

## Prompts

In [None]:
DEEP_RESEARCH_AGENT_PROMPT = """
  You are a research assistant conducting research on the user's input topic. For context, today's date is {date}.                                                                                                        │

  <Task>
  Your job is to use tools to gather information about the user's input topic.
  You can use any of the tools provided to you to find resources that can help answer the research question.
  You can call these tools in series or in parallel, your research is conducted in a tool-calling loop.
  Your response should be a thorough answer to the user's question, citing sources and reasoning, providing an overview of the facts or any gaps in the subject.
  </Task>

  <Available Tools>
  You have access to the following tools:
  1. **clarification_tool**: For asking user clarifying questions if needed. If you have clarifying questions start with this.
  2. **planning_tool**: For planning the research.
  2. **exa_search**: For conducting web searches to gather information
  2. **think_tool**: For reflection and strategic planning during research

  **CRITICAL: Use think_tool after each search to reflect on results and plan next steps**
  </Available Tools>

  <Instructions>
  Think like a human researcher with limited time. Follow these steps:

  1. **Read the question carefully** - What specific information does the user need?
  2. **Start with broader searches** - Use broad, comprehensive queries first
  3. **After each search, pause and assess** - Do I have enough to answer? What's still missing?
  4. **Execute narrower searches as you gather information** - Fill in the gaps
  5. **Stop when you can answer confidently** - Don't keep searching for perfection
  6. **Provide an answer** - At the end, always provide the answer from your research.
  </Instructions>

  <Hard Limits>
  **Tool Call Budgets** (Prevent excessive searching):
  - **Simple queries**: Use 2-3 search tool calls maximum
  - **Complex queries**: Use up to 5 search tool calls maximum
  - **Always stop**: After 5 search tool calls if you cannot find the right sources

  **Stop Immediately When**:
  - You can answer the user's question comprehensively
  - You have 3+ relevant examples/sources for the question
  - Your last 2 searches returned similar information
  </Hard Limits>

  <Show Your Thinking>
  After each search tool call, use think_tool to analyze the results:
  - What key information did I find?
  - What's missing?
  - Do I have enough to answer the question comprehensively?
  - Should I search more or provide my answer?
  </Show Your Thinking>
"""

## Tools

Thomas already introduced our first tool the exa_search tool so we will import it from our tools.py instead of redefining it. 

Next we will add 3 new tools to make upgrade this agent from a simple search agent to a deep research one. 

In [21]:
# import the exa_search tool Thomas introduced in the previous notebook 01_simple_tool_calling_agent.ipynb
from tools import exa_search_and_refine
exa_search = exa_search_and_refine

### clarification tool
If you have used another deep research service, like ChatGPT Deep Research, you will be failiar with the first step which is the claryfication questions. Users oftentime submit a one sentance request which often lacks the necessary information to provide them a deep answer that will really answer what they were looking for. 
In the case of ChatGPT, these questions are mandatory and happen every time you create a new Deep Research request, in our case we actually give the agent the choice to call the tool if it thinks it needs more information to get started. 

In [None]:
@weave.op
@function_tool
def clarification_tool(clarifying_questions):
  """                                                                                                                                                                                                               │                                                                                                                 │
  Use this tool to ask clarifying questions to the user.
  IMPORTANT: If you can see in the messages history that you have already asked a clarifying question, you almost always do not need to ask another one. Only ask another question if ABSOLUTELY NECESSARY.

  If there are acronyms, abbreviations, or unknown terms, ask the user to clarify.
  If you need to ask a question, follow these guidelines:
  - Be concise while gathering all necessary information.
  - Only ask max 3 questions.
  - Make sure to gather all the information needed to carry out the research task in a concise, well-structured manner.
  - Use bullet points or numbered lists if appropriate for clarity. Make sure that this uses markdown formatting and will be rendered correctly if the string output is passed to a markdown renderer.
  - Don't ask for unnecessary information, or information that the user has already provided. If you can see that the user has already provided the information, do not ask for it again.

  This tool will return the user clarifications.
  """
  output = input(clarifying_questions)
  return output


### planning tool
Another tool available for the agent is the planning tool. The agent should use this tool to analyze the users query and break it down into subqueries.

In [None]:
@weave.op
@function_tool
def planning_tool(plan: str) -> str:
  """Tool for planning the research.

  If there are no clarifying questions, use this tool as the first step of the research.

  Your plan should include:
  1. Short analysis of user request.
  2. Sub-queries broken down from users request, for example: if the query is 'what are 3 heaviest pokemons and their weight combined' the sub queries should be 'what are 3 heaviest pokemons' 'pokemon1 weight', 'pokemon2 weight', 'pokemon3 weight'.

  Args:
    plan: plan for the research.
  """

### think tool 
The agent should call this tool after each search. This tool will allow the agent to think about the current finidings, identify gaps in the research and decide if further research is neccessary. 

In [14]:
@weave.op
@function_tool
def think_tool(reflection: str) -> str:
    """Tool for strategic reflection on research progress and decision-making.

    Use this tool after each search to analyze results and plan next steps systematically.
    This creates a deliberate pause in the research workflow for quality decision-making.

    When to use:
    - After receiving search results: What key information did I find?
    - Before deciding next steps: Do I have enough to answer comprehensively?
    - When assessing research gaps: What specific information am I still missing?
    - Before concluding research: Can I provide a complete answer now?

    Reflection should address:
    1. Analysis of current findings - What concrete information have I gathered?
    2. Gap assessment - What crucial information is still missing?
    3. Quality evaluation - Do I have sufficient evidence/examples for a good answer?
    4. Strategic decision - Should I continue searching or provide my answer?

    Args:
        reflection: Your detailed reflection on research progress, findings, gaps, and next steps
    """

In [None]:
ToolCall = [clarification_tool, planning_tool, exa_search, think_tool]

## Agent

In [16]:
class AgentState(BaseModel):
    """Manages the state of the agent."""
    messages: List[Dict[str, Any]] = Field(default_factory=list)
    step: int = Field(default=0)
    final_assistant_content: str | None = None # Populated at the end of a run

In [None]:
class DeepResearchAgent:
    """A deep research agent class with tracing, state, and tool processing."""
    def __init__(self, model_name: str, system_message: str, tools: List[Callable]):
        self.model_name = model_name
        self.system_message = system_message
        self.tools = [function_tool(t) for t in tools] # add schemas to the tools

    @weave.op(name="DeepResearchAgent.step") # Trace each step
    def step(self, state: AgentState) -> AgentState:
        step = state.step + 1
        messages = state.messages
        final_assistant_content = None
        try:
            # call model with tools
            response = call_model(
                model_name=self.model_name,
                messages=messages,
                tools=[t.tool_schema for t in self.tools])

            # add the response to the messages
            messages.append(response.model_dump())

            # if the LLM requested tool calls, perform them
            if response.tool_calls:
                print("LLM requested tool calls:")
                # perform the tool calls
                tool_outputs = perform_tool_calls(tools=[clarification_tool, planning_tool, think_tool, exa_search], tool_calls=response.tool_calls)
                messages.extend(tool_outputs)

            # LLM gave content response
            else:
                messages.append(response.model_dump())
                final_assistant_content = response.content
        except Exception as e:
            print(f"ERROR in Agent Step: {e}")
            # Add an error message to history to indicate failure
            messages.append({"role": "assistant", "content": f"Agent error in step: {str(e)}"})
            final_assistant_content = f"Agent error in step {step}: {str(e)}"
        return AgentState(messages=messages, step=step, final_assistant_content=final_assistant_content)

    @weave.op(name="DeepResearchAgent.run")
    def run(self, user_prompt: str, max_turns: int = 10) -> AgentState:
        state = AgentState(messages=[
            {"role": "system", "content": self.system_message},
            {"role": "user", "content": user_prompt}])
        for _ in range(max_turns):
            print(f"--- Agent Loop Turn {state.step}/{max_turns} ---")
            state = self.step(state)
            if state.final_assistant_content:
                return state
        return state

## Run

In [None]:
if __name__ == "__main__":

	agent = DeepResearchAgent(
		model_name=MODEL_LARGE,
		system_message=DEEP_RESEARCH_AGENT_PROMPT.format(date=get_today_str()),
		tools=[clarification_tool, planning_tool, think_tool, exa_search]
	)
	state = agent.run(user_prompt="What type of vegan milk alternative is the healthiest?")
	print(f"Final response: {state.final_assistant_content}")

[36m[1mweave[0m: 🍩 https://wandb.ai/wandb-applied-ai-team/london-workshop-2025/r/call/0199f24a-286b-777f-b2e1-6caad2c339d2


--- Agent Loop Turn 0/10 ---
LLM requested tool calls:


--- Agent Loop Turn 1/10 ---
LLM requested tool calls:


--- Agent Loop Turn 2/10 ---
LLM requested tool calls:


--- Agent Loop Turn 3/10 ---
Final response: Based on comprehensive nutritional analysis from dietitians and consumer research organizations, **soy milk** emerges as the healthiest vegan milk alternative overall.

## Why Soy Milk Ranks #1

**Complete Protein**: With 7g of complete protein per cup, soy milk is the only plant milk that provides all essential amino acids in amounts comparable to dairy milk - significantly higher than almond (1g), oat (3g), or coconut (0g).

**Balanced Nutrition Profile**:
- **Calories**: 80 per cup (moderate, not too high or low)
- **Fat**: 4g, mostly heart-healthy unsaturated fats
- **Carbohydrates**: 4g (low, suitable for blood sugar management)
- **Sugars**: 1g (minimal)
- **Saturated Fat**: Very low vs. coconut milk's 4.5g saturated fat

## How Other Milks Compare

| Milk Alternative | Protein (g) | Calories | Key Limitations |
|------------------|-------------|----------|-----------------|
| **Almond Milk**  | 1g | 30-60 | Extremely low protein, most