## Deep Research

One of the classic cross-business Agentic use cases! This is huge.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">A Deep Research agent is broadly applicable to any business area, and to your own day-to-day activities. You can make use of this yourself!
            </span>
        </td>
    </tr>
</table>

I hope you find this lab useful! I would love to connect on LinkedIn: https://www.linkedin.com/in/kyranank/

### Upgrades to Original System (Rough Notes)

- When given a query, come up with 3 questions that will help answer the query
    - Then come up with a search query to answer each question
- Incorporate handoffs and using agents as tools
    - Agents as Tools: 1 Google Search agent and 1 OpenAI Web Search agent. The research manager will decide if more queries are needed (and can call planner_agent again), or if need to run again.
    - Agents as Tools: 1 question formulator agent and 1 search planner agent which get used by the research manager in sequence.
    - Handoff the research to the writer agent
    - Handoff the report to the editor agent

PLAN
- Create & test question_formulator_agent
    - Given a query, formulate X research questions to help answer the query
    - Test with a query
    - Convert to .as_tool()
- Adjust previous planner_agent & test
    - Use the question_formulator_agent as a tool to generate questions
    - Create 1 search query to answer each 1 of the questions
    - Test with simulated questions
    - Convert to .as_tool()
- Create & test gemini_search_agent
    - Someone has a version in the community contributions folder
    - Take the list of queries and perform a search for each one
    - Test with simulated query list
    - Convert to .as_tool()
- Adjust previous openai_search_agent & test
    - Take the list of queries and perform a search for each one
    - Test with simulated query list
    - Convert to .as_tool()
- Create an editor_agent
    - Receives the report and edits it to a professional level
    - Returns the same as writer_agent
- Adjust the writer_agent
    - Receives the research, writes a report, also returns a markdown report
    - Handsoff to editor_agent
- Create a research_manager_agent
    - Given a query use the planner_agent to generate search queries to research the query
    - Next use the gemini_search_agent and openai_search_agents to conduct the search queries. They should each be used once for every search query. That way you receive two different results to consider for each query for the final research report. If you believe more information is needed to address the query, you can use the planner_agent only one additional time followed by the gemini_search_agent and the gemini_search_agent to gather additional research. You will then handoff this research to the writer_agent to write a report.

In [1]:
from agents import Agent, WebSearchTool, trace, Runner, function_tool, handoff
from agents.model_settings import ModelSettings
from pydantic import BaseModel, Field
from dotenv import load_dotenv
import asyncio
import os
from sendgrid.helpers.mail import Mail, Email, To, Content
from typing import Dict
from IPython.display import display, Markdown
from agents import OpenAIChatCompletionsModel
from openai import AsyncOpenAI
import requests
from bs4 import BeautifulSoup

In [2]:
load_dotenv(override=True)

True

In [None]:
GEMINI_BASE_URL = "https://generativelanguage.googleapis.com/v1beta/openai"
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
GOOGLE_CUSTOM_SEARCH_API_KEY = os.getenv("GOOGLE_CUSTOM_SEARCH_API_KEY")
GOOGLE_CSE_ID = os.getenv("GOOGLE_CSE_ID")

gemini_client = AsyncOpenAI(base_url=GEMINI_BASE_URL, api_key=os.getenv('GOOGLE_API_KEY'))
gemini_model = OpenAIChatCompletionsModel(model="gemini-2.0-flash",openai_client=gemini_client)

print(GOOGLE_CUSTOM_SEARCH_API_KEY)
print(GOOGLE_CSE_ID)

### Question Formulator Agent
- Create & test question_formulator_agent
    - Given a query, formulate X research questions to help answer the query
    - Test with a query
    - Convert to .as_tool()

In [12]:
INSTRUCTIONS = "You are a helpful research assistant. Given a query, formulate a set of research questions \
that when answered will help answer or address the query. Output 3 questions to research."

class ResearchQuestionItem(BaseModel):
    reason: str = Field(description="Your reasoning for why this question is important to address the query.")
    question: str = Field(description="The research question to use to conduct further research regarding the query.")

class ResearchQuestionPlan(BaseModel):
    questions: list[ResearchQuestionItem] = Field(description="A list of research questions to research to best answer the query.")

question_formulator_agent = Agent(
    name="Question Formulator Agent",
    instructions=INSTRUCTIONS,
    model=gemini_model,
    output_type=ResearchQuestionPlan,
)

In [13]:
message = "Latest AI Agent frameworks in 2025"

with trace("Test - Question Formulator Agent"):
    result = await Runner.run(question_formulator_agent, message)

print(result.final_output)

questions=[ResearchQuestionItem(reason='This question directly addresses the core of the query by identifying the specific frameworks that are relevant in 2025.', question='What AI agent frameworks have gained prominence or emerged as new solutions in the year 2025?'), ResearchQuestionItem(reason='Understanding the functionalities and drawbacks of each framework is crucial for evaluating their suitability for different applications and use cases.', question='What are the key features, capabilities, and limitations of the leading AI agent frameworks in 2025?'), ResearchQuestionItem(reason='This will provide context on how these frameworks are being used in practice, which indicates their relevance and adoption rates.', question='What are the typical applications and use cases of AI Agent Frameworks in 2025?')]


In [14]:
description = "Generate research questions"

tool_question_formulator = question_formulator_agent.as_tool(tool_name="question_formulator_agent", tool_description=description)

tools_for_planner_agent = [tool_question_formulator]

tools_for_planner_agent

[FunctionTool(name='question_formulator_agent', description='Generate research questions', params_json_schema={'properties': {'input': {'title': 'Input', 'type': 'string'}}, 'required': ['input'], 'title': 'question_formulator_agent_args', 'type': 'object', 'additionalProperties': False}, on_invoke_tool=<function function_tool.<locals>._create_function_tool.<locals>._on_invoke_tool at 0x10cd251c0>, strict_json_schema=True, is_enabled=True)]

### Planner Agent
- Adjust previous planner_agent & test
    - Use the question_formulator_agent as a tool to generate questions
    - Create 1 search query to answer each 1 of the questions
    - Test with simulated questions
    - Convert to .as_tool()

In [15]:
INSTRUCTIONS = "You are a helpful research assistant. Given a query you will use the question_formulator_agent as a tool to \
come up with a set of questions. Your role is to create 1 web search for each question to perform to best answer the query. \
Output the web search terms to query for, there should be exactly as many web search queries as questions you received."

class WebSearchItem(BaseModel):
    reason: str = Field(description="Your reasoning for why this search is important to the query and will answer the question.")
    query: str = Field(description="The search term to use for the web search.")

class WebSearchPlan(BaseModel):
    searches: list[WebSearchItem] = Field(description="A list of web searches to perform to best answer the query.")

planner_agent = Agent(
    name="Planner Agent",
    instructions=INSTRUCTIONS,
    model=gemini_model,
    output_type=WebSearchPlan,
    tools=tools_for_planner_agent
)

In [16]:
message = "Latest AI Agent frameworks in 2025"

with trace("Test - Planner Agent"):
    result = await Runner.run(planner_agent, message)
    print(result.final_output)

searches=[WebSearchItem(reason='This search directly addresses the query about the latest AI agent frameworks in 2025 and seeks information about trends in that timeframe.', query='AI agent frameworks 2025 trends'), WebSearchItem(reason='This search focuses on platforms used to develop AI agents, aiming to identify new and popular platforms in 2025.', query='emerging AI agent development platforms 2025'), WebSearchItem(reason='This search explores the architectural designs of AI agents that are expected to be prominent in 2025.', query='future of AI agent architecture 2025')]


In [17]:
description = "Generate web search terms"

tool_planner_agent = planner_agent.as_tool(tool_name="planner_agent", tool_description=description)
tools_for_research_manager_agent = [tool_planner_agent]
tools_for_research_manager_agent

[FunctionTool(name='planner_agent', description='Generate web search terms', params_json_schema={'properties': {'input': {'title': 'Input', 'type': 'string'}}, 'required': ['input'], 'title': 'planner_agent_args', 'type': 'object', 'additionalProperties': False}, on_invoke_tool=<function function_tool.<locals>._create_function_tool.<locals>._on_invoke_tool at 0x10cd25ee0>, strict_json_schema=True, is_enabled=True)]

### Gemini Search Agent
- Create & test gemini_search_agent
    - Someone has a version in the community contributions folder
    - Take the list of queries and perform a search for each one
    - Test with simulated query list
    - Convert to .as_tool()

In [4]:
# Tool for Web Search - Google doesn't offer a remote WebSearchTool like OpenAI
# Need to set up Google Programmable Search / Custom Search API and get a key + search engine ID

@function_tool
async def google_web_search(query: str) -> str:
    """
    Performs a Google Search and returns a string of concatenated result snippets.
    """
    endpoint = "https://www.googleapis.com/customsearch/v1"
    params = {
        "key": GOOGLE_CUSTOM_SEARCH_API_KEY,
        "cx": GOOGLE_CSE_ID,
        "q": query,
        "num": 5,  # Number of results to fetch
    }

    response = requests.get(endpoint, params=params)
    if response.status_code != 200:
        raise RuntimeError(f"Google Search API error {response.status_code}: {response.text}")

    data = response.json()
    items = data.get("items", [])

    if not items:
        return "No results found."

    # Combine title and snippet for each result
    result_texts = []
    for item in items:
        title = item.get("title", "").strip()
        snippet = item.get("snippet", "").strip()
        link = item.get("link", "").strip()
        result_texts.append(f"{title}\n{snippet}\n{link}")

    summary = "\n\n".join(result_texts)
    return summary

In [6]:
query = 'Python programming'

url = f'https://www.googleapis.com/customsearch/v1?q={query}&key={GOOGLE_CUSTOM_SEARCH_API_KEY}&cx={GOOGLE_CSE_ID}'

response = requests.get(url)

if response.status_code == 200:
    print("Google Web Search API is active!")
    print(response.json())  # Display the response data
else:
    print("API request failed. Status code:", response.status_code)

Google Web Search API is active!
{'kind': 'customsearch#search', 'url': {'type': 'application/json', 'template': 'https://www.googleapis.com/customsearch/v1?q={searchTerms}&num={count?}&start={startIndex?}&lr={language?}&safe={safe?}&cx={cx?}&sort={sort?}&filter={filter?}&gl={gl?}&cr={cr?}&googlehost={googleHost?}&c2coff={disableCnTwTranslation?}&hq={hq?}&hl={hl?}&siteSearch={siteSearch?}&siteSearchFilter={siteSearchFilter?}&exactTerms={exactTerms?}&excludeTerms={excludeTerms?}&linkSite={linkSite?}&orTerms={orTerms?}&dateRestrict={dateRestrict?}&lowRange={lowRange?}&highRange={highRange?}&searchType={searchType}&fileType={fileType?}&rights={rights?}&imgSize={imgSize?}&imgType={imgType?}&imgColorType={imgColorType?}&imgDominantColor={imgDominantColor?}&alt=json'}, 'queries': {'request': [{'title': 'Google Custom Search - Python programming', 'totalResults': '381000000', 'searchTerms': 'Python programming', 'count': 10, 'startIndex': 1, 'inputEncoding': 'utf8', 'outputEncoding': 'utf8', 

In [7]:
INSTRUCTIONS = "You are a research assistant. Given a web search term, you search the web using the google_web_search tool \
for that term and produce a concise summary of the results. The summary must 5-8 paragraphs and between 500-1000 words \
words. Capture the main points. Write succintly, add evidence/facts, no need to have complete sentences or good \
grammar. This will be consumed by someone synthesizing a report, so it's vital you capture the \
essence and ignore any fluff. Do not include any additional commentary other than the summary itself."

gemini_search_agent = Agent(
    name="Gemini Search Agent",
    instructions=INSTRUCTIONS,
    tools=[google_web_search],
    model=gemini_model,
    model_settings=ModelSettings(tool_choice="required"),
)

In [8]:
message = "notable AI agent frameworks introduced or updated in 2025"

with trace("Test - Gemini Search Agent"):
    result = await Runner.run(gemini_search_agent, message)

display(Markdown(result.final_output))

Based on the search results, here's a summary of notable AI agent frameworks introduced or updated in 2025:

**General Trends and Overviews:**

*   **Agentic AI Frameworks:** 2025 saw a rise in frameworks designed to build AI agents for web applications.
*   **Microsoft Build 2025:** Microsoft emphasized AI agents and open agentic web development, introducing new models and coding agents and focusing on enterprise-grade agents within platforms like Azure AI.
*   **Agent Interoperability:** Google introduced the Agent2Agent Protocol (A2A) to foster interoperability between AI agents, enabling them to work together and enhance productivity by autonomously handling daily tasks.

**Specific Frameworks Mentioned:**

*   **Transformers Agents (Hugging Face):** This framework leverages transformer models.
*   **Mastra Core:** A framework enabling developers to add AI agents to web apps.  Example provided shows importing Agent from '@mastra/core'.
*   **Pydantic AI:**  Praised for its clean design.

**Key Themes and Concepts:**

*   **Productivity:** AI agents aim to boost productivity by automating tasks.
*   **Interoperability:** A focus on agents working together.
*   **Enterprise-Grade Agents:** Development of AI agents suitable for business applications.
*   **Open Agentic Web:** Building AI agents in an open and accessible manner.


OPENAI_API_KEY is not set, skipping trace export
OPENAI_API_KEY is not set, skipping trace export
OPENAI_API_KEY is not set, skipping trace export
OPENAI_API_KEY is not set, skipping trace export
OPENAI_API_KEY is not set, skipping trace export
OPENAI_API_KEY is not set, skipping trace export
OPENAI_API_KEY is not set, skipping trace export
OPENAI_API_KEY is not set, skipping trace export
OPENAI_API_KEY is not set, skipping trace export
OPENAI_API_KEY is not set, skipping trace export
OPENAI_API_KEY is not set, skipping trace export


In [9]:
description = "Search the web using Google Search"

tool_google_search = gemini_search_agent.as_tool(tool_name="gemini_search_agent", tool_description=description)

In [18]:
tools_for_research_manager_agent.append(tool_google_search)
tools_for_research_manager_agent

[FunctionTool(name='planner_agent', description='Generate web search terms', params_json_schema={'properties': {'input': {'title': 'Input', 'type': 'string'}}, 'required': ['input'], 'title': 'planner_agent_args', 'type': 'object', 'additionalProperties': False}, on_invoke_tool=<function function_tool.<locals>._create_function_tool.<locals>._on_invoke_tool at 0x10cd25ee0>, strict_json_schema=True, is_enabled=True),
 FunctionTool(name='gemini_search_agent', description='Search the web using Google Search', params_json_schema={'properties': {'input': {'title': 'Input', 'type': 'string'}}, 'required': ['input'], 'title': 'gemini_search_agent_args', 'type': 'object', 'additionalProperties': False}, on_invoke_tool=<function function_tool.<locals>._create_function_tool.<locals>._on_invoke_tool at 0x10c215e40>, strict_json_schema=True, is_enabled=True)]

### OpenAI Search Agent
- Adjust previous openai_search_agent & test
    - Take the list of queries and perform a search for each one
    - Test with simulated query list
    - Convert to .as_tool()
    

In [None]:
INSTRUCTIONS = "You are a research assistant. Given a web search term, you search the web \
for that term and produce a concise summary of the results. The summary must 5-8 paragraphs and between 500-1000 words \
words. Capture the main points. Write succintly, add evidence/facts, no need to have complete sentences or good \
grammar. This will be consumed by someone synthesizing a report, so it's vital you capture the \
essence and ignore any fluff. Do not include any additional commentary other than the summary itself."

openai_search_agent = Agent(
    name="Search Agent",
    instructions=INSTRUCTIONS,
    tools=[WebSearchTool(search_context_size="low")],
    model="gpt-4o-mini",
    model_settings=ModelSettings(tool_choice="required"),
)

In [None]:
message = "notable AI agent frameworks introduced or updated in 2025"

with trace("Search"):
    result = await Runner.run(openai_search_agent, message)

display(Markdown(result.final_output))

In [None]:
description = "Search the web using OpenAI's WebSearchTool"

tool_openai_search = openai_search_agent.as_tool(tool_name="openai_search_agent", tool_description=description)

In [None]:
tools_for_research_manager_agent.append(tool_openai_search)
tools_for_research_manager_agent

### Editor Agent
- Create an editor_agent
    - Receives the report and edits it to a professional level
    - Returns the same as writer_agent

In [19]:
class ReportData(BaseModel):
    short_summary: str = Field(description="A short 2-3 sentence summary of the findings.")
    markdown_report: str = Field(description="The final report")
    follow_up_questions: list[str] = Field(description="Suggested topics to research further")

In [20]:
INSTRUCTIONS = (
    "You are a senior editor tasked with editing and revising a cohesive report for a research query. "
    "You will be provided with the report data including a short summary, full markdown_report, and some follow_up_questions\n"
    "You should first review the report, analyze the structure, correct grammar, spelling, sentence structure, etc. for profressionalism"
    "Then, create a revised report using the original report as a basis and return that as your final output.\n"
    "The final output should be in markdown format, and it should be lengthy and detailed with FULL SENTENCES."
    "This is a research report, aim for 10-15 pages of content, AT LEAST 2000 words, maximum of 5000 words."
)

editor_agent = Agent(
    name="Editor Agent",
    instructions=INSTRUCTIONS,
    model=gemini_model,
    output_type=ReportData
)

### Writer Agent
- Adjust the writer_agent
    - Receives the research, writes a report, also returns a markdown report
    - Handsoff to editor_agent

In [21]:
INSTRUCTIONS = (
    "You are a senior researcher tasked with writing a cohesive report for a research query. "
    "You will be provided with the original query, and some initial research done by a research assistant.\n"
    "You should first come up with an outline for the report that describes the structure and "
    "flow of the report. Then, generate the report using the research as a basis and return that as your final output.\n"
    "The final output should be in markdown format, and it should be lengthy and detailed. Aim "
    "for 10-15 pages of content, at least 2000 words."
    "Once you have gathered all research results, hand off the complete set of research materials to the `editor_agent`.\n"
    "The `writer_agent` will be responsible for synthesizing the information into a final research report."
)

writer_agent = Agent(
    name="Writer Agent",
    instructions=INSTRUCTIONS,
    model=gemini_model,
    output_type=ReportData,
    handoffs=[handoff(editor_agent)]
)

### Research Manager Agent
- Create a research_manager_agent
    - Given a query use the planner_agent to generate search queries to research the query
    - Next use the gemini_search_agent and openai_search_agents to conduct the search queries. They should each be used once for every search query. That way you receive two different results to consider for each query for the final research report. If you believe more information is needed to address the query, you can use the planner_agent only one additional time followed by the gemini_search_agent and the gemini_search_agent to gather additional research. You will then handoff this research to the writer_agent to write a report.

In [22]:
   #   - `openai_search_agent`
INSTRUCTIONS = """
You are a Research Manager Agent responsible for coordinating a multi-step research process to produce high-quality research reports.

Follow this workflow precisely:

1. **Planning Phase**
   - When given a research query, always start by using the `planner_agent`.
   - Use the `planner_agent` to generate a set of structured search queries. The planner will return these queries as a Pydantic model.

2. **Initial Research Phase**
   - For each search query provided by the planner, conduct research using both of the following agents:
     - `gemini_search_agent`

   - You must use each of these search agents exactly once per search query. This ensures you gather two different perspectives for every query.

3. **Supplemental Research (if necessary)**
   - After reviewing the initial results, if you determine that additional information is required to adequately address the original research query, you may proceed to collect more research.
   - To do this, you may call the `planner_agent` **only one additional time** to generate more search queries.
   - For each additional search query, again use both:
     - `gemini_search_agent`

   - Do not exceed this single additional round of planning and research.

4. **Report Handoff**
   - Once you have gathered all research results, hand off the complete set of research materials to the `writer_agent`.
   - The `writer_agent` will be responsible for synthesizing the information into a final research report.

Important Guidelines:
- Always perform the research workflow in the sequence described: planner → search agents → (optional, only allowed to call 1 additional time) planner → search agents → writer.
- Do not skip any steps or call the search agents before the planner.
- DO NOT call the planner more than two (2) times TOTAL per query. You should only call it a maximum of 1 initial time and 1 additional time, for a total of 2 times.
- Ensure all gathered results are organized clearly before passing them to the writer agent.
"""

In [23]:
tools_for_research_manager_agent

[FunctionTool(name='planner_agent', description='Generate web search terms', params_json_schema={'properties': {'input': {'title': 'Input', 'type': 'string'}}, 'required': ['input'], 'title': 'planner_agent_args', 'type': 'object', 'additionalProperties': False}, on_invoke_tool=<function function_tool.<locals>._create_function_tool.<locals>._on_invoke_tool at 0x10cd25ee0>, strict_json_schema=True, is_enabled=True),
 FunctionTool(name='gemini_search_agent', description='Search the web using Google Search', params_json_schema={'properties': {'input': {'title': 'Input', 'type': 'string'}}, 'required': ['input'], 'title': 'gemini_search_agent_args', 'type': 'object', 'additionalProperties': False}, on_invoke_tool=<function function_tool.<locals>._create_function_tool.<locals>._on_invoke_tool at 0x10c215e40>, strict_json_schema=True, is_enabled=True)]

In [25]:
research_manager_agent = Agent(
    name="Research Manager Agent",
    instructions=INSTRUCTIONS,
    tools=tools_for_research_manager_agent,
    model=gemini_model,
    model_settings=ModelSettings(tool_choice="required"),
    handoffs=[handoff(writer_agent)]
)

In [26]:
# RUN AGENTIC SYSTEM HERE!

message = "Latest AI Agent frameworks in 2025"

with trace("Search"):
    final_result = await Runner.run(research_manager_agent, message)
    print("Report Complete!")

RateLimitError: Error code: 429 - [{'error': {'code': 429, 'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.', 'status': 'RESOURCE_EXHAUSTED', 'details': [{'@type': 'type.googleapis.com/google.rpc.QuotaFailure', 'violations': [{'quotaMetric': 'generativelanguage.googleapis.com/generate_content_free_tier_requests', 'quotaId': 'GenerateRequestsPerMinutePerProjectPerModel-FreeTier', 'quotaDimensions': {'location': 'global', 'model': 'gemini-2.0-flash'}, 'quotaValue': '15'}]}, {'@type': 'type.googleapis.com/google.rpc.Help', 'links': [{'description': 'Learn more about Gemini API quotas', 'url': 'https://ai.google.dev/gemini-api/docs/rate-limits'}]}, {'@type': 'type.googleapis.com/google.rpc.RetryInfo', 'retryDelay': '27s'}]}}]

In [None]:
display(Markdown(final_result.final_output.short_summary))

In [None]:
display(Markdown(final_result.final_output.markdown_report))

### Trace

https://platform.openai.com/traces

In [None]:
from agents.extensions.visualization import draw_graph
draw_graph(research_manager_agent)

In [None]:
from agents.extensions.visualization import draw_graph
draw_graph(writer_agent)

In [None]:
from agents.extensions.visualization import draw_graph
draw_graph(editor_agent)