## Deep Research

One of the classic cross-business Agentic use cases! This is huge.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">A Deep Research agent is broadly applicable to any business area, and to your own day-to-day activities. You can make use of this yourself!
            </span>
        </td>
    </tr>
</table>

In [8]:
from agents import Agent, WebSearchTool, trace, Runner, gen_trace_id, function_tool
from agents.model_settings import ModelSettings
from pydantic import BaseModel, Field
from dotenv import load_dotenv
import asyncio
import sendgrid
import os
from sendgrid.helpers.mail import Mail, Email, To, Content
from typing import Dict
from IPython.display import display, Markdown

#libraries to fix certificate error
import ssl
import certifi

In [9]:
load_dotenv(override=True)

#this fixes certificate error 
os.environ['SSL_CERT_FILE'] = certifi.where()
ssl_context = ssl.create_default_context(cafile=certifi.where())

## OpenAI Hosted Tools

OpenAI Agents SDK includes the following hosted tools:

The `WebSearchTool` lets an agent search the web.  
The `FileSearchTool` allows retrieving information from your OpenAI Vector Stores.  
The `ComputerTool` allows automating computer use tasks like taking screenshots and clicking.

### Important note - API charge of WebSearchTool

This is costing me 2.5 cents per call for OpenAI WebSearchTool. That can add up to $2-$3 for the next 2 labs. We'll use free and low cost Search tools with other platforms, so feel free to skip running this if the cost is a concern. Also student Christian W. pointed out that OpenAI can sometimes charge for multiple searches for a single call, so it could sometimes cost more than 2.5 cents per call.

Costs are here: https://platform.openai.com/docs/pricing#web-search

In [4]:
INSTRUCTIONS = "You are a research assistant. Given a search term, you search the web for that term and \
produce a concise summary of the results. The summary must 2-3 paragraphs and less than 300 \
words. Capture the main points. Write succintly, no need to have complete sentences or good \
grammar. This will be consumed by someone synthesizing a report, so it's vital you capture the \
essence and ignore any fluff. Do not include any additional commentary other than the summary itself."

search_agent = Agent(
    name="Search agent",
    instructions=INSTRUCTIONS,
    tools=[WebSearchTool(search_context_size="low")],
    model="gpt-4o-mini",
    model_settings=ModelSettings(tool_choice="required"), #mandatory -- LLM is REQUIRED to use this tool
)

In [None]:
message = "Market Openings in the Environmental Sector 2026"

with trace("Search"):

    #result = await Runner.run(agent, message, context_variables, max_turns, model_override, debug, stream) 
    #=> result.final_output (actual text), result.metadata (info on token usage/tools called), result.status (whether run succeeded/not)
    result = await Runner.run(search_agent, message)

display(Markdown(result.final_output))

The environmental sector is experiencing significant growth, leading to increased market openings across various industries. The global environmental consulting services market is projected to reach $60.55 billion by 2035, with a compound annual growth rate (CAGR) of 4.6% from 2026 to 2035. This growth is driven by stricter environmental regulations and a heightened focus on sustainability. ([businessresearchinsights.com](https://www.businessresearchinsights.com/market-reports/environmental-consulting-services-market-118587?utm_source=openai))

In the United States, the environmental industry is expected to generate approximately $540 billion in 2025, with a projected CAGR of 3.8% from 2025 through 2026. This growth is attributed to infrastructure funding, energy security, energy transition, Environmental, Social, and Governance (ESG) initiatives, and climate resilience efforts. ([sec.gov](https://www.sec.gov/Archives/edgar/data/0001643615/000095017025043826/2025_ars_final.pdf?utm_source=openai))

The environmental testing market is also expanding, with an estimated size of $13.61 billion in 2025 and a projected CAGR of 7.3% from 2026 to 2033. This growth is driven by stricter environmental standards and increased public awareness of pollution and sustainability. ([grandviewresearch.com](https://www.grandviewresearch.com/industry-analysis/environmental-testing-market-report?utm_source=openai))

Additionally, the environmental technology market is projected to reach $957.77 billion by 2034, growing at a CAGR of 4.42% from 2025 to 2034. ([precedenceresearch.com](https://www.precedenceresearch.com/environmental-technology-market?utm_source=openai))

These trends indicate a robust expansion in the environmental sector, leading to numerous market openings and opportunities for professionals and businesses alike. 

### As always, take a look at the trace

https://platform.openai.com/traces

### We will now use Structured Outputs, and include a description of the fields

In [6]:
# See note above about cost of WebSearchTool

HOW_MANY_SEARCHES = 3

INSTRUCTIONS = f"You are a helpful research assistant. Given a query, come up with a set of web searches \
to perform to best answer the query. Output {HOW_MANY_SEARCHES} terms to query for."

# Use Pydantic to define the Schema of our response - this is known as "Structured Outputs"

class WebSearchItem(BaseModel):
    #the info in description is essential instruction for the model to ensure a better structured output
    reason: str = Field(description="Your reasoning for why this search is important to the query.") 

    query: str = Field(description="The search term to use for the web search.")


class WebSearchPlan(BaseModel):
    searches: list[WebSearchItem] = Field(description="A list of web searches to perform to best answer the query.")


planner_agent = Agent(
    name="PlannerAgent",
    instructions=INSTRUCTIONS,
    model="gpt-4o-mini",
    output_type=WebSearchPlan, #the output will specifically be a list of web searches as described in object WebSearchPlan
)

In [7]:

message = "Market Openings in the Environmental Sector 2026"

with trace("Search"):
    result = await Runner.run(planner_agent, message)
    print(result.final_output)

searches=[WebSearchItem(reason='To find projected trends and opportunities in the environmental sector for upcoming years.', query='Environmental sector market trends 2026'), WebSearchItem(reason='To gather information about new businesses and innovations expected to emerge in the environmental field by 2026.', query='New business opportunities in environmental sector 2026'), WebSearchItem(reason='To identify specific markets and niches within the environmental sector that are predicted to grow in 2026.', query='Growing markets in environmental industry 2026')]


In [10]:
@function_tool
def send_email(subject: str, html_body: str) -> Dict[str, str]:
    """ Send out an email with the given subject and HTML body """
    sg = sendgrid.SendGridAPIClient(api_key=os.environ.get('SENDGRID_API_KEY'))
    from_email = Email("gabrielle.carpenter.25@ucl.ac.uk") # Change this to your verified email
    to_email = To("gabrielle.carpenter.25@ucl.ac.uk") # Change this to your email
    content = Content("text/html", html_body)
    mail = Mail(from_email, to_email, subject, content).get()

    sg.client.mail.send.post(request_body=mail)
    
    return "success"

In [11]:
send_email

FunctionTool(name='send_email', description='Send out an email with the given subject and HTML body', params_json_schema={'properties': {'subject': {'title': 'Subject', 'type': 'string'}, 'html_body': {'title': 'Html Body', 'type': 'string'}}, 'required': ['subject', 'html_body'], 'title': 'send_email_args', 'type': 'object', 'additionalProperties': False}, on_invoke_tool=<function function_tool.<locals>._create_function_tool.<locals>._on_invoke_tool at 0x0000022DA217AF20>, strict_json_schema=True, is_enabled=True, tool_input_guardrails=None, tool_output_guardrails=None)

In [12]:
INSTRUCTIONS = """You are able to send a nicely formatted HTML email based on a detailed report.
You will be provided with a detailed report. You should use your tool to send one email, providing the 
report converted into clean, well presented HTML with an appropriate subject line."""

email_agent = Agent(
    name="Email agent",
    instructions=INSTRUCTIONS,
    tools=[send_email],
    model="gpt-4o-mini",
)



In [13]:
INSTRUCTIONS = (
    "You are a senior researcher tasked with writing a cohesive report for a research query. "
    "You will be provided with the original query, and some initial research done by a research assistant.\n"
    "You should first come up with an outline for the report that describes the structure and "
    "flow of the report. Then, generate the report and return that as your final output.\n"
    "The final output should be in markdown format, and it should be lengthy and detailed. Aim "
    "for 6-8 pages of content, at least 700 words."
)

#creating a structured output for the report
class ReportData(BaseModel):
    short_summary: str = Field(description="A short 2-3 sentence summary of the findings.")

    markdown_report: str = Field(description="The final report")

    follow_up_questions: list[str] = Field(description="Suggested topics to research further")


writer_agent = Agent(
    name="WriterAgent",
    instructions=INSTRUCTIONS,
    model="gpt-4o-mini",
    output_type=ReportData,
)

### The next 3 functions will plan and execute the search, using planner_agent and search_agent

In [14]:
async def plan_searches(query: str):
    """ Use the planner_agent to plan which searches to run for the query """
    print("Planning searches...")
    #enacts planner_agent with query later specified
    result = await Runner.run(planner_agent, f"Query: {query}")

    #prints number of searches
    print(f"Will perform {len(result.final_output.searches)} searches")
    return result.final_output

#calls search() for each item in search plan created by plan_searches based on each item in the list WebSearchPlan
async def perform_searches(search_plan: WebSearchPlan):
    """ Call search() for each item in the search plan """
    print("Searching...")

    #search(item) calls search function for specific WebSearchItem 
    #asyncio.create_task() - turns search(item) into a Task object and schedules it to run on the event loop
    tasks = [asyncio.create_task(search(item)) for item in search_plan.searches]
    
    #*tasks - * unpacks operator. takes list of tasks and feeds them into the function as individual arguments
    #results - collects all return values in list form
    results = await asyncio.gather(*tasks)
    print("Finished searching")
    return results

async def search(item: WebSearchItem):
    """ Use the search agent to run a web search for each item in the search plan """
    input = f"Search term: {item.query}\nReason for searching: {item.reason}"
    result = await Runner.run(search_agent, input)
    return result.final_output

### The next 2 functions write a report and email it

In [15]:
async def write_report(query: str, search_results: list[str]):
    """ Use the writer agent to write a report based on the search results"""
    print("Thinking about report...")
    input = f"Original query: {query}\nSummarized search results: {search_results}"
    result = await Runner.run(writer_agent, input)
    print("Finished writing report")
    return result.final_output

async def send_email(report: ReportData):
    """ Use the email agent to send an email with the report """
    print("Writing email...")
    result = await Runner.run(email_agent, report.markdown_report)
    print("Email sent")
    return report

### Showtime!

In [16]:
query ="Market Openings in the Environmental Sector 2026"

with trace("Research trace"):
    print("Starting research...")
    search_plan = await plan_searches(query)
    search_results = await perform_searches(search_plan)
    report = await write_report(query, search_results)
    await send_email(report)  
    print("Hooray!")




Starting research...
Planning searches...
Will perform 3 searches
Searching...
Finished searching
Thinking about report...
Finished writing report
Writing email...
Email sent
Hooray!


### As always, take a look at the trace

https://platform.openai.com/traces

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/thanks.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00cc00;">Congratulations on your progress, and a request</h2>
            <span style="color:#00cc00;">You've reached an important moment with the course; you've created a valuable Agent using one of the latest Agent frameworks. You've upskilled, and unlocked new commercial possibilities. Take a moment to celebrate your success!<br/><br/>Something I should ask you -- my editor would smack me if I didn't mention this. If you're able to rate the course on Udemy, I'd be seriously grateful: it's the most important way that Udemy decides whether to show the course to others and it makes a massive difference.<br/><br/>And another reminder to <a href="https://www.linkedin.com/in/eddonner/">connect with me on LinkedIn</a> if you wish! If you wanted to post about your progress on the course, please tag me and I'll weigh in to increase your exposure.
            </span>
        </td>
    </tr>