# Building Your Own AI Research Agent


### About Me
- will brown from ai twitter (@willccbb)
- research lead @ prime intellect
- spent 2yrs @ morgan stanley doing LLM applications
- did phd @ columbia on multi-agent learning theory
- i work on agentic RL stuff (see [willccbb/verifiers](https://github.com/willccbb/verifiers) on github)


### About the Full Course (Production-Ready Agent Engineering: From MCP to RL)

https://maven.com/will-brown-kyle-corbitt/agents-mcp-rl

- runs june 16 - july 4
- co-teaching with kyle corbitt (@corbtt), ceo of openpipe
- agent stuff AND rl stuff
    - two sides of the same coin
    - course starts with practical 
    - builds towards RL finetuning for OSS models
    - most patterns have analogues which can be applied to closed/API models

### Today's Lightning Lesson
- Basic definitions
- Function-calling and writing your own tools
- MCP crash course
- Connecting MCP servers to clients
- Scaffolding a "deep research"

### What's an "agent"?

No universally agreed-upon definition, but:
- A system that can take actions in an environment towards a goal
- Ability to dynamically adapt logic on the fly in response to observations
- Core contrast: agents vs. workflows/pipelines

Agents:
- Deep Research
- Claude Code
- Manus 

Non-agents:
- pre-fetch RAG
- fixed decision trees of LLMs
- classifier-based routers (e.g. easy vs. hard questions to different LLMs)

Reductive definition:
- "LLM with tool calls in a while loop"
- Must-read: [Building Effective Agents](https://www.anthropic.com/engineering/building-effective-agents) from Anthropic


![AGENTS](images/agent.webp)

### What's an "environment"?

Traditionally:
- States, Actions, Environments, Transitions, Rewards
Most basic example:
- system prompt
- goal 
- set of tools

Simple version with no fancy libraries:
- OpenAI client (can use any model/provider, or locally-hosted endpoint)
- search tool + fetch tool

In [57]:
import os
from openai import OpenAI

model_name = "gpt-4.1"
base_url = "https://api.openai.com/v1"
client = OpenAI(base_url=base_url, api_key=os.getenv("OPENAI_API_KEY"))


system_prompt = """
You are a helpful assistant that can answer questions and help with tasks, such as drafting short research reports. Cite your sources when relevant. 

You have access to the following tools:

- search(query: str) -> str: Searches the web and returns summaries of top results.
- fetch(url: str) -> str: Fetches the content of a given URL and returns it as a markdown page.

You may call one tool per turn, for up to 10 turns, before giving your final answer.
In each turn, you should respond in the following format:

<think>
[your thoughts here]
</think>
<tool>
JSON with the following fields:
- name: The name of the tool to call
- args: A dictionary of arguments to pass to the tool (must be valid JSON)
</tool>

When you are done, give your final answer in the following format:

<answer>
[your final answer here]
</answer>
"""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "What is the latest news in the NBA?"},
]

response = client.chat.completions.create(
    model=model_name,
    messages=messages, # type: ignore
)

response = response.choices[0].message.content # type: ignore
print(response)


<think>
I will search for the latest news in the NBA to provide up-to-date information.
</think>
<tool>
{"name": "search", "args": {"query": "latest NBA news"}}
</tool>


In [58]:
# Basic search tool

def search(query: str) -> str:
    """Searches the web and returns summaries of top results.
    
    Args:
        query: The search query string

    Returns:
        Formatted string with bullet points of top 10 results, each with title, source, url, and brief summary

    Examples:
        {"query": "who invented the lightbulb"} -> ["Thomas Edison (1847-1931) - Inventor of the lightbulb", ...]
        {"query": "what is the capital of France"} -> ["Paris is the capital of France", ...]
        {"query": "when was the Declaration of Independence signed"} -> ["The Declaration of Independence was signed on July 4, 1776", ...]
    """

    try:
        from brave import Brave
        # set BRAVE_API_KEY in your environment
        brave = Brave()
        results = brave.search(q=query, count=10, raw=True) # type: ignore
        web_results = results.get('web', {}).get('results', []) # type: ignore
        
        if not web_results:
            return "No results found"

        summaries = []
        for r in web_results:
            if 'profile' not in r:
                continue
            header = f"{r['profile']['name']} ({r['profile']['long_name']})"
            title = r['title']
            snippet = r['description']
            url = r['url'] 
            summaries.append(f"•  {header}\n   {title}\n   {snippet}\n   {url}")

        return "\n\n".join(summaries)
    except Exception as e:
        return f"Error: {str(e)}"
# test
results = search("latest basketball scores")
print(results)

•  Flashscoreusa (flashscoreusa.com)
   Basketball Livescore, Basketball Results | Flashscore - NBA, Euroleague, NCAA
   <strong>Basketball</strong> livescore on Flashscore offers all the <strong>latest</strong> <strong>basketball</strong> results from more than 500+ <strong>basketball</strong> leagues all around the world including NBA, Euroleague, NCAA and more. Find all today&#x27;s/tonight&#x27;s <strong>basketball</strong> <strong>scores</strong> on Flashscore.
   https://www.flashscoreusa.com/basketball/

•  ESPN (Entertainment and Sports Programming Network)
   NBA Scores, 2024-25 Season - ESPN
   Live <strong>scores</strong> for every 2024-25 NBA season game on ESPN. Includes box <strong>scores</strong>, video highlights, play breakdowns and updated odds.
   https://www.espn.com/nba/scoreboard

•  Ncaa (ncaa.com)
   NCAA college basketball scores | NCAA.com
   Live college <strong>basketball</strong> <strong>scores</strong>, schedules and rankings from NCAA Division I men&#x27;

In [59]:
# fetch tool for URL contents

def fetch(url: str) -> str:
    """Fetches the content of a given URL and returns it as a markdown page.
    
    Args:
        url: The URL to fetch the content from

    Returns:
        A markdown page with the content of the URL.
    """
    import requests
    from markdownify import markdownify
    
    response = requests.get(url)
    response.raise_for_status()
    return markdownify(response.text)
content = fetch("https://www.flashscoreusa.com/basketball/")
print(content)

Basketball Livescore, Basketball Results | Flashscore - NBA, Euroleague, NCAA




Basketball Livescore, Basketball Results, NBA, Euroleague, NCAA



![](https://static.flashscore.com/res/_fs/image/2_others/bg.png)

[Scores](/)
[News](/news/)




[Favorites](/favorites/)

[Soccer](/)
[Tennis](/tennis/)
[Golf](/golf/)
[Football](/football/)
[Basketball](/basketball/)
[Baseball](/baseball/)
[Hockey](/hockey/)

[Aussie rules](/aussie-rules/)
[Badminton](/badminton/)
[Baseball](/baseball/)
[Basketball](/basketball/)
[Beach volleyball](/beach-volleyball/)
[Boxing](/boxing/)
[Cricket](/cricket/)
[Cycling](/cycling/)
[Darts](/darts/)
[eSports](/esports/)
[Field hockey](/field-hockey/)
[Football](/football/)
[Futsal](/futsal/)
[Golf](/golf/)
[Hockey](/hockey/)
[Horse racing](/horse-racing/)
[MMA](/mma/)
[Motorsport](/motorsport/)
[Rugby League](/rugby-league/)
[Rugby Union](/rugby-union/)
[Snooker](/snooker/)
[Soccer](/soccer/)
[Table tennis](/table-tennis/)
[Team handball](/team-handball/)
[Te

In [60]:
# basic tool parsing -- universal but less robust, use instructor/outlines/API-provider parsing for more anything production-grade
import re
import json

def parse_thinking_from_response(response: str) -> str | None:
    """Parse a thinking from a response."""
    thinking = re.search(r'<think>(.*?)</think>', response, re.DOTALL)
    if thinking:
        return thinking.group(1)
    return None

def parse_tool_from_response(response: str) -> dict | None:
    """Parse a tool from a response."""
    tool_call = re.search(r'<tool>(.*?)</tool>', response, re.DOTALL)
    if tool_call:
        return json.loads(tool_call.group(1))
    return None

def parse_answer_from_response(response: str) -> str | None:
    """Parse an answer from a response."""
    answer = re.search(r'<answer>(.*?)</answer>', response, re.DOTALL)
    if answer:
        return answer.group(1)
    return None

def call_tool(tool_call: dict) -> str:
    """Call a tool with the given tool call."""
    if tool_call['name'] == 'search':
        return search(tool_call['args']['query'])
    elif tool_call['name'] == 'fetch':
        return fetch(tool_call['args']['url'])
    else:
        return f"Error: Tool {tool_call['name']} not found"
    
# test
tool_call = {"name": "search", "args": {"query": "latest basketball scores"}}


example_response = """
<think>
I'll do a web search to retrieve the most recent news headlines and updates related to the NBA.
</think>
<tool>
{"name": "search", "args": {"query": "latest NBA news"}}
</tool>
"""

tool_call = parse_tool_from_response(example_response) # type: ignore
print(tool_call)

{'name': 'search', 'args': {'query': 'latest NBA news'}}


In [61]:
tool_result = call_tool(tool_call) # type: ignore
print(tool_result)

•  NBA (nba.com)
   NBA News - Latest team, player and league news | NBA.com
   <strong>NBA</strong> <strong>News</strong>: Your source of the most updated official <strong>NBA</strong> <strong>news</strong>. Stay current on the league, team and player <strong>news</strong>, scores, stats, standings from <strong>NBA</strong>.
   https://www.nba.com/news

•  ESPN (Entertainment and Sports Programming Network)
   NBA on ESPN - Scores, Stats and Highlights
   Visit ESPN for <strong>NBA</strong> live scores, video highlights and <strong>latest</strong> <strong>news</strong>. Stream games on ESPN and play Fantasy Basketball.
   https://www.espn.com/nba/

•  NBA (nba.com)
   The official site of the NBA for the latest NBA Scores, Stats & News. | NBA.com
   Follow the action on <strong>NBA</strong> scores, schedules, stats, <strong>news</strong>, teams, and players. Buy tickets or watch the games anywhere with <strong>NBA</strong> League Pass.
   https://www.nba.com/

•  Cbssports (cbssports.

In [62]:
# agent loop
final_answer = ""

question = "What is the latest news in the NBA? Give me a 3 paragraph report."
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": question},
]

turns = 0
while True:
    turns += 1
    retries = 0 
    while retries < 5:
        try:
            response = client.chat.completions.create(
                model=model_name,
                messages=messages, # type: ignore
            )
            response = response.choices[0].message.content # type: ignore
            # parse for thinking, tool call, and/or answer 
            maybe_thinking = parse_thinking_from_response(response) # type: ignore
            maybe_tool_call = parse_tool_from_response(response) # type: ignore
            maybe_answer = parse_answer_from_response(response) # type: ignore
            if maybe_tool_call or maybe_answer:
                break
        except Exception as e:
            print(f"Error: {e}")
            retries += 1
    print("=== Turn", turns, "===")
    if maybe_thinking: # type: ignore
        thinking = maybe_thinking.strip()
        print(f"Thinking: {thinking}")
    if maybe_tool_call: # type: ignore
        tool_call = maybe_tool_call
        tool_result = call_tool(tool_call)
        print(f"Tool call: {tool_call}")
        print(f"Tool result: {tool_result[:100]}")
        messages.append({"role": "user", "content": tool_result})
    elif maybe_answer: # type: ignore
        final_answer = maybe_answer
        break
    else:
        print("Error: No tool call or answer found")
        break

=== Turn 1 ===
Thinking: I should search for the latest NBA news to provide an up-to-date and accurate 3-paragraph report summarizing key happenings in the league.
Tool call: {'name': 'search', 'args': {'query': 'latest NBA news'}}
Tool result: •  NBA (nba.com)
   NBA News - Latest team, player and league news | NBA.com
   <strong>NBA</strong>
=== Turn 2 ===
Thinking: To provide an up-to-date and comprehensive 3-paragraph report on the latest NBA news, I will begin by checking NBA.com's official news page for recent headlines and top stories.
Tool call: {'name': 'fetch', 'args': {'url': 'https://www.nba.com/news'}}
Tool result: NBA News - Latest team, player and league news | NBA.com

Navigation Toggle[![NBA Logo](https://cdn.
=== Turn 3 ===
Thinking: The latest NBA news centers around the ongoing playoffs, with the Oklahoma City Thunder advancing to the NBA Finals after defeating the Minnesota Timberwolves in five games. Tyrese Haliburton's impressive performance and historic triple-d

In [63]:
print(f"Final answer: {final_answer}")

Final answer: 
The NBA Playoffs are reaching a dramatic conclusion as the Oklahoma City Thunder have secured their place in the NBA Finals by defeating the Minnesota Timberwolves 4-1 in the Western Conference Finals. The Thunder's success was built on their depth, defense, and the continued star performances of Shai Gilgeous-Alexander, Jalen Williams, and rookie standout Chet Holmgren. Oklahoma City closed out the series with a dominant performance, showcasing their readiness to contend for their first championship since relocating from Seattle.

Meanwhile, the Eastern Conference Finals are turning into a showcase for Indiana Pacers guard Tyrese Haliburton, who delivered a historic Game 4 triple-double—posting at least 30 points, 15 assists, and 10 rebounds with no turnovers, a feat never before accomplished in NBA playoff history. His efforts, alongside key moments from teammate Aaron Nesmith, have pushed the Pacers to a commanding 3-1 series lead over the New York Knicks, putting Ind

In [64]:
### Models as tools

def ask_model(question: str, url: str) -> str:
    """Ask a model a question about a URL and return the answer."""

    fetch_result = fetch(url)
    system_prompt = "You are a helpful assistant that can answer questions about a given URL."
    prompt = f"""
    Here is the content of the URL {url}:
    {fetch_result}

    Here is the question:
    {question}
    """
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": prompt},
    ]
    # ask model to answer question about url
    answer = client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=messages, # type: ignore
    )
    return answer.choices[0].message.content # type: ignore

# test
question = "What is the latest news in the NBA?."
url = "https://www.nba.com/news"
answer = ask_model(question, url)
print(answer)

The latest news in the NBA includes:

1. "5 takeaways: Thunder finish Wolves in 5" – Oklahoma City Thunder advance to the NBA Finals by defeating the Minnesota Timberwolves in 5 games. The Thunder showed star power, defense, and depth in an emphatic close-out win. (8 hours ago)

2. "4 stats to know for Knicks-Pacers Game 5" – Indiana's offense has been the driving force behind its 3-1 series lead over New York Knicks as the two teams prepare for Game 5.

3. "5 key stats from OKC's run to 2025 NBA Finals" – The Thunder have amassed impressive stats on their way to the NBA Finals.

4. "Thunder showing blueprint for success" – Isiah Thomas praises the Thunder's approach and suggests the Timberwolves could learn from their success.

5. Denver Nuggets' new coach David Adelman reveals offseason plans, focusing on team conditioning. (21 hours ago)

6. Tyrese Haliburton posted a historic triple-double in Game 4, helping put the Pacers on the brink with their win over the Knicks. (18 hours ago)

In [68]:
# 

import os
from openai import OpenAI

model_name = "gpt-4.1"
base_url = "https://api.openai.com/v1"
client = OpenAI(base_url=base_url, api_key=os.getenv("OPENAI_API_KEY"))

system_prompt = """
You are a helpful assistant that can answer questions and help with tasks, such as drafting short research reports. Cite your sources when relevant. 

You have access to the following tools:

- search(query: str) -> str: Searches the web and returns summaries of top results.
- ask_model(question: str, url: str) -> str: Ask an AI helper model a question about a given URL and return the answer.

You may call one tool per turn, for up to 10 turns, before giving your final answer.


In each turn, you should respond in the following format:

<think>
[your thoughts here]
</think>
<tool>
JSON with the following fields:
- name: The name of the tool to call
- args: A dictionary of arguments to pass to the tool (must be valid JSON)
</tool>

When you are done, give your final answer in the following format:

<answer>
[your final answer here]
</answer>
"""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "What is the latest news in the NBA?"},
]

response = client.chat.completions.create(
    model=model_name,
    messages=messages, # type: ignore
)

response = response.choices[0].message.content # type: ignore
print(response)


<think>
I will perform a web search to find the latest news and developments in the NBA, focusing on today's headlines and updates.
</think>
<tool>
{"name": "search", "args": {"query": "latest NBA news"}}
</tool>


In [69]:
# updated call_tool
def call_tool(tool_call: dict) -> str:
    """Call a tool with the given tool call."""
    if tool_call['name'] == 'search':
        return search(tool_call['args']['query'])
    elif tool_call['name'] == 'fetch':
        return fetch(tool_call['args']['url'])
    elif tool_call['name'] == 'ask_model':
        return ask_model(tool_call['args']['question'], tool_call['args']['url'])
    else:
        return f"Error: Tool {tool_call['name']} not found"
    
# agent loop
final_answer = ""

question = "What is the latest news in the NBA? Give me a 3 paragraph report."
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": question},
]

turns = 0
while True:
    turns += 1
    retries = 0 
    while retries < 5:
        try:
            response = client.chat.completions.create(
                model=model_name,
                messages=messages, # type: ignore
            )
            response = response.choices[0].message.content # type: ignore
            # parse for thinking, tool call, and/or answer 
            maybe_thinking = parse_thinking_from_response(response) # type: ignore
            maybe_tool_call = parse_tool_from_response(response) # type: ignore
            maybe_answer = parse_answer_from_response(response) # type: ignore
            if maybe_thinking or maybe_tool_call or maybe_answer:
                break
        except Exception as e:
            print(f"Error: {e}")
            retries += 1
    print("=== Turn", turns, "===")
    if maybe_thinking: # type: ignore
        thinking = maybe_thinking.strip()
        print(f"Thinking: {thinking}")
    if maybe_tool_call: # type: ignore
        tool_call = maybe_tool_call
        tool_result = call_tool(tool_call)
        print(f"Tool call: {tool_call}")
        print(f"Tool result: {tool_result[:100]}")
        messages.append({"role": "user", "content": tool_result})
    elif maybe_answer: # type: ignore
        final_answer = maybe_answer
        break
    else:
        print("Error: No tool call or answer found")
        break

=== Turn 1 ===
Thinking: To provide a current, accurate, and succinct three-paragraph report on the latest NBA news, I will search for up-to-date summaries from reputable sports news sources.
Tool call: {'name': 'search', 'args': {'query': 'latest NBA news'}}
Tool result: •  NBA (nba.com)
   NBA News - Latest team, player and league news | NBA.com
   <strong>NBA</strong>
=== Turn 2 ===
Thinking: To write a 3-paragraph report on the latest NBA news, I need to gather updates on current games, player performances, and any major events from today. The most authoritative and updated sources would be NBA.com, ESPN, and possibly Bleacher Report. I will start by checking the latest top stories from NBA.com.
Tool call: {'name': 'ask_model', 'args': {'question': 'What are the latest NBA news stories and updates as of today? Summarize the top headlines and key events.', 'url': 'https://www.nba.com/news/category/top-stories'}}
Tool result: As of today, the latest NBA news stories and updates inclu

In [70]:
print(f"Final answer: {final_answer}")

Final answer: 
The NBA Conference Finals are reaching an exciting peak, with the Oklahoma City Thunder clinching a berth in the NBA Finals after a commanding 4-1 series win over the Minnesota Timberwolves. Oklahoma City’s return to the Finals for the first time in over a decade has been built on star power, elite defense, and enviable team chemistry. Shai Gilgeous-Alexander, recently named the 2024-25 Kia NBA MVP, has anchored their impressive run, while Jalen Williams’ breakout performances have provided crucial support throughout the playoffs.

In the Eastern Conference, the Indiana Pacers hold a 3-1 series lead over the New York Knicks, thanks in large part to Tyrese Haliburton’s historic Game 4. Haliburton posted at least 30 points, 15 assists, and 10 rebounds with zero turnovers—a first in NBA playoff history—powering Indiana to the brink of an NBA Finals appearance. As the Pacers look to close out the series in a pivotal Game 5 at Madison Square Garden, the Knicks seek answers bo

### MCP Crash Course

TLDR: MCP is basically just function calling

Why:
- multiple function methods
- client/server architecture, clients + servers can manage their own state
- standardized interface, portability
- both Python + Typescript servers supported (uvx, npx)
    - also Java/Kotlin/C#/Swift SDKs, but less popular

Getting Started:
- MCP Server [docs](https://modelcontextprotocol.io/quickstart/server)
- MCP Client [docs](https://modelcontextprotocol.io/quickstart/client)
- example servers: [modelcontextprotocol/servers](https://github.com/modelcontextprotocol/servers)
- Other repositories:
    - [Smithery](https://smithery.ai/)
    - [mcp.so](https://mcp.so/)
    - [awesome-mcp-servers](https://github.com/punkpeye/awesome-mcp-servers)


MCP Clients:
- Claude Desktop
- Claude Code
- Cursor, Windsurf, other IDEs

### Deep Research via Claude Code

Claude Code is set up to be a code agent by default, but can also be used as a sandbox for other agentic workflows

We just need:
- tools
- instructions
- examples?
- ability to refine via feedback

In [53]:
basic_prompt = """
Write a report about upcoming tech events in NYC. Save the report as a markdown file in the reports folder.
"""

## COURSE DISCOUNT

### Use promo code 'LIGHTNING' for 20% off 

https://maven.com/will-brown-kyle-corbitt/agents-mcp-rl?promoCode=LIGHTNING

- Course runs June 16 - July 4
- Lectures on Tuesdays/Thursdays @ 5PM ET
- Office hours throughout, more will be added
- Lectures are recorded, can be watched async any time (including after the course)
- Weekly take-home projects
- Agents, MCP, Evals, Tool Calling, Reinforcement Learning, GRPO, and more


![PROMOCODE](images/lightning_qr.png)

##