# Building an AI Agent

In this workshop, we will do the following:
- Define Agentic AI
- Build an AI agent using tools
- Understanding why agents are distributed systems
- Identify distributed system challenges for AI agents
- Recognize when agents become workflows

## Hands-on Moments

This is a hands-on workshop!

All of the instructors slides and code samples are are executable in the workshop notebooks.
We encourage you to follow along and play with the samples!

At the end of every chapter (notebook) will be a hands-on lab.
This a self-guided experience where the instructor gives a prompt (not an llm haha) with a notebook and some starter code and the attendees solve the puzzle.

We are going to create a Research Agent that makes a call to the OpenAI API, conducts research on a topic of your choice, and generates a PDF report from that research. Let's go ahead and first set up your notebook.

In [None]:
# We'll first install the necessary packages for this workshop.

%pip install --quiet litellm reportlab python-dotenv

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.3/41.3 kB[0m [31m349.9 kB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.0/9.0 MB[0m [31m15.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m44.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m278.4/278.4 kB[0m [31m15.9 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
# Mermaid renderer, run at the beginning to setup rendering of diagrams
import base64
from IPython.display import Image, display

def render_mermaid(graph_definition):
    """
    Renders a Mermaid diagram in Google Colab using mermaid.ink.

    Args:
        graph_definition (str): The Mermaid diagram code (e.g., "graph LR; A-->B;").
    """
    graph_bytes = graph_definition.encode("ascii")
    base64_bytes = base64.b64encode(graph_bytes)
    base64_string = base64_bytes.decode("ascii")
    display(Image(url="https://mermaid.ink/img/" + base64_string))

## Create an `.env` File

Next you'll create a `.env` file to store your API keys.
In the file browser on the left, create a new file and name it `.env`.

**Note**: It may disappear as soon as you create it. This is because Google Collab hides hidden files (files that start with a `.`) by default.
To make this file appear, click the icon that is a crossed out eye and hidden files will appear.

Then double click on the `.env` file and add the following line with your API key.

```
LLM_API_KEY = YOUR_API_KEY
LLM_MODEL = "openai/gpt-4o"
```

By default this notebook uses OpenAI's GPT-4o.
If you want to use a different LLM provider, look up the appropriate model name [in their documentation](https://docs.litellm.ai/docs/providers) and change the `LLM_MODEL` field and provide your API key.

In [None]:
# Create .env file
with open(".env", "w") as fh:
  fh.write("LLM_API_KEY = YOUR_API_KEY\nLLM_MODEL = openai/gpt-4o")

# Now open the file and replace YOUR_API_KEY with your API key.

In [None]:
# Load environment variables and configure LLM settings

import os
from dotenv import load_dotenv

load_dotenv(override=True)

# Get LLM_API_KEY environment variable and print it to make sure that your .env file is properly loaded.
LLM_MODEL = os.getenv("LLM_MODEL", "openai/gpt-4o")
LLM_API_KEY = os.getenv("LLM_API_KEY", None)
print("LLM API Key", LLM_API_KEY)

LLM API Key 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


## Setting the Scnee

While a human agent can handle 2.6 tickets per hour; however, the average business has 578 tickets per day.

Now imagine, you have 10,000 customer support tickets, three time zones, and customers who want answers now. How do you solve this without hiring 100 people? That's where AI agents come in.

## What is an AI Agent?

An autonomous system that pursues goals through continuous decision-making and action

Think of an AI agent as an autonomous system that doesn't just respond once, but continuously works toward achieving a specific goal. Unlike traditional software that follows predetermined steps, an AI agent makes decisions dynamically based on the current situation and available information.

An example is a customer support AI agent. The agent doesn't just execute a single function - it pursues the goal of "resolving dispute and retain customer" through ongoing decision-making until the situation is fully handled.

## Prompting the LLM

Our agent will use LLM calls to process information and decide what actions to take.

We use `litellm` here, which is a unified interface for over 100+ LLM providers. This means that the same code works with different models - you only need to change the model string. All you need to do is provide an API key.

In [None]:
from litellm import completion, ModelResponse

def llm_call(prompt: str, llm_api_key: str, llm_model: str) -> ModelResponse:
    response = completion(
      model=llm_model,
      api_key=llm_api_key,
      messages=[{ "content": prompt,"role": "user"}]
    )
    return response

# Change this to a fun prompt of your choice!
prompt = "Give me 5 fun Tardigrade facts in the form of a sea shanty."

result = llm_call(prompt, LLM_API_KEY, LLM_MODEL)["choices"][0]["message"]["content"]

print(result)

(Verse 1)  
Oh, the mighty tardigrade sails through streams,  
A tiny sailor in our wildest dreams.  
Eight-legged and tougher than the brine,  
In ocean's depths or lands divine.  

(Chorus)  
Hey, ho, the tardigrades go,  
Resilient and hearty, through waters they flow.  
Sing of the critter who'll never fade,  
For none can outlast the brave tardigrade!  

(Verse 2)  
In freezing ice or scalding heat they thrive,  
Mighty survivors, they'll stay alive.  
Without a drop of water for years on end,  
The tardigrade laughs, “I’ve got time to spend!”  

(Chorus)  
Hey, ho, the tardigrades go,  
Boundless and strong, through worlds unknown.  
Sing of the creature that time cannot raid,  
The ever-resilient, bold tardigrade!  

(Verse 3)  
To the far reaches of outer space,  
A tardigrade’s heart keeps up the pace.  
Radiation burns and cosmic storms,  
Yet in the cosmos, they bravely perform.  

(Chorus)  
Hey, ho, the tardigrades go,  
To places where no others dare roam.  
Sing of the m

## Prompting the User

Now that we have our LLM call, we can write the code to prompt the user for their research topic.

1. We can write the code to prompt the user for their research topic.
2. Their input becomes the prompt sent directly to the LLM.
3. The LLM processes the request and returns a research response
4. We display the results back to the user

In [None]:
# Make the API call
print("Welcome to the Research Report Generator!")
prompt = input("Enter your research topic or question: ")
result = llm_call(prompt, LLM_API_KEY, LLM_MODEL)

# Extract the response content
response_content = result["choices"][0]["message"]["content"]

print("Research complete!")
print("-"*80)
print(response_content)


Welcome to the Research Report Generator!
Enter your research topic or question: Pikachus
Research complete!
--------------------------------------------------------------------------------
Pikachu is one of the most iconic characters from the Pokémon franchise, created by Nintendo, Game Freak, and Creatures. It is an Electric-type Pokémon and is known for its bright yellow fur and elongated ears tipped with black. Pikachu's signature ability is generating electricity, which it releases through its red cheek pouches. This character is often associated with the Pokémon mascot, thanks to its central role in the Pokémon TV series, games, and merchandise.

Pikachu evolves from Pichu when leveled up with high friendship and can further evolve into Raichu when exposed to a Thunder Stone. Pikachu is well-loved worldwide and has become a symbol of the broader Pokémon phenomenon. If you have specific questions about Pikachu or want more details about its role in different Pokémon media or games

## What is a Tool?

A tool is an external function that AI systems can call to perform specific tasks beyond just generating text. They take real actions.

Examples:
- Information retrieval (web search, database query, file reading)
- Communication tools (sending emails, post to Slack, sending text messages or notifications)
- Data analysis tools (run calculations, generate charts and graphs)
- Creative tools (image generation, document creation)


In [None]:
diagram = """
graph LR
    Goal[the goal] --> LLM[LLM<br/>makes decisions on<br/>what to do next]
    LLM --> Tool[Tool<br/>does what the LLM<br/>decided]
    Tool --> LLM
"""
render_mermaid(diagram)

So putting this information together, the agent is generally implemented as an event loop that is kicked off with an expression of some goal.
In the loop, it:
- Asks the LLM to determine the next steps in the flow, and then
- Invokes one or more tools to perform those actions.
- It keeps looping until the LLM hits its goal or the user stops it.

In [None]:
diagram = """
graph LR
    Bootstrap[bootstrap the agentic loop] --> LLM[LLM<br/>makes decisions on<br/>what to do next]
    LLM --> |prepare for tool execution| Tool[Tool<br/>does what the LLM<br/>decided]
    Tool --> |update the input to the LLM<br/>for the next turn| LLM

    UX[UX i.e. tool<br/>confirmation] -.-> Tool
    Library[Tool Library] -.-> Tool

    style UX fill:#f9f9f9
    style Library fill:#f9f9f9
"""
render_mermaid(diagram)

- An agent has a fixed set of tools available to it
- Each time the LLM responds with some instruction, it is the job of the agent to take that response and make preparations for tool execution. The agent may ask the user for confirmation before executing the tool. And after a tool is executed, it is the job of the agent to update the content that is fed to the LLM for the next turn.

## Generating a PDF

Agents become powerful when they can perform actions beyond just generating text, and that is what a tool does.

Once you have your research data, you’ll write it out to a PDF. We’ll use this later as a tool for our Agent to call.

Remember, tools are functions the agent can call to interact with the outside world:
- File operations (like this PDF generator)
- API calls to external services
- Database queries
- Email sending
- Web scraping


In [None]:
Wfrom reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.units import inch

def create_pdf(content: str, filename: str = "research_report.pdf") -> str:
    doc = SimpleDocTemplate(filename, pagesize=letter)

    styles = getSampleStyleSheet()
    title_style = ParagraphStyle(
        'CustomTitle',
        parent=styles['Heading1'],
        fontSize=24,
        spaceAfter=30,
        alignment=1
    )

    story = []
    title = Paragraph("Research Report", title_style)
    story.append(title)
    story.append(Spacer(1, 20))

    paragraphs = content.split('\n\n')
    for para in paragraphs:
        if para.strip():
            p = Paragraph(para.strip(), styles['Normal'])
            story.append(p)
            story.append(Spacer(1, 12))

    doc.build(story)
    return filename

create_pdf("Hello PDF!", filename="test.pdf")

'test.pdf'

## Open the PDF

Download the `test.pdf` PDF and open it. You should see a title **Research Report** and the words **Hello PDF!** in the document.

We'll next combine this with what we've seen already - we'll use our agent to respond to your prompt, then call the `create_pdf` tool to create a PDF with the response to your prompt!

See how neat this is? Instead of just printing text to the console, your agent now creates a tangible deliverable. You ask a question, get a response, and walk away with a professional PDF report.

## Bringing it all Together

You now have multiple functions that you can execute to achieve a task.
Next, write the code to bring this all together.

In [None]:
# Make the API call
print("Welcome to the Research Report Generator!")
prompt = input("Enter your research topic or question: ")
result = llm_call(prompt, LLM_API_KEY, LLM_MODEL)

# Extract the response content
response_content: str = result["choices"][0]["message"]["content"]

pdf_filename = create_pdf(response_content, "research_report.pdf")
print(f"SUCCESS! PDF created: {pdf_filename}")


Welcome to the Research Report Generator!
Enter your research topic or question: Give me 2 facts about pikachus
SUCCESS! PDF created: research_report.pdf


## The Foundations of an Agentic Application

You now have the foundations of an Agentic AI application.

What makes this agentic?
1. It's goal-oriented: it has a clear objective (generate a research report)
2. Tool usage: it combines multiple capabilities (LLM reasoning + PDF generation)
3. Autonomous decision making: the LLM decides how to structure and present the research

The functions you created are your "tools": `llm_call` and `create_pdf`. Some may think that an Agentic AI must have a loop, but that's not the case.

## Agentic Challenges

However, there are a few significant challenges to getting production AI at scale.

Your simple agent works great in demos, but production environments are messy and unpredictable.

- Tool resources (APIs and databases) go down
- Rate limiting on the LLM might cause it to fail, even if only intermittently
- LLMs are inherently non-deterministic
- Networks can go down.

Imagine if the user asks for report → LLM times out → half-generated PDF → frustrated user. Let's explore a bit more.


## The Event Loop

At a high level, AI agents are pretty simple. We have an event loop, and in that event loop, we make calls out to the LLM to ask it for directions. It’s what is driving the flow of the application.

At that direction, we might invoke some downstream tools (e.g.: make some microservice requests).

Then we consult with the user, and then go through the loop again.

In [None]:
diagram = """
graph LR
    P[Plan/Next Step] --> E[Execute]
    E --> O[Observe]
    O --> P
"""
render_mermaid(diagram)

## Agents Don't Work in Isolation

* When we invoke the tools, we are making downstream service requests, like to a microservice. However, increasingly those downstream service requests are going to additional agents.
Each of these nodes has its own event loop: plan/next step -> execute -> observe (see diagram below)
* They can call other agents which call other agents, creating complex networks. Each agent has its own logic and potential failure points.
  * Example: Research Agent → Web Search Agent + Data Analysis Agent → Report Generator Agent

## Agents Call Other Agents

Increasingly, we are seeing agents calling agents, which are calling other agents. Your research agent might ask: "Write a report on renewable energy trends."
  * Call a "Web Scraper" agent to gather sources (scrapes 5 energy websites)
  * Call a "Fact Checker" agent to verify claims (validates statistics about government data)
  * Call a "Citation" agent to format references (formats 12 sources in APA style)
  * Call a "PDF Generator" agent (what you built! Creates a final 8-page report)

This means your "simple" research request triggers a complex orchestration. If any agent fails at step 2 or 3, you lose all the work from previous steps and have to start over. This can get expensive!

Now over lay what can go wrong: network partitions, timeouts, and service failures at any step can break the chain anywhere!

## Your Agent in Production

  What you built:
  - 1 LLM call
  - 1 PDF generation

  In production, this becomes:
  - Multiple LLM calls across different agents
  - External API calls (web scraping, databases)
  - File system operations
  - Network failures at any step
  - Need to coordinate responses from multiple agents

The bottom line: What looks like one AI task is actually a distributed system challenge!

## Agents are Distributed Systems

<figure>
<center>
<img src='https://drive.google.com/uc?id=1yynE1_HDDVuFQjaesFcxyds045MzTkfo' />
<figcaption>The Truth About AI Agents</figcaption></center>
</figure>

## This is Why Agents == Workflows

  Your research agent needs to:
  1. Accept user input
    - Possible problems: input validation service, rate limiting
  2. Call the LLM for research
    - Possible problems: Internet connection, API down, rate limiting, timeout
  3. Generate PDF
    - Possible problems: Memory limits
  4. Return success/failure
    - Possible problem: Connection dropped

  Each step can fail.
  Each step might need different agents.
  This is a **workflow** - and workflows need orchestration.

## Putting It Together

To create a reliable agent, we need:

1. Durable Event Loop - An application that provides a durable implementation of the event loop—one capable of recovering from crashes and resuming exactly where it left off in the event of a failure.

2. Durable Invocation System - A system for durable invocation of LLMs and tools, ensuring that requests persist through transient issues such as network disruptions, service outages, or rate limits, and complete successfully once conditions allow.

---
# Exercise 1 - Adding More Tools

* In this exercise, you'll:
  * Call tools with your agent
  * Extract structured information from LLM responses to coordinate between different tools.
* Go to the **Exercise** Directory in the Google Drive and open the **Practice** Directory
* Open _01-An-AI-Agent-Practice.ipynb_ and follow the instructions and filling in the `TODO` statements.
* If you get stuck, raise your hand and someone will come by and help. You can also check the `Solution` directory for the answers
* **You have 5 mins**