# Building an AI Agent

In this workshop, we will do the following:
- Define Agentic AI
- Build an AI agent using tools
- Understanding why agents are distributed systems
- Identify distributed system challenges for AI agents
- Recognize when agents become workflows

## Hands-on Moments

This is a hands-on workshop!

All of the instructors slides and code samples are are executable in the workshop notebooks.
We encourage you to follow along and play with the samples!

At the end of every chapter (notebook) will be a hands-on lab.
This a self-guided experience where the instructor gives a prompt (not an llm haha) with a notebook and some starter code and the attendees solve the puzzle.

We are going to create a Research Agent that makes a call to the OpenAI API, conducts research on a topic of your choice, and generates a PDF report from that research. Let's go ahead and first set up your notebook.

In [None]:
# We'll first install the necessary packages for this workshop.

%pip install --quiet litellm reportlab python-dotenv

In [None]:
# Mermaid renderer, run at the beginning to setup rendering of diagrams
import base64
from IPython.display import Image, display


def render_mermaid(graph_definition):
    """
    Renders a Mermaid diagram in Google Colab using mermaid.ink.

    Args:
        graph_definition (str): The Mermaid diagram code (e.g., "graph LR; A-->B;").
    """
    graph_bytes = graph_definition.encode("ascii")
    base64_bytes = base64.b64encode(graph_bytes)
    base64_string = base64_bytes.decode("ascii")
    display(Image(url="https://mermaid.ink/img/" + base64_string))

## Create a `.env` File

Next you'll create a `.env` file to store your API keys.
In the file browser on the left, create a new file and name it `.env`.

**Note**: It may disappear as soon as you create it. This is because Google Collab hides hidden files (files that start with a `.`) by default.
To make this file appear, click the icon that is a crossed out eye and hidden files will appear.

Then double click on the `.env` file and add the following line with your API key.

```
LLM_API_KEY = YOUR_API_KEY
LLM_MODEL = "openai/gpt-4o"
```

By default this notebook uses OpenAI's GPT-4o.
If you want to use a different LLM provider, look up the appropriate model name [in their documentation](https://docs.litellm.ai/docs/providers) and change the `LLM_MODEL` field and provide your API key.

In [None]:
# Create .env file
with open(".env", "w") as fh:
    fh.write("LLM_API_KEY = YOUR_API_KEY\nLLM_MODEL = openai/gpt-4o")

# Now open the file and replace YOUR_API_KEY with your API key.

In [None]:
# Load environment variables and configure LLM settings

import os
from dotenv import load_dotenv

load_dotenv(override=True)

# Get LLM_API_KEY environment variable and print it to make sure that your .env file is properly loaded.
LLM_MODEL = os.getenv("LLM_MODEL", "openai/gpt-4o")
LLM_API_KEY = os.getenv("LLM_API_KEY", None)
print("LLM API Key", LLM_API_KEY)

## What is an AI Agent?

An autonomous system that pursues goals through continuous decision-making and action

Think of an AI agent as an autonomous system that doesn't just respond once, but continuously works toward achieving a specific goal. Unlike traditional software that follows predetermined steps, an AI agent makes decisions dynamically based on the current situation and available information.

An example is a customer support AI agent. The agent doesn't just execute a single function - it pursues the goal of "resolving dispute and retain customer" through ongoing decision-making until the situation is fully handled.

## Prompting the LLM

Our agent will use LLM calls to process information and decide what actions to take.

We use `litellm` here, which is a unified interface for over 100+ LLM providers. This means that the same code works with different models - you only need to change the model string. All you need to do is provide an API key.

In [None]:
from litellm import completion, ModelResponse


def llm_call(prompt: str, llm_api_key: str, llm_model: str) -> ModelResponse:
    response = completion(model=llm_model, api_key=llm_api_key, messages=[{"content": prompt, "role": "user"}])
    return response


# Change this to a fun prompt of your choice!
prompt = "Give me 5 fun Tardigrade facts in the form of a sea shanty."

result = llm_call(prompt, LLM_API_KEY, LLM_MODEL)["choices"][0]["message"]["content"]

print(result)

## Prompting the User

Now that we have our LLM call, we can write the code to prompt the user for their research topic.

1. We can write the code to prompt the user for their research topic.
2. Their input becomes the prompt sent directly to the LLM.
3. The LLM processes the request and returns a research response
4. We display the results back to the user

In [None]:
# Make the API call
print("Welcome to the Research Report Generator!")
prompt = input("Enter your research topic or question: ")
result = llm_call(prompt, LLM_API_KEY, LLM_MODEL)

# Extract the response content
response_content = result["choices"][0]["message"]["content"]

print("Research complete!")
print("-" * 80)
print(response_content)

## Generating a PDF

Agents become powerful when they can perform actions beyond just generating text, and that is what a tool does.

Once you have your research data, you'll write it out to a PDF. We'll use this later as a tool for our Agent to call.

Tools are functions the agent can call to interact with the outside world:
- File operations (like this PDF generator)
- API calls to external services
- Database queries
- Email sending
- Web scraping

This transforms our agent from a chatbot into something that produces deliverable outputs.

In [None]:
from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.units import inch


def create_pdf(content: str, filename: str = "research_report.pdf") -> str:
    doc = SimpleDocTemplate(filename, pagesize=letter)

    styles = getSampleStyleSheet()
    title_style = ParagraphStyle("CustomTitle", parent=styles["Heading1"], fontSize=24, spaceAfter=30, alignment=1)

    story = []
    title = Paragraph("Research Report", title_style)
    story.append(title)
    story.append(Spacer(1, 20))

    paragraphs = content.split("\n\n")
    for para in paragraphs:
        if para.strip():
            p = Paragraph(para.strip(), styles["Normal"])
            story.append(p)
            story.append(Spacer(1, 12))

    doc.build(story)
    return filename


create_pdf("Hello PDF!", filename="test.pdf")

## Open the PDF

Download the `test.pdf` PDF and open it. You should see a title **Research Report** and the words **Hello PDF!** in the document.

We'll next combine this with what we've seen already - we'll use our agent to respond to your prompt, then call the `create_pdf` tool to create a PDF with the response to your prompt!

See how neat this is? Instead of just printing text to the console, your agent now creates a tangible deliverable. You ask a question, get a response, and walk away with a professional PDF report.

## Bringing it all together

You now have multiple functions that you can execute to achieve a task.
Next, write the code to bring this all together.

In [None]:
# Make the API call
print("Welcome to the Research Report Generator!")
prompt = input("Enter your research topic or question: ")
result = llm_call(prompt, LLM_API_KEY, LLM_MODEL)

# Extract the response content
response_content: str = result["choices"][0]["message"]["content"]

pdf_filename = create_pdf(response_content, "research_report.pdf")
print(f"SUCCESS! PDF created: {pdf_filename}")

## The Foundations of an Agentic Application

You now have the foundations of an Agentic AI application.

What makes this agentic?
1. It's goal-oriented: it has a clear objective (generate a research report)
2. Tool usage: it combines multiple capabilities (LLM reasoning + PDF generation)
3. Autonomous decision making: the LLM decides how to structure and present the research

The functions you created are your "tools": `llm_call` and `create_pdf`. Some may think that an Agentic AI must have a loop, but that's not the case.

## Agentic Challenges

However, there are a few significant challenges to getting production AI at scale.

Your simple agent works great in demos, but production environments are messy and unpredictable.

- Tool resources (APIs and databases) go down
- Rate limitting on the LLM might cause it to fail, even if only intermittently
- LLMs are inherently non-deterministic
- Networks can go down.

Imagine if the user asks for report → LLM times out → half-generated PDF → frustrated user


## Agents Don't Work in Isolation

* They can call other agents which call other agents, creating complex networks. Each agent has its own logic and potential failure points.
  * Example: Research Agent → Web Search Agent + Data Analysis Agent → Report Generator Agent
* What starts as one simple request can cascade through multiple autonomous systems
* Each of these nodes has its own event loop: plan/next step -> execute -> observe (see diagram below)
* Network partitions, timeouts, and service failures at any step can break the chain anywhere

In [None]:
diagram = """
graph LR
    P[Plan/Next Step] --> E[Execute]
    E --> O[Observe]
    O --> P
"""
render_mermaid(diagram)

## Agents Call Other Agents

 Your research agent might:
  * Call a "Web Scraper" agent to gather sources
  * Call a "Fact Checker" agent to verify claims
  * Call a "Citation" agent to format references
  * Call a "PDF Generator" agent (what you built!)

This means your "simple" research request triggers a complex orchestration.

## Example of Complex Orchestration

User asks: "Write a report on renewable energy trends"

1. Research Agent → Web Scraper Agent (scrapes 5 energy websites)
2. Research Agent → Fact Checker Agent (validates statistics against government data)
3. Research Agent → Citation Agent (formats 12 sources in APA style)
4. Research Agent → PDF Generator Agent (creates final 8-page report)

The problem: If any agent fails at step 2 or 3, you lose all the work from previous steps and have to start over. This can get expensive!

## Your Simple Agent in Production

  What you built:
  - 1 LLM call
  - 1 PDF generation

  In production, this becomes:
  - Multiple LLM calls across different agents
  - External API calls (web scraping, databases)
  - File system operations
  - Network failures at any step
  - Need to coordinate responses from multiple agent

The bottom line: What looks like one AI task is actually a distributed system challenge!

## Agents are Distributed Systems

<figure>
<center>
<img src='https://drive.google.com/uc?id=1yynE1_HDDVuFQjaesFcxyds045MzTkfo' />
<figcaption>The Truth About AI Agents</figcaption></center>
</figure>

## This is Why Agents == Workflows

  Your research agent is actually:
  1. Accept user input
    - Possible problems: input validation service, rate limitting
  2. Call the LLM for research
    - Possible problems: Internet connection, API down, rate limitting, timeout
  3. Generate PDF
    - Possible problems: Memory limits
  4. Return success/failure
    - Possible problem: Connection dropped

  Each step can fail.
  Each step might need different agents.
  This is a **workflow** - and workflows need orchestration.

---
# Exercise 1 - Adding More Tools

* In this exercise, you'll:
  * Call tools with your agent
  *Extract structured information from LLM responses to coordinate between different tools.
* Go to the **Exercise** Directory in the Google Drive and open the **Practice** Directory
* Open _01-An-AI-Agent-Practice.ipynb_ and follow the instructions and filling in the `TODO` statements.
* If you get stuck, raise your hand and someone will come by and help. You can also check the `Solution` directory for the answers
* **You have 5 mins**