# Foundations of Durable AI with Temporal

## Hands-on Moments

This is a hands-on workshop!

All of the instructors slides and code samples are are executable in the workshop notebooks.
We encourage you to follow along and play with the samples!

At the end of every chapter (notebook) will be a hands-on lab.
This a self-guided experience where the instructor gives a prompt (not an llm haha) with a notebook and some starter code and the attendees solve the puzzle.

We are going to create a Research Agent that makes a call to the OpenAI API, conducts research on a topic of your choice, and generates a PDF report from that research. Let's go ahead and first set up your notebook.

In [5]:
# We'll first install the necessary packages for this workshop.

%pip install --quiet litellm reportlab python-dotenv


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.1.1[0m[39;49m -> [0m[32;49m25.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49m/Users/azhou/Desktop/edu-ai-workshop-mcp/env/bin/python3 -m pip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [6]:
# Mermaid renderer, run at the beginning to setup rendering of diagrams
import base64
from IPython.display import Image, display

def render_mermaid(graph_definition):
    """
    Renders a Mermaid diagram in Google Colab using mermaid.ink.

    Args:
        graph_definition (str): The Mermaid diagram code (e.g., "graph LR; A-->B;").
    """
    graph_bytes = graph_definition.encode("ascii")
    base64_bytes = base64.b64encode(graph_bytes)
    base64_string = base64_bytes.decode("ascii")
    display(Image(url="https://mermaid.ink/img/" + base64_string))

## Create an `.env` File

Next you'll create a `.env` file to store your API keys.
In the file browser on the left, create a new file and name it `.env`.

**Note**: It may disappear as soon as you create it. This is because Google Collab hides hidden files (files that start with a `.`) by default.
To make this file appear, click the icon that is a crossed out eye and hidden files will appear.

Then double click on the `.env` file and add the following line with your API key.

```
LLM_API_KEY = YOUR_API_KEY
LLM_MODEL = "openai/gpt-4o"
```

By default this notebook uses OpenAI's GPT-4o.
If you want to use a different LLM provider, look up the appropriate model name [in their documentation](https://docs.litellm.ai/docs/providers) and change the `LLM_MODEL` field and provide your API key.

In [None]:
# Create .env file
with open(".env", "w") as fh:
  fh.write("LLM_API_KEY = YOUR_API_KEY\nLLM_MODEL = openai/gpt-4o")

# Now open the file and replace YOUR_API_KEY with your API key.

In [None]:
# Load environment variables and configure LLM settings

import os
from dotenv import load_dotenv

load_dotenv(override=True)

# Get LLM_API_KEY environment variable and print it to make sure that your .env file is properly loaded.
LLM_MODEL = os.getenv("LLM_MODEL", "openai/gpt-4o")
LLM_API_KEY = os.getenv("LLM_API_KEY", None)
print("LLM API Key", LLM_API_KEY)

LLM API Key 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


# Building an AI Agent Introduction

By now, you've experienced generative AI firsthand. You've used ChatGPT and seen what LLMs can do. They excel at tasks like research, but their real power emerges when we connect them with users and other actions to build more advanced applications that go beyond simple chat interfaces.

In this workshop, we'll build toward creating AI agents, but let's start with a simple chain:

Use an LLM to generate research -> then produce a PDF from that research.

## Key Elements of a Gen-AI Application

At their core, Gen-AI applications use an LLM as one component among many. The LLM isn't the application itself. Take ChatGPT as an example—it's an application that wraps an LLM, not an LLM itself. Even this seemingly simple chat interface does much more than just call an LLM:

- Displays responses to the user
- Captures user input
- Maintains conversation history
- Orchestrates each subsequent LLM call

Applications can look like many different formats. We will start with something like a chain workflow where a series of LLM calls, actions, and user interactions are strung together:

<img src="https://images.ctfassets.net/0uuz8ydxyd9p/70SBemKQHnqfLxoHgPovQX/33f3a0b6cfc96eae2d17d1a463079560/Screenshot_2025-07-08_at_10.26.26%C3%A2__AM.png" />

Applications can also look like a loop, where the path through the business logic isn't predetermined. Instead, the system determines it at runtime, with the LLM driving the flow decisions.

<img src="https://images.ctfassets.net/0uuz8ydxyd9p/O0udQjFtCS4RKr3JLcoNQ/589423afb721f595896a978e3d9ca3c2/Screenshot_2025-07-08_at_10.26.59%C3%A2__AM.png" width="500"/>

We'll first look at applications that look like a chain workflow.

## Gen-AI Based Applications are Distributed Systems

When you combine LLMs with other actions—calling external APIs, querying databases, processing files, you're coordinating multiple services across network boundaries.

But challenges can happen:

- Resources (APIs and databases) go down
- Rate limiting on the LLM might cause it to fail
- Networks can go down.
- Imagine if the user asks the LLM for something, and the LLM times out. We have to start from the beginning.

**Gen-AI based applications are distributed systems**. And we are going to show you how to make these are resilient.

## What Does Durability Mean?

- If an LLM call fails halfway through processing, you **don't lose the work already completed**.
- If a database query times out, you can **retry just that step** without restarting everything.
- If your application crashes, it can **resume from the last successful operation**.
- **Long-running processes** can span hours or days without losing context.

Without durability, every failure means starting over.
With durability, failures become recoverable interruptions instead of catastrophic losses. This is especially critical for Gen-AI applications where LLM calls are expensive, slow, and unpredictable.

_Before we dive into building durable workflows, let's start with the first steps: making LLM call_.

## Prompting the LLM

Our agent will use LLM calls to process information and decide what actions to take.

We use `litellm` here, which is a unified interface for over 100+ LLM providers. This means that the same code works with different models - you only need to change the model string. All you need to do is provide an API key.

In [None]:
from litellm import completion, ModelResponse

def llm_call(prompt: str, llm_api_key: str, llm_model: str) -> ModelResponse:
    response = completion(
      model=llm_model,
      api_key=llm_api_key,
      messages=[{ "content": prompt,"role": "user"}]
    )
    return response

prompt = # TODO Add a prompt here to call your LLM
result = llm_call(prompt, LLM_API_KEY, LLM_MODEL)["choices"][0]["message"]["content"]

print(result)

Certainly! Elephants are fascinating creatures with many interesting characteristics:

1. **Species**: There are three extant species of elephants: the African bush elephant (Loxodonta africana), the African forest elephant (Loxodonta cyclotis), and the Asian elephant (Elephas maximus).

2. **Size**: Elephants are the largest land animals on Earth. African elephants are typically larger than their Asian counterparts. African bush elephants can weigh up to 12,000 pounds (around 5,443 kg), whereas Asian elephants can weigh up to  11,000 pounds (around 4,990 kg).

3. **Lifespan**: Elephants can live up to 60 to 70 years in the wild. Their longevity in captivity can be influenced by various factors such as care, diet, and living conditions.

4. **Social Structure**: Elephants are highly social animals. They live in matriarchal herds, usually led by the oldest female. Male elephants often live in bachelor groups or alone after leaving their maternal herd.

5. **Communication**: Elephants ha

## Calling the LLM with Prompts

Now that we have our LLM call, we can write the code to get input from the user and send it to the LLM.

1. We ask the user for their research topic.
2. Their input becomes the prompt sent directly to the LLM.
3. The LLM processes the request and returns a research response.
4. We display the results back to the user:

In [None]:
# Make the API call
print("Welcome to the Research Report Generator!")
prompt = input("Enter your research topic or question: ")
result = llm_call(prompt, LLM_API_KEY, LLM_MODEL)

# Extract the response content
response_content = result["choices"][0]["message"]["content"]

print("Research complete!")
print("-"*80)
print(response_content)


Welcome to the Research Report Generator!
Enter your research topic or question: Give me 2 facts about elephants
Research complete!
--------------------------------------------------------------------------------
Certainly! Here are two interesting facts about elephants:

1. **Intelligent and Social Creatures**: Elephants are known for their high intelligence, demonstrated through complex social structures, problem-solving skills, and strong memories. They live in herds, typically led by a matriarch, and display behaviors such as empathy, mourning for their dead, and cooperative group dynamics.

2. **Eco-Engineers**: Elephants play a crucial role in their ecosystems. By feeding on a wide variety of plants and knocking down trees, they help maintain the habitat for other species and assist in forest and savannah regeneration. Their dung is also vital as it helps in spreading seeds, fostering plant growth, and providing a nutrient-rich habitat for other organisms.


## From Prompts to Actions

So far, our LLM can only generate text responses - it thinks and responds, but it can't *do* anything in the real world.

What if we want our LLM to:
- Search the web for the latest information?
- Save the research report to a file?
- Send the results via email?
- Query a database or call an API?

This is where **actions** come in.

## What is an Action?

An action is an external function that performs specific tasks beyond just generating text.

Examples:
- Information retrieval (web search, database query, file reading)
- Communication tools (sending emails, post to Slack, sending text messages or notifications)
- Data analysis tools (run calculations, generate charts and graphs)
- Creative tools (image generation, document creation)


## Building Your First Action

Let's enhance our research application by adding an action that saves the results to a PDF. This will demonstrate how to move from just generating text to performing concrete actions with that text.

In [None]:
diagram = """
graph LR
    A[User calls LLM] --> B[Generating a PDF from LLM response]
"""
render_mermaid(diagram)

## Generating a PDF

Once you have your research data, you’ll call an action that takes the results and writes it out to a PDF.

Remember, actions can interact with the outside world:
- File operations (like this PDF generator)
- API calls to external services
- Database queries
- Email sending
- Web scraping


In [None]:
## Let's create our PDF generation action.
## This function takes text content and formats it into a professional-looking PDF document:

from reportlab.lib.pagesizes import letter
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.units import inch

def create_pdf(content: str, filename: str = "research_report.pdf") -> str:
    doc = SimpleDocTemplate(filename, pagesize=letter)

    styles = getSampleStyleSheet()
    title_style = ParagraphStyle(
        'CustomTitle',
        parent=styles['Heading1'],
        fontSize=24,
        spaceAfter=30,
        alignment=1
    )

    story = []
    title = Paragraph("Research Report", title_style)
    story.append(title)
    story.append(Spacer(1, 20))

    paragraphs = content.split('\n\n')
    for para in paragraphs:
        if para.strip():
            p = Paragraph(para.strip(), styles['Normal'])
            story.append(p)
            story.append(Spacer(1, 12))

    doc.build(story)
    return filename

create_pdf("Hello PDF!", filename="test.pdf")

'test.pdf'

## Open the PDF

Download the `test.pdf` PDF and open it. You should see a title **Research Report** and the words **Hello PDF!** in the document.

We'll next combine this with what we've seen already:
1. We'll use the LLM to respond to a prompt
2. Then, we'll call the `create_pdf` action to create a PDF with the response to your prompt!

See how neat this is? Instead of just printing text to the console, your application now creates a tangible deliverable. You ask a question, get a response, and walk away with a professional PDF report.

## Bringing it all Together

You now have multiple functions that you can execute to achieve a task.
Next, write the code to bring this all together:

1. Use the LLM to respond to a prompt
2. Call the `create_pdf` action to create a PDF with the response to your prompt.

In [None]:
# Step 1: Call the `llm_call` function with `prompt`, `LLM_API_KEY`, and `LLM_MODEL` as arguments.
# Step 2: Call the `create_pdf` function with `response_content` as the first argument.
# Step 3: Run the code block to execute the program.

# Make the API call
print("Welcome to the Research Report Generator!")
prompt = input("Enter your research topic or question: ")
result = llm_call() # TODO: Call the `llm_call` function with `prompt`, `LLM_API_KEY`, and `LLM_MODEL` as arguments.

# Extract the response content
response_content: str = result["choices"][0]["message"]["content"]

pdf_filename = create_pdf('', "research_report.pdf") # Call the `create_pdf` function with `response_content` as the first argument.
print(f"SUCCESS! PDF created: {pdf_filename}")

Welcome to the Research Report Generator!
Enter your research topic or question: Give me 2 facts about elephants
SUCCESS! PDF created: research_report.pdf


In [7]:
# Run this cell to load and execute the solution
from pathlib import Path
from IPython.display import display, Markdown
import os

notebook_dir = Path(os.getcwd())
solution_file = notebook_dir / "Solutions_01_Foundations_of_Durable_AI_with_Temporal" / "pdf_generation_solution.py"

if not solution_file.exists():
    raise FileNotFoundError(f"Solution file not found at {solution_file}")

code = solution_file.read_text()

print("Solution loaded:")
display(Markdown(f"```python\n{code}\n```"))

exec(code)
print("Solution executed successfully")

Solution loaded:


```python
# Expand for the solution

# Make the API call
print("Welcome to the Research Report Generator!")
prompt = input("Enter your research topic or question: ")
result = llm_call(prompt, LLM_API_KEY, LLM_MODEL)

# Extract the response content
response_content: str = result["choices"][0]["message"]["content"]

pdf_filename = create_pdf(response_content, "research_report.pdf")
print(f"SUCCESS! PDF created: {pdf_filename}")
```

Welcome to the Research Report Generator!


In [None]:
# Run this cell to load and execute the solution
from pathlib import Path
from IPython.display import display, Markdown
import os

notebook_dir = Path(os.getcwd())
solution_file = notebook_dir / "notebooks" / "02_MCP_Temporal_HITL_Solution" / "solution1.py"

if not solution_file.exists():
    raise FileNotFoundError(f"Solution file not found at {solution_file}")

code = solution_file.read_text()

print("Solution loaded:")
display(Markdown(f"```python\n{code}\n```"))

exec(code)
print("Solution executed successfully")


## The Foundations of a Chain Workflow

You now have the foundations of a chain workflow application. We chained an LLM call (`llm_call`) and action together (`create_pdf`) in a defined order (user input → LLM call → PDF generation).

## Why Chain Workflows Need Durability

Our simple chain workflow works great...until one of these happen:

- The LLM call times out halfway through generating the research?
- The PDF generation fails due to a file permission error?
- Your network connection drops during processing?

Right now, you'd have to start completely over. All the work is lost.

As workflows grow more complex, this problem compounds:
- **Longer chains**: User input → Web search → LLM analysis → Database query → PDF generation → Email delivery
- **More failure points**: Each step can fail independently
- **Expensive operations**: Re-running a 30-second LLM call because the last step failed wastes time and money
- **External dependencies**: When your workflow calls external services or APIs, those can be unreliable

## Your Chain Workflow in Production

What you built:
- User input → LLM call → PDF generation
- A simple 3-step chain

In production, this becomes:
- **Longer chains**: User input → Web search → LLM analysis → Database query → LLM refinement → PDF generation → Email delivery
- **External dependencies**: APIs, databases, file systems
- **Network failures**: Any step can fail
- **Expensive retries**: Re-running a 30-second LLM call because step 5 failed

When workflows get even more sophisticated, they evolve into **agentic systems** - where the LLM makes decisions about which actions to take next, potentially calling other specialized agents that themselves have complex workflows.

For example: Research Agent → Executes web search → Observes results → Decides if it needs more data → Executes another search or moves to analysis → Observes → Decides next step → etc.

Each agent has its own event loop (Plan → Execute → Observe):

<img src="https://i.postimg.cc/1zhMS6YL/Loop.png" width="300"/>

## Distributed System Issues

Increasingly, we are seeing agents calling agents, which are calling other agents.

For example, your research application might:
a. Call a "Web Scraper" agent to gather sources (scrapes 5 research websites)
b. Call a "Fact Checker" agent to verify claims (validates statistics against government data)
c. Call a "PDF Generator" agent (what you built!)

This means your "simple" research request triggers a complex orchestration.

Now overlay what can go wrong: network partitions, timeouts, and service failures at any step can break the chain anywhere!

<img src="https://i.postimg.cc/KzH2vx3y/agents-calling-agents.png" width="500"/>

The bottom line: **What looks like one AI task is actually a distributed system challenge!**

## Agents are Distributed Systems!

**This is why durability matters.** Without it, complex workflows become fragile and expensive. With it, failures become manageable interruptions instead of catastrophic losses.

## AI Needs Durability

  Your research application needs to:
  1. Accept user input
    - Possible problems: input validation service, rate limiting
  2. Call the LLM for research
    - Possible problems: Internet connection, API down, rate limiting, timeout
  3. Generate PDF
    - Possible problems: Memory limits
  4. Return success/failure
    - Possible problem: Connection dropped

  Each step can fail.
  Each step might need different agents.
  This is a **workflow** - and workflows need orchestration.

---
# Exercise 1 - Adding More Tools

* In this exercise, you'll:
  * Call an action with your agent
  * Extract structured information from LLM responses to coordinate between different tools.
* Go to the **Exercise** Directory in the Google Drive and open the **Practice** Directory
* Open _01_Durable_AI_Foundations_Practice.ipynb.ipynb_ and follow the instructions and filling in the `TODO` statements.
* If you get stuck, raise your hand and someone will come by and help. You can also check the `Solution` directory for the answers
* **You have 5 mins**