# Part 1: Making MCP Tools Durable with Temporal Workflows

Let's say that you've built an AI application that can chat. But what if you want it to *do things*—check weather, query databases, or interact with your company's systems?

The problem: every tool requires custom integration code. You write it once for your app, and can't reuse it elsewhere. Adding a new tool means code changes and redeployment.

**MCP (Model Context Protocol) solves this.**

And of course, you need reliability. When your MCP tools make external calls or run long operations, Temporal handles failures, retries, and state management automatically.

In this workshop, we'll combine MCP and Temporal to build AI agents with powerful, durable tool integrations.

### Review of Instructions and Tools 

From the last workshop, we saw:
1. Instructions: Agentic systems solve problems for you on your behalf based on what you tell it what to do in human language. For agents, this defines _how_ the agent should behave and make decisions

#### Sample instructions:

```python
instructions="You are a helpful weather assistant. Provide clear, concise weather information."
```

2. Tools: How things actually get done. Tools can be local processes (“read this local file and show me the result”) or remote calls (“query this database”).

### Sample tool definition

```python
tools=[
    {
        "type": "function",
        "name": "get_weather_alerts",
        "description": "Get current weather alerts for a US state",
        "parameters": {...}
    }
],
```

You define how to solve your problems in simple, human-readable terms, and the agentic AI works with you using the available tools.

<img src="https://images.ctfassets.net/0uuz8ydxyd9p/oi0jwiOM9uG6tgLGlBrf3/8e49117667416feb4081435a7bb3ef6f/Fig2.png" width="500" />

### Limitations

However, there are some limitations. 

1. **Pre-definition Constraint**: The system is constrained by its pre-defined tools. 
- What if you want to use tools without pre-defining them in your application? For example: 
    1. What if a user wants to check the weather
    2. You get the response, "Sorry, I don't have weather capabilities built in yet" even though weather APIs exist and are accessible.
2. **Making Integration Easier**
    1. Each integration has its own description and format
    2. You need to maintain different versions of different integrations
    3. Adding a new tool means code changes, testing, and redeployment

What we need instead is a system where:
1. Tools can be discovered dynamically
2. Integration is standardized
3. Tools can be added without code changes

Let's explore.

### What is Model Context Protocol?

- A protocol that allow LLMs to direct an AI application to invoke external functions
- Three main benefits:
  - a.) **Custom integrations** - Connect your applications to external services like Slack, Google Calendar, databases, and other systems
  - b.) **Portable toolset** - Build your toolset **once** using the MCP standard and use it **everywhere**. For example, create custom coding tools (boilerplate generators, prompt templates, documentation automation) that work across any MCP-compatible IDE or application (e.g VSCode, Windsurf).
  - c.) **Open-source MCP servers** - Leverage other open-source MCP servers. If you make an MCP Client, then it will allow your application to connect to other MCP servers developed by third parties.

With MCP, tools can present their capabilities to an agentic system and allow for dynamic goal solutions. 

<img src="https://i.postimg.cc/N0kXLyQg/what-is-mcp.png" width="300"/>

_Image credits to [this blogpost](https://yukitaylor00.medium.com/mcp-explained-how-modular-control-protocol-is-supercharging-ai-integrations-c0ce5ec15967)_

### MCP Has 2 Parts

1. **Primitives**: The things you interact with through MCP (prompts, resources like files, tools like agent-ready APIs)
2. **Client-Server Architecture**: How applications communicate with downstream systems (MCP Clients, MCP Servers, a transport protocol between them).

Let's go through the primitives first.

### How Primitives Work Together

Think of MCP like giving an AI assistant a **complete workspace** instead of just a chat window.

**Prompts** = *"Here's what I want"*
- Your instructions and requests

**Resources** = *"Here's what you need to know"*
- Background data: your codebase, database records, documentation
- Like handing someone all the project files before asking them to help

**Tools** = *"Here's what you can do"*  
- Actions the LLM can take: API calls, function execution, file operations

User prompt + injected resources + available tools = LLM decision-making

For example, a coding agent gets context not just from your prompt, but also from your codebase files (resources accessed through MCP), enabling it to understand your specific project before suggesting changes or using development tools.

<img src="https://images.ctfassets.net/0uuz8ydxyd9p/3NEgDFJRaW8MCfHfaT6Lfj/a416c7f08f62d76eff95d76b1deacce3/Fig5.png" width="500" />

### Tools = Agent-Ready APIs

The language that is describing the tool needs to be aligned with the language that the LLM is using.

**Why this matters:**
- LLMs are trained on natural language, not API documentation
- The tool description is what the LLM uses to decide **when** and **how** to use the tool

**Comparison:**
- **Developer-oriented**: `GET /api/weather/forecast?location=12345&days=7`
- **Agent-oriented**: "Get 7-day weather forecast for San Francisco, California"

# MCP Overview

### MCP Architecture

- MCP establishes a client-server communication model where the client and server exchange messages.
- The protocol defines how clients communicaate with the server.

### MCP Server

- A system that data owners create to make their systems accessible to AI applications
- Operates independently from the AI application, listening for requests from MCP Clients and responding accordingly.
- Provides tools, resources, and capabilities, while communicating to the Client what capabilities are available

3 Key Services:

- Prompt templates - Provides pre-built prompts for common tasks. 
    - For example, a resume rewriting template where you can swap in different resumes and target job descriptions.
- Resources - Static data access including files, databases, and external APIs. These are essentially GET requests for data lookup from various sources.
- Tools - Functions and APIs that allow MCP clients to perform actions.

Real-world Example:

A software development team builds an MCP server that connects to their:

- GitHub repositories (for code analysis and pull request management)
- Jira ticketing system (for project tracking and issue creation)
- CI/CD pipeline (for deployment status and build triggers)
- Documentation (for searching and updating technical docs)

Now any MCP-compatible application—whether it's Claude, VSCode, or a custom internal tool—can instantly access all these systems through a single, standardized interface.

### MCP Client

AI Applications that can connect to these MCP Servers to access external data and tools.

When you use Claude Desktop, you'll see various tools and integrations available to Claude in the user interface—this is because Claude Desktop has a built-in MCP Client.

MCP Clients:
- Discover server capabilities: The client asks MCP servers what tools and resources they have available.
- Handles data exchange: Receives data from servers and passes it to the AI application in the proper format.
- Manage tool execution: Coordinates when and how the AI uses different tools from connected servers.

For example, a coding agent that connects to your GitHub repositories, documentation, and development tools to understand your specific codebase and workflow.

### MCP Clients are Embedded in the Agent

<img src="https://i.postimg.cc/ncv93dPM/mcp-clients-in-agent.png" width="400"/>

**Key Concept:** The MCP Client is a **component inside** your AI application, not a separate service.

**How It Works:**

1. **User Interaction** → User sends a prompt to the agent
2. **Agent Processing** → The LLM processes the request and determines what tools are needed
3. **MCP Client Role** → The embedded MCP Client:
   - Discovers available tools from connected MCP Servers
   - Sends requests to the appropriate MCP Server(s)
   - Receives responses and passes them back to the LLM
4. **Agent Response** → LLM uses the tool results to generate a final response to the user

**Example Flow:**

```
User: "What's the weather in Seattle and create a calendar event for tomorrow"
     ↓
Agent + MCP Client
     ↓                           ↓
Weather MCP Server          Calendar MCP Server
(get_forecast tool)         (create_event tool)
     ↓                           ↓
Agent combines results → Response to user
```

### Many Applications Embed MCP Clients

<img src="https://i.postimg.cc/xTY1196d/apps-mcp-clients-1.png" width="400"/>

<img src="https://i.postimg.cc/vHzH1CjM/apps-mcp-clients-2.png" width="400"/>

**The Growing MCP Ecosystem:**

Major AI applications and tools are embedding MCP Clients, including:
- **Claude Desktop** - Anthropic's desktop AI assistant
- **IDEs** - Cursor, Windsurf, Zed, and other AI-powered code editors
- **Custom Applications** - Any app can integrate MCP

When you build **one MCP Server**, it instantly works with **all** of these applications. You don't need to:
- Write custom integrations for each platform
- Learn different APIs for Claude vs. Cursor vs. Windsurf
- Maintain separate codebases for different tools
- Redeploy when new MCP-compatible applications launch

### MCP Standardizes Tool Integration by Providing a Unified Protocol. 

Rather than each tool requiring its own unique communication method, applications can interface with MCP once and access multiple backend services through different MCP servers that all speak the same protocol.

<img src="https://i.postimg.cc/vHr58PFQ/mcp-diagram.png" width="400"/>

MCP supports multiple transport protocols, allowing you to choose the best communication method for your use case.

**Transport: stdio**

Standard input/output (stdio) runs the MCP server as a local subprocess, ideal for local development, desktop applications like Claude Desktop.

<img src="https://i.postimg.cc/bJjK9wDD/stdio.png" width="400"/>

**Transport: streamable-http**

Streamable HTTP uses Server-Sent Events (SSE) over HTTP, allowing the MCP server to run as a remote web service that clients connect to over the network. It is ideal for cloud deployments, microservices architectures, and scenarios where multiple clients need to access the same MCP server from different machines.

<img src="https://i.postimg.cc/HnYRFvtz/streamable-http.png" width="400"/>

### We Need Coarser Grained APIs For LLMs

Traditional APIs are designed for developers and are typically fine-grained (many small, specific operations):
- `GET /points/{lat},{lon}` - Get forecast grid endpoint
- `GET /forecast/{gridId}/{gridX},{gridY}` - Get forecast data

An LLM would need to:
1. Understand the multi-step sequence
2. Make multiple API calls in the correct order

**MCP Servers Orchestrate Lower-Level APIs**

MCP servers act as orchestration layers that handle the complexity:

<img src="https://github.com/temporal-community/durable-mcp/raw/main/diagrams/WeatherMCPServer-GetForecastTool.png" width="500" />

1. **LLM Request** → Agent calls `get_forecast(35.7796, -78.6382)` via MCP Client
2. **MCP Server Orchestrates** → The server handles multiple steps internally:
   - Makes first API call to NWS points endpoint
   - Extracts forecast grid URL from response
   - Makes second API call to get forecast data
   - Parses and formats the forecast periods
3. **Return Formatted Result** → Returns clean, formatted weather forecast to LLM

### Demo (Expand for instructor notes or to run on your own)
<!--
Prep:
1. Clone this repository: `https://github.com/temporal-community/durable-mcp/`.
2. Ahead of time, edit `claude_desktop_config.json` and add the full path to the directory containing weather.py.
3. Make sure you have Claude Desktop installed to your desktop as we will use this as our MCP Client.
4. Run `cp claude_desktop_config.json ~/Library/Application\ Support/Claude/` to connect the tools in this repository to the MCP client in Claude Desktop. Restart Claude Desktop if you have not already.

Demo: 
1. Run your temporal server: `temporal server start-dev`. 
2. Start the Temporal Worker: `uv run run_weather_worker.py` 
3. Open Claude Desktop. 
3. Ask the tool to get weather for city, for example: "What is the weather in Cary, NC?"
4. Emphasize how Claude Desktop discovers that tool dynamically. Allow it to call the `get_weather` tool.
4. Emphasize, but what about reliability?
-->

### Hands-on Moments

This is a hands-on workshop!

All of the instructors slides and code samples are are executable in the workshop notebooks.
We encourage you to follow along and play with the samples!

At the end of every chapter (notebook) will be a hands-on lab.

In [None]:
%pip install --quiet temporalio fastmcp

In [None]:
# Running this will download the Temporal CLI, which we need for this demo.

!curl -sSf https://temporal.download/cli.sh | sh

In [None]:
# This allows us to run the Temporal Asyncio event loop within the event loop of Jupyter Notebooks
import nest_asyncio
nest_asyncio.apply()

### Make Sure Your Temporal Web UI is Running

1. Run `temporal server start-dev` in your terminal.
2. Then in your `Ports` tab on the bottom of this screen, find `8233` and click on the Globe icon to open the Temporal Web UI.

### The Limitations of MCP tools

MCP enables powerful tool integrations, but the protocol itself doesn't provide durability. When your AI agent calls an MCP tool that:
- Makes external API calls
- Processes long-running operations
- Coordinates multiple services

Now what happens if the Weather API is down? What if the network fails halfway through a Workflow? While MCP solves the standardization problem, traditional MCP tools have some limitations:

### Example of a Non-Durable MCP Tool:

```python
from fastmcp import FastMCP
import httpx
import asyncio

mcp = FastMCP("fragile-weather")

@mcp.tool()
async def make_nws_request(city: str) -> str:
    async with httpx.AsyncClient() as client:
        # Network call can fail
        response = await client.get(f"https://api.weather.com/{city}")
        
    # Processing can crash
    data = response.json()
    
    # Long operation might timeout
    await asyncio.sleep(30)
    
    # No state persistence
    result = f"Weather for {city}: {data['temp']}°F"
    
    return result
```

### Temporal is the Ideal Technology for Building MCP Servers!

<img src="https://i.postimg.cc/NMm2ZpjW/mcp-temporal.png" width="500" />

MCP servers need to orchestrate complex, multi-step operations that interact with external systems.
Temporal is a great choice for this use case.

1. **Durable Execution**
    - Your MCP tool can run for hours, days, or even months
    - The tool keeps running even if the MCP server process crashes or restarts
    - State is preserved across failures automatically

2. **Automatic Retries**
    - When an external API is temporarily down, Temporal retries automatically
    - Configurable retry policies (exponential backoff, maximum attempts, etc.)
    - No manual retry logic cluttering your code

3. **Built-in Observability**
    - See every tool execution in the Temporal Web UI
    - View inputs, outputs, and errors for each step
    - Understand exactly what happened when debugging agent behavior

4. **Natural Orchestration Model**
    - Workflows look like regular code but are durable
    - Activities encapsulate external API calls with retry logic
    - Easy to express complex multi-step tool logic

### Optional Demo (Expand for instructor notes or to run on your own).
<!--
Note that this demo is optional, because if students did the first workshop (Agentic Loop and Temporal), students should understand how Temporal maintains durability. However, feel free to run this demo if you have extra time or you want to emphasize the point.
Prep:
1. Clone this repository: `https://github.com/temporal-community/durable-mcp/`.
2. Ahead of time, edit `claude_desktop_config.json` and add the full path to the directory containing server.py.
3. Make sure you have Claude Desktop installed to your desktop as we will use this as our MCP Client.
4. Run `cp claude_desktop_config.json ~/Library/Application\ Support/Claude/` to connect the tools in this repository to the MCP client in Claude Desktop. Restart Claude Desktop if you have not already.

Demo: 
1. Run your temporal server: `temporal server start-dev`. 
2. You'll notice this repository includes a `pf.rules` file that has URLs for the news weather API. We will block those to imitate a network outage for that API.
3. Set the rules with `sudo pfctl -f pf.rules`.
4. Enable the firewall with `sudo pfctl -e`.
5. Start the Temporal Worker: `uv run run_weather_worker.py` 
6. Restart Claude Desktop. 
7. Ask the tool to get weather for city, for example: "What is the weather in Cary, NC?"
8. Go on the Web UI and point out that the `make_nws_request` Activity is retrying.
9. Disable the firewall with `sudo pfctl -d`.
10. Watch the Workflow Execution complete successfully.
-->

### Demo #2 (Expand for instructor notes or to run on your own)
<!--
Prep:
1. Clone this repository: `https://github.com/temporal-community/durable-mcp/`.
2. Ahead of time, edit `claude_desktop_config.json` and add the full path to the directory containing server.py.
3. Make sure you have Claude Desktop installed to your desktop as we will use this as our MCP Client.
4. Run `cp claude_desktop_config.json ~/Library/Application\ Support/Claude/` to connect the tools in this repository to the MCP client in Claude Desktop. Restart Claude Desktop if you have not already.

Demo:
1. Run your temporal server: `temporal server start-dev`. 
2. Start the Temporal Worker: `uv run run_weather_worker.py` 
3. Restart Claude Code, and ask the tool to get weather for city, for example: "What is the weather in Cary, NC?"
4. Allow the tool to be used, then exit out Claude Code.
5. Go on the Web UI, and demonstrate that the Workflow is still running. 
6. Emphasize that with Temporal, the tool keeps going. The Workflow runs independently of the MCP server process.
-->

## Let's transform a simple MCP tool into a durable one that uses a Temporal Workflow.

#### Traditional MCP Tool (non-durable)

Remember without durability, if the code below fails, everything is lost - no retry, no recovery, no memory of what happened.

```
async def make_nws_request(url: str) -> dict[str, Any] | None:
    """Make a request to the NWS API with proper error handling."""
    headers = {
        "User-Agent": USER_AGENT,
        "Accept": "application/geo+json"
    }
    async with httpx.AsyncClient() as client:
        response = await client.get(url, headers=headers, timeout=5.0)
        response.raise_for_status()
        return response.json()
```

### Adding Durablity to MCP Tools with Temporal

To add durability to our tools, we can implement MCP tools as Temporal Workflows.

| Challenge | Without Temporal | With Temporal |
|-----------|-----------------|---------------|
| **Failures** | Tool execution lost on crash | Automatic recovery and retry |
| **Long Operations** | Timeout limitations | Unlimited execution time |
| **State Management** | Complex manual handling | Built-in durable state |
| **Network Issues** | Manual retry logic needed | Automatic retry with backoff |
| **Multi-step Processes** | Difficult coordination | Natural workflow orchestration |

### Let's build some durable MCP tools now.

In [None]:
# An Activity is code that is prone to failure, non-deterministic, making external calls etc.
# Step 1: Make the code an Activity. Look at the cell below for the solution.
# Step 2: Now run the code to load it into the program

from typing import Any
from temporalio import activity
import httpx

USER_AGENT = "weather-app/1.0"

async def make_nws_request(url: str) -> dict[str, Any] | None:
    """Make a request to the NWS API with proper error handling."""
    headers = {
        "User-Agent": USER_AGENT,
        "Accept": "application/geo+json"
    }
    async with httpx.AsyncClient() as client:

        response = await client.get(url, headers=headers, timeout=5.0)
        response.raise_for_status()
        return response.json()

In [None]:
# Optional: Run this cell to load and display the solution
from pathlib import Path
from IPython.display import display, Markdown
import os

notebook_dir = Path(os.getcwd())
solution_file = notebook_dir / "notebooks" / "01_MCP_Temporal_Intro_Solution" / "activity_solution.py"

code = solution_file.read_text()

print("Solution loaded:")
display(Markdown(f"```python\n{code}\n```"))

In [None]:
# The Workflow orchestrates Activities and maintains state durably.
# Step 1: Make the Workflow call the `make_nws_request` Activity for both `points_data` and `forecast_data`. 
# Step 2: Set the Schedule-to-Close to be 40 seconds in each Activity call.
# Step 3: Now run the code to load it into the program

from temporalio import workflow
from datetime import timedelta
from temporalio.common import RetryPolicy

retry_policy = RetryPolicy(
    maximum_attempts=0,  # Infinite retries
    initial_interval=timedelta(seconds=2),
    maximum_interval=timedelta(minutes=1),
    backoff_coefficient=1.0,
)

# Constants
NWS_API_BASE = "https://api.weather.gov"

# sandboxed=False is a Notebook only requirement. You normally don't do this
@workflow.defn(sandboxed=False)
class GetForecast:
    @workflow.run
    async def get_forecast(self, latitude: float, longitude: float) -> str:
        """Get weather forecast for a location.

        Args:
            latitude: Latitude of the location
            longitude: Longitude of the location
        """
        # First get the forecast grid endpoint
        points_url = f"{NWS_API_BASE}/points/{latitude},{longitude}"
        points_data = await workflow.execute_activity(
            # TODO: Execute `make_nws_request` Activity,
            points_url,
            # TODO: Set Schedule-to-Close Timeout to be 40 seconds
            retry_policy=retry_policy,
        )

        if not points_data:
            return "Unable to fetch forecast data for this location."

        await workflow.sleep(10)

        # Get the forecast URL from the points response
        forecast_url = points_data["properties"]["forecast"]
        forecast_data = await workflow.execute_activity(
            # TODO: Execute `make_nws_request` Activity,
            forecast_url,
            # TODO: Set Schedule-to-Close Timeout to be 40 seconds
            retry_policy=retry_policy,
        )
        if not forecast_data:
            return "Unable to fetch detailed forecast."

        # Format the periods into a readable forecast
        periods = forecast_data["properties"]["periods"]
        forecasts = []
        for period in periods[:5]:  # Only show next 5 periods
            forecast = f"""
    {period['name']}:
    Temperature: {period['temperature']}°{period['temperatureUnit']}
    Wind: {period['windSpeed']} {period['windDirection']}
    Forecast: {period['detailedForecast']}
    """
            forecasts.append(forecast)

        return "\n---\n".join(forecasts)

In [None]:
# Optional: Run this cell to load and display the solution
from pathlib import Path
from IPython.display import display, Markdown
import os

notebook_dir = Path(os.getcwd())
solution_file = notebook_dir / "notebooks" / "01_MCP_Temporal_Intro_Solution" / "workflow_solution.py"

code = solution_file.read_text()

print("Solution loaded:")
display(Markdown(f"```python\n{code}\n```"))


In [None]:
# Create MCP tool which calls the Workflow. This MCP tool is now durable!
# Step 1: Call the `GetForecast` Workflow.
# Step 2: Now run the code to load it into the program

# Note: MCP servers cannot be run directly in Jupyter notebooks because
# MCP servers need to run as separate processes that communicate with a protocol
# Therefore, we also have this code in a separate Python file that can be run
# as a standalone MCP server (mcp_servers/weather.py).

from temporalio.client import Client
from fastmcp import FastMCP

# Initialize FastMCP server
mcp = FastMCP("weather")

# Temporal client setup (do this once, then reuse)
temporal_client = None

async def get_temporal_client():
    global temporal_client
    if not temporal_client:
        temporal_client = await Client.connect("localhost:7233")
    return temporal_client

@mcp.tool
async def get_forecast(latitude: float, longitude: float) -> str:
    """Get weather forecast for a location.

    Args:
        latitude: Latitude of the location
        longitude: Longitude of the location
    """
    # The business logic has been moved into the temporal workflow, the mcp tool kicks off the workflow
    client = await get_temporal_client()
    handle = await client.start_workflow(
        workflow= # TODO: Call the `GetForecast` Workflow
        args=[latitude, longitude],
        id=f"forecast-{latitude}-{longitude}",
        task_queue="weather-task-queue",
    )
    return await handle.result()

    if __name__ == "__main__":
        # Initialize and run the server
        mcp.run(transport="sse", host="0.0.0.0", port=5125)

In [None]:
# Optional: Run this cell to load and display the solution
from pathlib import Path
from IPython.display import display, Markdown
import os

notebook_dir = Path(os.getcwd())
solution_file = notebook_dir / "notebooks" / "01_MCP_Temporal_Intro_Solution" / "mcp_server_solution.py"

code = solution_file.read_text()

print("Solution loaded:")
display(Markdown(f"```python\n{code}\n```"))


In [None]:
# Run your Worker

import asyncio
from temporalio.client import Client
from temporalio.worker import Worker

async def run_worker():
    # Connect to Temporal server (change address if using Temporal Cloud)
    client = await Client.connect("localhost:7233")

    worker = Worker(
        client,
        task_queue="weather-task-queue",
        workflows=[GetForecast],
        activities=[make_nws_request],
    )
    print("Worker started. Listening for workflows...")
    await worker.run()


In [None]:
# Due to the limitation of Jupyter Notebooks and Google Collab, this is how
# you must start the worker in a Notebook environment
worker = asyncio.create_task(run_worker())

# If you are running this code in a typical Python environment, you can start
# the Worker by just calling `asyncio.run`
# if __name__ == "__main__":
#    asyncio.run(run_worker())

## Configure Claude Desktop 
### Take 20 minutes to do the following steps.

1. Follow the `codespace-configure-mcp.md` file to see how to connect your MCP Server running in Codespace to Claude Desktop running on your local machine.
2. In Claude Desktop, try asking: "What's the weather forecast for San Francisco, CA?"
3. Allow it to make a call to the `get_forecast` tool.
4. Open your Web UI to see your Workflows executing in real-time!
    - To open your Temporal Web UI, in your `Ports` tab on the bottom of this screen, find `8233` and click on the Globe icon.
    - What were the inputs and outputs of both `make_nws_request` Activities?
    - What was the final output of the Workflow Execution?

In [None]:
# Kill any worker to prepare for the next demo.
x = worker.cancel()

if x:
  print("Worker killed")
else:
  print("Worker was not running. Nothing to kill")

### Simulate a Bug in Error Handling!

In this next section, we'll produce a bug where retries don't work as expected. You'll observe the failure, see retries not working as expected, and then identify and fix the error.

In [None]:
# Buggy Activity Code

# Step 1: This Activity has a bug that prevents retries from working!
# Can you spot it before running the code? Run the cell below to see the answer.
# Step 2: Run this codeblock.

from typing import Any
from temporalio import activity
import httpx

USER_AGENT = "weather-app/1.0"

@activity.defn
async def make_nws_request_buggy(url: str) -> dict[str, Any] | None:
    """Make a request to the NWS API - but with a bug!"""
    headers = {
        "User-Agent": USER_AGENT,
        "Accept": "application/geo+json"
    }
    
    try:
        async with httpx.AsyncClient() as client:
            response = await client.get(url, headers=headers, timeout=5.0)
            response.raise_for_status()
            return response.json()
    except Exception as e:
        print(f"Error occurred: {e}")
        return None

In [None]:
# Optional: Run this cell to load and display what the bug is
from pathlib import Path
from IPython.display import display, Markdown
import os

notebook_dir = Path(os.getcwd())
solution_file = notebook_dir / "notebooks" / "01_MCP_Temporal_Intro_Solution" / "buggy_activity_solution.md"

response = solution_file.read_text()

print("Solution loaded:")
display(response)

In [None]:
# Run this Workflow that calls the buggy Activity
from temporalio import workflow
from datetime import timedelta
from temporalio.common import RetryPolicy

NWS_API_BASE = "https://api.weather.gov"

retry_policy = RetryPolicy(
    maximum_attempts=0,  # Infinite retries
    initial_interval=timedelta(seconds=2),
    maximum_interval=timedelta(minutes=1),
    backoff_coefficient=1.0,
)

@workflow.defn(sandboxed=False)
class GetForecast:
    @workflow.run
    async def get_forecast(self, latitude: float, longitude: float) -> str:
        """Get weather forecast - buggy version."""
        points_url = f"{NWS_API_BASE}/points/{latitude},{longitude}"
        
        points_data = await workflow.execute_activity(
            make_nws_request_buggy,
            points_url,
            start_to_close_timeout=timedelta(seconds=40),
            retry_policy=retry_policy
        )

        if not points_data:
            return "Unable to fetch forecast data - but no retries happened!"

        forecast_url = points_data["properties"]["forecast"]
        forecast_data = await workflow.execute_activity(
            make_nws_request_buggy,
            forecast_url,
            start_to_close_timeout=timedelta(seconds=40),
            retry_policy=retry_policy
        )
        
        if not forecast_data:
            return "Unable to fetch detailed forecast - but no retries happened!"

        periods = forecast_data["properties"]["periods"]
        forecasts = []
        for period in periods[:5]:
            forecast = f"""
    {period['name']}:
    Temperature: {period['temperature']}°{period['temperatureUnit']}
    Wind: {period['windSpeed']} {period['windDirection']}
    Forecast: {period['detailedForecast']}
    """
            forecasts.append(forecast)

        return "\n---\n".join(forecasts)

In [None]:
# Define a Worker that registers the buggy Activity and Workflow
from temporalio.client import Client
from temporalio.worker import Worker

async def run_worker():
    client = await Client.connect("localhost:7233")
    
    worker = Worker(
        client,
        task_queue="weather-task-queue",
        workflows=[GetForecast],
        activities=[make_nws_request_buggy],
    )
    print("Worker started. Listening for workflows...")
    await worker.run()

In [None]:
# Start the Worker
import asyncio

worker = asyncio.create_task(run_worker())

In [None]:
# Test the workflow with invalid coordinates. 
# Because the coordinates are invalid, we expect that our Activity will not be able to 
# successfully fetch the weather.

# This simulates a network failure or API issue
from temporalio.client import Client
import uuid

async def test_buggy_workflow():
    client = await Client.connect("localhost:7233")
    
    # Using invalid coordinates to simulate failure
    handle = await client.start_workflow(
        GetForecast,
        args=[999.0, 999.0],  # Invalid coordinates that will cause the API to fail
        id=f"buggy-forecast-test-{uuid.uuid4()}",
        task_queue="weather-task-queue",
    )
    
    return handle

await test_buggy_workflow()

### Observing Your Buggy Workflow

Observe your Web UI. We probably expect that the Activity would retry on failure. But what we actually see is that the Workflow has completed succsessfully! 

However, if you click on the `ActivityTaskCompleted` Event, we can see we didn't get the forecast data. We seemed to have run into an error, but no retries happened, and the Workflow still completed successfully! What happened here?

### What Went Wrong?

**The Bug:** By catching the exception and returning `None`, the Activity completes successfully from Temporal's perspective. Temporal only retries Activities when they raise an exception, not when they return a value (even if that value is `None`).

```python
try:
    async with httpx.AsyncClient() as client:
        response = await client.get(url, headers=headers, timeout=5.0)
        response.raise_for_status()
        return response.json()
except Exception as e:
    print(f"Error occurred: {e}")
    return None  # BUG! Temporal doesn't see this as a failure!
```

By catching the exception and returning `None`, there was no failure from Temporal's perspective!

**The Fix:**

Let the exception propagate! Temporal needs to see the failure to trigger retries.

```python
# Don't catch the exception at all
async with httpx.AsyncClient() as client:
    response = await client.get(url, headers=headers, timeout=5.0)
    response.raise_for_status()
    return response.json()
```

In [None]:
# Stop the Worker before we fix it

x = worker.cancel()

In [None]:
# Now let's fix the bug by allow exceptions to propagate to Temporal
# **Step 1**: Remove the `try` statement and everything in the `except` statement.
# By doing this, we let exceptions propagate to Temporal
# So if an exception occurs, Temporal will see it and trigger retries!
# **Step 2**: Fix your indentations, and run this codeblock

from typing import Any
from temporalio import activity
import httpx

USER_AGENT = "weather-app/1.0"

@activity.defn
async def make_nws_request_buggy(url: str) -> dict[str, Any] | None:
    """Make a request to the NWS API"""
    headers = {
        "User-Agent": USER_AGENT,
        "Accept": "application/geo+json"
    }
    
    try: # TODO: Remove the try statement
        async with httpx.AsyncClient() as client:
            response = await client.get(url, headers=headers, timeout=5.0)
            response.raise_for_status()
            return response.json()
    except Exception as e: # TODO: Remove everything in the except statement
        print(f"Error occurred: {e}")
        return None

In [None]:
# Optional: Run this cell to load and display the solution
from pathlib import Path
from IPython.display import display, Markdown
import os

notebook_dir = Path(os.getcwd())
solution_file = notebook_dir / "notebooks" / "01_MCP_Temporal_Intro_Solution" / "fixed_activity_solution.py"

code = solution_file.read_text()

print("Solution loaded:")
display(Markdown(f"```python\n{code}\n```"))

In [None]:
# Register a Worker with the fixed code

from temporalio.client import Client
from temporalio.worker import Worker

async def run_fixed_worker():
    client = await Client.connect("localhost:7233")
    
    worker = Worker(
        client,
        task_queue="weather-task-queue",
        workflows=[GetForecast],
        activities=[make_nws_request_buggy],
    )
    print("Fixed worker started. Listening for workflows...")
    await worker.run()

In [None]:
# Start a Worker with the fixed code
import asyncio 

worker = asyncio.create_task(run_worker())

In [None]:
# Let's see if we see retries now with invalid coordinates again. Run this codeblock.

# This simulates a network failure or API issue
from temporalio.client import Client
import uuid

async def test_buggy_workflow():
    client = await Client.connect("localhost:7233")
    
    handle = await client.start_workflow(
        GetForecast,
        args=[999.0, 999.0],  # Invalid coordinates that will cause the API to fail
        id=f"buggy-forecast-test-{uuid.uuid4()}",
        task_queue="weather-task-queue",
    )
    
    return handle

# Run the test
await test_buggy_workflow()

### Success! Retries Are Working!

If you check the Temporal Web UI, you should now see the Activity is **retrying**. What is the output of the failed `ActivityTaskCompleted` Event?

Since we're using invalid coordinates (999.0, 999.0), this workflow will retry forever. Let's terminate the workflow and test with valid coordinates instead. 
1. Click the "Terminate" button on the top right side of the window. 
2. Enter "Invalid Coordinates" as the termination reason.

<img src="https://i.postimg.cc/RZBG1VTH/workflow-termination.png" width="300"/>

In [None]:
# Test the fixed code with a valid location and VALID coordinates!

from temporalio.client import Client
import uuid

async def test_fixed_workflow():
    client = await Client.connect("localhost:7233")
    
    # Using valid coordinates for San Francisco
    latitude, longitude = 37.7749, -122.4194
    workflow_id = f"fixed-forecast-test-{uuid.uuid4()}"
    
    handle = await client.start_workflow(
        GetForecast.get_forecast,
        args=[latitude, longitude],
        id=workflow_id,
        task_queue="weather-task-queue",
    )
    
# Run the test
await test_fixed_workflow()

### Observe Your Web UI

Your Workflow Execution should now complete successfully.

### Bug Fixed! What Did We Learn?

One of Temporal's core benefits is that you can write code as if everything will succeed (the "happy path"). You don't need complex error handling, retry logic, or state management - Temporal handles all of that automatically.

In fact, by creating the simpler version of the code (with no `try/except` block), we got more durability! Just by getting rid of the error handling logic and handing it over to Temporal, we saw Temporal handle this for us.

We see that we can trust Temporal's retry mechanism. Configure your Retry Policy, let exceptions propagate, and Temporal will handle the rest.

_Learn more about how Temporal handles errors in [this chapter](https://temporal.talentlms.com/unit/view/id:5877) of our free Error Handling course!_

In [None]:
# Kill any worker to prepare for the exercise.
x = worker.cancel()

if x:
  print("Worker killed")
else:
  print("Worker was not running. Nothing to kill")

### Summary: Why Temporal and MCP?

- MCP is the defacto standard
- What we are seeing in our customer base → you are building MCP servers.
    - Platform Engineering orchestration from slack
    - Extensions to your coding agents
- Still early days - an abundance of trickiness (security, scaling, observability, ..)
- But some things are clear…
    - MCP servers are distributed systems
    - They need durability
    - The durability needs compound because… Agents!
- Temporal is committed!

---
## Exercise 1 - Making Tools Durable

* In this exercise, you will:
  * Build durability and persistence to your MCP tools with Temporal Workflows
  * Test the integration between Claude Desktop, MCP servers, and Temporal workflows
* Go to the **exercises** Directory in the Google Drive and open the `01_Making_MCP_Tools_Durable` folder. Then, open the **practice** directory.
* Open the `README.md` and follow the instructions
* If you get stuck, raise your hand and someone will come by and help. You can also check the `Solution` directory for the answers