# MCP Historical Weather Comparison

<a target="_blank" href="https://colab.research.google.com/github/https://colab.research.google.com/github/tylere/claude-mcp-historical-weather/blob/main/demo.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

## Overview

This notebook demonstrates how to use **Model Context Protocol (MCP)** to extend Claude's capabilities with real-time data access. We'll build a system that allows Claude to access historical weather data and compare annual weather statistics between locations.

### What is MCP?

Model Context Protocol (MCP) is a standard that enables AI models to securely connect with external data sources and tools. Instead of being limited to training data, models can access live information, APIs, and services.

### What You'll Learn

1. **Limitations of LLMs without external tools** - See how Claude responds to weather queries without access to real data
2. **MCP Integration** - Connect Claude to a historical weather API
3. **Data Visualization** - Create aesthetically pleasing charts comparing weather patterns
4. **Interactive Analysis** - Ask Claude to analyze and compare weather data between locations

### Goals

By the end of this notebook, you'll have a working system that can:
- Fetch historical weather data for locations
- Compare annual weather statistics between cities
- Generate visualizations and insights about weather patterns
- Demonstrate the power of extending LLMs with external data sources

## Setup

### Environment Setup

This notebook is designed to work in both Google Colab and local Jupyter environments. The following cell automatically detects if it is run within Colab installs the necessary dependencies.

In [1]:
import sys

# Check if we're running in Google Colab
IN_COLAB = 'google.colab' in sys.modules
print(f"Running in {'Google Colab' if IN_COLAB else 'local Jupyter environment'}")

# Install required packages
if IN_COLAB:
    # Define packages that match pyproject.toml dependencies
    packages = [
        "altair>=5.5.0",
        "anthropic>=0.66.0",
        "ipykernel>=6.30.1",
        "mcp>=1.13.1",
        "numpy>=2.2.6",
        "openmeteo-requests>=1.7.2",
        "pandas>=2.3.2",
        "requests>=2.32.5",
        "requests-cache>=1.2.1",
        "retry-requests>=2.0.0",
    ]
    
    print("Installing required packages...")
    for package in packages:
        print(f"Installing {package}...")
        !pip install {package}
    
    print("✓ All packages installed successfully!")

Running in local Jupyter environment


### Python Package Imports

Now let's import all the necessary packages for our weather comparison system.

In [2]:
from datetime import datetime
import os
import json
from pprint import pprint

import getpass

# Data handling and validation
from pydantic import BaseModel

# Visualization
import altair as alt

# For displaying rich notebook content
from IPython.display import Markdown, display

# Claude API
import anthropic

import requests_cache  # for caching API responses
from retry_requests import retry  # for retrying API requests
import openmeteo_requests  # for making API requests to Open-Meteo

import pandas as pd  # for data manipulation

print("All packages imported successfully!")

All packages imported successfully!


### Anthropic API Key Setup

We need to securely obtain your Anthropic API key to interact with Claude. The helper function below retrieves a user's Anthropic API key. If the API key is not stored as an environment variable or a Colab secret, it prompts the user to enter their key in a safe manner that doesn't store the key in the notebook.

In [3]:
def get_api_key():
    try:
        if IN_COLAB:
            # Import package for accessing user data
            from google.colab import userdata
            api_key = userdata.get('ANTHROPIC_API_KEY')
        else:
            api_key = os.environ.get("ANTHROPIC_API_KEY")
    except:
        # Prompt user for their API key
        api_key = getpass.getpass("Enter your Anthropic API key: ")
    return api_key

## Logging Flag

Let's also define a flag variable for controlling how verbose the output it.

In [4]:
verbose_output = False

## Motivation: Do we actually need to extend LLMs?

Before diving into MCP integration, let's first see what happens when we ask Claude to compare weather data between two locations **without giving it access to any external tools or data sources**.

This will demonstrate the fundamental limitation of LLMs: they can only work with information from their training data, which has a knowledge cutoff and may not include specific, current, or detailed data.

In [5]:
def ask_claude_without_tools(question: str) -> str:
    """
    Send a question to Claude without any external tools or data access.
    
    Args:
        question: The question to ask Claude
    
    Returns:
        Claude's response as a string
    """
    try:
        client = anthropic.Anthropic(api_key=get_api_key())
        message = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1000,
            messages=[
                {
                    "role": "user", 
                    "content": question
                }
            ]
        )
        return message.content[0].text
    except Exception as e:
        return f"Error: {e}"

# Test Claude's response without external data
weather_question = """
Please compare the annual weather statistics for the years 2000-2023
for San Francisco, Redwood City, Seattle, and Austin. 

I'd like to see:
1. Days where the maximum temperature is between 18 and 24 degrees Celsius
2. Days where the precipitation exceeds 10 mm
3. Days where the sun duration exceeded 6*60*60 seconds
4. Days where the humidity is between 30 and 60%

Please provide specific data and create a comparison showing which city had more favorable weather conditions.
"""

print("🤖 Asking Claude about weather data WITHOUT external tools...")
print("="*70)
print(f"Question: {weather_question}")
print("="*70)
print("Claude's Response:")
print()

response = ask_claude_without_tools(weather_question)
display(Markdown(response))

🤖 Asking Claude about weather data WITHOUT external tools...
Question: 
Please compare the annual weather statistics for the years 2000-2023
for San Francisco, Redwood City, Seattle, and Austin. 

I'd like to see:
1. Days where the maximum temperature is between 18 and 24 degrees Celsius
2. Days where the precipitation exceeds 10 mm
3. Days where the sun duration exceeded 6*60*60 seconds
4. Days where the humidity is between 30 and 60%

Please provide specific data and create a comparison showing which city had more favorable weather conditions.

Claude's Response:



I don't have access to real-time weather databases or the ability to query historical weather data for the specific time period and cities you've requested. To get accurate annual weather statistics for San Francisco, Redwood City, Seattle, and Austin from 2000-2023, you would need to access weather data from sources like:

**Recommended Data Sources:**
- National Weather Service (weather.gov)
- NOAA Climate Data Online
- Weather Underground historical data
- OpenWeatherMap API
- Visual Crossing Weather API
- Meteostat

**However, I can provide some general insights about these cities' climates:**

**San Francisco:**
- Mediterranean climate with mild temperatures year-round
- Limited temperature variation (typically 10-20°C)
- Low precipitation outside winter months
- Often foggy, affecting sun duration
- Moderate to high humidity due to ocean influence

**Redwood City:**
- Similar to San Francisco but slightly warmer and less foggy
- More sunny days than San Francisco
- Mediterranean climate with dry summers

**Seattle:**
- Oceanic climate with mild, wet winters and warm, dry summers
- Frequent precipitation, especially October-May
- Limited sunny days during winter months
- Moderate humidity levels

**Austin:**
- Humid subtropical climate
- Hot summers (often exceeding 24°C)
- More extreme temperature variations
- Thunderstorms common, especially in spring
- Higher humidity levels

**To get the specific data you need:**

1. **Temperature 18-24°C:** San Francisco and Redwood City would likely have the most days in this range
2. **Precipitation >10mm:** Seattle would likely lead this category
3. **Sun duration >6 hours:** Austin and Redwood City would likely have the most
4. **Humidity 30-60%:** This would vary seasonally across all cities

Would you like me to help you structure a data collection approach or suggest specific APIs/tools to gather this historical weather data?

<div class="alert alert-block alert-info">
<b>Key Observations:</b> The LLM model (without tools) returns some general context, but we can do better than that!
</div>

As you can see from Claude's response above, without access to external data sources, the model has several limitations:

1. **No real-time data**: Claude can't access current or specific historical weather data
2. **General knowledge only**: Responses are based on general patterns from training data
3. **No specific metrics**: Can't provide exact precipitation amounts, temperatures, or day counts
4. **No visualizations**: Can't create charts or graphs from actual data

These gaps can be addressed by custom tools and the **Model Context Protocol (MCP)** - they can be used to bridge the gap between the model's reasoning capabilities and real-world data access.

## Analysis

In this section we will create some Python functions that:
- retrieve daily data from the Open-Mateo Weather API for a list of locations
- aggregate daily data to annual summaries
- create a time-series chart, plotting the annual data and a trend

We start by defining a simple Location class that contains the latitude and longitude coordinates.

In [6]:
class Location(BaseModel):
    name: str
    longitude: float
    latitude: float

## Get weather data

Next we configure access to the [Open-Meteo API](https://open-meteo.com/), including a cache and retry mechanism to avoid using the API unnecessarily.

In [7]:
api_url = "https://archive-api.open-meteo.com/v1/archive"

weather_variables = [
    "temperature_2m_max",
    "temperature_2m_mean",
    "temperature_2m_min",
    "relative_humidity_2m_max",
    "relative_humidity_2m_mean",
    "relative_humidity_2m_min",
    "rain_sum",
    "precipitation_hours",
    "sunshine_duration",
  ]

# Setup an Open-Meteo API client with a cache and retry mechanism
cache_session = requests_cache.CachedSession('.cache', expire_after=3600)
retry_session = retry(cache_session, retries=5, backoff_factor=0.2)
openmeteo = openmeteo_requests.Client(session=retry_session)

<div class="alert alert-block alert-info">
<b>Note:</b> The Open-Meteo API offers free access for non-commercial use. If you want to use the API commercially (or just want to support the project) consider <a href="https://open-meteo.com/en/pricing">purchasing an commercial use license</a>. 
</div>

Next we define a function to retrieve weather data from one or more locations, returning the data as a pandas dataframe.

In [8]:
def get_weather_data(
      locations: list[Location],
      start_date: str,
      end_date: str,
      variables: list[str]
    ):
    """
    Get weather data for one or more locations.
    
    Args:
        locations: List of Location objects
        start_date: Start date in YYYY-MM-DD format
        end_date: End date in YYYY-MM-DD format
        variables: List of weather variables to retrieve
    
    Returns:
        Pandas DataFrame with weather data
    """
    
    def parse_response(variables, response, location_name):
        daily = response.Daily()
        daily_data_dict = {
            "date": pd.date_range(
            start = pd.to_datetime(daily.Time(), unit = "s", utc = True),
            end = pd.to_datetime(daily.TimeEnd(), unit = "s", utc = True),
            freq = pd.Timedelta(seconds = daily.Interval()),
            inclusive = "left"
            )
        }

        # Add the variable data.
        for i, variable in enumerate(variables):
            daily_data_dict[variable] = daily.Variables(i).ValuesAsNumpy()

        # Add a column for the location name.
        daily_data_dict["location_name"] = location_name

        return pd.DataFrame(daily_data_dict)

  
    params = {
        "latitude": [x.latitude for x in locations],
        "longitude": [x.longitude for x in locations],
        "start_date": start_date,
        "end_date": end_date,
        "daily": variables,
    }

    # Query for weather data and get one response per location.
    responses = openmeteo.weather_api(api_url, params=params)

    # Concatenate all of the responses into a single dataframe
    daily_df_list = [parse_response(variables, response, locations[i].name) for i, response in enumerate(responses)]
    daily_df = pd.concat(daily_df_list, axis=0)

    return daily_df

### Test it out

In [9]:
test_locations = [
  Location(
    name='San Francisco',
    latitude=37.7749,
    longitude=-122.4194,
  ),
  Location(
    name='Redwood City',
    latitude=37.4848,
    longitude=-122.2281,
  )
]

# Test the original functionality
daily_data = get_weather_data(
  locations=test_locations,
  start_date="2000-01-01",
  end_date="2019-12-31",
  variables=['temperature_2m_mean', 'temperature_2m_max', 'rain_sum', 'sunshine_duration']
)
daily_data

Unnamed: 0,date,temperature_2m_mean,temperature_2m_max,rain_sum,sunshine_duration,location_name
0,2000-01-01 00:00:00+00:00,7.517750,11.574000,0.0,29323.564453,San Francisco
1,2000-01-02 00:00:00+00:00,8.274000,12.074000,0.0,30698.628906,San Francisco
2,2000-01-03 00:00:00+00:00,7.753168,12.524000,0.0,30748.666016,San Francisco
3,2000-01-04 00:00:00+00:00,8.513582,11.974000,0.6,7515.774902,San Francisco
4,2000-01-05 00:00:00+00:00,10.132333,15.374000,0.0,25200.000000,San Francisco
...,...,...,...,...,...,...
7300,2019-12-27 00:00:00+00:00,9.049335,12.978499,0.0,30588.695312,Redwood City
7301,2019-12-28 00:00:00+00:00,8.543084,13.328500,0.0,30485.902344,Redwood City
7302,2019-12-29 00:00:00+00:00,10.351417,13.728499,3.6,12621.796875,Redwood City
7303,2019-12-30 00:00:00+00:00,10.066000,13.928500,5.9,25244.691406,Redwood City


<div class="alert alert-block alert-info">
<b>Note:</b> Sometimes the free Open-Meteo API gets overloaded and times out.
If this happens, take it as a suggestion to stand up and stretch for a few minutes. 
</div>

## Calculate Annual Statistics

To enable better long-term comparisons, we can write a function that calculates annual statistics of how many times a variable exceeds a minimum or maximum threshold.

Examples:
- Days that the maximum temperature exceeds 30 degrees C
- Days that the mean temperatures is between 20 and 25 degrees C
- Days that rain exceeds 2 mm 

In [10]:
def calculate_annual_stats(
    daily_data: pd.DataFrame,
    variable: str,
    threshold_min: float = None,
    threshold_max: float = None
    ) -> pd.DataFrame:
  """
  Calculate annual statistics for the given daily data.
  
  Args:
    daily_data: DataFrame containing weather data
    variable: Name of the variable column to analyze
    threshold_min: Optional minimum threshold (inclusive)
    threshold_max: Optional maximum threshold (inclusive)
  
  Returns:
    DataFrame with columns: year, count, location_name
  """
  # Validate threshold parameters
  if threshold_min is None and threshold_max is None:
    raise ValueError("At least one of threshold_min or threshold_max must be provided")
  
  # Get unique location names from the daily data
  locations = daily_data['location_name'].unique()
  
  # Initialize list to store results
  results = []
  
  # Process each location
  for location in locations:
    # Filter data for this location and make an explicit copy to avoid SettingWithCopyWarning
    location_data = daily_data[daily_data['location_name'] == location].copy()
    
    # Extract year from date column
    location_data['year'] = pd.to_datetime(location_data['date']).dt.year
    
    # Apply threshold filters
    def count_days_in_range(x):
      mask = pd.Series(True, index=x.index)  # Start with all True
      
      if threshold_min is not None:
        mask = mask & (x >= threshold_min)
      
      if threshold_max is not None:
        mask = mask & (x <= threshold_max)
      
      return mask.sum()
    
    yearly_counts = location_data.groupby('year')[variable].apply(count_days_in_range)
    
    # Convert to dataframe
    yearly_df = pd.DataFrame({
      'year': yearly_counts.index,
      'count': yearly_counts.values,
      'location_name': location
    })
    
    results.append(yearly_df)
    
  # Combine all results
  return pd.concat(results, axis=0).reset_index(drop=True)

### Test it out

In [11]:
annual_stats = calculate_annual_stats(
    daily_data,
    variable='temperature_2m_max',
    threshold_min=20,
    threshold_max=25
)
annual_stats.head()

Unnamed: 0,year,count,location_name
0,2000,88,San Francisco
1,2001,101,San Francisco
2,2002,86,San Francisco
3,2003,106,San Francisco
4,2004,112,San Francisco


## Create a Timeseries Chart

Interpreting a long table of numbers can be pretty challenging for humans, so let's create a function for charting the data.

In [12]:
def create_annual_stats_chart(
    annual_stats: pd.DataFrame,
    title: str
  ) -> alt.Chart:
  """
  Create a chart showing the annual statistics.
  """
  # Base chart with data points
  base = alt.Chart(annual_stats).encode(
    x='year:O',
    y='count:Q',
    color='location_name:N'
  )

  # Create line chart
  lines = base.mark_line()

  # Add trend lines
  trend_lines = base.transform_regression(
    'year', 'count', 
    groupby=['location_name']
  ).mark_line(
    strokeDash=[5,5]
  ).encode(
    color='location_name:N'
  )

  # Combine the line and trend lines
  return (lines + trend_lines).properties(
    title=title
  )

### Test it out

In [13]:
create_annual_stats_chart(
    annual_stats,
    title='Days with Temperature Mean between 20 and 25 degrees C'
)

Now that we have functions to retrieve, analyze, and format weather statistics. In the following section we will expose that functionality to Claude.

# MCP

We will use the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) approach to connect Claude to the tools we built

We start off by defining a Python class that acts as a MCP client. This client has the ability to:
- Register tools
- Execute tools
- Process tool results
- Chat with Claude (using the tools)

In [14]:
class MCPClient:
    """A client for interacting with Claude using the Model Context Protocol (MCP).
    
    This class handles registering tools, executing them, and managing conversations with 
    Claude using the MCP format. It processes both text and image outputs from tools
    and formats them appropriately for Claude's consumption.

    Attributes:
        client: The Anthropic client instance
        tools: List of registered MCP tools
        tool_functions: Dictionary mapping tool names to their implementation functions
    """
    def __init__(self, api_key):
        self.client = anthropic.Anthropic(api_key=api_key)
        self.tools = []
        self.tool_functions = {}
    
    def register_tool(self, name, description, input_schema, function):
        """Register an MCP tool"""
        tool = {
            "name": name,
            "description": description,
            "input_schema": input_schema
        }
        self.tools.append(tool)
        self.tool_functions[name] = function
    
    def execute_tool(self, tool_name, tool_input):
        """Execute a registered tool"""
        if tool_name in self.tool_functions:
            return self.tool_functions[tool_name](**tool_input)
        else:
            return f"Tool {tool_name} not found"
    
    def _process_mcp_tool_result(self, tool_result_json: str):
        """Process MCP tool result and return content for Claude"""
        try:
            result = json.loads(tool_result_json)
            if "error" in result:
                return [{"type": "text", "text": f"Error: {result['error']}"}]
            
            content = []
            
            # Add text content
            if "text" in result:
                content.append({
                    "type": "text",
                    "text": result["text"]
                })
            
            if "location" in result:
                content.append({
                    "type": "location",
                    "text": f"Location: {result['location']}"
                })

            # Add image content 
            if "chart_json" in result:
                content.append({
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "application/vnd.vegalite.v5+json",
                        "data": result["chart_json"]
                    }
                })
            return content
        
        except json.JSONDecodeError:
            # If not JSON, treat as plain text
            return [{"type": "text", "text": tool_result_json}]

    def chat_with_tools(self,
                        user_message,
                        model="claude-sonnet-4-20250514",
                        max_iterations=10):
        """Send messages and handle tool calls automatically"""
        
        # Initialize the messages list
        messages = [{"role": "user", "content": user_message}]

        # Track images generated by tools throughout the conversation
        tool_images = []

        for iteration in range(max_iterations):
            if verbose_output:
                display(Markdown(f"`chat_with_tools()`: Iteration {iteration + 1} --------------------------"))
            try:
                if verbose_output:
                    display(Markdown(f"`chat_with_tools()`: messages ="))
                    pprint(messages)
                # Make request to Claude with tools
                response = self.client.messages.create(
                    model=model,
                    max_tokens=4096,
                    tools=self.tools,
                    messages=messages
                )
                if verbose_output:
                    display(Markdown(f"`chat_with_tools()`: response.content ="))
                    for block in response.content:
                        pprint(block)
                
                # Add assistant's response to conversation
                messages.append({"role": "assistant", "content": response.content})

                if verbose_output:
                    display(Markdown(f'`chat_with_tools()`: {response.stop_reason = }'))
                if response.stop_reason == "tool_use":
                    # Process all tool uses via MCP
                    tool_results = []
                    
                    for block in response.content:
                        if block.type == "tool_use":

                            mcp_result = self.execute_tool(block.name, block.input)
                            content = self._process_mcp_tool_result(mcp_result)

                            # Extract and store any images from tool results
                            image_free_content = []
                            for content_item in content:
                                if content_item['type'] == 'image':
                                    tool_images.append(content_item)
                                else:
                                    image_free_content.append(content_item)

                            tool_results.append({
                                "type": "tool_result",
                                "tool_use_id": block.id,
                                "content": image_free_content
                            })
                    
                    # Add tool results to conversation
                    messages.append({"role": "user", "content": tool_results})
                    
                    # Continue conversation loop
                    continue
                    
                else:
                    # Final response - display both text and any tool-generated images
                    
                    # Create a custom response object that includes tool images
                    class ResponseWithImages:
                        def __init__(self, original_response, tool_images):
                            self.content = []
                            
                            # Add original text content
                            for block in original_response.content:
                                self.content.append(block)
                            
                            # Add tool-generated images as additional content
                            for image in tool_images:
                                # Convert tool image format to Claude response format
                                image_block = type('ImageBlock', (), {
                                    'type': 'image',
                                    'source': type('Source', (), {
                                        'data': image['source']['data'],
                                        'media_type': image['source']['media_type']
                                    })()
                                })()
                                self.content.append(image_block)
                    
                    # Return enhanced response with images
                    return ResponseWithImages(response, tool_images)
            
            except Exception as e:
                print(f"Error in iteration {iteration + 1}: {e}")
                return None
        
        print(f"\nReached maximum iterations ({max_iterations})")
        return None

Next we define an MCP tool that can be used to comparing annual weather statistics at one or more locations.

In [15]:
def compare_locations_mcp(
    locations: list,
    variable: str,
    threshold_min: float = None,
    threshold_max: float = None,
    start_year: int = None,
    end_year: int = None
) -> str:
    """
    Compare the annual weather statistics for one or more locations.
    
    Args:
        locations: List of locations to analyze and compare
        variable: Weather variable to compare
        threshold_min: Optional minimum threshold (inclusive)
        threshold_max: Optional maximum threshold (inclusive)
        start_year: Start year (default: 50 years before end_year)
        end_year: End year (default: previous year)
        
    Returns:
        JSON string containing weather comparison text and chart data
    """
    try:
        # Validate input
        if not locations or len(locations) == 0:
            return json.dumps({"error": "At least one location must be provided"})
        
        # Convert dictionaries to Location objects
        location_objects = []
        for loc in locations:
            if isinstance(loc, dict):
                location_objects.append(Location(
                    name=loc["name"],
                    latitude=loc["latitude"],
                    longitude=loc["longitude"]
                ))
            else:
                location_objects.append(loc)

        # Set default year range if not provided
        if end_year is None:
            end_year = datetime.now().year - 1  # Default to previous year
        if start_year is None:
            start_year = end_year - 50  # Default to 50 years before end_year
        
        # Fetch weather data for all locations
        weather_data = get_weather_data(
            locations=location_objects,
            start_date=f"{start_year}-01-01",
            end_date=f"{end_year}-12-31",
            variables=[variable]
        )
                
        # Calculate annual statistics if thresholds were provided
        if threshold_min is not None or threshold_max is not None:
            annual_stats = calculate_annual_stats(
                daily_data=weather_data,
                variable=variable,
                threshold_min=threshold_min,
                threshold_max=threshold_max
            )
            
            # Create chart
            threshold_desc = ""
            if threshold_min is not None and threshold_max is not None:
                threshold_desc = f"Days of {variable} between {threshold_min} and {threshold_max}"
            elif threshold_min is not None:
                threshold_desc = f"Days of {variable} above {threshold_min}"
            elif threshold_max is not None:
                threshold_desc = f"Days of {variable} below {threshold_max}"
            
            chart = create_annual_stats_chart(annual_stats, threshold_desc)
        else:
            # Just show raw data comparison without thresholds
            annual_stats = None
            chart = None
        
        # Generate comparison text for all locations
        comparison_text = f"🌍 Location Analysis: {', '.join([loc.name for loc in location_objects])}\n\n"
        comparison_text += f"📊 {variable.replace('_', ' ').title()} Analysis:\n\n"
        
        # Statistics for each location
        location_stats = []
        for loc in location_objects:
            loc_data = weather_data[weather_data['location_name'] == loc.name]
            if not loc_data.empty:
                avg_val = loc_data[variable].mean()
                min_val = loc_data[variable].min()
                max_val = loc_data[variable].max()
                
                location_stats.append({
                    'name': loc.name,
                    'avg': avg_val,
                    'min': min_val,
                    'max': max_val,
                    'coordinates': f"{loc.latitude}, {loc.longitude}"
                })
                
                comparison_text += f"🏙️ **{loc.name}**\n"
                comparison_text += f"• Average: {avg_val:.1f}\n"
                comparison_text += f"• Range: {min_val:.1f} to {max_val:.1f}\n" 
                comparison_text += f"• Coordinates: {loc.latitude}, {loc.longitude}\n\n"

        # Add threshold-based analysis if applicable
        if annual_stats is not None:
            comparison_text += "🎯 **Threshold Analysis:**\n"
            
            # Calculate average days per location
            for loc in location_objects:
                loc_stats = annual_stats[annual_stats['location_name'] == loc.name]
                if not loc_stats.empty:
                    avg_days = loc_stats['count'].mean()
                    comparison_text += f"• {loc.name}: {avg_days:.0f} days per year on average\n"
            
            comparison_text += f"• Criteria: {threshold_desc}\n\n"
        
        # Add summary comparison if more than one location
        if len(location_objects) > 1:
            comparison_text += "🏆 **Summary:**\n"
            
            # Find best and worst for average values
            if location_stats:
                best_avg = max(location_stats, key=lambda x: x['avg'])
                worst_avg = min(location_stats, key=lambda x: x['avg'])
                
                comparison_text += f"• Highest average {variable.replace('_', ' ')}: {best_avg['name']} ({best_avg['avg']:.1f})\n"
                comparison_text += f"• Lowest average {variable.replace('_', ' ')}: {worst_avg['name']} ({worst_avg['avg']:.1f})\n"
                
                # Add threshold analysis summary if available
                if annual_stats is not None:
                    threshold_stats = []
                    for loc in location_objects:
                        loc_stats = annual_stats[annual_stats['location_name'] == loc.name]
                        if not loc_stats.empty:
                            avg_days = loc_stats['count'].mean()
                            threshold_stats.append({'name': loc.name, 'avg_days': avg_days})
                    
                    if threshold_stats:
                        best_threshold = max(threshold_stats, key=lambda x: x['avg_days'])
                        comparison_text += f"• Most days meeting criteria: {best_threshold['name']} ({best_threshold['avg_days']:.0f} days/year)\n"

        result = {"text": comparison_text}
        if chart:
            result["chart_json"] = chart.to_dict()
            
        return json.dumps(result)
        
    except Exception as e:
        return json.dumps({"error": f"Location analysis failed: {str(e)}"})

## Test it out

Start by creating an instance of the MCP client.

In [16]:
mcp_client = MCPClient(get_api_key())

Next, register our tool with the MCP client.

In [17]:
mcp_client.register_tool(
    name="compare_locations_mcp",
    description="Analyze and compare annual weather statistics for one or more cities with optional threshold analysis and visualization.",
    input_schema={
            "type": "object",
            "properties": {
                "locations": {
                    "type": "array",
                    "description": "List of locations to analyze and compare",
                    "items": {
                        "type": "object",
                        "properties": {
                            "name": {"type": "string"},
                            "latitude": {"type": "number"},
                            "longitude": {"type": "number"}
                        },
                        "required": ["name", "latitude", "longitude"]
                    },
                    "minItems": 1
                },
                "variable": {
                    "type": "string",
                    "description": "Weather variable to compare",
                    "enum": weather_variables,
                },
                "threshold_min": {
                    "type": "number",
                    "description": "Optional minimum threshold for counting days (inclusive)"
                },
                "threshold_max": {
                    "type": "number", 
                    "description": "Optional maximum threshold for counting days (inclusive)"
                },
                "start_year": {
                    "type": "integer",
                    "description": "Start year (default: 50 years before end_year)"
                },
                "end_year": {
                    "type": "integer",
                    "description": "End year (default: previous year)"
                }
            },
            "required": ["locations"]
        },
    function=compare_locations_mcp
)

We can verify that the MCP tool is configured properly, by executing the tool directly.

In [18]:
tool_results_str = mcp_client.execute_tool(
  tool_name="compare_locations_mcp",
  tool_input={
    "locations": [
        {
            "name": "San Francisco",
            "latitude": 37.7749,
            "longitude": -122.4194
        },
        {
            "name": "Seattle",
            "latitude": 47.6062,
            "longitude": -122.3321
        }
    ],
    "variable": "temperature_2m_max",
    "threshold_min": 18,
    "threshold_max": 24,
    "start_year": 2000,
    "end_year": 2019
})

tool_results = json.loads(tool_results_str)

# Display the tool results
display(Markdown(tool_results['text']))
if 'chart_json' in tool_results:
  display(alt.Chart.from_dict(tool_results['chart_json']))

🌍 Location Analysis: San Francisco, Seattle

📊 Temperature 2M Max Analysis:

🏙️ **San Francisco**
• Average: 18.3
• Range: 6.9 to 38.4
• Coordinates: 37.7749, -122.4194

🏙️ **Seattle**
• Average: 15.2
• Range: -4.6 to 37.5
• Coordinates: 47.6062, -122.3321

🎯 **Threshold Analysis:**
• San Francisco: 145 days per year on average
• Seattle: 80 days per year on average
• Criteria: Days of temperature_2m_max between 18 and 24

🏆 **Summary:**
• Highest average temperature 2m max: San Francisco (18.3)
• Lowest average temperature 2m max: Seattle (15.2)
• Most days meeting criteria: San Francisco (145 days/year)


## Helper function

There is one more helper function that we need to create... it converts the MCP Client response data (provided by Claude) into representations that the Jupyter notebook can display.

In [19]:
def display_claude_response(response):
    """Display Claude response with both text and images"""    
    if response:
        for content in response.content:
            if content.type == "text":
                display(Markdown(content.text))
            elif content.type == "image":
                if content.source.media_type == "application/vnd.vegalite.v5+json":
                    # Recreate the chart from JSON dictionary
                    altair_chart = alt.Chart.from_dict(content.source.data)
                    display(altair_chart)


# Querying with the MCP Client

Let's review the weather question we tried asking earlier (without using tools).

In [20]:
print(weather_question)


Please compare the annual weather statistics for the years 2000-2023
for San Francisco, Redwood City, Seattle, and Austin. 

I'd like to see:
1. Days where the maximum temperature is between 18 and 24 degrees Celsius
2. Days where the precipitation exceeds 10 mm
3. Days where the sun duration exceeded 6*60*60 seconds
4. Days where the humidity is between 30 and 60%

Please provide specific data and create a comparison showing which city had more favorable weather conditions.



<div class="alert alert-block alert-success">
<b>Tip:</b> Update this question ask for locations that are relevant to you!<br/>
</div>

Let's ask Claude the question again, this time using the tool we created.

In [21]:
response = mcp_client.chat_with_tools(weather_question)

This time the result is far more detailed, and includes time series charts!

In [22]:
display_claude_response(response)

Based on the data I was able to retrieve, here's what I can tell you about the weather comparison for 2000-2023:

## Weather Statistics Summary (2000-2023)

### 1. Days with Maximum Temperature 18-24°C (Pleasant Temperature Range)
- **San Francisco: 138 days/year** (Best) ⭐
- **Redwood City: 130 days/year** (2nd)
- **Seattle: 80 days/year** (3rd)
- **Austin: 73 days/year** (4th)

### 2. Days with Rain > 10mm (Heavy Rain Days)
- **Seattle: 38 days/year** (Most rainy)
- **Austin: 30 days/year**
- **San Francisco: 21 days/year**
- **Redwood City: 17 days/year** (Least rainy) ⭐

### Temperature Context:
- **Austin** has the highest average max temp (26.4°C) but fewer pleasant days due to excessive heat
- **Seattle** has the lowest average max temp (15.2°C), limiting pleasant temperature days
- **San Francisco** (18.0°C avg) and **Redwood City** (19.9°C avg) have more moderate climates

## Preliminary Weather Favorability Ranking:

Based on the available data:

1. **San Francisco** - Most days with pleasant temperatures, moderate rainfall
2. **Redwood City** - Second most pleasant temperature days, least rainfall
3. **Seattle** - Cooler with more frequent heavy rain
4. **Austin** - Too hot most of the year, moderate rainfall

Unfortunately, I wasn't able to retrieve the sunshine duration and humidity data due to API rate limits. For a complete analysis including those metrics (days with >6 hours sunshine and humidity 30-60%), we would need to wait and retry those queries. The sunshine and humidity data would provide important additional context for determining the most favorable weather conditions.

Would you like me to try retrieving the remaining data (sunshine duration and humidity) after a brief wait?

# Where to next?

There are all kinds of directions that you could go next.  Here are a few ideas:
 
- Implement additional weather statistics / climate indices. For inspiration, see the [ETCCDI Climate Change Indices](https://etccdi.pacificclimate.org/list_27_indices.shtml) compiled by [CLIVAR](https://www.clivar.org/).
- Add functionality to analyze weather forecast data. The [Open-Meteo Weather Forecast API](https://open-meteo.com/en/docs) (rather than the Historical Weather API used in this notebook) provides forecasts up to 16 days in the future.
- Use this same approach to provide Claude access to other large datasets, such as genomic sequence data or satellite imagery!

Happy data exploring!