<a href="https://colab.research.google.com/github/mvfolino68/llm-example/blob/main/event_extraction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 📅 LLM Workshop - Event Extraction & Tool Use with AI Systems

This notebook demonstrates two powerful capabilities of Large Language Models (specifically OpenAI's gpt-4o-mini):
1. Extracting calendar event details from everyday conversational text
2. Using function calling to access external tools (like weather data)

These practical examples show how LLMs can be integrated into email assistants, chat applications, or productivity tools.

## 👋 Goals
- Understand prompt chaining for multi-step LLM workflows
- Learn how to use Pydantic for structured LLM outputs
- Build an event extraction system
- Explore how to make LLMs work as reliable components in larger applications
- Implement function calling to access external data sources


# Part 1: Event Extraction with Prompt Chaining

## 🔍 What is Prompt Chaining?
Prompt chaining decomposes a complex task into a sequence of simpler steps, where each LLM call processes the output of the previous one. This approach offers several advantages:

- Improved accuracy - Each step has a clearer, more focused goal
- Better control - You can add validation between steps
- Easier debugging - When issues occur, you can identify exactly which step failed

In our workflow today:
1. First, we determine IF text contains a calendar event
2. Then, we extract the detailed event information
3. Finally, we generate a natural-language confirmation

Let's get started by setting up our environment!

[![](https://www.mermaidchart.com/raw/adf23c0b-dba5-4961-b27b-5ea65592e815?theme=light&version=v0.1&format=svg)](
)

## 🔧 Setup and Installation
Add OpenAI secret to Colab Secrets on the left. 🔑

Name the secret OPENAI_API_KEY and make it avilable to the notebook.

We'll share a 1password link with openai api key.

In [None]:
!pip install openai==1.66.3

In [None]:
# Setup: Import libraries and initialize client
from typing import Optional
from datetime import datetime
from pydantic import BaseModel, Field
from openai import OpenAI
from google.colab import userdata



# For this you'll need a OpenAI API key.
# Enter it in notebook secrets to the left. Name the secret `OPENAI_API_KEY`
api_key = userdata.get('OPENAI_API_KEY')

# Initialize the OpenAI client
client = OpenAI(api_key=api_key)
model = "gpt-4o-mini"
print("✅ Client initialized successfully!")


## 📊 Step 1: Data Models with Pydantic
A key concept in building reliable AI applications is structured outputs. Instead of parsing free-form text from the LLM, we can have it generate data in precise formats.

Pydantic helps us define data models with type validation. When combined with OpenAI's structured output feature, it ensures the LLM generates responses that exactly match our expected schema.

Let's define three models for our event extraction workflow:


In [None]:
# Step 1: Define data models
class EventExtraction(BaseModel):
    description: str            # Cleaned version of the input text
    is_calendar_event: bool     # Binary classification: is this an event?
    confidence_score: float     # How confident is the model (0.0-1.0)

class EventDetails(BaseModel):
    name: str                   # Event title/name
    date: str                   # ISO 8601 formatted date with time
    duration_minutes: int       # How long the event lasts
    participants: list[str]     # Who is attending

class EventConfirmation(BaseModel):
    confirmation_message: str             # Human-friendly confirmation
    calendar_link: Optional[str] = None   # Optional calendar link

print("✅ Data models defined - these will ensure our LLM outputs follow a consistent format.")


## 🕵️ Step 2: Event Detection
The first step in our prompt chain is to determine whether a given text contains a calendar event. This acts as a "filter" to avoid wasting compute time on non-event texts.

Note how we use the `parse` method with our `EventExtraction` model to get structured output rather than free text. This is a powerful technique introduced in the OpenAI API that ensures data consistency.

In [None]:
# Step 2: Extract event info - The first link in our prompt chain
def extract_event_info(user_input: str) -> EventExtraction:
    # Include current date for context (helps with relative dates like "next Tuesday")
    today = datetime.now().strftime("%A, %B %d, %Y")

    # Call the OpenAI API using our structured format
    response = client.responses.parse(
        model=model,
        input=[
            {"role": "system", "content": f"Today is {today}. Analyze the user \
              message and determine if it contains a calendar event request. Extract \
              relevant details and provide a confidence score between 0 and 1."},
            {"role": "user", "content": user_input},
        ],
        text_format=EventExtraction,  # This tells the API to format output as our model
    )
    return response.output[0].content[0].parsed

# Test with a sample input
input_text = "Let's schedule a 1h team meeting next Tuesday at 2pm with Alice and Bob."
result = extract_event_info(input_text)
result


## 📝 Step 3: Detail Extraction
Now that we've confirmed the text contains a calendar event, we'll extract specific details like the event name, date, duration, and participants.

This is the second link in our prompt chain - it takes the description from the previous step and extracts structured information. By breaking this out as a separate step, we give the model a more focused task.

In [None]:
# Step 3: Parse event details
def parse_event_details(description: str) -> EventDetails:
    today = datetime.now().strftime("%A, %B %d, %Y")
    response = client.responses.parse(
        model=model,
        input=[
            {"role": "system", "content": f"Today is {today}. Extract event details."},
            {"role": "user", "content": description},
        ],
        text_format=EventDetails,
    )
    return response.output[0].content[0].parsed

# Use previous output
details = parse_event_details(result.description)
details

## 📨 Step 4: Confirmation Generation
The final step in our chain is to generate a natural-language confirmation message. This demonstrates how we can convert structured data back into human-friendly text.

This approach is powerful because:
1. We maintain structured data throughout our workflow (for database storage, API responses, etc.)
2. We can still provide a conversational, personalized experience to users


In [None]:
# Step 4: Generate confirmation - The third link in our prompt chain
def generate_confirmation(event_details: EventDetails) -> EventConfirmation:
    response = client.responses.parse(
        model=model,
        input=[
            {"role": "system", "content": "You are a helpful personal assistant \
              named Ro. Generate a friendly, concise confirmation message based \
              on the event details provided. Include all important information in a \
              natural way. Sign off with 'Ro'."},
            {"role": "user", "content": str(event_details.model_dump())},
        ],
        text_format=EventConfirmation,
    )
    return response.output[0].content[0].parsed

# Use previous output as input to this function
confirmation = generate_confirmation(details)
print(confirmation.confirmation_message)


## 🔄 Complete Workflow
Now let's connect all three steps into a single workflow. This demonstrates the complete prompt chain:

1. Extract - Determine if text contains an event
2. Parse - Extract structured details from the text
3. Generate - Create a human-friendly confirmation

Notice how we include a validation step after the first function call. This is a "gate" that prevents low-confidence or non-event inputs from proceeding, saving compute and improving reliability.


In [None]:
# Full workflow - The complete prompt chain
def process_calendar_request(user_input: str) -> Optional[EventConfirmation]:
    # Step 1: Check if input contains a calendar event
    extraction = extract_event_info(user_input)

    # Validation gate: Only proceed if we're confident this is a calendar event
    if not extraction.is_calendar_event or extraction.confidence_score < 0.7:
        print("Not a calendar event or low confidence. Stopping workflow.")
        return None

    # Step 2: Extract detailed information
    details = parse_event_details(extraction.description)

    # Step 3: Generate user-friendly confirmation
    return generate_confirmation(details)

# Test the full workflow with our example
input_text = "Let's schedule a 1h team meeting next Friday at 3 with Mike and Rudo."
result = process_calendar_request(input_text)
print(result.confirmation_message)


# Part 2: Function Calling - Weather API Example

## 🔌 Function Calling with External Tools
Another powerful capability of modern LLMs is their ability to interact with external tools through function calling. This allows the model to:

1. Recognize when external data is needed
2. Format the appropriate function call
3. Incorporate the returned data into its response

In this example, we'll demonstrate how to use the OpenAI API to query a weather service.

In [None]:
# Import necessary libraries
import requests
import json

# Define a function to get weather data
def get_weather(latitude, longitude):
    """
    Fetches current weather data for the given coordinates.

    Args:
        latitude (float): The latitude coordinate
        longitude (float): The longitude coordinate

    Returns:
        float: Current temperature in Celsius
    """
    response = requests.get(f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&current=temperature_2m,wind_speed_10m&hourly=temperature_2m,relative_humidity_2m,wind_speed_10m")
    data = response.json()
    return data['current']['temperature_2m']

# Define the tool/function that the model can use
tools = [{
    "type": "function",
    "name": "get_weather",
    "description": "Get current temperature for provided coordinates in celsius.",
    "parameters": {
        "type": "object",
        "properties": {
            "latitude": {"type": "number"},
            "longitude": {"type": "number"}
        },
        "required": ["latitude", "longitude"],
        "additionalProperties": False
    },
    "strict": True
}]

# Step 1: User asks about the weather
input_messages = [{"role": "user", "content": "What's the weather like in Buffalo today converted to fahrenheit?"}]

# Step 2: Model recognizes need for external data and calls the function
response = client.responses.create(
    model=model,
    input=input_messages,
    tools=tools,
)

# Print the function call being made
tool_call = response.output[0]
print(f"Function called: {tool_call.name}")
print(f"With arguments: {tool_call.arguments}")

# Step 3: We execute the function with the provided arguments
args = json.loads(tool_call.arguments)
result = get_weather(args["latitude"], args["longitude"])
print(f"Function returned: {result}°C")

# Step 4: We send the function result back to the model
input_messages.append(tool_call)  # append model's function call message
input_messages.append({
    "type": "function_call_output",
    "call_id": tool_call.call_id,
    "output": str(result)
})

# Step 5: Model incorporates the external data into its response
response_2 = client.responses.create(
    model=model,
    input=input_messages,
    tools=tools,
)
print("\nFinal response:")
print(response_2.output_text)

# Try with other cities!
# You can modify the user query to ask about weather in different locations:
# "What's the weather in Tokyo right now?"
# "How warm is it in Sydney today?"
# "Tell me the temperature in Cairo, Egypt"

# 🧪 Try Your Own Examples!
Now it's your turn to experiment with both demos:

### For Event Extraction:
- Basic: Try different ways of phrasing calendar events
- Intermediate: Test with ambiguous dates or unusual time formats
- Advanced: Try inputs that mix event details with other content
- Expert: Modify the models to include additional fields (location, priority, etc.)

### For Function Calling:
- Try asking about weather in different cities
- Modify the function to return additional weather data (humidity, wind speed)
- Create a new tool function that does something else entirely!

Remember that the quality of the input prompt greatly affects the output. This is a great opportunity to practice prompt engineering.

## Helpful Resources
- [OpenAI pip](https://github.com/openai/openai-python)
- [OpenAI Responses API Docs](https://platform.openai.com/docs/api-reference/responses)
- [OpenAI Function Calling Documentation](https://platform.openai.com/docs/guides/function-calling)

### Potential Ideas

- Patient Data Analytics
- Healthcare Chatbots
- Device and Drug Comparative Effectiveness
- Medical Imaging Insights
- Assisted or Automated Diagnosis & Prescription
- Early Diagnosis
- Process automation (this could apply in many places)
- Patient Safety and quality
- Auditing of patient interactions