# Function Calling with OpenAI's API: From Execution to Data Extraction

This notebook explores function calling with OpenAI's API, demonstrating both explicit function execution and structured data extraction. 

# Tool Use with OpenAI Functions

This notebook demonstrates how to use OpenAI's function calling capabilities for:
- ✅ Executing real-world tools (e.g., fetching weather data)
- ✅ Extracting structured data from unstructured text (e.g., parsing LinkedIn profiles)

We cover three approaches to structured extraction:
1. Traditional function schemas
2. JSON mode with `response_format`
3. Beta parsing with `pydantic`

Includes examples of tool definition, execution, and message flow in the OpenAI chat API.

## Part 1: Explicit Function Execution
The notebook begins with a weather API example that demonstrates a crucial aspect of function calling: explicit execution. When using function calling to actually perform actions (like getting weather data), you need to:
1. Define the function structure for the model
2. Let the model decide when to call it
3. **Explicitly execute** the function yourself with the model's parameters
4. Feed the results back to the model

This weather example shows how to:
- Define a real function (`get_weather`)
- Let the model choose to call it with coordinates
- Actually execute the weather API call
- Return the results to the model for final response formatting

This pattern is essential when using function calling for real actions rather than just data structuring.



In [4]:
!pip install -q openai

In [None]:
import os
os.environ['OPENAI_API_KEY'] = 'XXX'

In [6]:
import requests

def get_weather(latitude, longitude):
    response = requests.get(f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&current=temperature_2m,wind_speed_10m&hourly=temperature_2m,relative_humidity_2m,wind_speed_10m")
    data = response.json()
    return data['current']['temperature_2m']

In [8]:
from openai import OpenAI
import json

client = OpenAI()

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current temperature for provided coordinates in celsius.",
        "parameters": {
            "type": "object",
            "properties": {
                "latitude": {"type": "number"},
                "longitude": {"type": "number"}
            },
            "required": ["latitude", "longitude"],
            "additionalProperties": False
        },
        "strict": True
    }
}]

messages = [{"role": "user", "content": "What's the weather like in Sydney today?"}]

completion = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
)

In [9]:
completion.choices[0].message.tool_calls

[ChatCompletionMessageFunctionToolCall(id='call_O65R7EQQxpDqyBj9vuzOXnSR', function=Function(arguments='{"latitude":-33.865143,"longitude":151.2099}', name='get_weather'), type='function')]

In [10]:
tool_call = completion.choices[0].message.tool_calls[0]
args = json.loads(tool_call.function.arguments)

result = get_weather(args["latitude"], args["longitude"])

In [11]:
messages.append(completion.choices[0].message)  # append model's function call message
messages.append({
    "role": "tool",
    "tool_call_id": tool_call.id,
    "content": str(result)  # Convert to string
})

completion_2 = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=tools,
)

In [12]:
completion_2.choices[0].message.content

'The current temperature in Sydney is 21°C.'

## 2. Function Calling with OpenAI's API: From Execution to Data Extraction#

While the weather example showed function calling for actual execution, these next examples demonstrate three ways to use OpenAI's function calling capabilities purely for structured data extraction. Each builds on the core concept of function calling but offers different patterns for implementation.

### 1. Traditional Function Calling
The original and most flexible approach, where we:
- Define a function schema that describes our desired structure
- Have the model provide arguments matching that schema
- Get structured data without actually executing any function

Best for:
- Complex data structures
- When you need fine-grained control
- Multi-step extraction processes
- Production environments needing robust validation

### 2. JSON Mode Response Format
A streamlined version of function calling that:
- Removes the function definition overhead
- Still ensures structured output
- Uses a simpler response_format parameter
- Maintains JSON compliance

Perfect for:
- Simple data extraction tasks
- When you want function calling's structure with less setup
- Rapid prototyping
- Straightforward schema requirements

### 3. Beta Parsing with Pydantic
An experimental approach that:
- Combines function calling with Pydantic validation
- Provides automatic type checking
- Offers a more Pythonic interface
- Simplifies the extraction-to-object pipeline

Ideal for:
- Python-native development
- When you want automatic type validation
- Projects already using Pydantic
- Quick prototypes needing type safety

## Key Considerations

Remember:
- Traditional function calling offers the most control but requires more setup
- JSON mode simplifies things but may sacrifice some flexibility
- Beta features like Pydantic parsing might change in future versions
- Choose based on your needs for validation, type safety, and development speed

In [13]:
from openai import OpenAI
import json

client = OpenAI()

# Unstructured text input
unstructured_text = """
John Doe is a Software Engineer at TechCorp Inc., based in San Francisco, California. 
Previously, he worked as a Backend Developer at CodeWorks LLC and as a Junior Developer at Webify Solutions. 
He has skills in Python, Java, Docker, Kubernetes, and Cloud Architecture.
"""

# Define tools for function calling
tools = [
    {
        "type": "function",
        "function": {
            "name": "extract_linkedin_profile",
            "description": "Extract structured information from a LinkedIn profile",
            "parameters": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "current_position": {"type": "string"},
                    "location": {"type": "string"},
                    "previous_positions": {"type": "array", "items": {"type": "string"}},
                    "skills": {"type": "array", "items": {"type": "string"}}
                },
                "required": ["name", "current_position", "location", "previous_positions", "skills"]
            }
        }
    }
]

# Extract structured data using function calling
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "Extract structured information from the LinkedIn profile text"},
        {"role": "user", "content": unstructured_text}
    ],
    tools=tools,
    tool_choice={"type": "function", "function": {"name": "extract_linkedin_profile"}}
)

# Extract the function call arguments
function_call = response.choices[0].message.tool_calls[0]
linkedin_profile = json.loads(function_call.function.arguments)

# Print the extracted profile
print(json.dumps(linkedin_profile, indent=2))

{
  "name": "John Doe",
  "current_position": "Software Engineer at TechCorp Inc.",
  "location": "San Francisco, California",
  "previous_positions": [
    "Backend Developer at CodeWorks LLC",
    "Junior Developer at Webify Solutions"
  ],
  "skills": [
    "Python",
    "Java",
    "Docker",
    "Kubernetes",
    "Cloud Architecture"
  ]
}


In [14]:
from pydantic import BaseModel
from openai import OpenAI

client = OpenAI()

class LinkedInProfile(BaseModel):
    name: str
    current_position: str
    location: str
    previous_positions: list[str]
    skills: list[str]

# Unstructured text input
unstructured_text = """
John Doe is a Software Engineer at TechCorp Inc., based in San Francisco, California. 
Previously, he worked as a Backend Developer at CodeWorks LLC and as a Junior Developer at Webify Solutions. 
He has skills in Python, Java, Docker, Kubernetes, and Cloud Architecture.
"""

# Extract structured data using Pydantic parsing
completion = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are an expert at extracting structured data from a LinkedIn profile."},
        {"role": "user", "content": unstructured_text}
    ],
    response_format=LinkedInProfile,
)

# Access the parsed profile
linkedin_profile = completion.choices[0].message.parsed

# Print the extracted profile
print(linkedin_profile.model_dump_json(indent=2))

{
  "name": "John Doe",
  "current_position": "Software Engineer at TechCorp Inc.",
  "location": "San Francisco, California",
  "previous_positions": [
    "Backend Developer at CodeWorks LLC",
    "Junior Developer at Webify Solutions"
  ],
  "skills": [
    "Python",
    "Java",
    "Docker",
    "Kubernetes",
    "Cloud Architecture"
  ]
}


In [15]:
from openai import OpenAI
import json

client = OpenAI()

# Unstructured text input
unstructured_text = """
John Doe is a Software Engineer at TechCorp Inc., based in San Francisco, California. 
Previously, he worked as a Backend Developer at CodeWorks LLC and as a Junior Developer at Webify Solutions. 
He has skills in Python, Java, Docker, Kubernetes, and Cloud Architecture.
"""

# Extract structured data using JSON mode
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "Extract the LinkedIn profile information in a structured JSON format with these keys: name, current_position, location, previous_positions, skills"},
        {"role": "user", "content": unstructured_text}
    ],
    response_format={"type": "json_object"}
)

# Parse the JSON response
linkedin_profile = json.loads(response.choices[0].message.content)

# Print the extracted profile
print(json.dumps(linkedin_profile, indent=2))

{
  "name": "John Doe",
  "current_position": "Software Engineer at TechCorp Inc.",
  "location": "San Francisco, California",
  "previous_positions": [
    "Backend Developer at CodeWorks LLC",
    "Junior Developer at Webify Solutions"
  ],
  "skills": [
    "Python",
    "Java",
    "Docker",
    "Kubernetes",
    "Cloud Architecture"
  ]
}
