<a href="https://colab.research.google.com/github/GiX007/agent-labs/blob/main/03_langchain/08_function_calling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# OpenAI Function Calling In LangChain

## Setup

In [None]:
import os
import openai

from dotenv import load_dotenv, find_dotenv
dotenv_path = find_dotenv() or '/content/OPENAI_API_KEY.env' # read local .env file
load_dotenv(dotenv_path)

openai_api_key = os.getenv('OPENAI_API_KEY')

import warnings
warnings.filterwarnings("ignore")

In [None]:
from typing import List
from pydantic import BaseModel, Field

## Pydantic Syntax


Pydantic data classes are a blend of Python's data classes with the validation power of Pydantic.

They offer a concise way to define data structures while ensuring that the data adheres to specified types and constraints.

This makes them ideal for reliably handling inputs and outputs in applications, especially when working with LLMs or APIs, as they catch errors early and simplify data serialization. They also integrate seamlessly with frameworks like LangChain, allowing structured data to flow safely through pipelines and tools.

In standard python you would create a class like this:

In [None]:
class User:
    def __init__(self, name: str, age: int, email: str):
        self.name = name
        self.age = age
        self.email = email

In [None]:
foo = User(name="Joe",age=32, email="joe@gmail.com")

In [None]:
foo.name

'Joe'

In [None]:
foo = User(name="Joe",age="bar", email="joe@gmail.com")

In [None]:
foo.age

'bar'

But, if we use pydantic's BaseModel, we get:

In [None]:
class pUser(BaseModel):
    name: str
    age: int
    email: str

In [None]:
foo_p = pUser(name="Jane", age=32, email="jane@gmail.com")

In [None]:
foo_p.name

'Jane'

<p style=\"background-color:#F5C780; padding:15px\"><b>Note:</b> The next line is expected to fail.</p>

In [None]:
foo_p = pUser(name="Jane", age="bar", email="jane@gmail.com")

ValidationError: 1 validation error for pUser
age
  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='bar', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/int_parsing

In [None]:
class Class(BaseModel):
    students: List[pUser]

In [None]:
obj = Class(
    students=[pUser(name="Jane", age=32, email="jane@gmail.com")]
)

In [None]:
obj

Class(students=[pUser(name='Jane', age=32, email='jane@gmail.com')])

## Pydantic to OpenAI function definition


In [None]:
# Pydantic class
class WeatherSearch(BaseModel):
    """Call this with an airport code to get the weather at that airport"""
    airport_code: str = Field(description="airport code to get weather for")

In [None]:
from langchain_core.utils.function_calling import convert_to_openai_function

In [None]:
# Convert a Pydantic model into an OpenAI-compatible function schema for LLM function-calling (convert_to_openai_function = Pydantic → OpenAI function schema for LLMs)
weather_function = convert_to_openai_function(WeatherSearch) # produces a JSON schema that OpenAI’s function-calling API can understand

In [None]:
weather_function

{'name': 'WeatherSearch',
 'description': 'Call this with an airport code to get the weather at that airport',
 'parameters': {'properties': {'airport_code': {'description': 'airport code to get weather for',
    'type': 'string'}},
  'required': ['airport_code'],
  'type': 'object'}}

In [None]:
class WeatherSearch1(BaseModel):
    airport_code: str = Field(description="airport code to get weather for")

Notice what happens in the next cells.

In [None]:
convert_to_openai_function(WeatherSearch1)

{'name': 'WeatherSearch1',
 'description': '',
 'parameters': {'properties': {'airport_code': {'description': 'airport code to get weather for',
    'type': 'string'}},
  'required': ['airport_code'],
  'type': 'object'}}

In [None]:
class WeatherSearch2(BaseModel):
    """Call this with an airport code to get the weather at that airport"""
    airport_code: str

In [None]:
convert_to_openai_function(WeatherSearch2)

{'name': 'WeatherSearch2',
 'description': 'Call this with an airport code to get the weather at that airport',
 'parameters': {'properties': {'airport_code': {'type': 'string'}},
  'required': ['airport_code'],
  'type': 'object'}}

In [None]:
!pip install langchain_openai



In [None]:
from langchain_openai import ChatOpenAI

In [None]:
model = ChatOpenAI()

In [None]:
model.invoke("what is the weather in SF today?", functions=[weather_function])

AIMessage(content='', additional_kwargs={'function_call': {'arguments': '{"airport_code":"SFO"}', 'name': 'WeatherSearch'}, 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 70, 'total_tokens': 87, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-CbPWsXy8eMYqYX0eEG3qj2EonECVV', 'service_tier': 'default', 'finish_reason': 'function_call', 'logprobs': None}, id='lc_run--2b38bb40-ddb3-4c0e-b2de-34fa57791d39-0', usage_metadata={'input_tokens': 70, 'output_tokens': 17, 'total_tokens': 87, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [None]:
model_with_function = model.bind(functions=[weather_function])

In [None]:
model_with_function.invoke("what is the weather in sf?")

AIMessage(content='', additional_kwargs={'function_call': {'arguments': '{"airport_code":"SFO"}', 'name': 'WeatherSearch'}, 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 69, 'total_tokens': 86, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-CbPXdmkqY7iSaAPipNwAtwlCW8xd8', 'service_tier': 'default', 'finish_reason': 'function_call', 'logprobs': None}, id='lc_run--9b02f5ea-8b88-4bff-a2e6-e0ea8ebfe91b-0', usage_metadata={'input_tokens': 69, 'output_tokens': 17, 'total_tokens': 86, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In this example, the model returned `gpt-3.5-turbo` because that is the default underlying model used by `ChatOpenAI()` when no `model` parameter is explicitly specified. The output did not produce a direct answer because the LLM correctly decided to generate a `function_call` instead of normal text, identifying that the user's query should be handled by the `WeatherSearch` function. The `'output_tokens': 17` indicates that the model used 17 tokens to produce this function call response (we will see later how this affects the cost). Additionally, this implementation demonstrates an excellent use of `bind` (or passing functions to the model), as it allows the LLM to know about the function ahead of time without needing to provide it on every call, enabling automatic and structured function-calling behavior.

## Forcing it to use a function

We can force the model to use a function.

In [None]:
model_with_forced_function = model.bind(functions=[weather_function], function_call={"name":"WeatherSearch"})

In [None]:
model_with_forced_function.invoke("what is the weather in sf?")

AIMessage(content='', additional_kwargs={'function_call': {'arguments': '{"airport_code":"SFO"}', 'name': 'WeatherSearch'}, 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 79, 'total_tokens': 86, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-CbPfsMjjWh7tGA3wKSWjnHtZNZ6zH', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--f3386b10-5a45-4552-89da-258fa9939824-0', usage_metadata={'input_tokens': 79, 'output_tokens': 7, 'total_tokens': 86, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [None]:
# the function will be used at any prompt
model_with_forced_function.invoke("hi!")

AIMessage(content='', additional_kwargs={'function_call': {'arguments': '{"airport_code":"JFK"}', 'name': 'WeatherSearch'}, 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 74, 'total_tokens': 81, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-CbPgFfyf1bGch8HWwGSOZ1r46EWqK', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--e5999670-0373-4a50-88a1-9c5e31f6bf58-0', usage_metadata={'input_tokens': 74, 'output_tokens': 7, 'total_tokens': 81, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

## Using in a chain

We can use this model bound to function in a chain as we normally would.

In [None]:
from langchain_core.prompts import ChatPromptTemplate

In [None]:
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant"),
    ("user", "{input}")
])

In [None]:
chain = prompt | model_with_function

In [None]:
chain.invoke({"input": "what is the weather in sf?"})

AIMessage(content='', additional_kwargs={'function_call': {'arguments': '{"airport_code":"SFO"}', 'name': 'WeatherSearch'}, 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 75, 'total_tokens': 92, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-CbPiKA8GxpF6dr8AUkzO8pFdQ1Sio', 'service_tier': 'default', 'finish_reason': 'function_call', 'logprobs': None}, id='lc_run--79e6097a-0953-457b-82e2-7b30c1d5ceee-0', usage_metadata={'input_tokens': 75, 'output_tokens': 17, 'total_tokens': 92, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

## Using multiple functions

Even better, we can pass a set of functions and let the LLM decide which to use based on the question context.

In [None]:
class ArtistSearch(BaseModel):
    """Call this to get the names of songs by a particular artist"""
    artist_name: str = Field(description="name of artist to look up")
    n: int = Field(description="number of results")

In [None]:
functions = [
    convert_to_openai_function(WeatherSearch),
    convert_to_openai_function(ArtistSearch),
]

In [None]:
model_with_functions = model.bind(functions=functions)

In [None]:
# notice how the model understands which function to use
model_with_functions.invoke("what is the weather in sf?")

AIMessage(content='', additional_kwargs={'function_call': {'arguments': '{"airport_code":"SFO"}', 'name': 'WeatherSearch'}, 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 17, 'prompt_tokens': 116, 'total_tokens': 133, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-CbPji9ePRzAOHZVsvPLm8CsTZagvV', 'service_tier': 'default', 'finish_reason': 'function_call', 'logprobs': None}, id='lc_run--176c1a13-c2c6-49cb-ae90-e8dad2bb2bde-0', usage_metadata={'input_tokens': 116, 'output_tokens': 17, 'total_tokens': 133, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [None]:
# notice how the model understands which function to use
model_with_functions.invoke("what are three songs by taylor swift?")

AIMessage(content='', additional_kwargs={'function_call': {'arguments': '{"artist_name":"Taylor Swift","n":3}', 'name': 'ArtistSearch'}, 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 21, 'prompt_tokens': 118, 'total_tokens': 139, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-CbPjsVlqzeDAxi2Q6mgWGRCOzSaxE', 'service_tier': 'default', 'finish_reason': 'function_call', 'logprobs': None}, id='lc_run--bda971be-2a08-42d8-afbd-7d86595b12fa-0', usage_metadata={'input_tokens': 118, 'output_tokens': 21, 'total_tokens': 139, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [None]:
# irrelevant input
model_with_functions.invoke("hi!")

AIMessage(content='Hello! How can I assist you today?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 10, 'prompt_tokens': 111, 'total_tokens': 121, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'id': 'chatcmpl-CbPjytXcqi4jRPalIf6bzlIXizBDL', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--bd3340ec-2883-45fb-bef4-90eebf54879d-0', usage_metadata={'input_tokens': 111, 'output_tokens': 10, 'total_tokens': 121, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In this notebook, we explored how the Pydantic approach can simplify and structure our workflow. We learned how to define Pydantic classes and convert them into OpenAI-compatible functions, highlighting why this approach is more robust and reliable than using plain Python classes. We also saw different ways to call functions with LLMs, including using chains, `bind` to embed functions into the model, forcing specific function calls, and providing multiple functions while letting the LLM choose the most appropriate one. Overall, this demonstrates a clean, structured, and flexible way to integrate function-calling into AI applications.