# LLM Function Calling - OpenAI

This demonstrates intermediate function calling on LLMs: geared toward `OpenAI` but works identically on Groq which has an OpenAI compatible API. Other LLM providers should be similar.

The hello-world of tool-calling, _get_weather_ is described in the official [OpenAI docs](https://platform.openai.com/docs/guides/function-calling). This version demonstrates an enhancement on that by utilizing pydantic classes to
 - Automate generation of the JSON schema required for tools
 - Automate deserialization of the JSON args supplied by OpenAI
 - Simplify implementation of tooling for ReAct and any other LLM use case.
 - Similar infra is also used for obtaining structured outputs from OpenAI and others.

There are two scenarios demonstrated here
 - `What is the weather in XX` which validates basic tool calling
 - `Increase Temperature by YY` which tests if the LLM can first call `get_temperature`, do some math and then call `set_temperature`.


## Setup Basic Environment

> Normally, these would go into a python module and I would include it in my 
module search path. However, when executed in colab directly from github. It will  require that I put my lib code in a repo, clone said repo into colab environment and then add it to my path. Something for later!

 - Logging
 - Colab environment
 - OpenAI functions 

In [None]:
import os

# If you want to log OpenAI's python library itself, also set the log level for this
# normally, limit this to warning/error and keep your own logging at debug levels.
# If this doesn't work right away, restart the kernel after changing the log-level
os.environ["OPENAI_LOG"]="error"

# Setup logging 
# Note that module needs to be reloaded for our config to take as Jupyter already configures it
# which makes all future configs no-ops.
from importlib import reload
import logging
reload(logging)
logging.basicConfig(format='%(asctime)s %(levelname)s:%(message)s', 
                    level=logging.ERROR, 
                    datefmt='%I:%M:%S')

#-------------------------------------------------------------
# Utility methods to visually separate output from logs
# displaying HTML and Markdown responses
from IPython.display import display, HTML

# Enhance with more Html (fg-color, font, etc) as needed but title is usually a good starting point.
def colorBox(txt, title=None):
    if title is not None:
        txt = f"<b>{title}</b><br><hr><br>{txt}"

    display(HTML(f"<div style='border-radius:15px;padding:15px;background-color:pink;color:black;'>{txt}</div>"))

#------------------------------------------------------------
# Configure for Colab
# You could do one of the two.
# Either paste your OpenAI Key here or put it in secrets
#-------------------------------------------------------------
if 'google.colab' in str(get_ipython()):
  from google.colab import userdata
  logging.debug("Tryign to fetch OPENAI_API_KEY from your secrets. Remember to make it available to this notebook")
  os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")

# Finally ensure you have the OpenAI key.
logging.debug("Checking if OPENAI_API_KEY is available")
assert(os.environ.get("OPENAI_API_KEY"))

#-------------------------------------------------------------
# Seup OpenAI completion function
#-------------------------------------------------------------
import openai
openai.api_key = os.environ.get("OPENAI_API_KEY")

def get_completion(prompt, model="gpt-4o-mini", tools = None, temperature=0) -> str:
    chat_history = [{"role":"user", "content":prompt}]
    response = get_response(chat_history=chat_history, 
                            model=model,
                            tools = tools, 
                            temperature=temperature)    
    return response.choices[0].message.content

def get_response(chat_history, model="gpt-4o-mini", tools = None, temperature=0) -> str:
    messages = chat_history
    return openai.chat.completions.create(
        model=model,
        messages=messages,
        tools=tools,
        temperature=temperature)    

07:33:55 DEBUG:Checking if OPENAI_API_KEY is available


## Use Pydantic to simplify schema creation

What OpenAI _(and by extension, every other vendor who wants to be compatible)_ needs.
 - A subset of JSON schema: _they don't spell it out, but from experience, I found that_
   - `ref` fields are not allowed. This means, schema references have to be resolved in-place. Thankfully a simple lib exists.
 - As much natural language description as possible
   - each function parameter to have a useful name and description
   - well thought out field names and descriptions

In PyDantic terms
 - To use the PyDantic model more easily, optionally make it a `dataclass`
 - Use `Field` to supply per-field description
 - use `jsonref.resolve_refs` to resolve any schema refs.

In [120]:
import json
import inspect
import jsonref
from typing import TypeVar

F = TypeVar('F')

def getToolJsonSchema(f:F) -> json:
    """
    f: A function of the form 'func(arg:BaseModel)` | `func()`
      Max of 1 argument and it should be a class deriving from PyDantic BaseModel  

    Eg.

    @dataclass
    class GetWeather(BaseModel):        
        location : str = Field(description="City and country e.g. San Jose, USA")

    def get_weather(args: GetWeather) -> float:
        '''
        Get current temperature for a given location.
        '''
        return 10    

    And use this thus:
    
        tool_schema = getToolJsonSchema(get_weather)
    """    
    if inspect.isfunction(f):
        sig = inspect.signature(f)
        params = list(sig.parameters.items())

        if len(params) > 1:
            # more than 1 args is an error for us!
            raise TypeError(f"{str(f)} should have just max of 1 arguments")
        elif len(params):            
            # 1 args case
            p_ann   = params[0][1].annotation            
            p_class = p_ann.__class__

            if "pydantic._internal._model_construction.ModelMetaclass" not in str(p_class):
                raise TypeError(f"{str(f)} must derive from Pydantic's BaseModel")

            # exec the class method to generate the json schema
            arg_json_schema = getattr(p_ann, 'model_json_schema').__call__()
            arg_json_schema = jsonref.replace_refs(arg_json_schema)

            # 👉 OpenAI enforces that "additionalProperties" : False is set!
            arg_json_schema["additionalProperties"] = False
                        
            # Now generate the schema for the full function
            # See https://platform.openai.com/docs/guides/function-calling
            func_schema = {
                "type"       : "function",
                "function"   : {
                    "name"       : f.__name__,
                    "description": f.__doc__.strip(),
                    "parameters" : arg_json_schema,
                    "strict"     : True
                }
            }            
        else:
            # 0 args case            
            # See https://platform.openai.com/docs/guides/function-calling
            func_schema = {
                "type"       : "function",
                "function"   : {
                    "name"       : f.__name__,
                    "description": f.__doc__.strip()                    
                }
            }        
        
        return func_schema
    else:
        raise TypeError(f"{str(f)} should be a python function")    


#------------------------------------------------------------------------
# Tool execution infrastructure
#------------------------------------------------------------------------
# Name -> { deserializer, executor} map
import json
from pydantic import ValidationError

class ToolExecutor:
    # tool_arg_deserializer(str) -> Obj. Expect pydantic's ValidationError
    # tool_func(obj)             -> str
    def __init__(self, name:str,                  
                 tool_arg_deserializer, 
                 tool_func):
        self.name = name
        self.tool_arg_deserializer = tool_arg_deserializer
        self.tool_func = tool_func
    
    def exec(self, json_arg: str) -> str:
        if self.tool_arg_deserializer:
            try:
                logging.debug(f"Attempting to deserialize {json_arg} for tool: {self.name}")

                arg_obj = self.tool_arg_deserializer(json_arg)
                logging.debug(f"✔️ deserialized to {arg_obj}. Calling function")

                func_result = self.tool_func(arg_obj)
                logging.debug(f"✔️ function returned {func_result}")

                return func_result

            except ValidationError as e:
                logging.error(f"JSON string {json_arg} is not valid for {self.name}:{e}")
        else:            
            logging.debug(f"Calling no-arg function: {self.name}")            
            assert(json_arg == '{}' or json_arg is None)
            
            func_result = self.tool_func()
            logging.debug(f"✔️ function returned {func_result}")
            return func_result

    
# {name : ToolExecutor}*
tool_dict = {}
        
def exec_tool(name: str, args: str) -> str :
    if not name in tool_dict:
        raise KeyError(f"Tool: {name} is not registered! Cannot call!")
    else:
        return tool_dict[name].exec(args)

## The chat loop with tools involved

Unlike a one-shot prompt, when tools are used:

 - the LLM can respond with one or more `tool call`s instead of an `assistant response`. 
 - We need to evaluate all the tool calls and respond. 
 - This is continued till the LLM responsds with an assistant response 
 - Then we are done.

In [121]:
def run_chat_loop(prompt:str, tools):
    """
    Runs a chat loop with an initial prompt and supplied tools
    Resolves all tool_calls made till a final assistant response is provided
    """
    # Initialize
    chat_history = [
        {
            "role" : "system",
            "content" : "You are a helpful assistant that uses the supplied tools to response to the user's questions."
        }]

    # Run the loop
    msgs = [{
        "role":"user", 
        "content": prompt}]
    
    while len(msgs):
        chat_history.extend(msgs)
        msgs = []

        response = get_response(
            chat_history=chat_history,
            tools = tools)

        # tool-call
        # Note: The OpenAI example is outdated
        # tool_calls is not longer a JSON object but an array of 
        # `ChatCompletionMessageToolCall` objects
        if response.choices[0].message.tool_calls:

            # The tool-call set needs to be added back to the chat_history
            msgs.append(response.choices[0].message)

            # Process all the tool calls
            for tool_call in response.choices[0].message.tool_calls:
                logging.debug(f"Executing tool_call: {tool_call}")            

                colorBox(f"{tool_call.function.name}({tool_call.function.arguments})", title="LLM Tool Call")
                tool_result = exec_tool(
                    tool_call.function.name,
                    tool_call.function.arguments)
                assert(isinstance(tool_result, str))

                # along with it's response. The response will be linked to the tool_call's 
                # via the ID        
                msgs.append({
                    "role" : "tool",
                    "tool_call_id" : tool_call.id,
                    "content"      : tool_result
                })
        else:
            # Assistant response
            chat_response = response.choices[0].message.content
            colorBox(chat_response, title="Final LLM Response")

## Implement OpenAI's get_weather example

See https://platform.openai.com/docs/guides/function-calling

The example is listed below

```python
from openai import OpenAI

client = OpenAI()

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current temperature for a given location.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City and country e.g. Bogotá, Colombia"
                }
            },
            "required": [
                "location"
            ],
            "additionalProperties": False
        },
        "strict": True
    }
}]

completion = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is the weather like in Paris today?"}],
    tools=tools
)
```

My goal is to automate the creation of the json given a function. The previously created `getToolJsonSchema` handles the creation of the schema given a function: the limitation is that the function, if it has arguments is limited to just 1 and it must be a pydantic class.

> 👉 Note that in a production scenario, one will enfore that all the fields and the function itself will have a reasonable descriptions. We need good natural language descriptions to allow the LLM to decide which tool to call.

In [122]:
from pydantic import BaseModel, Field
from dataclasses import dataclass

@dataclass
class GetWeather(BaseModel):        
    location : str = Field(description="City and country e.g. San Jose, USA")

def get_weather(args: GetWeather) -> float:
    """
    Get current temperature for a given location.
    """
    retval = "10"
    logging.debug(f"get_weather called with {args}. Returning hardcoded value {retval}")
    return retval

# Register in the tool dictionary
# Note that one could evolve this further to make the tool_dict the single 
# source and use it to the tool descriptions as well.
tool_dict["get_weather"] = ToolExecutor(
    name = "get_weather",
    tool_arg_deserializer = lambda json_str: GetWeather.model_validate_json(json_str),
    tool_func = lambda gw: get_weather(gw)
    )

In [123]:
# Test schema generation and execution!
if 0:
        # Verify that the JSON we get from out `get_weather` function matches the 
        # raw JSON used in the OpenAI example
        print(json.dumps(
                getToolJsonSchema(get_weather),
                indent=4
        ))

        # Test using the OpenAI example's serialized JSON
        exec_tool("get_weather", "{\"location\":\"Paris, France\"}")

In [115]:
run_chat_loop(
    prompt="What is the weather in San Jose, USA?",
    tools = [getToolJsonSchema(get_weather)]
)

09:25:37 DEBUG:Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'system', 'content': "You are a helpful assistant that uses the supplied tools to response to the user's questions."}, {'role': 'user', 'content': 'What is the weather in San Jose, USA?'}], 'model': 'gpt-4o-mini', 'temperature': 0, 'tools': [{'type': 'function', 'function': {'name': 'get_weather', 'description': 'Get current temperature for a given location.', 'parameters': {'properties': {'location': {'description': 'City and country e.g. San Jose, USA', 'title': 'Location', 'type': 'string'}}, 'required': ['location'], 'title': 'GetWeather', 'type': 'object', 'additionalProperties': False}, 'strict': True}}]}}
09:25:37 DEBUG:Sending HTTP Request: POST https://api.openai.com/v1/chat/completions
09:25:37 DEBUG:close.started
09:25:37 DEBUG:close.complete
09:25:37 DEBUG:connect_tcp.started host='api.openai.com' port=443 local_address=None timeout=5.0 socket_op

09:25:38 DEBUG:Attempting to deserialize {"location":"San Jose, USA"} for tool: get_weather
09:25:38 DEBUG:✔️ deserialized to location='San Jose, USA'. Calling function
09:25:38 DEBUG:get_weather called with location='San Jose, USA'. Returning hardcoded value 10
09:25:38 DEBUG:✔️ function returned 10
09:25:38 DEBUG:Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'system', 'content': "You are a helpful assistant that uses the supplied tools to response to the user's questions."}, {'role': 'user', 'content': 'What is the weather in San Jose, USA?'}, {'content': None, 'refusal': None, 'role': 'assistant', 'tool_calls': [{'id': 'call_tVbBvPZNGHyRLU8qJzbC2iRH', 'function': {'arguments': '{"location":"San Jose, USA"}', 'name': 'get_weather'}, 'type': 'function'}]}, {'role': 'tool', 'tool_call_id': 'call_tVbBvPZNGHyRLU8qJzbC2iRH', 'content': '10'}], 'model': 'gpt-4o-mini', 'temperature': 0, 'tools': [{'type': 'function', 'func

## Implement an IOT controller - Setting temperature

> Demonstrates the ability of the LLM to sequence multiple tool calls.

With the infra developed, this becomes quite simple. 

In [124]:
@dataclass
class SetTemperature(BaseModel):        
    temp : float = Field(description="The temperature value in Fahrenheit to set the thermostat to")

def set_thermostat_temperature(args: SetTemperature) -> str:
    """
    Sets the current temperature for a given location."
    """
    logging.debug(f"set_thermostat_temperature called with {args}")
    return ""

def get_thermostat_temperature() -> str:
    """
    Returns the current temperature setting of the thermostat."
    """
    retval = "60"
    logging.debug(f"get_thermostat_temperature called. Returning hardcoded {retval}")
    return retval

# Register in the tool dictionary
tool_dict.clear()
tool_dict["set_thermostat_temperature"] = ToolExecutor(
    name = "set_thermostat_temperature",
    tool_arg_deserializer = lambda json_str: SetTemperature.model_validate_json(json_str),
    tool_func = lambda gw: set_thermostat_temperature(gw)
    )

tool_dict["get_thermostat_temperature"] = ToolExecutor(
    name = "get_thermostat_temperature",
    tool_arg_deserializer = None,
    tool_func = lambda: get_thermostat_temperature()
    )

In [125]:
run_chat_loop(
    prompt="Increase the temperature by 10 degrees",
    tools = [
        getToolJsonSchema(get_thermostat_temperature),
        getToolJsonSchema(set_thermostat_temperature)
        ]
)

09:32:24 DEBUG:Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'system', 'content': "You are a helpful assistant that uses the supplied tools to response to the user's questions."}, {'role': 'user', 'content': 'Increase the temperature by 10 degrees'}], 'model': 'gpt-4o-mini', 'temperature': 0, 'tools': [{'type': 'function', 'function': {'name': 'get_thermostat_temperature', 'description': 'Returns the current temperature setting of the thermostat."'}}, {'type': 'function', 'function': {'name': 'set_thermostat_temperature', 'description': 'Sets the current temperature for a given location."', 'parameters': {'properties': {'temp': {'description': 'The temperature value in Fahrenheit to set the thermostat to', 'title': 'Temp', 'type': 'number'}}, 'required': ['temp'], 'title': 'SetTemperature', 'type': 'object', 'additionalProperties': False}, 'strict': True}}]}}
09:32:24 DEBUG:Sending HTTP Request: POST https://api.opena

09:32:25 DEBUG:Calling no-arg function: get_thermostat_temperature
09:32:25 DEBUG:get_thermostat_temperature called. Returning hardcoded 60
09:32:25 DEBUG:✔️ function returned 60
09:32:25 DEBUG:Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'system', 'content': "You are a helpful assistant that uses the supplied tools to response to the user's questions."}, {'role': 'user', 'content': 'Increase the temperature by 10 degrees'}, {'content': None, 'refusal': None, 'role': 'assistant', 'tool_calls': [{'id': 'call_NRZFjLO5krfnRJ5l5sOGZh6c', 'function': {'arguments': '{}', 'name': 'get_thermostat_temperature'}, 'type': 'function'}]}, {'role': 'tool', 'tool_call_id': 'call_NRZFjLO5krfnRJ5l5sOGZh6c', 'content': '60'}], 'model': 'gpt-4o-mini', 'temperature': 0, 'tools': [{'type': 'function', 'function': {'name': 'get_thermostat_temperature', 'description': 'Returns the current temperature setting of the thermostat."'}}, {'type'

09:32:25 DEBUG:Attempting to deserialize {"temp":70} for tool: set_thermostat_temperature
09:32:25 DEBUG:✔️ deserialized to temp=70.0. Calling function
09:32:25 DEBUG:set_thermostat_temperature called with temp=70.0
09:32:25 DEBUG:✔️ function returned 
09:32:25 DEBUG:Request options: {'method': 'post', 'url': '/chat/completions', 'files': None, 'json_data': {'messages': [{'role': 'system', 'content': "You are a helpful assistant that uses the supplied tools to response to the user's questions."}, {'role': 'user', 'content': 'Increase the temperature by 10 degrees'}, {'content': None, 'refusal': None, 'role': 'assistant', 'tool_calls': [{'id': 'call_NRZFjLO5krfnRJ5l5sOGZh6c', 'function': {'arguments': '{}', 'name': 'get_thermostat_temperature'}, 'type': 'function'}]}, {'role': 'tool', 'tool_call_id': 'call_NRZFjLO5krfnRJ5l5sOGZh6c', 'content': '60'}, {'content': None, 'refusal': None, 'role': 'assistant', 'tool_calls': [{'id': 'call_NB3AoGlxZNryvE3EXL10GOoZ', 'function': {'arguments': '