# Function Calling -- Format JSON

Function calling is my favourite, open secret about modern LLMs: They can be used to communicate with other systems. 

How do we do that? By parsing your information into a JSON, which can then be used to interact with your APIs. There are more straight forward use cases for JSON output as well e.g. parsing a text into structured data.

The remainder of this notebook is directly from the [OpenAI Cookbook](https://github.com/openai/openai-cookbook/blob/main/examples/How_to_call_functions_with_chat_models.ipynb)

## How do we structure language? 

1. Parse it into a structured format of our choice and design
2. JSON or XML or YML or toml
3. JSON is the most common format for APIs -- OpenAI


## Example Problem Statement

1. We want to decide whether we can use a function, and call it. Python function for instance. Tool.
2. Tool: What inputs or parameters are required? What are the outputs?

These 2 questions are the essential OpenAI JSON Schema. 


This notebook covers how to use the Chat Completions API in combination with external functions to extend the capabilities of GPT models.

`tools` is an optional parameter in the Chat Completion API which can be used to provide function specifications. The purpose of this is to enable models to generate function arguments which adhere to the provided specifications. Note that the API will not actually execute any function calls. It is up to developers to execute function calls using model outputs.

Within the `tools` parameter, if the `functions` parameter is provided then by default the model will decide when it is appropriate to use one of the functions. The API can be forced to use a specific function by setting the `tool_choice` parameter to `{"name": "<insert-function-name>"}`. The API can also be forced to not use any function by setting the `tool_choice` parameter to `"none"`. If a function is used, the output will contain `"finish_reason": "function_call"` in the response, as well as a `tool_choice` object that has the name of the function and the generated function arguments.

### Overview

This notebook contains the following 2 sections:

- **How to generate function arguments:** Specify a set of functions and use the API to generate function arguments.
- **How to call functions with model generated arguments:** Close the loop by actually executing functions with model generated arguments.

## How to generate function arguments

In [1]:
import instructor
import marvin
from dotenv import load_dotenv
from openai import OpenAI
from tenacity import retry, stop_after_attempt, wait_random_exponential
from termcolor import colored

load_dotenv()
GPT_MODEL = "gpt-3.5-turbo-0613"
client = OpenAI()

### Utilities

First let's define a few utilities for making calls to the Chat Completions API and for maintaining and keeping track of the conversation state.

In [2]:
@retry(wait=wait_random_exponential(multiplier=1, max=40), stop=stop_after_attempt(3))
def chat_completion_request(messages, tools=None, tool_choice=None, model=GPT_MODEL):
    try:
        response = client.chat.completions.create(
            model=model,
            messages=messages,
            tools=tools,
            tool_choice=tool_choice,
        )
        return response
    except Exception as e:
        print("Unable to generate ChatCompletion response")
        print(f"Exception: {e}")
        return e

In [3]:
def pretty_print_conversation(messages):
    role_to_color = {
        "system": "red",
        "user": "green",
        "assistant": "blue",
        "function": "magenta",
    }

    for message in messages:
        if message["role"] == "system":
            print(
                colored(
                    f"system: {message['content']}\n", role_to_color[message["role"]]
                )
            )
        elif message["role"] == "user":
            print(
                colored(f"user: {message['content']}\n", role_to_color[message["role"]])
            )
        elif message["role"] == "assistant" and message.get("function_call"):
            print(
                colored(
                    f"assistant: {message['function_call']}\n",
                    role_to_color[message["role"]],
                )
            )
        elif message["role"] == "assistant" and not message.get("function_call"):
            print(
                colored(
                    f"assistant: {message['content']}\n", role_to_color[message["role"]]
                )
            )
        elif message["role"] == "function":
            print(
                colored(
                    f"function ({message['name']}): {message['content']}\n",
                    role_to_color[message["role"]],
                )
            )

### Basic concepts

Let's create some function specifications to interface with a hypothetical weather API. We'll pass these function specification to the Chat Completions API in order to generate function arguments that adhere to the specification.

In [14]:
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_weather",
            "description": "Get the current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                },
                "required": ["location", "format"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "get_n_day_weather_forecast",
            "description": "Get an N-day weather forecast",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                    "num_days": {
                        "type": "integer",
                        "description": "The number of days to forecast",
                    },
                },
                "required": ["location", "format", "num_days"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "get_local_language",
            "description": "Get the most common languages spoken based on user's location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                },
                "required": ["location"],
            },
        },
    },
]

In [17]:
len(tools)
tools[-1]

{'type': 'function',
 'function': {'name': 'get_local_language',
  'description': "Get the most common languages spoken based on user's location",
  'parameters': {'type': 'object',
   'properties': {'location': {'type': 'string',
     'description': 'The city and state, e.g. San Francisco, CA'}},
   'required': ['location']}}}

If we prompt the model about the current weather, it will respond with some clarifying questions.

In [5]:
messages = []
messages.append(
    {
        "role": "system",
        "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.",
    }
)
messages.append({"role": "user", "content": "What's the weather like today"})
chat_response = chat_completion_request(messages, tools=tools)
assistant_message = chat_response.choices[0].message
messages.append(assistant_message)
assistant_message

ChatCompletionMessage(content='Sure, I can help you with that. Could you please provide me with the location?', role='assistant', function_call=None, tool_calls=None)

Once we provide the missing information, it will generate the appropriate function arguments for us.

In [6]:
messages.append({"role": "user", "content": "I'm in Glasgow, Scotland."})
chat_response = chat_completion_request(messages, tools=tools)
assistant_message = chat_response.choices[0].message
messages.append(assistant_message)
assistant_message

ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_SI8ipOfdnzh43oNn96SiDuGb', function=Function(arguments='{\n  "location": "Glasgow, Scotland",\n  "format": "celsius"\n}', name='get_current_weather'), type='function')])

By prompting it differently, we can get it to target the other function we've told it about.

In [7]:
messages = []
messages.append(
    {
        "role": "system",
        "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.",
    }
)
messages.append(
    {
        "role": "user",
        "content": "what is the weather going to be like in Glasgow, Scotland over the next 5 days",
    }
)
chat_response = chat_completion_request(messages, tools=tools)
assistant_message = chat_response.choices[0].message
messages.append(assistant_message)
assistant_message

ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_rYzwZHqkrF1NV4q1ZNNCBC2k', function=Function(arguments='{\n"location": "Glasgow, Scotland",\n"format": "celsius",\n"num_days": 5\n}', name='get_n_day_weather_forecast'), type='function')])

Once again, the model is asking us for clarification because it doesn't have enough information yet. In this case it already knows the location for the forecast, but it needs to know how many days are required in the forecast.

In [9]:
messages.append({"role": "user", "content": "5 days"})
chat_response = chat_completion_request(messages, tools=tools)
chat_response

Unable to generate ChatCompletion response
Exception: Error code: 400 - {'error': {'message': "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_rYzwZHqkrF1NV4q1ZNNCBC2k", 'type': 'invalid_request_error', 'param': 'messages.[3].role', 'code': None}}


openai.BadRequestError('Error code: 400 - {\'error\': {\'message\': "An assistant message with \'tool_calls\' must be followed by tool messages responding to each \'tool_call_id\'. The following tool_call_ids did not have response messages: call_rYzwZHqkrF1NV4q1ZNNCBC2k", \'type\': \'invalid_request_error\', \'param\': \'messages.[3].role\', \'code\': None}}')

#### Forcing the use of specific functions or no function

We can force the model to use a specific function, for example get_n_day_weather_forecast by using the function_call argument. By doing so, we force the model to make assumptions about how to use it.

In [10]:
# in this cell we force the model to use get_n_day_weather_forecast
messages = []
messages.append(
    {
        "role": "system",
        "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.",
    }
)
messages.append(
    {"role": "user", "content": "Give me a weather report for Toronto, Canada."}
)
chat_response = chat_completion_request(
    messages,
    tools=tools,
    tool_choice={
        "type": "function",
        "function": {"name": "get_n_day_weather_forecast"},
    },
)
chat_response.choices[0].message

ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_AP5wn56f9eu7bcUThM80wcbJ', function=Function(arguments='{\n  "location": "Toronto, Canada",\n  "format": "celsius",\n  "num_days": 1\n}', name='get_n_day_weather_forecast'), type='function')])

In [11]:
# if we don't force the model to use get_n_day_weather_forecast it may not
messages = []
messages.append(
    {
        "role": "system",
        "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.",
    }
)
messages.append(
    {"role": "user", "content": "Give me a weather report for Toronto, Canada."}
)
chat_response = chat_completion_request(messages, tools=tools)
chat_response.choices[0].message

ChatCompletionMessage(content=None, role='assistant', function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='call_5T05g46ssbkcLBLWmvEEEyku', function=Function(arguments='{\n  "location": "Toronto, Canada",\n  "format": "celsius"\n}', name='get_current_weather'), type='function')])

We can also force the model to not use a function at all. By doing so we prevent it from producing a proper function call.

In [12]:
messages = []
messages.append(
    {
        "role": "system",
        "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.",
    }
)
messages.append(
    {
        "role": "user",
        "content": "Give me the current weather (use Celcius) for Toronto, Canada.",
    }
)
chat_response = chat_completion_request(messages, tools=tools, tool_choice="none")
chat_response.choices[0].message

ChatCompletionMessage(content='{\n  "location": "Toronto, Canada",\n  "format": "celsius"\n}', role='assistant', function_call=None, tool_calls=None)

### Parallel Function Calling

Newer models like gpt-4-1106-preview or gpt-3.5-turbo-1106 can call multiple functions in one turn.

In [18]:
messages = []
messages.append(
    {
        "role": "system",
        "content": "Don't make assumptions about what values to plug into functions. Ask for clarification if a user request is ambiguous.",
    }
)
messages.append(
    {
        "role": "user",
        "content": "What languages are spoken in Toronto, Canada?",
    }
)
chat_response = chat_completion_request(
    messages, tools=tools, model="gpt-3.5-turbo-1106"
)

assistant_message = chat_response.choices[0].message.tool_calls
assistant_message

[ChatCompletionMessageToolCall(id='call_sUCvibDgovA0JtI4YE6TmGmN', function=Function(arguments='{"location":"Toronto, Canada"}', name='get_local_language'), type='function')]

In [26]:
import json

params = json.loads(assistant_message[0].function.arguments)

In [30]:
# for message in assistant_message:
#     params = json.loads(message.function.arguments)
#     print(f"function: {message.function.name}")

In [27]:
def get_local_language(location: str):
    ...
    # Wikipedia search for languages spoken in location


get_local_language(**params)

'English, French'

## Next Steps

Checkout other [notebook](https://github.com/openai/openai-cookbook/blob/main/examples/How_to_call_functions_for_knowledge_retrieval.ipynb) that demonstrates how to use the Chat Completions API and functions for knowledge retrieval to interact conversationally with a knowledge base.

# Using Function Calling with Pydantic

Pydantic is a data validation and settings management using Python type annotations. It is a library that allows you to validate data and settings using Python type annotations. It is a great library to use with FastAPI, but it can be used with any Python project. We use instructor (https://jxnl.github.io/instructor) to demonstrate how to use Pydantic with Function Calling.

As an illustration, we will use a complex task: Graph Construction -- but a simple example, so that you can self verify the errors with the model outputs. 

## Graph Construction

In this guide, we demonstrate how to extract and resolve entities from a sample legal contract. Then, we visualize these entities and their dependencies as an entity graph

In [36]:
from typing import List

from pydantic import BaseModel, Field


class Property(BaseModel):
    key: str
    value: str
    resolved_absolute_value: str


class Entity(BaseModel):
    id: int = Field(
        ...,
        description="Unique identifier for the entity, used for deduplication, design a scheme allows multiple entities",
    )
    subquote_string: List[str] = Field(
        ...,
        description="Correctly resolved value of the entity, if the entity is a reference to another entity, this should be the id of the referenced entity, include a few more words before and after the value to allow for some context to be used in the resolution",
    )
    entity_title: str
    properties: List[Property] = Field(
        ..., description="List of properties of the entity"
    )
    dependencies: List[int] = Field(
        ...,
        description="List of entity ids that this entity depends or relies on to resolve it",
    )


class DocumentExtraction(BaseModel):
    entities: List[Entity] = Field(
        ...,
        description="Body of the answer, each fact should be a separate object with a body and a list of sources",
    )

In [35]:
d = DocumentExtraction(
    entities=[
        Entity(
            id=1,
            subquote_string=["Toronto is a city in Canada"],
            entity_title="Toronto",
            properties=[
                Property(
                    key="location",
                    value="Toronto",
                    resolved_absolute_value="Toronto, Canada",
                )
            ],
            dependencies=[2],
        )
    ]
)
print(d.json())

{"entities":[{"id":1,"subquote_string":["Toronto is a city in Canada"],"entity_title":"Toronto","properties":[{"key":"location","value":"Toronto","resolved_absolute_value":"Toronto, Canada"}],"dependencies":[2]}]}


In [32]:
import instructor
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()

True

In [37]:
# Apply the patch to the OpenAI client
# enables response_model keyword
client = instructor.patch(OpenAI())


def ask_ai(content: str) -> DocumentExtraction:
    return client.chat.completions.create(
        model="gpt-4",
        response_model=DocumentExtraction,
        messages=[
            {
                "role": "system",
                "content": "Extract and resolve a list of entities from the following document:",
            },
            {
                "role": "user",
                "content": content,
            },
        ],
    )  # type: ignore

In [38]:
from graphviz import Digraph


def generate_html_label(entity: Entity) -> str:
    rows = [
        f"<tr><td>{prop.key}</td><td>{prop.resolved_absolute_value}</td></tr>"
        for prop in entity.properties
    ]
    table_rows = "".join(rows)
    return f"<<table border='0' cellborder='1' cellspacing='0'><tr><td colspan='2'><b>{entity.entity_title}</b></td></tr>{table_rows}</table>>"


def generate_graph(data: DocumentExtraction):
    dot = Digraph(comment="Entity Graph", node_attr={"shape": "plaintext"})

    for entity in data.entities:
        label = generate_html_label(entity)
        dot.node(str(entity.id), label)

    for entity in data.entities:
        for dep_id in entity.dependencies:
            dot.edge(str(entity.id), str(dep_id))

    dot.render("entity.gv", view=True)

In [39]:
content = """
My name is Nirant Kasliwal. I am an AI Engineer and I'm teaching some amazing, perseverant engineers learning about AI
"""

In [41]:
d = ask_ai(content)

'{"entities":[{"id":1,"subquote_string":["My name is Nirant Kasliwal."],"entity_title":"Nirant Kasliwal","properties":[{"key":"name","value":"Nirant Kasliwal","resolved_absolute_value":"Nirant Kasliwal"},{"key":"profession","value":"AI Engineer","resolved_absolute_value":"AI Engineer"}],"dependencies":[]},{"id":2,"subquote_string":["I am an AI Engineer and I\'m teaching some amazing, perseverant engineers learning about AI"],"entity_title":"Engineers learning about AI","properties":[{"key":"profession","value":"Engineer in AI","resolved_absolute_value":"Engineer in AI"},{"key":"description","value":"amazing, perseverant","resolved_absolute_value":"amazing, perseverant"}],"dependencies":[1]}]}'

In [50]:
generate_graph(d)

In [51]:
content = """
Sample Legal Contract
Agreement Contract

This Agreement is made and entered into on 2020-01-01 by and between Company A ("the Client") and Company B ("the Service Provider").

Article 1: Scope of Work

The Service Provider will deliver the software product to the Client 30 days after the agreement date.

Article 2: Payment Terms

The total payment for the service is $50,000.
An initial payment of $10,000 will be made within 7 days of the the signed date.
The final payment will be due 45 days after [SignDate].

Article 3: Confidentiality

The parties agree not to disclose any confidential information received from the other party for 3 months after the final payment date.

Article 4: Termination

The contract can be terminated with a 30-day notice, unless there are outstanding obligations that must be fulfilled after the [DeliveryDate].
"""  # Your legal contract here
model = ask_ai(content)

In [52]:
generate_graph(model)

In [12]:
example_content = """
# Technical Document: User Profile Service with Horizontal Scaling

## Overview

The User Profile Service is a microservice designed to manage user profiles in a large-scale web application. It provides functionalities such as creating, updating, retrieving, and deleting user profile information. Given the dynamic nature of user interactions and the potential for high traffic, this service is designed with horizontal scaling capabilities to ensure high availability, resilience, and consistent performance under varying loads.

## Service Description

### Functional Requirements

1. **Create Profile**: Allows creating a new user profile with details like name, email, and preferences.
2. **Update Profile**: Permits modifications to existing profiles.
3. **Retrieve Profile**: Enables fetching profile details for a given user ID.
4. **Delete Profile**: Supports removal of a user profile from the system.

### Non-Functional Requirements

1. **Scalability**: The service must scale horizontally to handle spikes in traffic.
2. **Performance**: Response times should remain consistent under load.
3. **Availability**: Designed for 99.99% uptime.
4. **Security**: Ensure data protection and secure access.

## System Architecture

### Components

1. **API Gateway**: Serves as the entry point for all client requests, routing them to the appropriate service instance.
2. **Service Instances**: Multiple instances of the User Profile Service, each capable of handling requests independently.
3. **Load Balancer**: Distributes incoming requests evenly across service instances to ensure load distribution and high availability.
4. **Database Cluster**: A horizontally scalable database that stores user profile information. It ensures data consistency and high availability through replication and sharding techniques.
5. **Cache Layer**: An in-memory cache (e.g., Redis) to reduce database load and improve response times for frequently accessed data.

### Data Flow

1. A client request hits the API Gateway.
2. The request is forwarded to the Load Balancer.
3. The Load Balancer selects a Service Instance based on load distribution algorithms.
4. The Service Instance processes the request. It may interact with the Cache Layer or the Database Cluster as needed.
5. The response is sent back to the client following the reverse path.

## Horizontal Scaling Strategy

### Service Instances

- **Auto-Scaling**: Service instances are scaled horizontally based on CPU and memory usage metrics. When thresholds are exceeded, new instances are automatically spawned across multiple servers or cloud infrastructure.

### Database Cluster

- **Sharding**: The database is partitioned into shards, each holding a subset of the data. This distributes the load and enables the database to grow horizontally.
- **Replication**: Each shard is replicated across multiple nodes to ensure data availability and fault tolerance.

### Cache Layer

- The cache layer is designed to scale out by adding more nodes, which allows for distributing cache data and handling more concurrent connections.

## Technologies Used

- **API Gateway**: NGINX, Amazon API Gateway
- **Service Implementation**: Node.js, Express.js
- **Load Balancer**: HAProxy, AWS Elastic Load Balancing
- **Database**: MongoDB with sharding, Cassandra
- **Cache**: Redis Cluster
- **Containerization**: Docker
- **Orchestration**: Kubernetes for managing containerized service instances
- **Monitoring and Logging**: Prometheus, Grafana, ELK Stack

## Deployment and Operations

- The service is deployed on a Kubernetes cluster, leveraging its auto-scaling capabilities for service instances.
- Continuous Integration and Continuous Deployment (CI/CD) pipelines are used for automated testing and deployment.
- Comprehensive monitoring and logging mechanisms are in place for real-time performance tracking and anomaly detection.

## Security Considerations

- All data in transit and at rest are encrypted using industry-standard encryption algorithms.
- Access to the service is secured using OAuth 2.0 for authentication and authorization.
- Regular security audits and vulnerability assessments are conducted.

## Conclusion

The User Profile Service is designed to be a robust, scalable, and secure microservice that can handle high loads and ensure consistent performance. By leveraging horizontal scaling strategies, modern technologies, and best practices in system design, this service can adapt to the growing demands of a large-scale web application.
"""

In [13]:
technical_model = ask_ai(example_content)

In [14]:
generate_graph(technical_model)

## Marvin's AI Functions

In [70]:
load_dotenv()


class ReviewAspects(BaseModel):
    product_name: str = Field(
        ...,
        description="Name or type of the product",
    )
    product_sentiment: float = Field(
        ...,
        description="Sentiment for the product",
    )
    service: float = Field(
        ...,
        description="Sentiment for the service",
    )

In [56]:
import os
marvin.settings.openai.api_key = os.getenv("OPENAI_API_KEY")

In [71]:
@marvin.fn(model_kwargs={"model": "gpt-3.5-turbo", "temperature": 0})
def find_review_aspects(text: str) -> ReviewAspects:
    ...
    

find_review_aspects("I love this new microwave, but the service quality is quite poor!")

ReviewAspects(product_name='microwave', product_sentiment=0.8, service=0.2)

## Next steps

The entity model for the technical diagram can be made richer, with a focus on tools, function calls/data flow, storage and other components. This can be used to generate a technical diagram from a textual description. We encourage you to attempt this task as a challenge.

## Domain Specific Language (DSL)

This is a higher-level approach that operates at the prompt and LLM level and allows the user to specify the desired output format. Some popular examples of this approach are:

- Microsoft's Guidance (https://github.com/guidance-ai/guidance)
- Outlines (https://github.com/outlines-dev/outlines)

## Model Finetuning

1. Task-Specific Adaptation: Fine-tuning allows the model to adapt to specific tasks or domains, enhancing its performance on targeted applications.

2. Efficiency in Learning: Since the model is already pre-trained on a large dataset, fine-tuning requires less data and time to specialize the model for a new task.

3. Improved Accuracy: Fine-tuned models often show improved accuracy and understanding in generating responses tailored to specific schemas or functions

### Limitations

1. Resource Intensive: Fine-tuning requires additional computational resources and data, which can be a limitation for smaller organizations or individual developers.

2. Limited Flexibility: Adapting the model to new input or output schemas can necessitate retraining, making it less flexible to rapid changes or diverse requirements.

3. Provider Specific: Fine-tuned models are often specific to the provider's ecosystem, reducing portability and limiting their application across different platforms or LLMs.