# Function Calling from First Principles
In this notebook, I'll be working through an example that shows how to do function calling with LLMs.  Function calling is a design pattern that takes advantage of the LLMs immanent knowledge of JSON schema to arm it with the ability to interact with its environment.  

These interaction allow the model to pursue goals with a greater scope of influence than just writing words.  There is a massive potential for goal seeking LLMs across a wide range of use cases, because of the open endedness of the problem solving approach.

For this example, I have limited the LLMs scope to one tool in order to keep the example focused on understanding what's happening when an LLM actually uses external tools.



In [1]:
import json
import os
import sys
import boto3
from utils import bedrock, print_ww
from utils.utils import extract_model_info

## Setting up Bedrock API.

We'll use AWS bedrock models, specifically Claude V2, to test this design pattern.

In [2]:
boto3_bedrock = bedrock.get_bedrock_client(
    assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
    region=os.environ.get("AWS_DEFAULT_REGION", "us-east-1"),
    runtime=False
)

Create new client
  Using region: us-east-1
boto3 Bedrock client successfully created!
bedrock(https://bedrock.us-east-1.amazonaws.com)


Let's take a look at which models we have access to.

In [3]:
all_models = boto3_bedrock.list_foundation_models()
print(f"extract_model_info(all_models): {extract_model_info(all_models)}")

extract_model_info(all_models): [('Titan Text Large', 'amazon.titan-tg1-large'), ('Titan Text Embeddings', 'amazon.titan-e1t-medium'), ('Titan Text Embeddings v2', 'amazon.titan-embed-g1-text-02'), ('Titan Text G1 - Express', 'amazon.titan-text-express-v1'), ('Titan Embeddings G1 - Text', 'amazon.titan-embed-text-v1'), ('Stable Diffusion XL', 'stability.stable-diffusion-xl'), ('Stable Diffusion XL', 'stability.stable-diffusion-xl-v0'), ('J2 Grande Instruct', 'ai21.j2-grande-instruct'), ('J2 Jumbo Instruct', 'ai21.j2-jumbo-instruct'), ('Jurassic-2 Mid', 'ai21.j2-mid'), ('Jurassic-2 Mid', 'ai21.j2-mid-v1'), ('Jurassic-2 Ultra', 'ai21.j2-ultra'), ('Jurassic-2 Ultra', 'ai21.j2-ultra-v1'), ('Claude Instant', 'anthropic.claude-instant-v1'), ('Claude', 'anthropic.claude-v1'), ('Claude', 'anthropic.claude-v2'), ('Command', 'cohere.command-text-v14')]


In [4]:
bedrock_runtime = bedrock.get_bedrock_client(
    assumed_role=os.environ.get("BEDROCK_ASSUME_ROLE", None),
    region=os.environ.get("AWS_DEFAULT_REGION", "us-east-1")
)

Create new client
  Using region: us-east-1
boto3 Bedrock client successfully created!
bedrock-runtime(https://bedrock-runtime.us-east-1.amazonaws.com)


Let's see what the model knows about JSON schema.  If you get a coherent and relatively accurate answer here, it's a good sign that this technique will likely work with your model of choice.

In [5]:
prompt_data = """
Human: Tell me everything you know about JSON schema.

Assistant:"""

In [6]:
body = json.dumps({"prompt": prompt_data, "max_tokens_to_sample": 500})
modelId = "anthropic.claude-v2" 
accept = "application/json"
contentType = "application/json"

In [7]:
response = bedrock_runtime.invoke_model(
    body=body, modelId=modelId, accept=accept, contentType=contentType
)
response_body = json.loads(response.get("body").read())

print(response_body.get("completion"))

 Here is a summary of what I know about JSON schema:

- JSON schema is a specification for defining the structure and validation criteria for JSON data. It provides a declarative way to specify requirements and validate JSON documents against those requirements.

- The main components of a JSON schema are:

    - Definitions - Used to define commonly reused JSON objects
    - Properties - Used to define the structure of a JSON object
    - Required properties - Properties that must exist in an object
    - Data types - Used to define the data type of a property (string, number, boolean, etc)
    - Format validation - Used for further validation of data types (email, date, etc) 
    - Enumerations - Restricts a value to a fixed set of values
    - Conditionals - Specify requirements depending on another property value
    - Array validations - Validates contents of arrays

- JSON schema allows for validation keywords like "minimum", "maximum" for numbers, "minLength", "maxLength" for st

## An Example - How's the weather?
In this example, I show how to use JSON schema based function definitions to generate function arguments.  You'll see a JSON schema specification for a python function called `get_current_weather`.  This is a hypothetical function that retrieves formatted weather information based on a location.

The inspiration for this example is here:
https://github.com/openai/openai-cookbook/blob/main/examples/How_to_call_functions_with_chat_models.ipynb

In [8]:
prompt_data = """
Human: Your role is to form function arguments based on the specification provide in 'FUNCTIONS'.  Users will ask you a question about the weather, and you'll attempt to form the function call that we can pass along to a tool to determine the response. Do not try to answer questions about the weather directly, use the provided tools.  The tools are specified using a json schema format, which defines how the arguments to the function are to be formed.

FUNCTIONS = [
    {
        "name": "get_current_weather",
        "description": "Get the current weather",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "The city and state, e.g. San Francisco, CA",
                },
                "format": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "The temperature unit to use. Infer this from the users location.",
                },
            },
            "required": ["location", "format"],
        },
    }
]

Examples
###
Human: 
What's the weather like in Glasgow Scotland?
Assistant:
{
'function_call': {'name': 'get_current_weather',
    'arguments': {
        "location": "Glasgow, Scotland",
        "format": "celsius"
    }
}

Human: 
What's the weather like in Raleigh, NC?
Assistant:
{
'function_call': {'name': 'get_current_weather',
    'arguments': {
        "location": "Raleigh, North Carolina",
        "format": "fahrenheit"
    }
}
###

Human:
What's the weather like in Washington, DC?

Assistant:
"""

In [9]:
body = json.dumps({"prompt": prompt_data, "max_tokens_to_sample": 500})
modelId = "anthropic.claude-v2" 
accept = "application/json"
contentType = "application/json"

In [10]:
response = bedrock_runtime.invoke_model(
    body=body, modelId=modelId, accept=accept, contentType=contentType
)
response_body = json.loads(response.get("body").read())

print(response_body.get("completion"))

 {
  "function_call": {
    "name": "get_current_weather", 
    "arguments": {
      "location": "Washington, DC",
      "format": "fahrenheit"
    }
  }
}


# Main Event - City Dogs and Function Calling
We'll consider two examples.  First the linear regression tool. How does the function call work in detail?

For the second example, I'll show a goal oriented LLM in action.

## 1 - The Linear Regression Tool:
In order to demonstrate function calling, we'll consider a simple example.  In this example, the language model has been given one tool, a linear regression model.  The tool uses the python scipy library to generate linear models.  

The LLM will pursue one goal.  We're telling it to look for a positive linear trend between pairs of data columns.  The LLM will do this by guessing and checking column pairs using the tool.  It passes the data into the linear regression tool, and the results are sent back to the LLM as part of the user query.  It then analyzes the tool response, and if it failes to find a positive trend, it will make another attempt with new columns.  It will continue this approach until it finds a positive trend or the maximum number of iterations has been reached.

In [11]:
from prompt_template import PromptTemplate, load_text_file
from function_parser import FunctionParser, extract_json_from_text
from linear_trend_model import run_linear_trend_model, FUNCTIONS

In [12]:
#: Here's our prompt that we'll use to drive the language model
linear_trend_prompt = load_text_file('linear_trend_request.prompt')

#: Here's a mock article excerpt that I generated to showcase this technology.
city_dogs =  load_text_file('city_dogs.txt')


In [13]:
#: Establish the Prompt template from the text file.
linear_trend_prompt_template = PromptTemplate(linear_trend_prompt)

#: Fill the prompt using our 
linear_trend_filled_prompt = linear_trend_prompt_template.fill(
    FUNCTIONS=str(FUNCTIONS), 
    user_data=city_dogs
)
print(f"linear_trend_filled_prompt: {linear_trend_filled_prompt}")

linear_trend_filled_prompt: Human:

You will play the role of an expert Data Scientist assistant.  You love linear models, and when you get linear data you make a function call to fit a model to the data. We'll use triple hash ### to open and close subsections of prompt examples.

We're going to define your character and behaviour in terms of the red, yellow, blue, and green color categories that relate to personality types.

The color system I'm referring to - it's a way of categorizing personality traits into four colors:

Red personalities are usually seen as bold, expressive, and competitive. They are motivated by power, stimulation, and winning.
Yellow personalities are sociable, outgoing, and enthusiastic. They tend to be persuasive, verbal, and eager to collaborate.
Green personalities are calm, accommodating, and caring. They value stability, compassion, and building community.
Blue personalities are logical, detail-oriented, and focused. They are quantitative, deliberate, and 

In [14]:
body = json.dumps({"prompt": linear_trend_filled_prompt, "max_tokens_to_sample": 500})
modelId = "anthropic.claude-v2" 
accept = "application/json"
contentType = "application/json"

In [15]:
response = bedrock_runtime.invoke_model(
    body=body, modelId=modelId, accept=accept, contentType=contentType
)
response_body = json.loads(response.get("body").read())

completion = response_body.get("completion")

print(f"completion:\n {completion}")

completion:
  Here is the function call with the data from the prompt formatted as the linear regression parameters:

```json
{
  "function_call": {
    "name": "run_linear_trend_model",
    "parameters": {
      "x": [0.5, 1.5, 3, 4.2, 6],
      "y": [15000, 12500, 10000, 7500, 6000]
    }
  }
}
```


Here we can clearly see that the LLM wants to call a function.

The function parser will now check the completion for function calls, and run them if any are found.

In [16]:
parser = FunctionParser({'run_linear_trend_model': run_linear_trend_model})


This completion should have a function call inside, let's see what happens when we parse it.

In [17]:
print(completion)

 Here is the function call with the data from the prompt formatted as the linear regression parameters:

```json
{
  "function_call": {
    "name": "run_linear_trend_model",
    "parameters": {
      "x": [0.5, 1.5, 3, 4.2, 6],
      "y": [15000, 12500, 10000, 7500, 6000]
    }
  }
}
```


In [18]:
#: This method will parse and execute any valid function calls contained within the completion.
parser.parse_and_execute(completion)

[34mCalling run_linear_trend_model[0m
x :[0.5, 1.5, 3, 4.2, 6]
y :[15000, 12500, 10000, 7500, 6000]
fit linear model in 0.0007519721984863281 seconds.


"{'slope': -1652.7572364251002, 'intercept': 15224.381998732304, 'r_value': -0.9850169656708015, 'p_value': 0.0021966205939785346, 'std_err': 167.06548200469157}"

You may get slightly different results every time you run this code, but it should always produce a function call to the linear model.

## How to think about this.

The LLM is asking you to do something and return the value to it.  The LLM can then use the results of the function call to continue problem solving, or return these results directly.  

In this simple example, we're looking for simple linear trends using one tool.  However, it's possible to orchestrate a system of tools, or even a system of LLMs to be able to automate even more novel types of workflows.

The available literature and code is full of more sophisticaed examples.  Here, we are peeling back the layers of the onion, so we can start to see what's really going on in agentic function calls.


# 2 - Goal Seeking LLM System
In this example, we'll use the elements of function calling that have been presented so far to show how an LLM can pursue a goal using tools. 

In [19]:
from llm_framework import LLMFramework

Let's start with the data.  Here's some mock sales KPI data generated for our hypothetical hot dog stand.

It uses a pipe delmited format, but this is not a hard requirement.

In [29]:
city_dogs_stats =  load_text_file('city_dogs_stats.txt')

In [21]:
print(city_dogs_stats)


| Location                       | Distance from Center (miles)| Street               | Monthly Revenue ($)| Avg Price per Hot Dog ($)    | Monthly Customers Count | Top Selling Side Item | Employee Count | Customer Satisfaction (%) | Average Wait Time (minutes) |
| Broadway Avenue, Manhattan     | 0.5                         | Broadway Avenue      | 15,000             | 5                            | 3,000                   | Fries                 | 10             | 80                        | 15                          |
| West Village, Manhattan        | 1.5                         | Bleecker Street      | 12,500             | 5.5                          | 2,273                   | Soda                  | 8              | 85                        | 10                          |
| Williamsburg, Brooklyn         | 3                           | Bedford Avenue       | 10,000             | 6                            | 1,667                   | Cheese Sticks         | 7             

We'll use a variation of the earlier prompt that is directed to use tools to seek the LLM's end goal.

In [22]:
linear_trend_agent_prompt = load_text_file('linear_trend_model_agent.prompt')

Next, well create a prompt template using the `linear_trend_model_agent.prompt` text file.

In [23]:
linear_trend_agent_prompt_template = PromptTemplate(linear_trend_agent_prompt)

We fill in the user_data and FUNCTIONS placeholders in our prompt. The user data contains the relevant contextual data that will be analyzed, and FUNCTIONS contains a description of the linear regression tool.

In [24]:
linear_trend_agent_filled_prompt = linear_trend_agent_prompt_template.fill(
    FUNCTIONS=str(FUNCTIONS), #: JSON schema definition of the functions
    user_data=city_dogs_stats
)

In [25]:
print(linear_trend_agent_filled_prompt)

Human:

You will play the role of an expert Data Scientist assistant.  You love linear models, and when you get linear data you make a function call to fit a model to the data. We'll use triple hash ### to open and close subsections of prompt examples.

We're going to define your character and behaviour in terms of the red, yellow, blue, and green color categories that relate to personality types.

The system I'm referring to - it's a way of categorizing personality traits into four colors:

Red personalities are usually seen as bold, expressive, and competitive. They are motivated by power, stimulation, and winning.
Yellow personalities are sociable, outgoing, and enthusiastic. They tend to be persuasive, verbal, and eager to collaborate.
Green personalities are calm, accommodating, and caring. They value stability, compassion, and building community.
Blue personalities are logical, detail-oriented, and focused. They are quantitative, deliberate, and good at reasoning analytically.

Y

Now, we'll set up our goal-pursuing framework.  For this part, we'll pass the actual python functions into the framework 

In [26]:
#: We need to give the actual python function to the LLM in addition to the JSON spec.
functions  = {"run_linear_trend_model": run_linear_trend_model}

#: Initialize our framework
llm_framework = LLMFramework(bedrock_runtime, linear_trend_agent_filled_prompt, functions)


In [27]:
#: Now we're all set to run the framework.
result = llm_framework.run()

iterations: 0
body: {"prompt": "Human:\n\nYou will play the role of an expert Data Scientist assistant.  You love linear models, and when you get linear data you make a function call to fit a model to the data. We'll use triple hash ### to open and close subsections of prompt examples.\n\nWe're going to define your character and behaviour in terms of the red, yellow, blue, and green color categories that relate to personality types.\n\nThe system I'm referring to - it's a way of categorizing personality traits into four colors:\n\nRed personalities are usually seen as bold, expressive, and competitive. They are motivated by power, stimulation, and winning.\nYellow personalities are sociable, outgoing, and enthusiastic. They tend to be persuasive, verbal, and eager to collaborate.\nGreen personalities are calm, accommodating, and caring. They value stability, compassion, and building community.\nBlue personalities are logical, detail-oriented, and focused. They are quantitative, deliber

In [28]:
print(f"RESULT: {result}")


RESULT: {'slope': 4.847823012570029, 'intercept': 985.2580177068903, 'r_value': 0.9805780978795232, 'p_value': 0.003239680784467666, 'std_err': 0.5598166441865517}


## Results
If you run the LLM framework multiple times, you will get different trajectories.  There are a few different positive trends in the data, which means there's also different terminating cases.  In my experience it takes about 3-5 iterations for the LLM to find the desired trend, although it will occasionally find one on the first try.

This simple example is ripe for extension.  With the tool provided, it would be simple to extend the prompt and goal loop to qualify the data trends more holistically.  Perhaps we could add new types of models to this framework.

Additionally, adding more tools would allow for an even greater scope of possible goals for the LLM to pursue.

Please comment and let me know if this inspired you to try any new ideas!

