# Native Integration of Function Calling with Open-Source Language Models





In this demo, we will explore how to enhance language model predictions with the ability to call external functions, such as fetching the current weather or making a reservation, directly within the model's output. This capability is enabled by the `ToolCallSampler` class,

We will begin by setting up our environment and installing necessary packages.


In [None]:
%%capture
!git clone https://github.com/unaidedelf8777/function-sampler.git && cd function-sampler && pip install .
!pip install bitsandbytes accelerate

## Defining External Functions

Before we can enhance our language model with external function calls, we need to define the functions that can be called. These functions are specified as a list of dictionaries, each containing details about the function name, description, parameters, and any required arguments. they are specified in the same format as the legacy OpenAI function calling format, along with support for the 'format', 'maxLength', an 'minLength' fields of the [json-schema spec](<https://json-schema.org/learn/getting-started-step-by-step>).


In [None]:
s = [{
            "name": "get_current_weather",
            "description": "Get the current weather",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city and state, e.g. San Francisco, CA",
                    },
                    "format": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "The temperature unit to use. Infer this from the users location.",
                    },
                },
                "required": ["location", "format"],
            },
       },
       {
    "name": "get_reservation",
    "description": "Retrieve a reservation at a restaurant",
    "parameters": {
        "type": "object",
        "properties": {
            "restaurant_name": {
                "type": "string",
                "description": "The name of the restaurant for which the reservation is made"
            },
            "reservation_date": {
                "type": "string",
                "format": "date",
                "description": "The date of the reservation in YYYY-MM-DD format"
            },
            "reservation_time": {
                "type": "string",
                "format": "time",
                "description": "The time of the reservation in HH:MM format"
            },
            "party_size": {
                "type": "integer",
                "description": "The number of people included in the reservation"
            },
            "contact_number": {
                "type": "integer",
                "description": "The contact phone number for the reservation confirmation"
            }
        },
        "required": ["restaurant_name", "reservation_time"]
    }
}
]


## Setting Up the Language Model

To demonstrate function calls within text generation, we'll use a pre-trained causal language model from Hugging Face's `transformers` library. Along with the model, we also load its associated tokenizer, which will be used for encoding inputs and decoding outputs.

We will also supply the tokenizer to the `ToolCallSampler` class, which will use our models tokenizer to construct a **FSM** ( finite-state machine )  for each function schema we gave.


In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("teknium/OpenHermes-2.5-Mistral-7B", load_in_4bit=True)

tokenizer = AutoTokenizer.from_pretrained("teknium/OpenHermes-2.5-Mistral-7B")

In [None]:
m = "<|im_start|>user\nmake me a reservation at magianos for 6 pm. call functions by responding with the word:'<function>' \n<|im_end|>\n<|im_start|>assistant\n "
tokens = tokenizer.encode(m, return_tensors='pt')

## Configuring the ToolCallSampler

The `ToolCallSampler` class is at the core of our function calling mechanism. It requires a configuration specifying the vocabulary size of the tokenizer, among other settings. Once configured, it will intercept and process function call patterns during text generation.


In [None]:
from function_sampler import ToolCallSamplerConfig, ToolCallSampler
config = ToolCallSamplerConfig(vocab_size=len(tokenizer) )

sampler = ToolCallSampler(tokenizer=tokenizer, functions=s, config=config)
from transformers import LogitsProcessorList, TextStreamer

streamer=TextStreamer(tokenizer)
import time
start = time.time()
x = model.generate(
    tokens.to("cuda"),
    max_new_tokens=800,
    logits_processor=LogitsProcessorList([sampler]),
    do_sample=True,
    streamer=streamer,

)
taken = time.time() - start
taken