# Function Calling

Function calling allows Mistral models to connect to external tools. By integrating Mistral models with external tools such as user defined functions or APIs, users can easily build applications catering to specific use cases and practical problems. In this guide, for instance, we wrote two functions for tracking payment status and payment date. We can use these two tools to provide answers for payment-related queries.

At a glance, there are four steps with function calling:

- User: specify tools and query
- Model: Generate function arguments if applicable
- User: Execute function to obtain tool results
- Model: Generate final answer

In this guide, we will walk through a simple example to demonstrate how function calling works with Mistral models in these four steps.

Before we get started, let’s assume we have a dataframe consisting of payment transactions. When users ask questions about this dataframe, they can use certain tools to answer questions about this data. This is just an example to emulate an external database that the LLM cannot directly access.

In [1]:
import pandas as pd
import json

# Assuming we have the following data
data = {
    'transaction_id': ['T1001', 'T1002', 'T1003', 'T1004', 'T1005'],
    'customer_id': ['C001', 'C002', 'C003', 'C002', 'C001'],
    'payment_amount': [125.50, 89.99, 120.00, 54.30, 210.20],
    'payment_date': ['2021-10-05', '2021-10-06', '2021-10-07', '2021-10-05', '2021-10-08'],
    'payment_status': ['Paid', 'Unpaid', 'Paid', 'Paid', 'Pending']
}

# Create DataFrame
df = pd.DataFrame(data)

## Step 1. User: specify tools and query

### Tools

Users can define all the necessary tools for their use cases.

- In many cases, we might have multiple tools at our disposal. For example, let’s consider we have two functions as our two tools: `retrieve_payment_status` and `retrieve_payment_date` to retrieve payment status and payment date given transaction ID.

In [2]:
import json
import pandas as pd

# Assuming 'data' is a DataFrame type
def retrieve_payment_status(df: pd.DataFrame, transaction_id: str) -> str:
    """
    Retrieves the payment status for a given transaction ID.

    Args:
        df: The DataFrame containing payment data. It must have columns:
                           'transaction_id', 'payment_status', and other payment-related information.
        transaction_id: The unique identifier for the transaction.

    Returns:
        str: A JSON-formatted string containing the payment status if the transaction ID is found.
             If not found, returns an error message.
    """
    if transaction_id in df.transaction_id.values:
        return json.dumps({'status': df[df.transaction_id == transaction_id].payment_status.item()})
    return json.dumps({'error': 'transaction id not found.'})

def retrieve_payment_date(df: pd.DataFrame, transaction_id: str) -> str:
    """
    Retrieves the payment date for a given transaction ID.

    Args:
        df: The DataFrame containing payment data. It must have columns:
                           'transaction_id', 'payment_date', and other payment-related information.
        transaction_id: The unique identifier for the transaction.

    Returns:
        str: A JSON-formatted string containing the payment date if the transaction ID is found.
             If not found, returns an error message.
    """
    if transaction_id in df.transaction_id.values:
        return json.dumps({'date': df[df.transaction_id == transaction_id].payment_date.item()})
    return json.dumps({'error': 'transaction id not found.'})


- In order for Mistral models to understand the functions, we need to outline the function specifications with a JSON schema. Specifically, we need to describe the type, function name, function description, function parameters, and the required parameter for the function. Since we have two functions here, let’s list two function specifications in a list.

In [3]:
tools = [
    {
        "type": "function",
        "function": {
            "name": "retrieve_payment_status",
            "description": "Get payment status of a transaction",
            "parameters": {
                "type": "object",
                "properties": {
                    "transaction_id": {
                        "type": "string",
                        "description": "The transaction id.",
                    }
                },
                "required": ["transaction_id"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "retrieve_payment_date",
            "description": "Get payment date of a transaction",
            "parameters": {
                "type": "object",
                "properties": {
                    "transaction_id": {
                        "type": "string",
                        "description": "The transaction id.",
                    }
                },
                "required": ["transaction_id"],
            },
        },
    }
]


- Then we organize the two functions into a dictionary where keys represent the function name, and values are the function with the df defined. This allows us to call each function based on its function name.

In [4]:
import functools

names_to_functions = {
  'retrieve_payment_status': functools.partial(retrieve_payment_status, df=df),
  'retrieve_payment_date': functools.partial(retrieve_payment_date, df=df)
}

### User query

Suppose a user asks the following question: “What’s the status of my transaction?” A standalone LLM would not be able to answer this question, as it needs to query the business logic backend to access the necessary data. But what if we have an exact tool we can use to answer this question? We could potentially provide an answer!

In [5]:
messages = [{"role": "user", "content": "What's the status of my transaction T1001?"}]

## Step 2. Model: Generate function arguments

How do Mistral models know about these functions and know which function to use? We provide both the user query and the tools specifications to Mistral models. The goal in this step is not for the Mistral model to run the function directly. It’s to 1) determine the appropriate function to use , 2) identify if there is any essential information missing for a function, and 3) generate necessary arguments for the chosen function.

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "mistralai/Mistral-7B-Instruct-v0.3"
tokenizer = AutoTokenizer.from_pretrained(model_id)



tools = [retrieve_payment_status,retrieve_payment_date]


# format and tokenize the tool use prompt 
inputs = tokenizer.apply_chat_template(
            messages,
            tools=tools,
            add_generation_prompt=True,
            return_dict=True,
            return_tensors="pt",
)

model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

inputs.to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1000)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))


In [None]:
''' from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
dtype = torch.bfloat16 # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

# 4bit pre quantized models we support for 4x faster downloading + no OOMs.
fourbit_models = [
    "unsloth/mistral-7b-bnb-4bit",
    "unsloth/mistral-7b-instruct-v0.2-bnb-4bit",
    "unsloth/llama-2-7b-bnb-4bit",
    "unsloth/llama-2-13b-bnb-4bit",
    "unsloth/codellama-34b-bnb-4bit",
    "unsloth/tinyllama-bnb-4bit",
    "unsloth/gemma-7b-bnb-4bit", # New Google 6 trillion tokens model 2.5x faster!
    "unsloth/gemma-2b-bnb-4bit",
] # More models at https://huggingface.co/unsloth

model, tokenizer2 = FastLanguageModel.from_pretrained(
    model_name = "unsloth/mistral-7b-v0.3", # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)

FastLanguageModel.for_inference(model)

def get_current_weather(location: str, format: str):
    """
    Get the current weather

    Args:
        location: The city and state, e.g. San Francisco, CA
        format: The temperature unit to use. Infer this from the users location. (choices: ["celsius", "fahrenheit"])
    """
    pass

conversation = [{"role": "user", "content": "What's the weather like in Paris?"}]
tools = [get_current_weather]

tokenizer2.chat_template = tokenizer.chat_template 



# format and tokenize the tool use prompt 
inputs = tokenizer2.apply_chat_template(
            conversation,
            tools=tools,
            add_generation_prompt=True,
            return_dict=True,
            return_tensors="pt",
)




inputs.to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1000)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ''' 

' from transformers import AutoModelForCausalLM, AutoTokenizer\nimport torch\nfrom unsloth import FastLanguageModel\nimport torch\nmax_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!\ndtype = torch.bfloat16 # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+\nload_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.\n\n# 4bit pre quantized models we support for 4x faster downloading + no OOMs.\nfourbit_models = [\n    "unsloth/mistral-7b-bnb-4bit",\n    "unsloth/mistral-7b-instruct-v0.2-bnb-4bit",\n    "unsloth/llama-2-7b-bnb-4bit",\n    "unsloth/llama-2-13b-bnb-4bit",\n    "unsloth/codellama-34b-bnb-4bit",\n    "unsloth/tinyllama-bnb-4bit",\n    "unsloth/gemma-7b-bnb-4bit", # New Google 6 trillion tokens model 2.5x faster!\n    "unsloth/gemma-2b-bnb-4bit",\n] # More models at https://huggingface.co/unsloth\n\nmodel, tokenizer2 = FastLanguageModel.from_pretrained(\n    model_name = "unsloth/mistral-7b-v0.3", # Cho

In [None]:
messages.append(response.choices[0].message)

NameError: name 'response' is not defined

In [None]:
messages

[{'role': 'user', 'content': "What's the status of my transaction T1001?"},
 AssistantMessage(content='', tool_calls=[ToolCall(function=FunctionCall(name='retrieve_payment_status', arguments='{"transaction_id": "T1001"}'), id='D681PevKs', type='function')], prefix=False, role='assistant')]

## Step 3. User: Execute function to obtain tool results

How do we execute the function? Currently, it is the user’s responsibility to execute these functions and the function execution lies on the user side. In the future, we may introduce some helpful functions that can be executed server-side.

Let’s extract some useful function information from model response including function_name and function_params. It’s clear here that our Mistral model has chosen to use the function `retrieve_payment_status` with the parameter `transaction_id` set to T1001.

In [None]:
import json
tool_call = response.choices[0].message.tool_calls[0]
function_name = tool_call.function.name
function_params = json.loads(tool_call.function.arguments)
print("\nfunction_name: ", function_name, "\nfunction_params: ", function_params)


function_name:  retrieve_payment_status 
function_params:  {'transaction_id': 'T1001'}


In [None]:
function_result = names_to_functions[function_name](**function_params)
function_result


'{"status": "Paid"}'

In [None]:
messages.append({"role":"tool", "name":function_name, "content":function_result, "tool_call_id":tool_call.id})

In [None]:
messages

[{'role': 'user', 'content': "What's the status of my transaction T1001?"},
 AssistantMessage(content='', tool_calls=[ToolCall(function=FunctionCall(name='retrieve_payment_status', arguments='{"transaction_id": "T1001"}'), id='D681PevKs', type='function')], prefix=False, role='assistant'),
 {'role': 'tool',
  'name': 'retrieve_payment_status',
  'content': '{"status": "Paid"}',
  'tool_call_id': 'D681PevKs'}]

## Step 4. Model: Generate final answer

We can now provide the output from the tools to Mistral models, and in return, the Mistral model can produce a customised final response for the specific user.

In [None]:
response = client.chat.complete(
    model = model,
    messages = messages
)
response.choices[0].message.content

'The payment for transaction T1001 has been successfully paid.'

In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "mistralai/Mistral-7B-Instruct-v0.3"
tokenizer = AutoTokenizer.from_pretrained(model_id)

from datetime import datetime
import pytz

def get_current_weather(location: str, format: str):
    """
    Get the current weather.

    Args:
        location: The city and country, e.g. "Paris, France".
        format: The temperature unit, choices: ["celsius", "fahrenheit"].
    """
    fake_data = {
        "Paris": {"celsius": "18°C, Bulutlu", "fahrenheit": "64°F, Cloudy"},
        "Ankara": {"celsius": "22°C, Güneşli", "fahrenheit": "72°F, Sunny"},
        "New York": {"celsius": "15°C, Yağmurlu", "fahrenheit": "59°F, Rainy"}
    }

    city = location.split(",")[0].strip()
    data = fake_data.get(city, {"celsius": "Bilinmeyen", "fahrenheit": "Unknown"})
    return f"{location} için hava durumu: {data.get(format.lower(), 'Bilinmeyen format')}."


def get_current_time(location: str):
    """
    Get the local time for a given location.

    Args:
        location: The city or country, e.g. "Tokyo" or "France".
    """
    timezone_map = {
        "Istanbul": "Europe/Istanbul",
        "Paris": "Europe/Paris",
        "New York": "America/New_York",
        "Tokyo": "Asia/Tokyo"
    }

    tz_name = timezone_map.get(location, "UTC")
    tz = pytz.timezone(tz_name)
    now = datetime.now(tz)
    return f"{location} için yerel saat: {now.strftime('%H:%M:%S')} ({tz_name})"


def convert_currency(amount: float, from_currency: str, to_currency: str):
    """
    Convert currency between two currencies.

    Args:
        amount: The amount to convert.
        from_currency: The source currency code, e.g. "USD".
        to_currency: The target currency code, e.g. "TRY".
    """
    exchange_rates = {
        ("USD", "TRY"): 32.5,
        ("EUR", "TRY"): 34.2,
        ("TRY", "USD"): 0.0307,
        ("USD", "EUR"): 0.91,
        ("EUR", "USD"): 1.1
    }

    rate = exchange_rates.get((from_currency.upper(), to_currency.upper()))
    if rate is None:
        return f"Döviz kuru mevcut değil: {from_currency} → {to_currency}"

    converted = round(amount * rate, 2)
    return f"{amount} {from_currency.upper()} ≈ {converted} {to_currency.upper()}"


conversation = [{"role": "user", "content": "1 Dolar kaç Türk lirası yapar"}]
tools = [get_current_weather, get_current_time, convert_currency]



# format and tokenize the tool use prompt 
inputs = tokenizer.apply_chat_template(
            conversation,
            tools=tools,
            add_generation_prompt=True,
            return_dict=True,
            return_tensors="pt",
)

model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, device_map="auto")

inputs.to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1000)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))


2025-03-31 12:39:44.812474: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-03-31 12:39:44.822946: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1743413984.834089   36083 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1743413984.837179   36083 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-03-31 12:39:44.851998: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instr

Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.


[{"type": "function", "function": {"name": "get_current_weather", "description": "Get the current weather.", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The city and country, e.g. \"Paris, France\"."}, "format": {"type": "string", "description": "The temperature unit, choices: [\"celsius\", \"fahrenheit\"]."}}, "required": ["location", "format"]}}}, {"type": "function", "function": {"name": "get_current_time", "description": "Get the local time for a given location.", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The city or country, e.g. \"Tokyo\" or \"France\"."}}, "required": ["location"]}}}, {"type": "function", "function": {"name": "convert_currency", "description": "Convert currency between two currencies.", "parameters": {"type": "object", "properties": {"amount": {"type": "number", "description": "The amount to convert."}, "from_currency": {"type": "string", "description": "The s

In [None]:
import ollama
import json
import time
import functions
from rich.console import Console

c = Console()

available_functions = {
  functions.get_current_weather.__name__: functions.get_current_weather,
}


def extract_function_calls(model, question, fn_definitions):
  prompt = f"""
  [AVAILABLE_TOOLS]{json.dumps(fn_definitions)}[/AVAILABLE_TOOLS]
  [INST] {question} [/INST]
  """
  start = time.time()
  response = ollama.generate(model=model, prompt=prompt, raw=True)
  end = time.time()

  try:
    raw_response = response['response']
    fn_calls = json.loads(raw_response.replace("[TOOL_CALLS] ", ""))
    return raw_response, fn_calls, None, (end-start)
  except Exception as e:
    return raw_response, None, e, (end-start)


def call_function(function_calls, available_functions):
  fn_responses = []
  for call in function_calls:
    fn_to_call = available_functions[call["name"]]
    fn_response = fn_to_call(**call["arguments"])
    fn_responses.append(fn_response)
  return fn_responses


def answer_question(model, question, fn_responses):
  stream = ollama.generate(model=model, stream=True,
    prompt=f"""
    Using the following function responses: f{json.dumps(fn_responses)}
    Answer this question: {question}
  """)
  for chunk in stream:
    c.print(chunk['response'], end='')


model = "mistral:7b-instruct-v0.3-fp16"

question = "What is the weather like today in New York?"
raw, fn_calls, error, time_taken = extract_function_calls(
  model, question, functions.definitions
)

if error:
  c.print(f"Error: {error}\n{raw}", style="red")
  c.print(f"Fn Call Duration: {time_taken:.2f} seconds", style="yellow")
else:
  c.print(f"Function calls: {fn_calls}")
  fn_responses = call_function(fn_calls, available_functions)
  c.print(f"Function responses: {fn_responses}")
  start = time.time()
  answer_question(model, question, fn_responses)
  end = time.time()
  c.print(f"\nFn Call Duration: {time_taken:.2f} seconds", style="yellow")
  c.print(f"Answer Duration: {end-start:.2f} seconds", style="yellow")

In [None]:
import emoji

# Unicode'daki tüm emojileri içeren liste
len([ x for x in list(emoji.EMOJI_DATA.keys()) if len(x)>=2])
[ x for x in list(emoji.EMOJI_DATA.keys()) if len(x)>=2]


['🅰️',
 '🇦🇫',
 '🇦🇱',
 '🇩🇿',
 '🇦🇸',
 '🇦🇩',
 '🇦🇴',
 '🇦🇮',
 '🇦🇶',
 '🇦🇬',
 '🇦🇷',
 '🇦🇲',
 '🇦🇼',
 '🇦🇨',
 '🇦🇺',
 '🇦🇹',
 '🇦🇿',
 '🅱️',
 '🇧🇸',
 '🇧🇭',
 '🇧🇩',
 '🇧🇧',
 '🇧🇾',
 '🇧🇪',
 '🇧🇿',
 '🇧🇯',
 '🇧🇲',
 '🇧🇹',
 '🇧🇴',
 '🇧🇦',
 '🇧🇼',
 '🇧🇻',
 '🇧🇷',
 '🇮🇴',
 '🇻🇬',
 '🇧🇳',
 '🇧🇬',
 '🇧🇫',
 '🇧🇮',
 '🇰🇭',
 '🇨🇲',
 '🇨🇦',
 '🇮🇨',
 '🇨🇻',
 '🇧🇶',
 '🇰🇾',
 '🇨🇫',
 '🇪🇦',
 '🇹🇩',
 '🇨🇱',
 '🇨🇳',
 '🇨🇽',
 '🇨🇵',
 '🇨🇨',
 '🇨🇴',
 '🇰🇲',
 '🇨🇬',
 '🇨🇩',
 '🇨🇰',
 '🇨🇷',
 '🇭🇷',
 '🇨🇺',
 '🇨🇼',
 '🇨🇾',
 '🇨🇿',
 '🇨🇮',
 '🇩🇰',
 '🇩🇬',
 '🇩🇯',
 '🇩🇲',
 '🇩🇴',
 '🇪🇨',
 '🇪🇬',
 '🇸🇻',
 '🏴\U000e0067\U000e0062\U000e0065\U000e006e\U000e0067\U000e007f',
 '🇬🇶',
 '🇪🇷',
 '🇪🇪',
 '🇸🇿',
 '🇪🇹',
 '🇪🇺',
 '🇫🇰',
 '🇫🇴',
 '🇫🇯',
 '🇫🇮',
 '🇫🇷',
 '🇬🇫',
 '🇵🇫',
 '🇹🇫',
 '🇬🇦',
 '🇬🇲',
 '🇬🇪',
 '🇩🇪',
 '🇬🇭',
 '🇬🇮',
 '🇬🇷',
 '🇬🇱',
 '🇬🇩',
 '🇬🇵',
 '🇬🇺',
 '🇬🇹',
 '🇬🇬',
 '🇬🇳',
 '🇬🇼',
 '🇬🇾',
 '🇭🇹',
 '🇭🇲',
 '🇭🇳',
 '🇭🇰',
 '🇭🇺',
 '🇮🇸',
 '🇮🇳',
 '🇮🇩',
 '🇮🇷',
 '🇮🇶',
 '🇮🇪',
 '🇮🇲',
 '🇮🇱',
 '🇮🇹',
 '🇯🇲',
 '🇯🇵',
 '㊗️',
 '🈷️',
 '㊙️',
 '🈂️',
 '🇯🇪',
 '🇯🇴',
 '🇰🇿',
 '🇰🇪',
 '🇰🇮',
 '🇽🇰',
 '🇰🇼',
 '🇰🇬',
 '🇱🇦',
 '🇱