<a href="https://colab.research.google.com/github/dogukartal/ML-RoadMap/blob/main/NLP/Hugging%20Face/PyTorch/Tool_Calling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Tool Calling
---

In [1]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

def get_current_temperature(location: str, unit: str) -> float:
    """
    Get the current temperature at a location.

    Args:
        location: The location to get the temperature for, in the format "City, Country"
        unit: The unit to return the temperature in. (choices: ["celsius", "fahrenheit"])
    Returns:
        The current temperature at the specified location in the specified units, as a float.
    """
    return 22.  # A real function should probably actually get the temperature!

def get_current_wind_speed(location: str) -> float:
    """
    Get the current wind speed in km/h at a given location.

    Args:
        location: The location to get the temperature for, in the format "City, Country"
    Returns:
        The current wind speed at the given location in km/h, as a float.
    """
    return 6.  # A real function should probably actually get the wind speed!

checkpoint = "NousResearch/Hermes-2-Pro-Llama-3-8B"

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, device_map="auto")

tools = [get_current_temperature, get_current_wind_speed]

messages = [
  {"role": "system", "content": "You are a bot that responds to weather queries. You should reply with the unit used in the queried location."},
  {"role": "user", "content": "Hey, what's the temperature in Paris in Celcius right now?"}
]

inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}

out = model.generate(**inputs, max_new_tokens=128)

print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/56.1k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/68.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/700 [00:00<?, ?B/s]

model.safetensors.index.json:   0%|          | 0.00/23.9k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/4 [00:00<?, ?it/s]

model-00001-of-00004.safetensors:   0%|          | 0.00/4.98G [00:00<?, ?B/s]

model-00002-of-00004.safetensors:   0%|          | 0.00/5.00G [00:00<?, ?B/s]

model-00003-of-00004.safetensors:   0%|          | 0.00/4.92G [00:00<?, ?B/s]

model-00004-of-00004.safetensors:   0%|          | 0.00/1.17G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/142 [00:00<?, ?B/s]

Setting `pad_token_id` to `eos_token_id`:128003 for open-end generation.


<tool_call>
{"name": "get_current_temperature", "arguments": {"location": "Paris, France", "unit": "celsius"}}
</tool_call><|im_end|>


The LLM is not inherently capable of running the functions directly—rather, it identifies when and which function needs to be called. The purpose of the model generating a tool call (like get_current_temperature) is to signal what function should be executed and with what parameters. So we execute the function with the arguments that LLM gave us.

In [3]:
tool_call = {"name": "get_current_temperature", "arguments": {"location": "Paris, France", "unit": "celsius"}}
messages.append({"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}]})

We’ve added the tool call to the conversation, we can call the function and append the result to the conversation.

In [4]:
messages.append({"role": "tool", "name": "get_current_temperature", "content": "22.0"})

In [5]:
messages

[{'role': 'system',
  'content': 'You are a bot that responds to weather queries. You should reply with the unit used in the queried location.'},
 {'role': 'user',
  'content': "Hey, what's the temperature in Paris in Celcius right now?"},
 {'role': 'assistant',
  'tool_calls': [{'type': 'function',
    'function': {'name': 'get_current_temperature',
     'arguments': {'location': 'Paris, France', 'unit': 'celsius'}}}]},
 {'role': 'tool', 'name': 'get_current_temperature', 'content': '22.0'}]

Let's let the assistant read the function outputs and continue chatting with the user:

In [6]:
inputs = tokenizer.apply_chat_template(messages, tools=tools, add_generation_prompt=True, return_dict=True, return_tensors="pt")
inputs = {k: v.to(model.device) for k, v in inputs.items()}
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0][len(inputs["input_ids"][0]):]))

Setting `pad_token_id` to `eos_token_id`:128003 for open-end generation.


The current temperature in Paris, France is 22.0°C.<|im_end|>


 This can be a powerful way to extend the capabilities of conversational agents with real-time information, computational tools like calculators, or access to large databases.

 ## Tool Schemas
 ---
 Each function you pass to the tools argument of apply_chat_template is converted into a JSON schema. These schemas are then passed to the model chat template. In other words, tool-use models do not see your functions directly, and they never see the actual code inside them. What they care about is the function definitions and the arguments they need to pass to them - they care about what the tools do and how to use them, not how they work! It is up to you to read their outputs, detect if they have requested to use a tool, pass their arguments to the tool function, and return the response in the chat.

In [8]:
# Manual Schema Conversion

from transformers.utils import get_json_schema

def multiply(a: float, b: float):
    """
    A function that multiplies two numbers

    Args:
        a: The first number to multiply
        b: The second number to multiply
    """
    return a * b

schema = get_json_schema(multiply)
print(schema)

{'type': 'function', 'function': {'name': 'multiply', 'description': 'A function that multiplies two numbers', 'parameters': {'type': 'object', 'properties': {'a': {'type': 'number', 'description': 'The first number to multiply'}, 'b': {'type': 'number', 'description': 'The second number to multiply'}}, 'required': ['a', 'b']}}}



```
{
  "type": "function",
  "function": {
    "name": "multiply",
    "description": "A function that multiplies two numbers",
    "parameters": {
      "type": "object",
      "properties": {
        "a": {
          "type": "number",
          "description": "The first number to multiply"
        },
        "b": {
          "type": "number",
          "description": "The second number to multiply"
        }
      },
      "required": ["a", "b"]
    }
  }
}
```



JSON schemas can be passed directly to the tools argument of apply_chat_template - this gives you a lot of power to define precise schemas for more complex functions.

In [None]:
# A simple function that takes no arguments
current_time = {
  "type": "function",
  "function": {
    "name": "current_time",
    "description": "Get the current local time as a string.",
    "parameters": {
      'type': 'object',
      'properties': {}
    }
  }
}

# A more complete function that takes two numerical arguments
multiply = {
  'type': 'function',
  'function': {
    'name': 'multiply',
    'description': 'A function that multiplies two numbers',
    'parameters': {
      'type': 'object',
      'properties': {
        'a': {
          'type': 'number',
          'description': 'The first number to multiply'
        },
        'b': {
          'type': 'number', 'description': 'The second number to multiply'
        }
      },
      'required': ['a', 'b']
    }
  }
}

model_input = tokenizer.apply_chat_template(
    messages,
    tools = [current_time, multiply]
)