# How to Use FuriosaAI SDK’s Tool Calling to build AI Agents

The latest Llama models, including versions 3.1, 3.2, and 4, include native support for tool calling, making it easy to build agentic systems that request information from a wide range of APIs and then use that information to inform the system’s response to user queries. In this notebook, we'll explore what tool calling means in the context of large language models (LLMs), and demonstrate how to build AI agents running on RNGD (pronounced “Renegade”), Furiosa’s flagship AI accelerator for LLMs, RAG applications, and agentic AI. In this demo, we’ll use the Llama 3.1 8B model in the FuriosaAI SDK, which provides an OpenAI-compatible API that makes tool calling seamless.

## This notebook covers,
- Introduction to Tool Calling
- Using the Llama 3.1 8B Instruct model with the FuriosaAI SDK for tool calling
- Building a simple search agent using LangChain tools and agents



## Tool calling
Tool calling extends an LLM's capabilities by allowing it to interact with external tools or functions. The available tools and their required input parameters are defined in the system prompt, and provided to the model along with the user's input. When the model receives a user query and determines that a tool is needed, it generates a tool call request in the JSON format. For example, if a user asks about the current weather in San Francisco, the model can issue a structured request to a weather API tool.

When an LLM attempts to call a tool in response to a user query, it must extract the necessary parameters from the input and generate a structured output that conforms to the expected function call’s input format. This is a non-trivial task, as the model must not only accurately identify and format the required parameters according to the specific schema defined for the tool, but also generate the structured data for the tool calling request.

To enable this capability, some approaches use few-shot prompting, where examples of tool usage and the required output format are provided as demonstrations. More recently, advanced models like Llama 3.1 8B have been trained with built-in tool-calling capabilities, allowing them to perform this process more reliably without extensive prompt engineering.

You can follow the example steps below to test tool calling.

#### 1. Define the tool
```
tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City and state, e.g., 'San Francisco, CA'"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location", "unit"]
        }
    }
}]
```

#### 2. Receive the user’s input
- "What's the weather like in San Francisco?"


#### 3. Get the tool call request
- Function called: `get_weather`
- Arguments: {"location": "San Francisco, CA", "unit": "fahrenheit"}


#### 4. Get tool call outputs by executing to python backend
- Tool call outputs (by tool execution): "Getting the weather for San Francisco, CA in fahrenheit..."


#### 5. Append the tool calling outputs and generate the final LLM responses
- By appending the tool calling outputs to the previous chat history and calling the LLM, the model can generate the final outputs.
- The final LLM response: "The weather for San Francisco, CA is ..." 


## How the FuriosaAI SDK supports tool calling
The FuriosaAI SDK dramatically simplifies accelerated inference. It includes a powerful server that can run NPU-optimized models and expose them through an OpenAI-compatible REST API.
For more extensive AI/ML inference workflows, the FuriosaAI SDK offers support for tool calling. This means that AI agents developed with FuriosaAI SDK are able to invoke external or custom tools while reasoning and generating responses. Let’s dive into how you can leverage tool calling with the FuriosaAI SDK to build an AI agent.

## Prerequisites
- Access to a RNGD servers
- FuriosaSDK 2025.03
- Install dependencies
- Choose an AI application framework (we'll use Langchain)



#### 1. Load RNGD servers

- Load RNGD servers with `furiosa-llm serve` command.
- We'll use tool calling supported server with `--enable-auto-tool-choice` and `--tool-call-parser`.


```
furiosa-llm serve furiosa-ai/Llama-3.1-8B-Instruct-FP8 \
    --enable-auto-tool-choice \
    --tool-call-parser llama3_json \
    --port 8000 \
    --devices "npu:0"
```

In [4]:
from openai import OpenAI
import json


port= 8000
api_key="EMPTY"
client = OpenAI(base_url = f"http://localhost:{port}/v1", 
                api_key=api_key)

## Chat with Tool calling 
Here are the four steps that simplify the process of LLM chat with tool calling:
The model generates inputs for the tools.
The specified tools are executed using these inputs, returning the corresponding outputs.
The outputs from the tool calls are incorporated into the ongoing chat history.
The model uses the tool outputs along with the prior context to generate the final response.

#### 1. Tool Definition
First, we need to specify the following arguments.
name of tool
description of tool
json schema describing the inputs to the tool

#### 2. Tool binding
Once a tool is defined, the related information—such as tools and tool_choice—is passed into the chat API. By linking these tools with the language model, the model receives both the tools and the surrounding context. Based on the provided tool definitions and the selected tool_choice strategy, the model can generate a function call by deciding which tool to use and what input parameters to provide.



## Tool Usage -- Custom Tool Use

In [1]:
import json
import random

def get_stock_price(ticker: str, currency: str) -> str:
    price = round(random.uniform(100, 1500), 2)
    return f"The current price of {ticker.upper()} is {price} {currency.upper()}."

def get_exchange_rate(base_currency: str, target_currency: str) -> str:
    rate = round(random.uniform(0.5, 1.5), 4)
    return f"Current exchange rate from {base_currency.upper()} to {target_currency.upper()} is {rate}."

def get_financial_news(company: str) -> str:
    sample_news = [
        f"{company} reported better-than-expected quarterly earnings.",
        f"{company} announced a strategic partnership to expand into new markets.",
        f"Analysts are optimistic about {company}'s growth prospects for the upcoming year."
    ]
    return random.choice(sample_news)

tool_functions = {
    "get_stock_price": get_stock_price,
    "get_exchange_rate": get_exchange_rate,
    "get_financial_news": get_financial_news
}

custom_tools = [
    {
        "type": "function",
        "function": {
            "name": "get_stock_price",
            "description": "Retrieve the latest stock price for a given ticker symbol.",
            "parameters": {
                "type": "object",
                "properties": {
                    "ticker": {"type": "string", "description": "Stock ticker symbol, e.g., 'AAPL', 'TSLA'"},
                    "currency": {"type": "string", "enum": ["USD", "EUR", "KRW"], "description": "Currency code for the price."}
                },
                "required": ["ticker", "currency"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_exchange_rate",
            "description": "Get the current exchange rate between two currencies.",
            "parameters": {
                "type": "object",
                "properties": {
                    "base_currency": {"type": "string", "description": "Base currency code, e.g., 'USD'"},
                    "target_currency": {"type": "string", "description": "Target currency code, e.g., 'EUR'"}
                },
                "required": ["base_currency", "target_currency"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_financial_news",
            "description": "Get recent financial news headlines related to a company.",
            "parameters": {
                "type": "object",
                "properties": {
                    "company": {"type": "string", "description": "Company name, e.g., 'Apple', 'Tesla'"}
                },
                "required": ["company"]
            }
        }
    }
]


user_inputs = ["What's Tesla's stock price in USD?",
              "What's the current USD to EUR exchange rate?",
              " Also, any recent news about Tesla?"]


tool_functions = {"get_stock_price": get_stock_price,
                  "get_exchange_rate": get_exchange_rate,
                  "get_financial_news": get_financial_news} 

In [None]:
for user_input in user_inputs:
    messages = [{"role": "user", "content": user_input}]
 
    response = client.chat.completions.create(
        model="furiosa-ai/Llama-3.1-8B-Instruct-FP8",
        messages=messages,
        tools=custom_tools,
        tool_choice="auto",
        max_completion_tokens=100,
    )

    print("===================")
    print(response.choices[0])
    tool_call = response.choices[0].message.tool_calls[0].function
    print(f"Function called: {tool_call.name}")
    print(f"Arguments: {tool_call.arguments}")
    print(f"Result: {tool_functions[tool_call.name](**json.loads(tool_call.arguments))}")

Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='chatcmpl-tool-1ce4b5b3ef324c8399b4151778f73021', function=Function(arguments='{"ticker": "TSLA", "currency": "USD"}', name='get_stock_price'), type='function')], reasoning_content=None))
Function called: get_stock_price
Arguments: {"ticker": "TSLA", "currency": "USD"}
Result: The current price of TSLA is 1363.53 USD.
Choice(finish_reason='tool_calls', index=0, logprobs=None, message=ChatCompletionMessage(content=None, refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=[ChatCompletionMessageToolCall(id='chatcmpl-tool-fcf9999bf2cd4e8bb020b5f8bd0c0927', function=Function(arguments='{"base_currency": "USD", "target_currency": "EUR"}', name='get_exchange_rate'), type='function')], reasoning_content=None))
Function called: 