# Simulating Tool Calling

In this cookbook, we demonstrate how to evaluate tool calling capabilities in LLM applications using objective metrics. Like always, we'll focus on data-driven approaches to measure and improve tool selection performance.

When building AI assistants, we often need them to use external tools - searching databases, calling APIs, or processing data. But how do we know if our model is selecting the right tools at the right time? Traditional evaluation methods don't capture this well.

Imagine you're building a customer service bot. A user asks "What's my account balance?" Your assistant needs to decide: should it query the account database, ask for authentication, or simply respond with general information? Selecting the wrong tool leads to either frustrated users (if important tools are missed) or wasted resources (if unnecessary tools are called).

The key insight is that tool selection quality is distinct from text generation quality. You can have a model that writes beautiful responses but consistently fails to take appropriate actions. By measuring precision and recall of tool selection decisions, we can systematically improve how our models interact with the world around them.

## Requirements

Before starting, ensure you have the following packages installed:

In [1]:
%pip install langwatch pydantic openai pandas 


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


## Setup

Start by setting up LangWatch to monitor your RAG application:

In [None]:
import langwatch
import openai
import getpass
import pandas as pd

# Display settings for better visualization
pd.set_option('display.max_columns', None)
pd.set_option('display.width', 1000)
pd.set_option('display.max_rows', 100)
pd.set_option('display.max_colwidth', 100)

# Initialize OpenAI and LangWatch
openai.api_key = getpass.getpass('Enter your OpenAI API key: ')
langwatch.api_key = getpass.getpass('Enter your LangWatch API key: ')