# Command-R Function Calling Evaluation Demo

**🎯 Goal**:
- Use Okareo to evaluate Cohere's Command-R function calling model.

**📋 Steps**:
- Upload a function calling scenario
- Upload custom checks to assess the agent's function calling outputs
- Define a custom model to invoke Command-R for function calling
- Run the evaluation using the scenario, checks, and model

In [None]:
# get Okareo client
import os
from okareo import Okareo

OKAREO_API_KEY = os.environ["OKAREO_API_KEY"]
okareo = Okareo(OKAREO_API_KEY)

Upload a scenario based on data used in the Berkeley Function Calling Leaderboard.

See here for more details: https://gorilla.cs.berkeley.edu/leaderboard.html#leaderboard

In [None]:
from okareo_api_client.models.scenario_set_create import ScenarioSetCreate
from okareo_api_client.models.seed_data import SeedData

file_path = "scenarios/google_api_scenario.jsonl"
tool_scenario = okareo.upload_scenario_set(
    scenario_name="Google API Tool Scenario",
    file_path=file_path,
)
print(tool_scenario.app_link)

Load the tool definitions for use with Command-R.

In [None]:
# load the tools to use with command-r
import json

def int_to_bool(obj):
    # Convert all integer values to booleans
    return {k: (bool(v) if isinstance(v, int) else v) for k, v in obj.items()}

search_filename = "apis/cohere_format/google_search.json"
with open(search_filename, "r") as f:
    search_contents = json.load(f, object_hook=int_to_bool)

translate_filename = "apis/cohere_format/google_translate.json"
with open(translate_filename, "r") as f:
    translate_contents = json.load(f, object_hook=int_to_bool)

Define the [CustomModel](https://docs.okareo.ai/docs/sdk/okareo_python#custommodel--modelinvocation) to call Command-R and parse its outputs.

In [None]:
# custom model that calls command-r with tools

from okareo.model_under_test import CustomModel, ModelInvocation
import cohere

COHERE_API_KEY = os.environ["COHERE_API_KEY"]

class CommandRToolModel(CustomModel):
    def __init__(self, name):
        super().__init__(name)
        self.client = cohere.Client(api_key=COHERE_API_KEY)
        self.tools = search_contents+translate_contents
        self.preamble = (
            "You are a Google API assistant helping a user translate their requests into code. "
            "The user will provide a description of the task they want to accomplish, "
            "and you will generate the corresponding Python code.\n\n"
            "Only output the code snippet that corresponds to the API call that answers the user's question. "
            "For example, 'my_api_call(\"my argument #1\", \"my argument #2\")'"
        )

    def invoke(self, input_value):
        response = self.client.chat(
            message=input_value,
            tools=self.tools,
            preamble=self.preamble,
            model="command-r",
        )
        # extract the tool completion
        message_out = response.chat_history[-1]
        print(message_out)
        return ModelInvocation(
            model_prediction=message_out,
            model_input=input_value,
            raw_model_output=response.meta,
        )

# Register the model to use in the test run
mut_name="Command-R Tool Model"
model_under_test = okareo.register_model(
    name=mut_name,
    model=[CommandRToolModel(name=CommandRToolModel.__name__)],
    update=True
)

Run a [Generation evaluation](https://docs.okareo.ai/docs/guides/generation_overview) on the custom Command-R model. 

We use predefined [checks in Okareo](https://docs.okareo.ai/docs/getting-started/concepts/checks), and the selected checks reflect the [Evaluation Metrics](https://gorilla.cs.berkeley.edu/blogs/8_berkeley_function_calling_leaderboard.html#metrics) used in the Tool Calling Leaderboard. These checks include:

- "Is Function Correct"
- "Are Required Parameters Present"
- "Are All Parameters Expected"
- "Do Parameter Values Match"

In [None]:
# evaluation that uses the scenario, check, and model
from okareo_api_client.models.test_run_type import TestRunType

eval_name = f"Command-R Tool Call evaluation"
evaluation = model_under_test.run_test(
    name=eval_name,
    scenario=tool_scenario.scenario_id,
    test_run_type=TestRunType.NL_GENERATION,
    checks=[
        "is_function_correct",
        "are_required_params_present",
        "are_all_params_expected",
        "do_param_values_match",
    ],
)
print(f"See results in Okareo: {evaluation.app_link}")