# Multi-turn Evaluation with Tool Call Demo

**🎯 Goal**:
- Run a multi-turn evaluation in Okareo.
- Provide a simple introduction to Okareo evaluations.

**📋 Steps**:
1. Upload a multi-turn scenario.
2. Define a model to act as a Target in a multi-turn conversation, and add tools that are available to call.
3. Run the evaluation using the scenario, model, and a check for task completion.

In [None]:
!pip install okareo

In [2]:
# get Okareo client
import os
from okareo import Okareo

OKAREO_API_KEY = os.environ.get("OKAREO_API_KEY", "<YOUR_OKAREO_API_KEY>")
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY", "<YOUR_OPENAI_API_KEY>")
okareo = Okareo(OKAREO_API_KEY)

Upload a simple scenario. Each row of the `seed_data` should contain:

- `input_`: a prompt used to direct the Driver.

    - It's important to note that the driver prompt needs to be aware of possible tool calls. This awareness allows the driver to return a mocked JSON response when appropriate. 

In [3]:
import random
import string

from okareo_api_client.models.scenario_set_create import ScenarioSetCreate
from okareo_api_client.models.seed_data import SeedData

def random_string(length: int) -> str:
    return "".join(random.choices(string.ascii_letters, k=length))

JSON_SCENARIO = [
    {
        "input": {
            "user_name": "john_doe_123",
            "action": "delete account",
        },
        "result": "delete account", 
        #"result": [{"function":{"arguments":{"user_id":"john_doe_123"}, "name":"delete_account"},"id":".*","type":"function"}],
    },
    {
        "input": {
            "user_name": "susan_smith_456",
            "action": "open account",
        },
        "result": "open account",
        #"result": [{"function":{"arguments":{"user_id":"susan_smith_456"}, "name":"open_account"},"id":".*","type":"function"}],
    },
    {
        "input": {
            "user_name": "alex_jones_789",
            "action": "account balance",
        },
        "result": "account balance",
        #"result": [{"function":{"arguments":{"user_id":"alex_jones_789"}, "name":"account_balance"},"id":".*","type":"function"}],
    },
]

JSON_SEED = Okareo.seed_data_from_list(JSON_SCENARIO)  # type: ignore
scenario_set_create = ScenarioSetCreate(
    name=f"Multi-turn Demo Scenario - {random_string(5)}",
    seed_data=JSON_SEED,
)
okareo.create_scenario_set(scenario_set_create)

# Create seeds from JSON_SCENARIO
seeds = [
    SeedData(
        input_="""You are interacting with a customer service agent. First, ask a single question about WebBizz. 
        
        After getting information about WebBizz, you need to perform this action: {action}. Your user ID is {user_name}.
        
        If you receive any function calls, output the result in JSON format and provide a JSON response indicating that the action was successful. """.format(
            action=scenario["input"]["action"],
            user_name=scenario["input"]["user_name"]
        ),
        result=scenario["result"]
    )
    for scenario in JSON_SCENARIO
] 

# Create scenario set with the generated seeds
scenario_set_create = ScenarioSetCreate(
    name=f"Driver Prompts - {random_string(5)}",
    seed_data=seeds
)
scenario = okareo.create_scenario_set(scenario_set_create)

Create a `ModelBasedCheck` using a description of what task completion would look like for the customer service agent.

In [None]:
from okareo.checks import ModelBasedCheck, CheckOutputType

# Define a ModelBasedCheck to evaluate task completion
# This check determines if the customer service agent informs the user that their account has been deleted
prompt = "The task is complete if the output confirms that {scenario_result} action was successful. Return True for if the task is completed, False otherwise. Here is the output to check: {model_output}"
okareo.create_or_update_check(name='task_completion_account_management', description="Check if the agent confirms account action", check=ModelBasedCheck(prompt_template=prompt, check_type=CheckOutputType.PASS_FAIL))

Register a `MultiTurnDriver` using your OpenAI model as the Target and add in tools that the model can call.

In [5]:
from okareo.model_under_test import MultiTurnDriver, StopConfig, OpenAIModel

# Register a multi-turn model with Okareo
multiturn_model = okareo.register_model(
    name="Demo MultiTurnDriver - OpenAI",
    model=MultiTurnDriver(
        driver_temperature=0,
        max_turns=6,
        # Configure the OpenAI model as the target
        target=OpenAIModel(
                model_id="gpt-4o",
                temperature=0,
                # Define the system prompt for the customer service agent
                system_prompt_template="""You are a customer service agent that can talk about Webbizz, a web-based business that works in ecommerce and can help users manage their accounts by opening, checking balances, and deleting them.
                    Be polite and ensure the user understands the implications of account opening and deletion before you call the function.
                    
                    For account_balance, open_account, and delete_account confirm when the action has been completed successfully.

                    When you call delete_account, the user's account will be deleted, so always ask to make sure they want to delete. Your responses shouldn't be longer than 2 sentences.
                    """,
                # Define the tools available to the model
                tools=[
                    {
                        "type": "function",
                        "function": {
                            "name": "delete_account",
                            "description": "Deletes the user's account",
                            "parameters": {
                                "type": "object",
                                "properties": {
                                    "user_id": {
                                        "type": "string",
                                        "description": "The unique identifier for the user's account",
                                    },
                                },
                                "required": ["user_id"],
                            },
                        },
                    },
                    {
                        "type": "function",
                        "function": {
                            "name": "open_account",
                            "description": "Opens a new account for the user",
                            "parameters": {
                                "type": "object",
                                "properties": {
                                    "user_id": {
                                        "type": "string",
                                        "description": "The unique identifier for the user's account",
                                    },
                                },
                                "required": ["user_id"],
                            },
                        },
                    },
                    {
                        "type": "function",
                        "function": {
                            "name": "account_balance",
                            "description": "Retrieves the current balance for a user's account",
                            "parameters": {
                                "type": "object",
                                "properties": {
                                    "user_id": {
                                        "type": "string",
                                        "description": "The unique identifier for the user's account",
                                    },
                                },
                                "required": ["user_id"],
                            },
                        },
                    }
                ],
            ),
        # Configure when to stop the conversation
        stop_check=StopConfig(check_name="task_completion_account_management", stop_on=True)
        #stop_check=StopConfig(check_name="function_call_consistency", stop_on=True)
    ),
    update=True,
)

Run a [Generation evaluation](https://docs.okareo.ai/docs/guides/generation_overview) on the custom model. 

In [None]:
from okareo_api_client.models.test_run_type import TestRunType

evaluation = multiturn_model.run_test(
    api_key=OPENAI_API_KEY,
    scenario=scenario,
    name="Multi-turn Demo Evaluation w/ Tool Call",
    test_run_type=TestRunType.MULTI_TURN,
    # checks=[
    #     "function_call_validator"
    # ]
)

print(f"See results in Okareo: {evaluation.app_link}")