<a target="_blank" href="https://colab.research.google.com/github/okareo-ai/okareo-python-sdk/blob/main/examples/test_runs.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

## Welcome to Okareo!

Get your API token from [https://app.okareo.com/](https://app.okareo.com/) and set it in the cell below. 👇
   (Note: You will need to register first.)



In [None]:
OKAREO_API_KEY = "<YOUR-OKAREO-API-TOKEN>"

%pip install okareo transformers torch

We're going to set up a simple classification task that will score a model on how accurately it can classify different scenarios. The setup will have three parts:

1. A pretrained model that classifies scenarios as pertaining to either "pricing," "returns," or "complaints"
2. A set of scenarios to test the model on
3. An evaluation of the model

## The model

The model is a finetuned version of [DistilBERT](https://huggingface.co/docs/transformers/en/model_doc/distilbert#distilbert), a smaller, faster version of BERT.

In [None]:
# Load libraries
from transformers import AutoTokenizer, DistilBertForSequenceClassification

# Load a tokenizer for the model from the Hugging Face Hub
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

# Load the pretrained model from the Hugging Face Hub
model = DistilBertForSequenceClassification.from_pretrained("okareo-ai/webbizz_classification_model")

In [9]:
# Load all of the necessary libraries from Okareo
from okareo import Okareo
from okareo_api_client.models import ScenarioSetCreate, SeedData
from okareo.model_under_test import CustomModel, ModelInvocation

# Load the torch library
import torch

# Create an instance of the Okareo client
okareo = Okareo(OKAREO_API_KEY)

# Define a model class that will be used used for classification
# The model takes in a scenario and returns a predicted class
class ClassificationModel(CustomModel):
    # Constructor for the model
    def __init__(self, name, tokenizer, model):
        self.name = name
        # The pretrained tokenizer
        self.tokenizer = tokenizer
        # The pretrained model
        self.model = model
        # The possible labels for the model
        self.label_lookup = ["pricing", "returns", "complaints"]

    # Callable to be applied to each scenario in the scenario set
    def invoke(self, input: str):
        # Tokenize the input
        encoding = self.tokenizer(input, return_tensors="pt", padding="max_length", truncation=True, max_length=512)
        # Get the logits from the model
        logits = self.model(**encoding).logits
        # Get the index of the highest value (the predicted class)
        idx = torch.argmax(logits, dim=1).item()
        # Get the label for the predicted class
        prediction = self.label_lookup[idx]
        
        # Return the prediction in a ModelInvocation object
        return ModelInvocation(
                actual=prediction,
                model_input=input,
                model_result={ "prediction": prediction, "confidence": logits.softmax(dim=1).max().item() },
            )

# Register the model with Okareo
# This will return a model if it already exists or create a new one if it doesn't
model_under_test = okareo.register_model(name="intent_classifier_model", model=ClassificationModel(name="Classification model", tokenizer=tokenizer, model=model), update=True)

## The scenarios

In [11]:
# Define a scenario set
# This is a collection of scenarios that will be used to test the model
scenario_set_create = ScenarioSetCreate(name="My Test Scenario Set", # Name of the scenario set
                                        # The data that will be used to test the model
                                        # Each SeedData object has an input (the scenario) and a result (the expected output)
                                        seed_data=[
                                            SeedData(input_="I want to send this product back", result="returns"),
                                            SeedData(input_="my product is not working", result="complaints"),
                                            SeedData(input_="how much is the warranty on the product", result="pricing"),
                                            SeedData(input_="this product is having issues", result="complaints"),
                                            SeedData(input_="I want to send this product back for a return", result="returns"),
                                            SeedData(input_="how much is this product", result="pricing"),
                                            SeedData(input_="I just received my order, and it's not what I expected. What should I do?", result="returns"),
                                            SeedData(input_="I ordered a book, but I received a DVD. What's the next step?", result="returns"),
                                            SeedData(input_="I'm having trouble with the product I purchased. Who should I contact?", result="complaints"),
                                            SeedData(input_="The software I purchased isn't compatible with my computer. Who can help me with this?", result="complaints"),
                                            SeedData(input_="The product I bought last week is now on sale. Can I get a refund for the difference?", result="pricing"),
                                            SeedData(input_="I saw an ad for a discount on your products, but I can't find any information on your site. Can you help?", result="pricing")])

# Create the scenario set
scenario = okareo.create_scenario_set(scenario_set_create)
scenario_id = scenario.scenario_id

## Evaluation

In [None]:
# Run the test
# This will run the model on the scenarios in the scenario set
test_run_item = model_under_test.run_test(scenario=scenario_id, name="Intent Classifier Run", calculate_metrics=True)

# Generate a link back to Okareo for evaluation visualization
model_results = test_run_item.model_metrics.to_dict()
app_link = test_run_item.app_link
print(f"See results in Okareo: {app_link}")