# Checks in Okareo: An Introduction

<a target="_blank" href="https://colab.research.google.com/github/okareo-ai/okareo-python-sdk/blob/main/examples/checks.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

## 🎯 Goals

After using this notebook, you will be able to:
- Access the list of available `checks` in Okareo
- Generate and upload a custom `check` to Okareo
- Use `checks` to assess the behaviors of registered models in Okareo

First, import the Okareo library and use your [API key](https://docs.okareo.com/docs/guides/environment#setting-up-your-okareo-environment) to authenticate. You will also need an [OpenAI API Key](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key).

In [1]:
from okareo import Okareo

OKAREO_API_KEY = "<YOUR_OKAREO_API_KEY>"
OPENAI_API_KEY = "<YOUR_OPENAI_API_KEY>"

okareo = Okareo(OKAREO_API_KEY)

## Uploading a Scenario

Here we use an existing `.jsonl` file to create a seed scenario with the `upload_scenario_set` method. The data here includes short questions about a fictitious company called "WebBizz."

In [None]:
import os
import requests

file_path_articles = "webbizz_retrieval_questions.jsonl"
scenario_name_articles = "WebBizz Retrieval Questions"

def load_or_download_file(file_path, scenario_name):
    try:
        # load the file to okareo
        source_scenario = okareo.upload_scenario_set(file_path=file_path, scenario_name=scenario_name)
    except:
        print(f"- Loading file {file_path} to Okareo failed. Temporarily download the file from GitHub...") 

        # if the file doesn't exist, download it
        file_url = f"https://raw.githubusercontent.com/okareo-ai/okareo-python-sdk/main/examples/{file_path}"
        response = requests.get(file_url)
        with open(file_path, "wb") as f:
            f.write(response.content)

        # load the file to okareo
        source_scenario = okareo.upload_scenario_set(file_path=file_path, scenario_name=scenario_name)

        # delete the file
        os.remove(file_path)
    return source_scenario

source_scenario  = load_or_download_file(file_path_articles, scenario_name_articles)
print(f"{scenario_name_articles}: {source_scenario.app_link}")

## Register a Model

For this notebook, we will register a simple model that makes the scenario `input` more concise.

In [None]:
import random
import string

from okareo.model_under_test import OpenAIModel

random_string = ''.join(random.choices(string.ascii_letters, k=5))

mut_name = f"OpenAI Concise"
eval_name = f"OpenAI Concise Test Run"

USER_PROMPT_TEMPLATE = "{input}"
BREVITY_CONTEXT_TEMPLATE = """
Rewrite the following text in a more concise manner:
"""

print(f"Registering model...")
# Register the model to use in the test run
model_under_test = okareo.register_model(
name=mut_name,
model=OpenAIModel(
    model_id="gpt-3.5-turbo",
    temperature=0,
    system_prompt_template=BREVITY_CONTEXT_TEMPLATE,
    user_prompt_template=USER_PROMPT_TEMPLATE,
),
)

print(f"Model registered: {model_under_test}")

## Pre-defined Checks

To bootstrap your LLM evaluation workflow, Okareo offers pre-defined checks. Let's list the available checks with `okareo.get_all_checks()`.

In [4]:
all_checks = okareo.get_all_checks()
all_checks_names = [check.name for check in all_checks]
all_checks_names

['coherence_summary',
 'consistency_summary',
 'fluency_summary',
 'relevance_summary',
 'consistency',
 'coherence',
 'conciseness',
 'fluency',
 'uniqueness',
 'levenshtein_distance',
 'levenshtein_distance_input',
 'compression_ratio',
 'does_code_compile',
 'contains_all_imports']

To see if our model is making the input more concise, let's use the `conciseness` and `levenshtein_distance_input` checks.

We can get some more details on these by running the following snippet:

In [5]:
checks = ['conciseness', 'levenshtein_distance_input']
for check in all_checks:
    if check.name in checks:
        print(f"--- {check.name} ---")
        print(check)

--- conciseness ---
EvaluatorBriefResponse(id='14a5cf92-102a-47a0-9bfe-d26fd123d4f7', name='conciseness', description='A measure of economy of words in the model_output. Lower scores indicate unnecessary verbosity in the model_output. Ranges from 1 to 5.', output_data_type='float', time_created=datetime.datetime(2024, 4, 11, 11, 0), additional_properties={})
--- levenshtein_distance_input ---
EvaluatorBriefResponse(id='cfcff942-c11e-44ee-9660-7716f2b997e1', name='levenshtein_distance_input', description='Calculate the Levenshtein Distance between the model output and the scenario input.', output_data_type='int', time_created=datetime.datetime(2024, 4, 9, 12, 0), additional_properties={})


The checks above can be used when calling `run_test` on a model under test using the `checks` parameter.

In [7]:
from okareo_api_client.models.test_run_type import TestRunType

# Run the evaluation
evaluation = model_under_test.run_test(
    name=eval_name,
    scenario=source_scenario,
    api_key=OPENAI_API_KEY,
    test_run_type=TestRunType.NL_GENERATION,
    calculate_metrics=True,
    checks=checks,
)
print(f"See results in Okareo: {evaluation.app_link}")
metrics = evaluation.model_metrics

See results in Okareo: https://app.okareo.com/project/d38b3714-8c8f-4d69-8c07-cc7285bbe1b5/eval/69418399-888b-4bd9-aa04-558655657aeb


## Generating a Custom Check

In addition to the Okareo's predefined checks, you can use Okareo to generate custom checks.

In [6]:
from okareo_api_client.models import EvaluatorSpecRequest

description = (
    "Calculate the number of tokens in the model_output divided by the number of tokens in the scenario_input."
    "Get the number of tokens by doing a simple `len().split(' ')` call on model_output and scenario_input."
)
output_data_type = "float"

generate_request = EvaluatorSpecRequest(
    description=description,
    requires_scenario_input=True,
    requires_scenario_result=False,
    output_data_type=output_data_type
)
generated_test = okareo.generate_check(generate_request).generated_code
print(generated_test)

from abc import ABC, abstractmethod
import re
import nltk
import spacy
import sklearn

class BaseCheck(ABC):
    @staticmethod
    @abstractmethod
    def evaluate():
        pass

class Check(BaseCheck):
    @staticmethod
    def evaluate(model_output: str, scenario_input: str) -> float:
        # Calculate the number of tokens in model_output and scenario_input
        model_output_tokens = len(model_output.split(' '))
        scenario_input_tokens = len(scenario_input.split(' '))
        
        # Calculate the ratio of tokens in model_output to tokens in scenario_input
        token_ratio = model_output_tokens / scenario_input_tokens
        
        return token_ratio



## Uploading the check to Okareo

First, we save the generated code to a temp file locally.

In [7]:
import os
import tempfile

check_name = 'token_compression_ratio'
temp_dir = tempfile.gettempdir()
file_path = os.path.join(temp_dir, f"{check_name}.py")
with open(file_path, "w+") as file:
    file.write(generated_test)

Then pass the `file_path` to the `upload_check` method.

In [8]:
token_cr_check = okareo.upload_check(
    name=check_name,
    file_path=file_path,
    requires_scenario_input=True,
    requires_scenario_result=False,
    description=description,
    output_data_type=output_data_type,
)

## Use the uploaded check

Finally, we can use the uploaded check by adding it to the `checks` parameter of `run_test`.

In [10]:
# Run the evaluation with the custom check
evaluation = model_under_test.run_test(
    name=eval_name,
    scenario=source_scenario,
    api_key=OPENAI_API_KEY,
    test_run_type=TestRunType.NL_GENERATION,
    calculate_metrics=True,
    checks=['conciseness', 'levenshtein_distance_input', check_name],
)
print(f"See results in Okareo: {evaluation.app_link}")
metrics = evaluation.model_metrics

See results in Okareo: https://app.okareo.com/project/d38b3714-8c8f-4d69-8c07-cc7285bbe1b5/eval/903db520-4fdc-4760-adc8-b9366c0ff52a
