# Creating custom metrics for your Large Language Model (LLM)

Within this Jupyter notebook, you'll discover the seamless process of crafting and implementing custom metrics using code. This invaluable functionality empowers users to deploy metrics tailored precisely to their unique business requirements. Whether you're tracking specialized metrics or catering to distinct business needs, this tutorial equips you with the tools to seamlessly integrate custom metrics into your workflow.

## Installation
Ensure you have the required libraries installed by executing the following commands in your Jupyter notebook:

In [None]:
!pip install -q datarobot datarobotx[llm] tiktoken 

## Import modules

In [None]:
import os
import requests
import tiktoken
import pandas as pd
from datetime import datetime

## Load secrets

In [None]:
API_URL = os.environ['API_URL']
PRED_API_URL = os.environ['PRED_API_URL']
API_KEY = os.environ['API_KEY']
DEPLOYMENT_ID = os.environ['DEPLOYMENT_ID'] #DEPLOYMENT ID
CUSTOM_METRIC_ID = os.environ['CUSTOM_METRIC_ID'] #Metric ID
DATAROBOT_KEY = os.environ['DATAROBOT_KEY'] #DEPLOYMENT ID
MODEL_PACKAGE_ID = None

## Creating custom metric function
To understand the Total Cost of the solution, the organization will build a metric around the pricing of the GPT 3.5 API provided by Azure. The following function calculates the price per prediction call using token counts calculated using Tiktoken. It then multiplies the token counts with the price per token provided in the Azure OpenAl pricing
page. If the LLM is self-hosted, metrics around the compute cost are relevant for the organization.

In [2]:
# Obtain encoding for the specified model
encoding = tiktoken.get_encoding("cl100k_base")

# Define custom metrics functions
def get_gpt_token_count(text):
    """
    Counts the number of tokens in the given text.

    Args:
        text (str): Input text to count tokens.

    Returns:
        int: Number of tokens in the text.
    """    
    return len(encoding.encode(text))

def get_gpt_3_5_cost(
    prompt, response, prompt_token_cost=0.0015 / 1000, response_token_cost=0.002 / 1000
):
    """
    Calculates the cost of generating a response using GPT, based on token counts and token costs.

    Args:
        prompt (str): Prompt text provided to the model.
        response (str): Generated response text.
        prompt_token_cost (float): Cost per token for the prompt. Default is 0.0015 / 1000.
        response_token_cost (float): Cost per token for the response. Default is 0.002 / 1000.

    Returns:
        float: Cost of generating the response.
    """    
    return (
        get_gpt_token_count(prompt) * prompt_token_cost
        + get_gpt_token_count(response) * response_token_cost
    )

# Example usage
prompt = "How can we improve renewable energy?"
response = "By investing in solar and wind technologies."

# Calculate custom metric
cost = get_gpt_3_5_cost(prompt, response)
print("Cost of generating response:", cost)


Cost of generating response: 2.65e-05


## Test predictions on deployed model

In [4]:
def make_predictions(inputText):
    "Make predictions using the specified input text and DataRobot deployment."
    
    data = "promptText\n" + inputText
    
    headers={
                "Content-Type": "text/plain; charset=UTF-8",
                "Authorization": "Bearer " + API_KEY,
                "DataRobot-Key": DATAROBOT_KEY
            }
    
    predictions_response = requests.post(
            PRED_API_URL.format(deployment_id=DEPLOYMENT_ID),
            data=data,
            headers=headers,
    )
    return predictions_response

inputText = "What is romance?"
predictions_response = make_predictions(inputText)
predictions_response.json()

{'data': [{'rowId': 0,
   'prediction': 'Romance is a genre of literature, film, and television that focuses on passionate love stories that often have a happy ending. The element of drama, emotion, and intimacy characterizes romantic stories typically unfold between two people who are attracted to each other but face some obstacle, whether internal or external, that keeps them from being together right away. The genre first emerged in the late eighteenth century and has been consistently popular ever since.',
   'predictionValues': [{'label': 'resultText',
     'value': 'Romance is a genre of literature, film, and television that focuses on passionate love stories that often have a happy ending. The element of drama, emotion, and intimacy characterizes romantic stories typically unfold between two people who are attracted to each other but face some obstacle, whether internal or external, that keeps them from being together right away. The genre first emerged in the late eighteenth ce

## Upload custom metrics
This function submit_custom_metric records values for an existing custom metric on a deployment. It takes the API URL, API key, deployment ID, custom metric ID, model package ID, and the metric value as inputs, and posts the data to the specified endpoint.

In [None]:
def submit_custom_metric(API_URL,API_KEY,DEPLOYMENT_ID,CUSTOM_METRIC_ID,MODEL_PACKAGE_ID, metric):
    """Record values for an existing custom metric on a deployment"""
    
    HEADERS = {
    'Authorization': 'Bearer {}'.format(API_KEY),
    'User-Agent': 'IntegrationSnippet-Requests',
    }
    time_ = datetime.today().strftime("%m/%d/%Y %I:%M %p")
    rows = [
        {"timestamp": ts.isoformat(), "value": value}
        for ts, value in zip([pd.to_datetime(time_)], [metric])
    ]
    response = requests.post(
        API_URL.format(DEPLOYMENT_ID, CUSTOM_METRIC_ID),
        json={'modelPackageId': MODEL_PACKAGE_ID, 'buckets': rows,},
        headers=HEADERS,
    )
    response.raise_for_status()

In [None]:
submit_custom_metric(
    API_URL,
    API_KEY,
    DEPLOYMENT_ID,
    CUSTOM_METRIC_ID,
    MODEL_PACKAGE_ID,
    get_gpt_3_5_cost(inputText, predictions_response.json()["data"][0]["prediction"]),
)