[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openlayer-ai/examples-gallery/blob/main/monitoring/llms/monitoring-llms.ipynb)


# <a id="top">Monitoring LLMs</a>

This notebook illustrates a typical monitoring flow for LLMs using Openlayer. For more details, refer to the [How to set up monitoring guide](https://docs.openlayer.com/docs/how-to-guides/set-up-monitoring) from the documentation.


## <a id="toc">Table of contents</a>

1. [**Creating a project and an inference pipeline**](#inference-pipeline) 

2. [**Publishing production data**](#publish-batches)

3. [(Optional) **Uploading a reference dataset**](#reference-dataset)

4. [(Optional) **Publishing ground truths**](#ground-truths)

Before we start, let's download the sample data and import pandas.

In [None]:
%%bash

if [ ! -e "fine_tuning_dataset.csv" ]; then
    curl "https://openlayer-static-assets.s3.us-west-2.amazonaws.com/examples-datasets/monitoring/llms/fine_tuning_dataset.csv" --output "fine_tuning_dataset.csv"
fi

if [ ! -e "prod_data_no_ground_truths.csv" ]; then
    curl "https://openlayer-static-assets.s3.us-west-2.amazonaws.com/examples-datasets/monitoring/llms/prod_data_no_ground_truths.csv" --output "prod_data_no_ground_truths.csv"
fi

if [ ! -e "prod_ground_truths.csv" ]; then
    curl "https://openlayer-static-assets.s3.us-west-2.amazonaws.com/examples-datasets/monitoring/llms/prod_ground_truths.csv" --output "prod_ground_truths.csv"
fi

In [None]:
import pandas as pd

## <a id="inference-pipeline"> 1. Creating a project and an inference pipeline </a>

[Back to top](#top)

In [None]:
!pip install openlayer

In [None]:
import openlayer

client = openlayer.OpenlayerClient("YOUR_API_KEY_HERE")

In [None]:
from openlayer.tasks import TaskType

project = client.create_project(
    name="Python QA",
    task_type=TaskType.LLM,
)

Now that you are authenticated and have a project on the platform, it's time to create an inference pipeline. Creating an inference pipeline is what enables the monitoring capabilities in a project.

In [None]:
inference_pipeline = project.create_inference_pipeline()

## <a id="publish-batches"> 2. Publishing production data </a>

[Back to top](#top)

In production, as the model makes predictions, the data can be published to Openlayer. This is done with the `publish_batch_data` method. 

The data published to Openlayer can have a column with **inference ids** and another with **timestamps** (UNIX sec format). These are both optional and, if not provided, will receive default values. The inference id is particularly important if you wish to publish ground truths at a later time. 

In [None]:
production_data = pd.read_csv("prod_data_no_ground_truths.csv")

In [None]:
batch_1 = production_data.loc[:9]
batch_2 = production_data.loc[9:18]
batch_3 = production_data.loc[18:]

In [None]:
batch_1.head()

### <a id="publish-batches"> Publish to Openlayer </a>

Here, we're simulating three calls to `publish_batch_data`. In practice, this is a code snippet that lives in your inference pipeline and that gets called after the model predictions.

In [None]:
batch_config = {
    "inputVariableNames": ["question"],
    "outputColumnName": "answer",
    "inferenceIdColumnName": "inference_id",
}


In [None]:
inference_pipeline.publish_batch_data(
    batch_df=batch_1,
    batch_config=batch_config
)

In [None]:
inference_pipeline.publish_batch_data(
    batch_df=batch_2,
    batch_config=batch_config
)

In [None]:
inference_pipeline.publish_batch_data(
    batch_df=batch_3,
    batch_config=batch_config
)

**That's it!** You're now able to set up tests and alerts for your production data. The next sections are optional and enable some features on the platform.

## <a id="reference-dataset"> 3. Uploading a reference dataset </a>

[Back to top](#top)

A reference dataset is optional, but it enables drift monitoring. Ideally, the reference dataset is a representative sample of the training/fine-tuning set used to train the deployed model. In this section, we first load the dataset and then we upload it to Openlayer using the `upload_reference_dataframe` method.

In [None]:
fine_tuning_data = pd.read_csv("./fine_tuning_dataset.csv")

### <a id="upload-reference"> Uploading the dataset to Openlayer </a>

In [None]:
dataset_config = {
    "inputVariableNames": ["question"],
    "groundTruthColumnName": "ground_truth",
    "label": "reference"
}

In [None]:
inference_pipeline.upload_reference_dataframe(
    dataset_df=fine_tuning_data,
    dataset_config=dataset_config
)

## <a id="ground-truths"> 4. Publishing ground truths for past batches </a>

[Back to top](#top)

The ground truths are needed to create Performance tests. The `publish_ground_truths` method can be used to update the ground truths for batches of data already published to the Openlayer platform. The inference id is what gets used to merge the ground truths with the corresponding rows.

In [None]:
ground_truths = pd.read_csv("prod_ground_truths.csv")

### <a id="publish-truth">Publish ground truths </a>

In [None]:
inference_pipeline.publish_ground_truths(
    df=ground_truths,
    ground_truth_column_name="ground_truth",
    inference_id_column_name="inference_id",
)