# Fiddler LLM Application Quick Start Guide

Fiddler is the pioneer in enterprise AI Observability, offering a unified platform that enables all model stakeholders to monitor model performance and to investigate the true source of model degredation.  Fiddler's AI Observability platform supports both traditional ML models as well as Generative AI applications.  This guide walks you through how to onboard a LLM chatbot application that is built using a RAG architecture.

---

You can start using Fiddler ***in minutes*** by following these 8 quick steps:

1. Imports
2. Connect to Fiddler
3. Create a Fiddler project
4. Upload a baseline dataset
5. Opt-in to specific Fiddler's LLM Enrichments
6. Add information about the LLM application
7. Publish production events
8. Get insights

**Don't have a Fiddler account? [Sign-up for a 14-day free trial](https://www.fiddler.ai/trial?utm_source=fiddler_docs&utm_medium=referral).**

## 1. Imports

In [None]:
!pip install fiddler-client

In [None]:
import numpy as np
import pandas as pd
import time as time
import fiddler as fdl
import datetime

print(f"Running client version {fdl.__version__}")

## 2. Connect to Fiddler

Before you can add information about your LLM application with Fiddler, you'll need to connect using our API client.


---


**We need a few pieces of information to get started.**
1. The URL you're using to connect to Fiddler
2. Your organization ID
3. Your authorization token

In [None]:
URL = '' 
ORG_NAME = ''
AUTH_TOKEN = ''

These parameters can be found on the **Settings** page of your Fiddler environment.

Now just run the following code block to connect to the Fiddler API!

In [None]:
client = fdl.FiddlerApi(url=URL, org_id=ORG_NAME, auth_token=AUTH_TOKEN)

## 3. Create a Fiddler Project

Once you connect, you can create a new project by specifying a unique project ID in the client's `add_project` function.

In [None]:
PROJECT_ID = 'fiddler_chatbot'
DATASET_ID = 'fiddler_chatbot_history'
MODEL_ID = 'rag_chatbot'

In [None]:
client.add_project(PROJECT_ID)

## 4. Upload a baseline dataset

In this example, we'll be onboarding data in order to observe our **Fiddler chatbot application**.  Any Fiddler AI application under observation must first establish a baseline dataset that establishes what the data should look like.   
  
In order to get insights into the model's performance, **Fiddler needs a small sample of data that can serve as a baseline** for making comparisons with data in production.  Let's use a file with some historical prompts, source docs, and responses from our Fiddler chatbot for our baseline.

---

*For more information on how to design a baseline dataset, [click here](https://docs.fiddler.ai/docs/designing-a-baseline-dataset).*

In [None]:
PATH_TO_BASELINE_CSV = 'https://media.githubusercontent.com/media/fiddler-labs/fiddler-examples/main/quickstart/data/chatbot_llm_baseline.csv'

baseline_df = pd.read_csv(PATH_TO_BASELINE_CSV)
baseline_df

Fiddler uses this baseline dataset to keep track of important information about your data.  This includes **data types**, **data ranges**, and **unique values** for categorical variables.

---

You can construct a `DatasetInfo` object to be used as **a schema for keeping track of this information** by running the following code block.

In [None]:
dataset_info = fdl.DatasetInfo.from_dataframe(baseline_df, max_inferred_cardinality=5)
dataset_info

Then use the client's [upload_dataset](https://docs.fiddler.ai/reference/clientupload_dataset) function to send this information to Fiddler.
  
*Just include:*
1. A unique dataset ID
2. The baseline dataset as a pandas DataFrame
3. The [DatasetInfo](https://docs.fiddler.ai/reference/fdldatasetinfo) object you just created

In [None]:
client.upload_dataset(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID,
    dataset={
        'baseline': baseline_df
    },
    info=dataset_info
)

## 5. Opt-in to specific Fiddler LLM Enrichments

After publishing our chatbot's prompts and responses to our Fiddler environment, we can request that Fiddler execute a series of enrichment services that can "score" our prompts and responses for a variety of insights.  These enrichment services can detect AI safety issues like PII leakage, hallucinations, toxicity, and more.  We can also opt-in for enrichment services like embedding generation which will allow us to track prompt and response outliers and drift.  A full description of these enrichments can be found here.

---

Let's define the enrichment services we'd like to use.  Here we will opt in for embedding generation for our prompts, responses and source docs.  Additionally, let's opt in for PII detection, outlier detection through centroid distance metrics, and some other text based evaluation scores.

In [None]:
fiddler_backend_enrichments = [
    fdl.Enrichment(
        name='Enrichment Prompt Embedding',
        enrichment='embedding',
        columns=['question'],
    ),
    fdl.TextEmbedding(
        name='Prompt TextEmbedding',
        source_column='question',
        column='Enrichment Prompt Embedding',
        n_tags=10
    ),
    fdl.Enrichment(
        name='Enrichment Prompt Centroid Distance',
        enrichment='centroid_distance',
        columns=['Prompt TextEmbedding'],
    ),
    fdl.Enrichment(
        name='Enrichment Source Docs Embedding',
        enrichment='embedding',
        columns=['source_docs'],
    ),
    fdl.TextEmbedding(
        name='Source Docs TextEmbedding',
        source_column='source_docs',
        column='Enrichment Source Docs Embedding',
        n_tags=10
    ),
    fdl.Enrichment(
        name='Enrichment Source Docs Centroid Distance',
        enrichment='centroid_distance',
        columns=['Source Docs TextEmbedding'],
    ),    
     fdl.Enrichment(
        name='Enrichment Response Embedding',
        enrichment='embedding',
        columns=['response'],
    ),
    fdl.TextEmbedding(
        name='Response TextEmbedding',
        source_column='response',
        column='Enrichment Response Embedding',
        n_tags=10
    ),
    fdl.Enrichment(
        name='Enrichment Response Centroid Distance',
        enrichment='centroid_distance',
        columns=['Response TextEmbedding'],
    ),
    fdl.Enrichment(
        name='Enrichment QA TextStat',
        enrichment='textstat',
        columns=['question', 'response'],
        config={'statistics': [
                'char_count',
                'flesch_reading_ease',
                'flesch_kincaid_grade',
            ]
        }
    ),
    fdl.Enrichment(
        name='Enrichment QA Sentiment',
        enrichment='sentiment',
        columns=['question', 'response'],
    )
]

## 6.  Add information about the LLM application

Now it's time to onboard information about our LLM application to Fiddler.  We do this by defining a [ModelInfo](https://docs.fiddler.ai/reference/fdlmodelinfo) object.


---


The [ModelInfo](https://docs.fiddler.ai/reference/fdlmodelinfo) object will contain some **information about how your LLM application operates**.
  
*Just include:*
1. The **dataset_info** object which defines our data types and columns
2. The **dataset_id** of our baseline dataset.  The baseline dataset will also be enriched during this step based on the enrichments we've configured.
3. The **task** your model is performing (LLM, regression, binary classification, etc.)
4. The **features** columns.  For a LLM application, these are just the raw inputs and outputs of our LLM application.
5. The **custom_features** which contain the configuration of the enrichments we opted for.


In [None]:
model_info = fdl.ModelInfo.from_dataset_info(
    dataset_info=dataset_info,
    dataset_id='baseline',
    model_task=fdl.ModelTask.LLM,
    features=['question', 'response', 'source_docs'],
    custom_features=fiddler_backend_enrichments
)
model_info

Almost done! Now just specify a unique model ID and use the client's [add_model](https://docs.fiddler.ai/reference/clientadd_model) function to send this information to Fiddler.  This step can take a little longer as the baseline dataset will also be enriched during this call.

In [None]:
client.add_model(
    project_id=PROJECT_ID,
    dataset_id=DATASET_ID,
    model_id=MODEL_ID,
    model_info=model_info,
)

## 7. Publish production events

Information about our LLM application is onboarded to Fiddler and now it's time to start publishing some production data!  
Fiddler will **monitor this data and compare it to your baseline to generate powerful insights into how your model is behaving**.


---


Each record sent to Fiddler is called **an event**.  Events simply contain the inputs and outputs of a predictive model or LLM application.
  
Let's load in some sample events (prompts and responses) from a CSV file.

In [45]:
PATH_TO_EVENTS_CSV = 'https://media.githubusercontent.com/media/fiddler-labs/fiddler-examples/main/quickstart/data/chatbot_llm_events.csv'

llm_events_df = pd.read_csv(PATH_TO_EVENTS_CSV)
 
# Timeshifting the timestamp column in the events file so the events are as recent as today
llm_events_df['ts'] = pd.to_datetime(llm_events_df['ts'])
time_diff = pd.Timestamp.now().normalize() - llm_events_df['ts'].max()
llm_events_df['ts'] += time_diff

llm_events_df

Unnamed: 0,session_id,source_docs,response,question,comment,feedback,ts
0,6d9917af-c889-40a5-bd83-a7860318a26f,"Document: slug: ""cv-monitoring"" _ID, ''moni...","Yes, Fiddler supports LLM (Lifecycle Model Mon...",Does Fiddler support LLM monitoring?,,,2023-10-29 02:29:18.094
1,6d9917af-c889-40a5-bd83-a7860318a26f,"Document: ---\ntitle: ""Designing a Baseline...",A baseline dataset is a representative sample ...,what is baseline dataset?,,Like,2023-10-29 02:30:48.656
2,6d9917af-c889-40a5-bd83-a7860318a26f,"Document: ---\ntitle: ""Baselines""\nslug: ""f...","No, the default baseline for all monitoring me...",Does Fiddler have the ability to change the mo...,,Like,2023-10-29 02:33:35.860
3,6d9917af-c889-40a5-bd83-a7860318a26f,"Document: ---\ntitle: ""Data Drift""\nslug: ""...",The calculation of data drift metrics in Fiddl...,how is data drift metrics calculated?,pretty good,Like,2023-10-29 02:35:27.871
4,6d9917af-c889-40a5-bd83-a7860318a26f,"Document: ---\ntitle: ""Deploying Fiddler""\n...","Yes, it is possible to deploy Fiddler on data ...",Can Fiddler be deployed on data-centers?,nice!,Like,2023-10-29 02:38:26.411
...,...,...,...,...,...,...,...
441,1b88a642-1d2c-4560-821e-ad3ea338ec99,Document: ResourceLink:https://www.fiddler....,Krishna Gade is the Founder and CEO of Fiddler...,Who is Krisha GaDE?,,,2024-02-27 02:38:46.767
442,e71adb1a-c425-4e46-ad61-ec731bdeadb4,Document: ResourceLink:https://www.fiddler....,"Yes, Fiddler can observe LLM applications. Fid...",Can Fiddler Observe LLM APplicaitons?,,,2024-02-27 20:52:41.327
443,76499c4d-77c5-471c-b0aa-2584c535a28d,Document: BlogLink:https://www.fiddler.ai/b...,"Yes, Fiddler supports LLM (Language Model Moni...",Can Fiddler Support LLM Observability?,,,2024-02-27 21:07:49.141
444,78845331-55ca-45f1-8624-f4910025bec9,Document: ResourceLink:https://www.fiddler....,Fiddler supports LLMs by providing tools and r...,How does Fiddler support LLMs?,,Like,2024-02-27 23:17:21.129


You can use the client's `publish_events_batch` function to start pumping data into Fiddler!
  
*Just include:*
1. The DataFrame containing your events
2. The name of the column containing event timestamps

In [None]:
client.publish_events_batch(
    project_id=PROJECT_ID,
    model_id=MODEL_ID,
    batch_source=llm_events_df,
    timestamp_field='ts'
)

# 8. Get insights

**You're all done!**
  
You can now head to your Fiddler environment and start getting enhanced observability into your LLM application's performance.

<table>
    <tr>
        <td>
            <img src="https://raw.githubusercontent.com/fiddler-labs/fiddler-examples/main/quickstart/images/LLM_chatbot_UMAP.png" />
        </td>
    </tr>
</table>

**What's Next?**

Try the [NLP Monitoring - Quickstart Notebook](https://docs.fiddler.ai/docs/simple-nlp-monitoring-quick-start)

---


**Questions?**  
  
Check out [our docs](https://docs.fiddler.ai/) for a more detailed explanation of what Fiddler has to offer.

Join our [community Slack](http://fiddler-community.slack.com/) to ask any questions!

If you're still looking for answers, fill out a ticket on [our support page](https://fiddlerlabs.zendesk.com/) and we'll get back to you shortly.