<center>
    <p style="text-align:center">
        <img alt="phoenix logo" src="https://storage.googleapis.com/arize-assets/phoenix/assets/phoenix-logo-light.svg" width="200"/>
        <br>
        <a href="https://docs.arize.com/phoenix/">Docs</a>
        |
        <a href="https://github.com/Arize-ai/phoenix">GitHub</a>
        |
        <a href="https://join.slack.com/t/arize-ai/shared_invite/zt-1px8dcmlf-fmThhDFD_V_48oU7ALan4Q">Community</a>
    </p>
</center>
<h1 align="center">Toxicity Classification Evals</h1>

## Install Dependencies and Import Libraries

In [None]:
!pip install -qq "arize-phoenix[experimental]" ipython matplotlib openai pycm scikit-learn

In [None]:
import os
from getpass import getpass

import matplotlib.pyplot as plt
import openai
import pandas as pd
from phoenix.experimental.evals import (
    TOXICITY_PROMPT_RAILS_MAP,
    TOXICITY_PROMPT_TEMPLATE_STR,
    OpenAiModel,
    download_benchmark_dataset,
    llm_eval_binary,
)
from pycm import ConfusionMatrix
from sklearn.metrics import classification_report

pd.set_option("display.max_colwidth", None)

## Download Benchmark Dataset

We'll evaluate the evaluation system consisting of an LLM model and settings in addition to an evaluation prompt template against a benchmark datasets of toxic and non-toxic text with ground-truth labels. Currently supported datasets include:

- "wiki_toxic"


In [None]:
df = download_benchmark_dataset(task="toxicity-classification", dataset_name="wiki_toxic-test")
df.head()

## Display Toxicity Classification Template

View the default template used to classify toxicity. You can tweak this template and evaluate its performance relative to the default.

In [None]:
print(TOXICITY_PROMPT_TEMPLATE_STR)

The template variables are:

- **text:** the text to be classified

# Configure the LLM

Configure your OpenAI API key.

In [None]:
if not (openai_api_key := os.getenv("OPENAI_API_KEY")):
    openai_api_key = getpass("🔑 Enter your OpenAI API key: ")
openai.api_key = openai_api_key
os.environ["OPENAI_API_KEY"] = openai_api_key

Instantiate the LLM and set parameters.

In [None]:
model = OpenAiModel(
    model_name="gpt-4",
    temperature=0.0,
)

## Run Toxicity Classifications

Run toxicity classifications against a subset of the data.

In [None]:
df = df.sample(n=100).reset_index(drop=True)
df = df.rename(
    columns={"comment_text": "text"},
)

In [None]:
rails = list(TOXICITY_PROMPT_RAILS_MAP.values())
toxic_classifications = llm_eval_binary(
    dataframe=df,
    template=TOXICITY_PROMPT_TEMPLATE_STR,
    model=model,
    rails=rails,
)

## Evaluate Classifications

Evaluate the predictions against human-labeled ground-truth toxicity labels.

In [None]:
true_labels = df["toxic"].map(TOXICITY_PROMPT_RAILS_MAP).tolist()
predicted_labels = toxic_classifications

print(classification_report(true_labels, predicted_labels, labels=rails))
confusion_matrix = ConfusionMatrix(
    actual_vector=true_labels, predict_vector=predicted_labels, classes=rails
)
confusion_matrix.plot(
    cmap=plt.colormaps["Blues"],
    number_label=True,
    normalized=True,
);