<center>
    <p style="text-align:center">
        <img alt="phoenix logo" src="https://storage.googleapis.com/arize-phoenix-assets/assets/phoenix-logo-light.svg" width="200"/>
        <br>
        <a href="https://docs.arize.com/phoenix/">Docs</a>
        |
        <a href="https://github.com/Arize-ai/phoenix">GitHub</a>
        |
        <a href="https://arize-ai.slack.com/join/shared_invite/zt-2w57bhem8-hq24MB6u7yE_ZF_ilOYSBw#/shared-invite/email">Community</a>
    </p>
</center>
<h1 align="center">Image Classification with Phoenix</h1>

In this tutorial, you will:
- Upload a dataset of images to Phoenix
- Create an experiment to classify the images and measure the accuracy of the model you use
- View images in the Phoenix UI

ℹ️ This notebook requires an OpenAI API key.

## Install dependencies

In [None]:
%%bash
pip install -q "arize-phoenix>=4.29.0" openinference-instrumentation-openai openai datasets 'httpx<0.28'

## Connect to Phoenix

In [None]:
# Check if PHOENIX_API_KEY is present in the environment variables.
# If it is, we'll use the cloud instance of Phoenix. If it's not, we'll start a local instance.
# A third option is to connect to a docker or locally hosted instance.
# See https://docs.arize.com/phoenix/setup/environments for more information.

import os

if "PHOENIX_API_KEY" in os.environ:
    os.environ["PHOENIX_CLIENT_HEADERS"] = f"api_key={os.environ['PHOENIX_API_KEY']}"
    os.environ["PHOENIX_COLLECTOR_ENDPOINT"] = "https://app.phoenix.arize.com"

else:
    import phoenix as px

    px.launch_app()

In [None]:
from phoenix.otel import register

tracer_provider = register()

## Load dataset of test cases

In [None]:
from datasets import load_dataset

import phoenix as px

df = load_dataset("huggingface/image-classification-test-sample")["train"].to_pandas()

In [None]:
df.head()

We first need to convert the image data to a base64 encoded string. Phoenix expects the image data to be in either this format or as a URL.

In [None]:
import base64

# Extract the bytes object from the dictionary and update the 'img' column
df["img"] = df["img"].apply(lambda x: x["bytes"])

# Base64 encode the value in 'img' column
df["img"] = df["img"].apply(lambda x: base64.b64encode(x).decode("utf-8"))


# Append 'data:image/png;base64,' to the beginning of each value in the 'image' column
df["img"] = df["img"].apply(lambda x: "data:image/png;base64," + x)

Next, let's map the numerical labels to the actual labels. This will make it easier to compare the model's output to the expected output.

In [None]:
label_map = {
    1: "automobile",
    2: "snakes",
    3: "cat",
    4: "tree",
    5: "dog",
    6: "frog",
    7: "horse",
    8: "ship",
}
df["label"] = df["label"].map(label_map)

In [None]:
df.head()

Our dataset is now ready to upload to Phoenix. Let's take a look at the first image in the dataset just to make sure it looks right.

In [None]:
from IPython.display import Image, display

# Get the image data from the first row
image_data = df.loc[0, "img"]

# Remove the data URI prefix
image_data = image_data.split(",")[1]

# Decode the base64 string
image_bytes = base64.b64decode(image_data)

# Display the image
display(Image(data=image_bytes))

From here, we can upload the dataset to Phoenix. This dataset will act as our test cases for the experiment we'll run later.

In [None]:
import datetime

test_cases = px.Client().upload_dataset(
    dataset_name=f"image-classification-test-sample-{datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S')}",
    dataframe=df,
    input_keys=["img"],
    output_keys=["label"],
)

## Create our experiment task

In [None]:
import os

os.environ["OPENINFERENCE_BASE64_IMAGE_MAX_LENGTH"] = (
    "10000000000"  # this ensures that the image data is not truncated
)

We'll be using OpenAI's GPT-4o-mini model to classify the images. Let's make sure we'll be able to properly see all the traces generated by this model by instrumenting it.

In [None]:
from openinference.instrumentation.openai import OpenAIInstrumentor

OpenAIInstrumentor().instrument(tracer_provider=tracer_provider, skip_dep_check=True)

We also need to set the OpenAI API key in the environment variables.

In [None]:
import getpass

if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter your OpenAI API key: ")

Now we can define our task. This task will take an image and use GPT-4o-mini to classify it based on the labels we uploaded.

In [None]:
from openai import OpenAI


def task(input):
    client = OpenAI()

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "text",
                        "text": "What’s in this image? Your answer should be a single word. The word should be one of the following: "
                        + str(label_map.values()),
                    },
                    {
                        "type": "image_url",
                        "image_url": {
                            "url": input["img"],
                        },
                    },
                ],
            }
        ],
        max_tokens=300,
    )

    output_label = response.choices[0].message.content.lower()
    return output_label

## Create our evaluators

Next, we have to set up our evaluators. In this case, our evaluators are very simple, because we have ground truth labels for each image. All we need to do is check if the model's output matches the expected output.

In [None]:
def matches_expected_label(expected, output):
    return expected["label"] == output

## Run our experiment

With that, we're ready to run our experiment. This function will run each row of our test_cases dataset through our task function and evaluate the output using our evaluators. Results will be displayed below, and uploaded to the Phoenix UI.

In [None]:
import nest_asyncio

from phoenix.experiments import run_experiment

nest_asyncio.apply()

run_experiment(
    task=task,
    evaluators=[matches_expected_label],
    dataset=test_cases,
    experiment_description="Image classification experiment",
    experiment_metadata={"model": "gpt-4o"},
)

If you're running this in Colab, you can view the experiment by clicking on the URL in the cell below.

In [None]:
if px.active_session():
    px.active_session().view()