<center>
    <p style="text-align:center">
        <img alt="phoenix logo" src="https://storage.googleapis.com/arize-assets/phoenix/assets/phoenix-logo-light.svg" width="200"/>
        <br>
        <a href="https://docs.arize.com/phoenix/">Docs</a>
        |
        <a href="https://github.com/Arize-ai/phoenix">GitHub</a>
        |
        <a href="https://join.slack.com/t/arize-ai/shared_invite/zt-1px8dcmlf-fmThhDFD_V_48oU7ALan4Q">Community</a>
    </p>
</center>
<h1 align="center">Phoenix Quickstart</h1>

In this quickstart, you will:

- Download curated datasets of embeddings and predictions and load them into a pandas DataFrame
- Define a schema to describe the format of your data
- Launch Phoenix and explore the app

Let's get started!

## 1. Install Dependencies and Import Libraries

In [None]:
%pip install -q arize-phoenix

In [None]:
from dataclasses import replace
import pandas as pd
import phoenix as px

## 2. Download the Data

Download the curated dataset.

In [None]:
train_df = pd.read_parquet(
    "https://storage.googleapis.com/arize-assets/phoenix/datasets/unstructured/cv/human-actions/human_actions_training.parquet"
)
prod_df = pd.read_parquet(
    "https://storage.googleapis.com/arize-assets/phoenix/datasets/unstructured/cv/human-actions/human_actions_production.parquet"
)

## 3. Launch Phoenix

### a) Define Your Schema
To launch Phoenix with your data, you first need to define a schema that tells Phoenix which columns of your DataFrames correspond to features, predictions, actuals (i.e., ground truth), embeddings, etc.

The trickiest part is defining embedding features. In this case, each embedding feature has two pieces of information: the embedding vector itself contained in the "image_vector" column and the link to the image contained in the "url" column.

Define a schema for your training data.

In [None]:
train_schema = px.Schema(
    timestamp_column_name="prediction_ts",
    prediction_label_column_name="predicted_action",
    actual_label_column_name="actual_action",
    embedding_feature_column_names={
        "image_embedding": px.EmbeddingColumnNames(
            vector_column_name="image_vector",
            link_to_data_column_name="url",
        ),
    },
)

The schema for your production data is the same, except it does not have an actual label column.

In [None]:
prod_schema = replace(train_schema, actual_label_column_name=None)

### b) Define Your Datasets
Next, define your primary and reference datasets. In this case, your reference dataset contains training data and your primary dataset contains production data.

In [None]:
prod_ds = px.Dataset(prod_df, prod_schema)
train_ds = px.Dataset(train_df, train_schema)

### c) Create a Phoenix Session

In [None]:
session = px.launch_app(prod_ds, train_ds)

### d) Launch the Phoenix UI

You can open Phoenix by copying and pasting the output of `session.url` into a new browser tab.

In [None]:
session.url

Alternatively, you can open the Phoenix UI in your notebook with

In [None]:
session.view()

## 4. Explore the App

Click on "image_embedding" in the "Embeddings" section to visualize your embedding data. What insights can you uncover from this page?

## 5. Close the App

When you're done, don't forget to close the app.

In [None]:
px.close_app()