# Arize Tutorial: Model Type - Score Categorical (for Ranking Model)

Let's get started on using Arize! ✨

Arize helps you visualize your model performance, understand drift & data quality issues, and share insights learned from your models.

**In this tutorial, we will send in data to calculate [NDCG](https://en.wikipedia.org/wiki/Discounted_cumulative_gain#Normalized_DCG) (Normalized Discounted Cumulative Gain).**

NDCG is a popular metric of ranking quality. It measures the ability of search engines or related applications to present to users the most relevant results first. NDCG values range between 0 and 1 with 1 being the best. Arize calculates NDCG with the standard `log2` discount.

For more of our usage case tutorials, visit our other [example tutorials](https://arize.gitbook.io/arize/examples).

### Running This Notebook
1. Save a copy in Google Drive for yourself.
2. Step through each section below, pressing play on the code blocks to run the cells.
3. In Step 2, use your own Space and API key from your Arize account.


## Step 1: Load Example Data

`requestId` is the ID of each query, and `position` is the ranking of the retrieved documents (with 0 being the highest).

The `score` here is just a binary value reflecting whether the document is actually relevant, but it can be any floating point value depending how you want to assign the relevance scores for your NDCG calculations.

In [None]:
import pandas as pd

df = pd.read_csv("https://storage.googleapis.com/arize-assets/fixtures/ndcg.zip")
df

Transpose the `score` column into sequences grouping by the `requestId` and place the sequences at the position 0. 

Since the data is already sorted, we simply rotate the `score` column into a list. (If your data is not sorted, then you would create a list and insert the elements individually according to their positions.)

In [None]:
def transpose(df):
    return pd.DataFrame(
        {"sequence": [df["score"].tolist()] + [None] * (len(df) - 1)}, index=df.index
    )


df = pd.concat([df, df.groupby("requestId").apply(transpose)], axis=1)
df

## Step 2: Import and Setup Arize Client
You can find your `API_KEY` and `SPACE_KEY` by navigating to the settings page in your workspace as shown below (only space admins can see the keys). 

<img src="https://storage.cloud.google.com/arize-assets/fixtures/copy-keys.png" width="700">

In [None]:
!pip install -q arize
from arize.pandas.logger import Client, Schema
from arize.utils.types import ModelTypes, Environments

SPACE_KEY = "SPACE_KEY"
API_KEY = "API_KEY"

arize_client = Client(space_key=SPACE_KEY, api_key=API_KEY)

model_id = "logging_ndcg"
model_version = "1.0"
model_type = ModelTypes.SCORE_CATEGORICAL

if SPACE_KEY == "SPACE_KEY" or API_KEY == "API_KEY":
    raise ValueError("❌ NEED TO CHANGE SPACE AND/OR API_KEY")
else:
    print("Step 2 ✅: Import and Setup Arize Client Done! Now we can start using Arize!")

## Step 3: Log data to Arize
Three easy steps to log a `pandas.DataFrame`. See [docs](https://docs.arize.com/arize/api-reference/python-sdk/arize.pandas) for more details.

1.   Define `Schema` to designate column names
2.   Call `arize.pandas.log()`
3.   Check `response.status_code`

[![Buttons_OpenOrange.png](https://storage.googleapis.com/arize-assets/fixtures/Buttons_OpenOrange.png)](https://docs.arize.com/arize/sdks-and-integrations/python-sdk/arize.pandas)

In [None]:
# Define a Schema() object for Arize to pick up data from the correct columns for logging
production_schema = Schema(
    prediction_id_column_name="id",  # REQUIRED
    prediction_label_column_name="prediction",
    actual_label_column_name="actual",
    actual_score_column_name="score",
    actual_numeric_sequence_column_name="sequence",
)

# arize_client.log returns a Response object from Python's requests module
response = arize_client.log(
    dataframe=df,
    schema=production_schema,
    model_id=model_id,
    model_version=model_version,
    model_type=model_type,
    environment=Environments.PRODUCTION,
)

# If successful, the server will return a status_code of 200
if response.status_code != 200:
    print(
        f"❌ logging failed with response code {response.status_code}, {response.text}"
    )
else:
    print(f"Step 3 ✅: You have successfully logged {len(df)} data points to Arize!")

### Check Data Ingestion Information

Data will be available in the UI in about 10 minutes after it was received. If data from a new model is sent, the model will be reflected almost immediately in the Arize platform. However, you will not see data yet. To verify data has been sent correctly and is being processed, we recommend that you check our Data Ingestion tab.

You will be able to see the predictions, actuals, and feature importances that have been sent in the last week, last day or last 30 minutes.

An example view of the Data Ingestion tab from a model, when data is sent continuously over 30 minutes, is shown in the image below.

<img src="https://storage.cloud.google.com/arize-assets/fixtures/data-ingestion-tab.png" width="700">



### Overview
Arize is an end-to-end ML observability and model monitoring platform. The platform is designed to help ML engineers and data science practitioners surface and fix issues with ML models in production faster with:
- Automated ML monitoring and model monitoring
- Workflows to troubleshoot model performance
- Real-time visualizations for model performance monitoring, data quality monitoring, and drift monitoring
- Model prediction cohort analysis
- Pre-deployment model validation
- Integrated model explainability

### Website
Visit Us At: https://arize.com/model-monitoring/

### Additional Resources
- [What is ML observability?](https://arize.com/what-is-ml-observability/)
- [Playbook to model monitoring in production](https://arize.com/the-playbook-to-monitor-your-models-performance-in-production/)
- [Using statistical distance metrics for ML monitoring and observability](https://arize.com/using-statistical-distance-metrics-for-machine-learning-observability/)
- [ML infrastructure tools for data preparation](https://arize.com/ml-infrastructure-tools-for-data-preparation/)
- [ML infrastructure tools for model building](https://arize.com/ml-infrastructure-tools-for-model-building/)
- [ML infrastructure tools for production](https://arize.com/ml-infrastructure-tools-for-production-part-1/)
- [ML infrastructure tools for model deployment and model serving](https://arize.com/ml-infrastructure-tools-for-production-part-2-model-deployment-and-serving/)
- [ML infrastructure tools for ML monitoring and observability](https://arize.com/ml-infrastructure-tools-ml-observability/)

Visit the [Arize Blog](https://arize.com/blog) and [Resource Center](https://arize.com/resource-hub/) for more resources on ML observability and model monitoring.
