# Evaluation
This notebooks shows how to use `evaluate` to calculate the evaluation metrics for the AMEX competition.

## Install `evaluate`

In [None]:
%%capture
!pip install evaluate

## Load data

First, let's load the competition data:

In [None]:
import numpy as np
import pandas as pd
from pathlib import Path

input_path = Path('/kaggle/input/amex-default-prediction/')

train_data = pd.read_csv(
    input_path / 'train_data.csv',
    index_col='customer_ID',
    usecols=['customer_ID', 'P_2'])

train_labels = pd.read_csv(input_path / 'train_labels.csv', index_col='customer_ID')

## Make predictions
With the training data we can make some dummy predicitons:

In [None]:
ave_p2 = (train_data
          .groupby('customer_ID')
          .mean()
          .rename(columns={'P_2': 'prediction'}))

# Scale the mean P_2 by the max value and take the compliment
ave_p2['prediction'] = 1.0 - (ave_p2['prediction'] / ave_p2['prediction'].max())

## Evaluation with `evaluate` 

With `evaluate` anybody can add community metrics (see the [guide](https://huggingface.co/docs/evaluate/creating_and_sharing) in the documentation). We added the AMEX evaluation metric as a community metric so anybody can easily use it which comes with an interactive widget and a metric card:

https://huggingface.co/spaces/lvwerra/amex

Using a community metric with `evaluate` is just two lines of code:

In [None]:
import evaluate

amex_metric = evaluate.load("lvwerra/amex")
amex_metric.compute(references=train_labels["target"], predictions=ave_p2["prediction"])

# Original implementation
We can verify that this is the same result as with the original implementation provided by the organizers:

In [None]:
def amex_metric_original(y_true: pd.DataFrame, y_pred: pd.DataFrame) -> float:

    def top_four_percent_captured(y_true: pd.DataFrame, y_pred: pd.DataFrame) -> float:
        df = (pd.concat([y_true, y_pred], axis='columns')
              .sort_values('prediction', ascending=False))
        df['weight'] = df['target'].apply(lambda x: 20 if x==0 else 1)
        four_pct_cutoff = int(0.04 * df['weight'].sum())
        df['weight_cumsum'] = df['weight'].cumsum()
        df_cutoff = df.loc[df['weight_cumsum'] <= four_pct_cutoff]
        return (df_cutoff['target'] == 1).sum() / (df['target'] == 1).sum()
        
    def weighted_gini(y_true: pd.DataFrame, y_pred: pd.DataFrame) -> float:
        df = (pd.concat([y_true, y_pred], axis='columns')
              .sort_values('prediction', ascending=False))
        df['weight'] = df['target'].apply(lambda x: 20 if x==0 else 1)
        df['random'] = (df['weight'] / df['weight'].sum()).cumsum()
        total_pos = (df['target'] * df['weight']).sum()
        df['cum_pos_found'] = (df['target'] * df['weight']).cumsum()
        df['lorentz'] = df['cum_pos_found'] / total_pos
        df['gini'] = (df['lorentz'] - df['random']) * df['weight']
        return df['gini'].sum()

    def normalized_weighted_gini(y_true: pd.DataFrame, y_pred: pd.DataFrame) -> float:
        y_true_pred = y_true.rename(columns={'target': 'prediction'})
        return weighted_gini(y_true, y_pred) / weighted_gini(y_true, y_true_pred)

    g = normalized_weighted_gini(y_true, y_pred)
    d = top_four_percent_captured(y_true, y_pred)

    return 0.5 * (g + d)

In [None]:
print(amex_metric_original(train_labels, ave_p2)) 

Indeed, we get the same result :) 