# ImbalanceMetrics (Classification): Usage
## Example 2: Intermediate

## Dependencies
First, we load the required dependencies. Here we import classification_metrics from imbalance_metrics to evalute the result we get from the SVC. In addition, we use pandas for data handling and cross_validate for crossvalidation.

In [1]:
# load dependencies
import numpy as np
from sklearn import svm
from sklearn.metrics import  make_scorer
from sklearn.model_selection import cross_validate
from imbalance_metrics import classification_metrics as cm
import pandas as pd

## Data
After, we load our data. This dataset is taken from the KEEL repository.

In [2]:
# load data
df = pd.read_csv(
    'https://raw.githubusercontent.com/paobranco/ImbalanceMetrics/main/data/poker-9_vs_7(processed).csv', header=None
)
df.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
0,1,2,3,2,4,2,2,2,1,8,0
1,3,10,3,12,4,10,1,10,2,10,0
2,1,13,2,6,4,6,3,6,1,6,0
3,4,1,1,1,3,2,3,1,2,1,0
4,2,9,4,6,3,9,4,9,1,9,0


In [3]:
df.describe()


Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10
count,244.0,244.0,244.0,244.0,244.0,244.0,244.0,244.0,244.0,244.0,244.0
mean,2.454918,6.758197,2.508197,6.92623,2.57377,6.590164,2.545082,7.07377,2.540984,6.790984,0.032787
std,1.085831,3.971469,1.145727,3.822541,1.095581,3.805463,1.112044,3.822541,1.115897,3.758305,0.178444
min,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.0
25%,2.0,3.0,1.0,3.0,2.0,3.0,2.0,3.75,2.0,3.0,0.0
50%,2.0,6.5,2.0,7.0,3.0,7.0,3.0,7.0,2.0,7.0,0.0
75%,3.0,10.0,4.0,10.0,4.0,10.0,4.0,11.0,4.0,10.0,0.0
max,4.0,13.0,4.0,13.0,4.0,13.0,4.0,13.0,4.0,13.0,1.0


In [4]:
# Assign x and y values from the dataframe
X = df.drop(columns=[10])
y = df[10]


As the gmean_score, pr_davis and pr_manning are not `Sklearn`'s builtin metrics, we need to modify them a bit to use them in cross_validate funtion. We can use make_scorer from `Sklearn` to covert them to `Sklearn` compatible metrics.

In [5]:
gmean=make_scorer(cm.gmean_score)
davis=make_scorer(cm.pr_davis,needs_proba=True)
manning=make_scorer(cm.pr_manning,needs_proba=True)

## Model
After, we train our model with data. In this example, we use the `svm.SVC()` from sklearn. We are also using cross_validate function. As for metrics, we have passed the metrics 'f1','gmean','h_score','davis' and 'manning'as parameter.

In [6]:
clf = svm.SVC(kernel="linear", probability=True, random_state=0)
cv=cross_validate(clf, X, y.to_numpy(), cv=6, scoring={'f1': "f1",'gmean':gmean,'davis':davis,'manning':manning},return_train_score=True)


Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class
Using 1 as minority class


## Evaluation
As we have pass the metrics as parameter in the cross_validate function, it calculates the results and returns as a dictionary. We are now printing the results. 

In [7]:
# Printing the result
print("F1 Score:", cv['test_f1'])
print("Gmean Score:", cv['test_gmean'])
print("Davis Interpolation:", cv['test_davis'])
print("Manning Interpolation:", cv['test_manning'])


F1 Score: [0.         0.         0.66666667 0.         0.         1.        ]
Gmean Score: [0.         0.         0.70710678 0.         0.         1.        ]
Davis Interpolation: [0.01219512 0.01219512 0.51844512 0.51844512 0.01315789 0.0125    ]
Manning Interpolation: [0.02439024 0.02439024 0.52439024 0.52439024 0.02631579 0.025     ]


## Conclusion

In this package, we have presented a set of evaluation metrics specifically designed for imbalanced domains. Our package, "ImbalanceMetrics", provides a comprehensive set of evaluation metrics to assess the performance of machine learning models trained on imbalanced datasets.

Our package includes several evaluation metrics that address the challenges of imbalanced domains. These metrics can provide a more accurate assessment of model performance than traditional metrics, which can be misleading in imbalanced domains.

To learn more about our package, please refer to the documentation, which includes detailed descriptions of all the available metrics and their usage.