# ImbalancedDomainsMetrics (Classification): Usage
## Example 1: Beginner

## Dependencies
First, we load the required dependencies. Here we import classification_metrics from imbalanced_metrics to evalute the result we get from the SVM. In addition, we use pandas for data handling, and train_test_split to split the dataset.

In [1]:
# load dependencies
from imbalanced_metrics import classification_metrics as cm
from sklearn import svm
import pandas as pd
from sklearn.model_selection import train_test_split

## Data
After, we load our data.


In [2]:
# load data
df = pd.read_csv(
    'https://raw.githubusercontent.com/paobranco/ImbalanceMetrics/main/data/glass.csv', header=None
)
df.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9
0,1.52101,13.64,4.49,1.1,71.78,0.06,8.75,0.0,0.0,1
1,1.51761,13.89,3.6,1.36,72.73,0.48,7.83,0.0,0.0,1
2,1.51618,13.53,3.55,1.54,72.99,0.39,7.78,0.0,0.0,1
3,1.51766,13.21,3.69,1.29,72.61,0.57,8.22,0.0,0.0,1
4,1.51742,13.27,3.62,1.24,73.08,0.55,8.07,0.0,0.0,1


In [3]:
# Assign x and y values from the dataframe
X = df.drop(columns=[9])
y = df[9]

In [4]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33)

## Model
After, we train our model with data. In this example, we use the `svm.SVC()` from sklearn. This model will predict as y_pred which we will compare with true test value y in our evaluation.

In [5]:
clf = svm.SVC(kernel="linear", probability=True, random_state=0)
clf.fit(X_train,y_train)
y_pred=clf.predict_proba(X_test)

## Evaluation

Now we will evaluate the prediction using the metrics from classification_metrics. Here we are using Precision-Recall AUC using Davis method.

First, we will be using default value None as pos_label. When pos_label = None, the minority class is selected as pos_label. In this example, 1 is the minority class so here pos_label = 1.

In [6]:
# Here we set default None as positive label so the monority class can be seleted as positive class.
cm.pr_davis(y_test, y_pred)

Using 6 as minority class


0.5138888888888888

Next, we will be using value 7 as pos_label. This means 7 will be treated as the positive label for the dataset. This time we will also pass True as return_pr so the function also returns presicion and recall alongside davis auc.

In [7]:
# Here we set 0 as positive label
cm.pr_davis(y_test, y_pred,return_pr=True,pos_label=7)


(array([0.24137931, 0.22807018, 0.23214286, 0.23636364, 0.24074074,
        0.24528302, 0.25      , 0.23529412, 0.24      , 0.24489796,
        0.25      , 0.25531915, 0.26086957, 0.26666667, 0.27272727,
        0.27906977, 0.28571429, 0.29268293, 0.3       , 0.30769231,
        0.31578947, 0.32432432, 0.33333333, 0.34285714, 0.35294118,
        0.36363636, 0.375     , 0.38709677, 0.4       , 0.37931034,
        0.39285714, 0.40740741, 0.42307692, 0.44      , 0.45833333,
        0.47826087, 0.5       , 0.47619048, 0.5       , 0.47368421,
        0.5       , 0.52941176, 0.5625    , 0.6       , 0.64285714,
        0.69230769, 0.75      , 0.72727273, 0.8       , 0.88888889,
        0.875     , 0.85714286, 0.83333333, 0.8       , 0.75      ,
        0.66666667, 0.5       , 0.        , 1.        ]),
 array([1.        , 0.92857143, 0.92857143, 0.92857143, 0.92857143,
        0.92857143, 0.92857143, 0.85714286, 0.85714286, 0.85714286,
        0.85714286, 0.85714286, 0.85714286, 0.85714286, 0.

After Davis method, we are using Precision-Recall AUC using Manning method.

Like Davis method, we will be using default value None as pos_label first. This means minority class 6 will be treated as the positive label for the dataset.  

In [8]:
# Here we set default None as positive label so the monority class can be seleted as positive class.
cm.pr_manning(y_test, y_pred)

Using 6 as minority class


0.75

Next, we will be using value 7 as pos_label. This means 7 will be treated as the positive label for the dataset. This time we will also pass True as return_pr so the function also returns presicion and recall alongside manning auc.

In [9]:
# Here we set 0 as positive label
cm.pr_manning(y_test, y_pred,return_pr=True,pos_label=7)

(array([1.        , 0.88888889, 0.88888889, 0.88888889, 0.88888889,
        0.88888889, 0.88888889, 0.88888889, 0.88888889, 0.88888889,
        0.8       , 0.75      , 0.75      , 0.69230769, 0.64285714,
        0.6       , 0.5625    , 0.52941176, 0.5       , 0.5       ,
        0.5       , 0.5       , 0.5       , 0.47826087, 0.45833333,
        0.44      , 0.42307692, 0.40740741, 0.4       , 0.4       ,
        0.4       , 0.38709677, 0.375     , 0.36363636, 0.35294118,
        0.34285714, 0.33333333, 0.32432432, 0.31578947, 0.30769231,
        0.3       , 0.29268293, 0.28571429, 0.27906977, 0.27272727,
        0.26666667, 0.26086957, 0.25531915, 0.25      , 0.25      ,
        0.25      , 0.25      , 0.25      , 0.24528302, 0.24137931,
        0.24137931, 0.24137931, 0.24137931, 0.24137931, 0.23728814,
        0.23333333, 0.2295082 , 0.22580645, 0.22222222, 0.21538462,
        0.21212121, 0.20895522, 0.20588235, 0.20289855, 0.2       ,
        0.1971831 ]),
 array([0.        , 0.    

## Conclusion

In this package, we have presented a set of evaluation metrics specifically designed for imbalanced domains. Our package, "ImbalancedDomainsMetrics", provides a comprehensive set of evaluation metrics to assess the performance of machine learning models trained on imbalanced datasets.

Our package includes several evaluation metrics that address the challenges of imbalanced domains. These metrics can provide a more accurate assessment of model performance than traditional metrics, which can be misleading in imbalanced domains.

To learn more about our package, please refer to the documentation, which includes detailed descriptions of all the available metrics and their usage.