# ID2214 Fx Assignment
Abyel Tesfay, Abyel@kth.se

### Instructions
The following jupyter notebook contains solutions to a set of tasks in the form of simulations and tests, explanations and any assumptions made. This notebook was written with the purpose of completing the assignments below. Each assignment consists of an explanation and a form of simulation (or results from it). Below the assignments you will find the code use for data preparation, modelling and evaluation.

## Load packages used

In [None]:
import numpy as np
import pandas as pd

## 1a. Methodology

It depends on the outcome of the models generated from the hyper-parameter settings and the algorithm used. The performance of the best-performing model is biased on how the given dataset is randomly split into two samples. Therefore the performance (accuracy) of the best-performing model might be too optimistic, its good score is dependent on the current sample that was randomly generated. For this observation i performed the following steps.
- I chose the dataset "healthcare-dataset-stroke-data.csv" which is classified with binary labels
- I prepared two equally sized samples using randomized sampling
- For modelling i used RandomForest with the hyper-parameters 'n_estimators', 'criterion' and 'max_features', the best performing model was picked by the highest average accuracy from a ten-fold cross-validation
- For performance estimation i trained a model with the best configuration and a baseline model, using the first half as training set. I then tested both models using the second half as a test set.

Using the hyper-parameters 'n_estimators'= [1,10,50,100,250], criterion ['gini', 'entropy'] and 'max_features' = [1,2,...,10] i received the following results:

The results show that even if the best-performing configuration for hyper-parameters (and algorithm) outperforms the baseline model in the first half of data, the baseline model may still *outperform the best-performing configuration* in the second half. I also checked the amount models that performed better than the baseline during modelling, to see if a majority of them could outperform on the first half. If this was true, then the best performing configuration would be *more likely to outperform* the baseline on the second half of data.

## 1b. Data preparation


Assuming that the model was trained on a imbalanced training set which contains instances that are not present in the test set, we should expect a **lower accuracy but a similar AUC** when evaluating the model on the class-balanced set. The reason is that the model was trained on a imbalanced set where the majority class is frequent. When evaluated on a class-balanced test set (which has a lower frequency of the majority class) the accuracy will decrease. For the AUC however we will see a similar performance. The AUC only measures the probability of the model to rank an instance with the correct label ahead of instances with the wrong label. A lower accuracy will not affect this metric. 

The following steps were taken with two different datasets
- Select a data set for the task
- Split the dataset into two halves, one training set and one 'sampling' set 
- Use the sampling set to create the following test sets described in 1b:
    - An imbalanced test set in which the majority class is four times more frequent than the minority class
    - A class-balanced test set (has fewer instances than the above data set)
- Perform data preparation on the training set: filtering and imputation
- Generate and train two identical models using a selected algorithm e.g RandomForest
- Evaluate the models using both the imbalanced and balanced test sets

Results, smiles_one_hot.csv:

Results, diabetes_binary_health_indicators_BRFSS2015.csv

------------------------------

## Code


### Helper functions required for 1a and 1b

In [28]:
import numpy as np
import pandas as pd
import time
from sklearn.tree import DecisionTreeClassifier
from IPython.display import display

def create_column_filter(df):
    df2 = df.copy()
    column_filter = list(df2.columns)
    columns = [col for col in df2.columns if col not in ['active', 'index', 'id', 'class']]
    for col in columns:
        if df2[col].isnull().all():
            df2.drop(columns=col, inplace=True)
            column_filter.remove(col)
            continue

        if len(df2[col].dropna().unique()) <= 1:
            df2.drop(columns=col, inplace=True)
            column_filter.remove(col)

    return df2, column_filter

def apply_column_filter(df, column_filter):
    df2 = df.copy()
    [df2.drop(columns=col, inplace=True) for col in df2.columns if col not in column_filter]
    return df2

def create_normalization(df, normalizationtype='minmax'):
    df2 = df.copy()
    include_types = np.int32, np.int64, np.float32, np.float64
    columns = [col for col in df2.columns if col not in ['active', 'index', 'id', 'class']
               and df2[col].dtype in include_types]
    normalization = {}
    for col in columns:
        if normalizationtype=='minmax':
            min = df2[col].min()
            max = df2[col].max()
            normalization[col] = normalizationtype, min, max
        elif normalization=='zscore':
            mean = df2[col].mean()
            std = df[col].std()
            normalization = normalizationtype, mean, std

    for col in columns:
        values = list(normalization[col])
        if values[0] == 'minmax':
            df2[col] = [(x-values[1])/(values[2]-values[1]) for x in df[col]]

    return df2, normalization

def apply_normalization(df, normalization):
    df2 = df.copy()
    include_types = np.int32, np.int64, np.float32, np.float64
    columns = [col for col in df2.columns if col not in ['active', 'index', 'id', 'class']
               and df2[col].dtype in include_types]
    for col in columns:
        values = list(normalization[col])
        if values[0] == 'minmax':
            df2[col] = [(x-values[1])/(values[2]-values[1]) for x in df[col]]
    return df2

def create_imputation(df):
    df2 = df.copy()
    numeric_types = np.int32, np.int64, np.float32, np.float64
    columns = [col for col in df2.columns if col not in ['active', 'index', 'id', 'class']]
    imputation = {}
    for col in columns:
        if df2[col].dtype in numeric_types:
            if df2[col].isnull().all():
                df2[col].fillna(0, inplace=True)
            imputation[col] = df2[col].mean()
            df2[col].fillna(df2[col].mean(), inplace=True)
        else:
            if df2[col].isnull().all():
                df2[col].fillna('', inplace=True) if df2[col].dtype == 'object' else \
                    df2[col].astype('category') and df2[col].fillna(df2[col].cat.categories[0], inplace=True)

            imputation[col] = df2[col].mode()[0]
            df2[col].fillna(imputation[col], inplace=True)

    return df2, imputation

def apply_imputation(df, imputation):
    df2 = df.copy()
    return df2.fillna(value=imputation)

def create_one_hot(df):
    df2=df.copy()
    columns = [col for col in df2.columns if col not in ['active', 'index', 'id', 'class']]
    one_hot={}
    for col in columns:
        if df2[col].dtype.name != 'category' and df2[col].dtype.name != 'object':
            continue
        one_hot[col]=df2[col].unique()
        tmp = pd.get_dummies(df2[col], prefix=col, prefix_sep='-', dtype=np.float64)
        df2.drop(columns=col, inplace=True)
        df2 = pd.concat([df2, tmp], axis=1)

    return df2, one_hot

def apply_one_hot(df, one_hot):
    new_df = df.copy()
    for e in new_df.columns:
        if e in one_hot:
            for i in one_hot[e]: 
                new_df[e + "-" + i] = [1.0 if x == i else 0.0 for x in new_df[e]]
                new_df[e + "-" + i].astype('float')                                     
            new_df.drop(e, axis=1, inplace=True)                                             
    return new_df

def accuracy(df, correctlabels):
    highest_probability = df.idxmax(axis=1)
    correct_occurances = 0
    for correct_label, predicted_label in zip(correctlabels, highest_probability):
        if correct_label==predicted_label:
            correct_occurances+=1

    return correct_occurances/df.index.size

def brier_score(df, correctlabels):
    squared_sum = 0
    row = 0
    for label in correctlabels:
        i = np.where(df.columns==label)[0]
        for col in df.columns:
            squared_sum += (1 - df.loc[row, label])**2 if label==col else df.loc[row, col]**2
        row+=1

    return squared_sum/df.index.size

def auc(df, correctlabels):
    auc=0
    for col in df.columns:
        df2 = pd.concat([df[col], pd.Series(correctlabels.astype('category'), name='correct')], axis=1)
        # get dummies for correct labels and sort descending
        df2 = pd.get_dummies(df2.sort_values(col, ascending=False))

        # move col to first for easier total tp and fp calculation
        tmp=df2.pop('correct_'+str(col))
        # get the col frequency for calculating weighted AUCs
        col_frequency=tmp.sum()/tmp.index.size
        df2.insert(1, tmp.name, tmp)
        scores={}
        # populate the scores dictionary for column i.e. key=score, value=[tp_sum, fp_sum]
        for row in df.index:
            key=df2.iloc[row, 0]
            current=np.zeros(2, dtype=np.uint) if scores.get(key) is None else scores[key]
            to_add=np.array([1,0]) if df2.iloc[row, 1]==1 else np.array([0,1])
            scores[key]=current+to_add

        # calculate auc based on scores
        cov_tp=0
        column_auc=0
        tot_tp=0
        tot_fp=0
        # calculate total tp and fp
        for value in scores.values():
            tot_tp+=int(value[0])
            tot_fp+=int(value[1])

        # same algorithm as in the lecture 
        for i in scores.values():
            if i[1] == 0:
                cov_tp+=i[0]
            elif i[0] == 0:
                column_auc += (cov_tp/tot_tp)*(i[1]/tot_fp)
            else:
                column_auc += (cov_tp/tot_tp)*(i[1]/tot_fp)+(i[0]/tot_tp)*(i[1]/tot_fp)/2
                cov_tp += i[0]

        auc+=col_frequency*column_auc

    return auc

### Task 1a

In [None]:
"""
Data preparation for task 1a
@author: Abyel
"""
import numpy as np
import pandas as pd

####################### - Helper functions
"""
Splits the data into two random, equally-sized data sets. Compared to the
above, this function uses random sampling of the data set
"""
def random_split(df, Class_name):
    df1 = df.copy()
    df1_reshuffled = np.random.choice(df1.index, len(df1.index), replace=False)
    
    # create a list of two samples, both are used to create train and test sets
    two_random_indexSets = np.random.choice(df1_reshuffled, (2,int(len(df1.index)/2)), replace=False)

    # list for both datasets
    df_list = []
    
    for elem in two_random_indexSets:
        df2 = df1.iloc[elem]      
        df2_reshuffled_indexes = np.random.choice(len(df2),len(df2),replace=False)
        data_set = [df2.iloc[i] for i in df2_reshuffled_indexes]
        df3 = pd.DataFrame(data_set, columns = list(df1))
        df3.index = range(len(df3.index))
        df_list.append(df3)

    # determine through coin flip which set that becomes the test set and training set
    flip = np.random.choice(len(df_list), 1, replace=False)
    test = df_list[flip[0]]
    df_list.pop(flip[0])
    training = df_list[0]
    
    return training, test

####################### - Main

#input here!
dataset_name = "healthcare-dataset-stroke-data.csv"
Class_label_name = "stroke" # set class label name here
print("Data set: " + dataset_name + ", class label: " + Class_label_name)
print()

# Get the dataset
data_set = pd.read_csv(dataset_name)
ON_data_set = data_set.copy()

# Check the amount classes between the two
class1 = sum(ON_data_set[Class_label_name].values == 0)
class2 = sum(ON_data_set[Class_label_name].values == 1)
print("Amount instances with the following classes")
print("0: " + str(class1))
print("1: " + str(class2))

# the amount features
features = len(ON_data_set.columns)
print("Amount features: " + str(features))
print()

# Split into two sets, one for cross-validation and one for testing (evaluation)
training_set, test_set = random_split(ON_data_set, Class_label_name)

## distribution
train_class0 = sum(training_set[Class_label_name].values == 0)
train_class1 = sum(training_set[Class_label_name].values == 1)
test_class0 = sum(test_set[Class_label_name].values == 0)
test_class1 = sum(test_set[Class_label_name].values == 1)

print("Class distribution")
print("training set --- 0: " + str(train_class0) + ", 1: " + str(train_class1))
print("test set --- 0: " + str(test_class0) + ", 1: " + str(test_class1))

# Save into csv files
training_set.to_csv("training_set.csv", index=False)
test_set.to_csv("test_set.csv", index=False)

In [None]:
"""
Modelling for task 1a
@author: Abyel
"""

import numpy as np
import pandas as pd
from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier

############# Init

# input here!
np.random.seed(100) # Pick the same seed, for random generation
Class_label_name = 'stroke' # change edepending on class name
Total_features = 12         # amount features

# Get training set
training_set = pd.read_csv("training_set.csv")

# Get the instances (X)
X_rf = training_set.copy()

# Get the labels (Y), drop then in the training set
Y = training_set[Class_label_name].astype('category')
X_rf.drop(columns=[Class_label_name], inplace=True)

############# Data preparation

# RF
X_rf, column_filter = create_column_filter(X_rf)
X_rf, imputation = create_imputation(X_rf)
X_rf, one_hot = create_one_hot(X_rf)

# drop unecessary columns (for certain datasets)
# X_rf = X_rf.drop(['id'], axis=1)

############# Modelling

# Prepare cross-validation
cv = KFold(n_splits=10, random_state=1, shuffle=True)

# Prepare hyper-parameters

num_trees = [1,10,50,100,250]
criterion = ["gini", "entropy"]
max_f = range(Total_features + len(one_hot))[1: 11]

scores = []
hyper_parameters = []

# Do cross-validation for RF
for num in num_trees:
    for crit in criterion:
        for no_featues in max_f:
            model = RandomForestClassifier(n_estimators=num, criterion=crit, max_features=no_featues)
            new_sc = np.mean(cross_val_score(model, X_rf, Y, scoring="accuracy", cv=cv, n_jobs=-1))
            hyper_parameters.append({"trees": num, "critera": crit, "features": no_featues})
            scores.append(new_sc)
        
# Find the best performing model based on score (and the best configuration)
indx = np.argmax(scores)
best_parameters = hyper_parameters[indx]
    
# Create a baseline model & validate it
model = RandomForestClassifier()
base_sc = np.mean(cross_val_score(model, X_rf, Y, scoring="accuracy", cv=cv, n_jobs=-1))

# Optional data: check the amount hyper-parameter setting that gives better accuracy than the baseline
better_than_base = sum(scores > base_sc)

print("Modelling & cross-validation:")
if scores[indx] > base_sc:
    print('Hyper-parameters is better')
else:
    print('baseline model is better or equal')
print('best hyper-parameters: ', best_parameters)
print('hyper-parameters score: ', round(scores[indx], 6))
print('base model score: ', round(base_sc, 6))
print('no. hyper-parameters better than baseline model: ', better_than_base)

In [None]:
"""
Evaluation for task 1a
@author: Abyel
"""
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

############# Start

# Set the seed and the best hyper-parameters
np.random.seed(100)
no_trees, criteria, no_features = 50, "gini", 2
Class_label_name = 'stroke'

# Get both training and test sets
test_set = pd.read_csv("test_set.csv")
train_set = pd.read_csv("training_set.csv")

# Create the X and Y datasets for training the models
X_train = train_set.copy()
Y_train = X_train[Class_label_name].astype('category')

############# Data preparation

X_train.drop(columns=[Class_label_name], inplace=True)
X_train, column_filter = create_column_filter(X_train)
X_train, imputation = create_imputation(X_train)
X_train, one_hot = create_one_hot(X_train)

# Training the best performing model and baseline model on the train set
rf_hyper_parameters = RandomForestClassifier(n_estimators=no_trees, criterion=criteria, max_features=no_features)
rf_hyper_parameters.fit(X_train, Y_train)
rf_base_line = RandomForestClassifier()
rf_base_line.fit(X_train, Y_train)

# Prepare the test set, both X and Y 
results = []
X_test = test_set.copy()
Y_true = X_test[Class_label_name].astype('category')
X_test.drop(columns=[Class_label_name], inplace=True)

# Apply data preparation
X_test = apply_column_filter(X_test, column_filter)
X_test = apply_imputation(X_test, imputation)
X_test = apply_one_hot(X_test, one_hot)

############# Testing (evaluation)
y_pred_hyper = rf_hyper_parameters.predict(X_test)
y_pred_baseline = rf_base_line.predict(X_test)

# Get the accuracy
accuracy_hyper_parameters = round(accuracy_score(Y_true, y_pred_hyper), 10)
accuracy_base_line = round(accuracy_score(Y_true, y_pred_baseline), 10)
results.append([accuracy_hyper_parameters, accuracy_base_line])

print()
print("Evaluation:")
if accuracy_hyper_parameters > accuracy_base_line:
    print('Hyper-parameters is better')
else:
    print('baseline model is better or equal')
print('Accuracy hyper-par:', accuracy_hyper_parameters, ', trees:', no_trees, 
      ', criterion:', criteria ,', features:', no_features)
print('Accuracy baseline: ', accuracy_base_line, ', trees', rf_base_line.n_estimators, 
      ', criterion:', rf_base_line.criterion ,', features:', rf_base_line.max_features)

### Task 1b

In [None]:
"""
Data prearation for 1b
@author: Abyel
"""
import numpy as np
import pandas as pd

####################### - Helper functions

"""
Splits the data into two random, equally-sized data sets. Stratified sampling
is used to make both samples contains equal probability models to get instances 
with one of the class labels e.g. "1" or "0" 
"""
def data_split(df, Class_name):
    df1 = df.copy()
    
    labels_0_indexes = np.where(df1[Class_name].values == 0)[0]
    labels_0_indexes_sets = np.random.choice(labels_0_indexes, (2,int(len(labels_0_indexes)/2)), replace=False)
    
    labels_1_indexes = np.where(df1[Class_name].values == 1)[0]
    labels_1_indexes_sets = np.random.choice(labels_1_indexes, (2,int(len(labels_1_indexes)/2)), replace=False)
    
    data_sets = zip(labels_1_indexes_sets,labels_0_indexes_sets)
    data_sets_list = list(data_sets)

    # list for both datasets
    df_list = []

    # split the dataset into 2 using indexes
    for elem in data_sets_list:
        df2 = pd.concat([df1.iloc[elem[0]], df1.iloc[elem[1]]])       
        df2_reshuffled_indexes = np.random.choice(len(df2),len(df2),replace=False)
        data_set = [df2.iloc[i] for i in df2_reshuffled_indexes]
        df3 = pd.DataFrame(data_set, columns = list(df1))
        df3.index = range(len(df3.index))
        df_list.append(df3)
        
    test = df_list[1]
    training = df_list[0]
    
    return training, test

####################### - Data preparation

dataset_name = "smiles_one_hot.csv"
Class_label_name = "active" # set class label name here
print("Data set: " + dataset_name + ", class label: " + Class_label_name)
print()

# Get the dataset, switch here with different datasets
data_set = pd.read_csv(dataset_name)
ON_data_set = data_set.copy()

# Check the amount classes between the two
class1 = sum(ON_data_set[Class_label_name].values == 0)
class2 = sum(ON_data_set[Class_label_name].values == 1)
print("Amount instances with the following classes")
print("0: " + str(class1))
print("1: " + str(class2))

# the amount features
features = len(ON_data_set.columns)
print("Amount features: " + str(features))
print()

# Split into two sets, one for cross-validation and one for testing (evaluation)
training_set, test_set = data_split(ON_data_set, Class_label_name)

## distribution
train_class0 = sum(training_set[Class_label_name].values == 0)
train_class1 = sum(training_set[Class_label_name].values == 1)
test_class0 = sum(test_set[Class_label_name].values == 0)
test_class1 = sum(test_set[Class_label_name].values == 1)

print("Class distribution")
print("training set --- 0: " + str(train_class0) + ", 1: " + str(train_class1))
print("test set --- 0: " + str(test_class0) + ", 1: " + str(test_class1))

# Save into csv
training_set.to_csv("B_training_set.csv", index=False)
test_set.to_csv("B_sampling_set.csv", index=False)

In [None]:
"""
Preparation of the majority and equal-sized samples described in 1b
@author: Abyel
"""
import numpy as np
import pandas as pd

####################### - Helper functions

## Creates a dataset in which the majority class is equal the minority
def equal_sampling(df, classname, Half_Class_label):
    df1 = df.copy()
    labels_0_indexes = np.where(df[classname].values == 0)[0]
    labels_0_indexes_sets = np.random.choice(labels_0_indexes, Half_Class_label, replace=False)       # 211 or 528
    
    labels_1_indexes = np.where(df1[classname].values == 1)[0]
    labels_1_indexes_sets = np.random.choice(labels_1_indexes, Half_Class_label, replace=True)       # 211 or 527
    
    df2 = pd.concat([df1.iloc[labels_0_indexes_sets], df1.iloc[labels_1_indexes_sets]])       
    df2_reshuffled_indexes = np.random.choice(len(df2),len(df2),replace=False)
    data_set = [df2.iloc[i] for i in df2_reshuffled_indexes]
    df3 = pd.DataFrame(data_set, columns = list(df1))
    df3.index = range(len(df3.index))
    
    return df3

## Creates a dataset in which the majority class is represented as 4/5 in the dataset
def adjust_sampling(df, classname, Half_Class_label, four_of_five):
    df1 = df.copy()
    labels_0_indexes = np.where(df1[classname].values == 0)[0]
    labels_0_indexes_sets = np.random.choice(labels_0_indexes, four_of_five, replace=False)     # 4/5 of the size
    
    labels_1_indexes = np.where(df1[classname].values == 1)[0]
    labels_1_indexes_sets = np.random.choice(labels_1_indexes, Half_Class_label, replace=True)  # 1/5 of the size

    df2 = pd.concat([df1.iloc[labels_0_indexes_sets], df1.iloc[labels_1_indexes_sets]])       
    df2_reshuffled_indexes = np.random.choice(len(df2),len(df2),replace=False)
    data_set = [df2.iloc[i] for i in df2_reshuffled_indexes]
    df3 = pd.DataFrame(data_set, columns = list(df1))
    df3.index = range(len(df3.index))
    
    return df3

####################### - Data preparation
Class_name = "active"  # set class name here
Half_Class_label = 211 # hardcoded, represents half the amount of minority class label 
                        # e.g. 211 in smiles.one_hot, 1000 in diabetes_binary dataset
four_of_five = 844     # hardcoded, represents 4/5 of the majority class label 
                        # e.g. 844 in smiles.one_hot, 4000 in diabetes_binary dataset

# Get the test dataset, copy for majority and one for equal_sampling
data_set = pd.read_csv("B_sampling_set.csv")
major_set = data_set.copy()
equal_set = data_set.copy()

#Split into two sets, one for training and one for testing
major_set = adjust_sampling(major_set, Class_name, Half_Class_label, four_of_five)   # for 5:1 majority set
equal_set = equal_sampling(equal_set, Class_name, Half_Class_label)                  # for equal sets

#save into csv
major_set.to_csv("B_majority_test.csv", index=False)
equal_set.to_csv("B_undersample_test.csv", index=False)


In [None]:
"""
Modelling and testing of the data sets, 1b
@author: Abyel
"""

import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import roc_auc_score
from sklearn.metrics import accuracy_score
from sklearn.metrics import precision_score
from sklearn.metrics import classification_report

################ HELPER FUNCTIONS

"""
Evaluates the given model with test_set. Uses f,i and o to apply 
filtering, imputation and one_hot encoding
"""
def evaluate_model(model, test_set, f, i, o, classname):
    X_test = test_set.copy()
    Y_true = X_test[classname].astype('category')
    X_test.drop(columns=[classname], inplace=True)
    
    # applying data preparation
    X_test = apply_column_filter(X_test, f)
    X_test = apply_imputation(X_test, i)
    X_test = apply_one_hot(X_test, o)


    # get predictions and score with given test set
    y_pred = model.predict(X_test)
    y_score = model.predict_proba(X_test)

    # get the AUC and accuracy
    accuracy = round(accuracy_score(Y_true, y_pred), 6)
    try:
        AUC = round(roc_auc_score(Y_true, y_score[:, 1]), 6)
    except ValueError:
        print("ERROR AUC")
        AUC = 0
    
    return accuracy, AUC

################ MAIN

# Set seed and no trees, also set the name of the class label
np.random.seed(100)
no_trees = 100
classname = 'active'

# get training set and both test sets
training_set = pd.read_csv("B_training_set.csv")
majority_set = pd.read_csv("B_majority_test.csv")
undersample_set = pd.read_csv("B_undersample_test.csv")

# get the instances (X)
X_train = training_set.copy()
Y_train = X_train[classname].astype('category')

### Data preparation
X_train.drop(columns=[classname, 'index'], inplace=True, errors='ignore')
X_train, column_filter = create_column_filter(X_train)
X_train, imputation = create_imputation(X_train)
X_train, one_hot = create_one_hot(X_train)

### Modelling
model = RandomForestClassifier(n_estimators=no_trees)
model.fit(X_train, Y_train)

# Prepare both test sets and evaluate the AUC and accuracy
results = []
major_accuracy, major_auc = evaluate_model(model, majority_set, column_filter, imputation, one_hot, classname)
under_accuracy, under_auc = evaluate_model(model, undersample_set, column_filter, imputation, one_hot, classname)

## print 
rows = [[major_accuracy, major_auc], [under_accuracy, under_auc]]
results_df = pd.DataFrame(rows, columns=['Accuracy', 'AUC'])
results_df.index = ['Imbalanced', 'class-balanced']
print()
print(results_df)
