# Evalutation of regression and classification models : model assessement

This code is a complement to PyTerK environment notebooks. It allows to gather all desired model evaluation in a table, using similar method than PyTerK. 

In this notebook, we are evaluating different models on the same dataset. The train data, test data and models are loaded for each iteration and k-folds. Predictions are performed from train data. Extected and predicted values are stored.

Regression models are evaluatied through adjusted $R^2$ of the regression between expected and predicted values. For adjusted $R^2$, the size of the dataset correspond to ```len(y_pred)```and the number of independant variables is 1 as predicted values are supposed to be equal to expected values. 
__________________________________________________________________________________________________________________________________

*Example of tree structure obtained from training of 3 Neural Network (NN), Random Forest (RF) and SVM regression and classification models with different hyperparameters, for different ouputs.*

* campaign08 
    * NN_r : regression NN models
        * Fit_E : output of the model
            * Fit_E_0000 : model n°1 *eg: 50x50 layers*
            * Fit_E_0001 : model n°2 *eg: 100x100 layers*
            * Fit_E_0002 : model n°3 *eg: 50x100 layers*
        * Fit_H
        * Fit_CI
        * Fit_IQ 
    * NN_c
    * RF_r
    * RF_c
    * SVM_r
    * SVM_c



## Import libraries 

In [4]:
import numpy as np
import os, json
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
from sklearn.metrics import classification_report
from sklearn.metrics import hamming_loss

## Regression model

### Get paths for models to evaluate: stored results of iterative k-fold crossvalidation trainings (```run```) for different outputs (```fit```)

In [5]:
run_dir=os.getenv('RUN_DIR')
directory=f"{run_dir}ModelAssessment/"
runs=['SVM_r','NN_r','RF_r']
fit=['Fit_E','Fit_H', 'Fit_CI','Fit_IQ']

### Calculation of adjusted $R^2$ for each model and output

In [6]:
# For each run/kind of models and each outputs/predictions
for num_model in range (0,len(runs)):
    for num_pred in range (0,len(fit)):
        # Let's go through the different kind of models and outputs — eg: NN_r/Fit_E
        run_dir=directory+runs[num_model]+'/'+fit[num_pred]+'/'

        # Subdirs= all models of a kind, with different hyperparemeters — eg : NN_r/Fit_E/Fit_E_0000
        subdirs = [f.path for f in os.scandir(run_dir) if f.is_dir() and not f.name.startswith('.')]
        subdirs = sorted(subdirs)
        
        # Dataframe to store the metrics of each training : initialisation with an empty column
        df_r2_adj_models=pd.DataFrame(columns=['model0']) 
        # Dataframe to store the mean and std of metrics for each model: : initialisation with an empty column
        df_r2_adj_mean_std=pd.DataFrame(columns=['model0'],index=['R2 adj mean','R2 adj std'])
        
        
        for s in subdirs:
            r2_adj_table=[]
            about= json.load(open(s+'/about.json'))
            name_model=about['args']['model_id']
            # For all iteration and all k-fold
            iteration=[f.path for f in os.scandir(s) if f.is_dir() and not f.name.startswith('.')]
            for it in iteration:
                k_fold=[f.path for f in os.scandir(it) if f.is_dir() and not f.name.startswith('.')]
                for k in k_fold:
                    with open(k+'/yytest.json') as fd:
                        yy = json.load(fd)
                        y_pred=np.reshape(np.array(yy['y_pred']),(np.shape(np.array(yy['y_pred']))[0]))
                        y_test=np.reshape(np.array(yy['y_test']),(np.shape(np.array(yy['y_test']))[0]))

                        # Compute R2 and R2 adjusted 
                        corr_matrix = np.corrcoef(y_test,y_pred)
                        corr = corr_matrix[0,1]
                        R2 = corr**2
                        #R2_adj=1-((1-R2)*(len(y_test)-1))/(len(y_test)-1-1)
                        r2_adj_table.append(R2)
            
            df_r2_adj_models[name_model]=r2_adj_table
            df_r2_adj_mean_std[name_model]=[np.mean(r2_adj_table),np.std(r2_adj_table)]                                              
            

        # Print table of evaluation of each model (mean and std of metrics)
        display(runs[num_model]+'_'+fit[num_pred])
        del df_r2_adj_mean_std['model0'] # delete initial columns
        df_r2_adj_mean_std=df_r2_adj_mean_std.T
        display(df_r2_adj_mean_std)

        # Gives the best model.
        maxR2adj=df_r2_adj_mean_std['R2 adj mean'].max()
        bestmodel=df_r2_adj_mean_std['R2 adj mean'].idxmax()
        print('The best model is '+bestmodel +' with R2_adj='+str(maxR2adj)+' +/- '+str(df_r2_adj_mean_std['R2 adj std'][bestmodel])+'\n')
        
        # Save evaluation of each model. 
        df_r2_adj_mean_std.to_csv(f'{directory}results/regression/R2_adj_mean_std_'+runs[num_model]+'_'+fit[num_pred]+'.csv')                                                    

'SVM_r_Fit_E'

Unnamed: 0,R2 adj mean,R2 adj std
sklearn-SVR-poly2-0.1-1,0.648501,0.024092
sklearn-SVR-poly2-5-1,0.652520,0.023574
sklearn-SVR-poly2-10-1,0.659457,0.022924
sklearn-SVR-poly2-0.1-100,0.682925,0.028544
sklearn-SVR-poly2-5-100,0.687212,0.025958
...,...,...
sklearn-nuSVR-rbf-0.5-100,0.947765,0.008881
sklearn-nuSVR-rbf-0.8-100,0.947244,0.009578
sklearn-nuSVR-rbf-0.2-1000,0.953800,0.008958
sklearn-nuSVR-rbf-0.5-1000,0.956740,0.009844


The best model is sklearn-nuSVR-rbf-0.5-1000 with R2_adj=0.9567404561727593 +/- 0.00984405645119613



  c /= stddev[:, None]
  c /= stddev[None, :]


'SVM_r_Fit_H'

Unnamed: 0,R2 adj mean,R2 adj std
sklearn-SVR-poly2-0.1-1,0.451807,0.039509
sklearn-SVR-poly2-5-1,0.351267,0.030862
sklearn-SVR-poly2-10-1,,
sklearn-SVR-poly2-0.1-100,0.455348,0.038978
sklearn-SVR-poly2-5-100,0.350446,0.028760
...,...,...
sklearn-nuSVR-rbf-0.5-100,0.934863,0.012765
sklearn-nuSVR-rbf-0.8-100,0.932870,0.014842
sklearn-nuSVR-rbf-0.2-1000,0.944377,0.010687
sklearn-nuSVR-rbf-0.5-1000,0.946245,0.011766


The best model is sklearn-nuSVR-rbf-0.5-1000 with R2_adj=0.9462445762206475 +/- 0.011765842553282712



  c /= stddev[:, None]
  c /= stddev[None, :]


'SVM_r_Fit_CI'

Unnamed: 0,R2 adj mean,R2 adj std
sklearn-SVR-poly2-0.1-1,0.534025,0.103509
sklearn-SVR-poly2-5-1,,
sklearn-SVR-poly2-10-1,,
sklearn-SVR-poly2-0.1-100,0.535486,0.101770
sklearn-SVR-poly2-5-100,,
...,...,...
sklearn-nuSVR-rbf-0.5-100,0.941082,0.037777
sklearn-nuSVR-rbf-0.8-100,0.941369,0.035773
sklearn-nuSVR-rbf-0.2-1000,0.937094,0.045953
sklearn-nuSVR-rbf-0.5-1000,0.939955,0.044764


The best model is sklearn-nuSVR-rbf-0.8-100 with R2_adj=0.9413692442782364 +/- 0.03577306472411468



'SVM_r_Fit_IQ'

Unnamed: 0,R2 adj mean,R2 adj std
sklearn-SVR-poly2-0.1-1,0.488227,0.087095
sklearn-SVR-poly2-5-1,0.484759,0.090206
sklearn-SVR-poly2-10-1,0.484483,0.097327
sklearn-SVR-poly2-0.1-100,0.555867,0.111346
sklearn-SVR-poly2-5-100,0.566349,0.095987
...,...,...
sklearn-nuSVR-rbf-0.5-100,0.540873,0.059664
sklearn-nuSVR-rbf-0.8-100,0.480313,0.050152
sklearn-nuSVR-rbf-0.2-1000,0.630413,0.076722
sklearn-nuSVR-rbf-0.5-1000,0.629185,0.071454


The best model is sklearn-nuSVR-poly3-0.2-1000 with R2_adj=0.6498214815268614 +/- 0.0970019956571828



'NN_r_Fit_E'

Unnamed: 0,R2 adj mean,R2 adj std
keras-50x50,0.951554,0.009784
keras-100x100,0.955538,0.009964
keras-50x50x50,0.955905,0.010585
keras-100x100x100,0.960464,0.010246
keras-50x50x50x50,0.961183,0.010441
keras-100x100x100x100,0.965007,0.010959
keras-100x100x50x50,0.962332,0.011012
keras-50x100x100x50,0.96461,0.011124
keras-50x50x100x100,0.963971,0.011455


The best model is keras-100x100x100x100 with R2_adj=0.9650071744651089 +/- 0.010958810813758945



'NN_r_Fit_H'

Unnamed: 0,R2 adj mean,R2 adj std
keras-50x50,0.945531,0.014916
keras-100x100,0.950223,0.014428
keras-50x50x50,0.954197,0.014136
keras-100x100x100,0.95605,0.014896
keras-50x50x50x50,0.955968,0.01445
keras-100x100x100x100,0.956751,0.015021
keras-100x100x50x50,0.956209,0.015289
keras-50x100x100x50,0.956475,0.014923
keras-50x50x100x100,0.956207,0.014333


The best model is keras-100x100x100x100 with R2_adj=0.9567510205973954 +/- 0.015021431982757065



'NN_r_Fit_CI'

Unnamed: 0,R2 adj mean,R2 adj std
keras-50x50,0.930356,0.040088
keras-100x100,0.937271,0.038692
keras-50x50x50,0.940192,0.038762
keras-100x100x100,0.944365,0.03407
keras-50x50x50x50,0.944714,0.038352
keras-100x100x100x100,0.945421,0.036865
keras-100x100x50x50,0.945418,0.037693
keras-50x100x100x50,0.94767,0.034963
keras-50x50x100x100,0.946228,0.038222


The best model is keras-50x100x100x50 with R2_adj=0.9476699208352644 +/- 0.034962944534769655



'NN_r_Fit_IQ'

Unnamed: 0,R2 adj mean,R2 adj std
keras-50x50,0.627709,0.066335
keras-100x100,0.657957,0.074948
keras-50x50x50,0.681367,0.068353
keras-100x100x100,0.691431,0.072551
keras-50x50x50x50,0.704123,0.065337
keras-100x100x100x100,0.740161,0.067938
keras-100x100x50x50,0.718748,0.058658
keras-50x100x100x50,0.712471,0.066139
keras-50x50x100x100,0.713011,0.070215


The best model is keras-100x100x100x100 with R2_adj=0.7401609730895842 +/- 0.0679381010048114



'RF_r_Fit_E'

Unnamed: 0,R2 adj mean,R2 adj std
sklearn-rfr-50-5,0.969386,0.011589
sklearn-rfr-100-5,0.968741,0.012264
sklearn-rfr-150-5,0.969241,0.011866
sklearn-rfr-50-10,0.968512,0.012278
sklearn-rfr-100-10,0.970079,0.011102
sklearn-rfr-150-10,0.969725,0.01099
sklearn-rfr-50-20,0.966311,0.011827
sklearn-rfr-100-20,0.9668,0.011762
sklearn-rfr-150-20,0.966853,0.011911


The best model is sklearn-rfr-100-10 with R2_adj=0.9700791369087292 +/- 0.01110249539750862



'RF_r_Fit_H'

Unnamed: 0,R2 adj mean,R2 adj std
sklearn-rfr-50-5,0.955845,0.015165
sklearn-rfr-100-5,0.955445,0.014559
sklearn-rfr-150-5,0.95555,0.014097
sklearn-rfr-50-10,0.956218,0.013586
sklearn-rfr-100-10,0.956536,0.013876
sklearn-rfr-150-10,0.956832,0.014298
sklearn-rfr-50-20,0.952065,0.013803
sklearn-rfr-100-20,0.951926,0.014591
sklearn-rfr-150-20,0.952727,0.014308


The best model is sklearn-rfr-150-10 with R2_adj=0.956832380450869 +/- 0.01429758739312696



'RF_r_Fit_CI'

Unnamed: 0,R2 adj mean,R2 adj std
sklearn-rfr-50-5,0.90025,0.053663
sklearn-rfr-100-5,0.904537,0.053976
sklearn-rfr-150-5,0.905085,0.045149
sklearn-rfr-50-10,0.890941,0.059214
sklearn-rfr-100-10,0.89496,0.055015
sklearn-rfr-150-10,0.893577,0.060722
sklearn-rfr-50-20,0.829801,0.067823
sklearn-rfr-100-20,0.832343,0.068304
sklearn-rfr-150-20,0.834071,0.065848


The best model is sklearn-rfr-150-5 with R2_adj=0.9050848881973714 +/- 0.04514917486945086



'RF_r_Fit_IQ'

Unnamed: 0,R2 adj mean,R2 adj std
sklearn-rfr-50-5,0.914003,0.037704
sklearn-rfr-100-5,0.908815,0.047858
sklearn-rfr-150-5,0.910695,0.046928
sklearn-rfr-50-10,0.896496,0.049276
sklearn-rfr-100-10,0.897165,0.0574
sklearn-rfr-150-10,0.898342,0.051333
sklearn-rfr-50-20,0.793874,0.082265
sklearn-rfr-100-20,0.795042,0.08976
sklearn-rfr-150-20,0.795194,0.07749


The best model is sklearn-rfr-50-5 with R2_adj=0.9140030067161979 +/- 0.03770443946790342



## Classification models

### Get paths for models to evaluate: stored results of iterative k-fold crossvalidation trainings (```run```) for different outputs (```classifications```)
Add prediction type : to compute the metrics we need only classes. If the prediction type is other, then we correct it.

In [7]:
run_dir=os.getenv('RUN_DIR')
directory=f"{run_dir}campaign08/"
runs=['NN_class','RF_class','SVM_class']
classification=['Class_CI','Class_DRX']
predict_type_list=['sigmoid','classes','classes']

### Calculation of F1-score, accuracy, Hamming loss, confusion matrix for each model and output

In [8]:
# For each run/kind of models and each outputs/predictions
for num_model in range (0,len(runs)):
    for num_pred in range (0,len(classification)):

        # Let's go through the different kind of models and outputs — eg: RF_c/Class_CI
        run_dir=directory+runs[num_model]+'/'+classification[num_pred]+'/'
        

        # Subdirs= all models of a kind, with different hyperparemeters — eg : RF_c/Class_CI/Class_CI_0000
        subdirs = [f.path for f in os.scandir(run_dir) if f.is_dir() and not f.name.startswith('.')]
        subdirs = sorted(subdirs)
        

        # Dataframe to store the metrics of each training : initialisation with an empty column
        df_metrics_models=pd.DataFrame(columns=['model0'])

        # Dataframe to store the mean and std of metrics for each model: : initialisation with an empty column
        df_metrics_mean_std=pd.DataFrame(columns=['model0'],
                                         index=['accuracy mean','accuracy std',
                                                'recall amorphous mean', 'recall amorphous std', 
                                                'precision amorphous mean','precision amorphous std',
                                                'F1 amorphous mean','F1 amorphous std',
                                                'recall cristalline mean', 'recall cristalline std', 
                                                'precision cristalline mean','precision cristalline std',
                                                'F1 cristalline mean','F1 cristalline std',
                                                'F1 weighted mean' , 'F1 weighted std',
                                                'Hamming loss mean', 'Hamming loss std'])
        
        
        
        for s in subdirs:
            accuracy_table=[]
            recall_table_amorphe=[]
            precision_table_amorphe=[]
            F1_table_amorphe=[]
            recall_table_crist=[]
            precision_table_crist=[]
            F1_table_crist=[]
            F1_table_weighted=[]
            Hml_table=[]
            y_pred_list=[]
            y_test_list=[]
            
            
            about= json.load(open(s+'/about.json'))
            name_model=about['args']['model_id']
            
            # For all iteration and all k-fold
            iteration=[f.path for f in os.scandir(s) if f.is_dir() and not f.name.startswith('.')]
            for it in iteration:
                k_fold=[f.path for f in os.scandir(it) if f.is_dir() and not f.name.startswith('.')]
                for k in k_fold:
                    with open(k+'/yytest.json') as fd:
                        yy = json.load(fd)
                        y_pred=np.reshape(np.array(yy['y_pred']),(np.shape(np.array(yy['y_pred']))[0]))
                        
                        y_test=np.reshape(np.array(yy['y_test']),(np.shape(np.array(yy['y_test']))[0]))
                        
                        # Transform output into class
                        predict_type=predict_type_list[num_model]
                        if predict_type=='softmax':
                            y_pred = np.array( [ np.argmax(y) for y in y_pred] )

                        if predict_type=='sigmoid':
                            y_pred = np.array( [ 0 if y<0.5 else 1 for y in y_pred] )

                        if predict_type=='classes':
                            y_pred = y_pred.squeeze()


                        metrics =pd.DataFrame(classification_report(y_test,y_pred,output_dict=True)).T
                        #display(metrics)
                        #print(metrics['recall'].keys)
                        amorph_key=metrics['recall'].keys()[0]
                        crist_key=metrics['recall'].keys()[1]
                        
                        accuracy_table.append(metrics['precision']['accuracy'])
                        recall_table_amorphe.append(metrics['recall'][amorph_key])
                        precision_table_amorphe.append(metrics['precision'][amorph_key])
                        F1_table_amorphe.append(metrics['f1-score'][amorph_key])
                        recall_table_crist.append(metrics['recall'][crist_key])
                        precision_table_crist.append(metrics['precision'][crist_key])
                        F1_table_crist.append(metrics['f1-score'][crist_key])
                        F1_table_weighted.append(metrics['f1-score']['weighted avg'])
                        Hml_table.append(hamming_loss(y_test, y_pred))
                        y_pred_list.append(y_pred)
                        y_test_list.append(y_test)
                        
            y_pred_list=np.concatenate(y_pred_list[:])  
            y_test_list=np.concatenate(y_test_list[:])             
            
            df_metrics_mean_std[name_model]=[np.mean(accuracy_table),np.std(accuracy_table),
                                             np.mean(recall_table_amorphe),np.std(recall_table_amorphe),
                                             np.mean(precision_table_amorphe),np.std(precision_table_amorphe),
                                             np.mean(F1_table_amorphe),np.std(F1_table_amorphe),
                                             np.mean(recall_table_crist),np.std(recall_table_crist),
                                             np.mean(precision_table_crist),np.std(precision_table_crist),
                                             np.mean(F1_table_crist),np.std(F1_table_crist),
                                             np.mean(F1_table_weighted),np.std(F1_table_weighted),
                                             np.mean(Hml_table),np.std(Hml_table)]                                            
                                             
            cm = confusion_matrix( y_test_list, y_pred_list, normalize="pred")
            disp = ConfusionMatrixDisplay(confusion_matrix=cm,display_labels=['amorphous','cristalline'])
            disp.plot()
            plt.title(name_model)
            plt.show()    

        
        display(runs[num_model]+'_'+classification[num_pred])
        del df_metrics_mean_std['model0'] # suppress initialisation empty column
        
        # Print table of evaluation of each model (mean and std of metrics)
        df_metrics_mean_std=df_metrics_mean_std.T
        display(df_metrics_mean_std)

        # Gives the best model base on F1 weighted score and check that it coherent with accuracy and Hamming loss
        max_F1_w=df_metrics_mean_std['F1 weighted mean'].max()
        max_F1_w=df_metrics_mean_std['F1 weighted mean'].max()
        bestmodel=df_metrics_mean_std['F1 weighted mean'].idxmax()
        accuracy_best=(df_metrics_mean_std['accuracy mean'].idxmax()==bestmodel)
        hml_best=(df_metrics_mean_std['Hamming loss mean'].idxmin()==bestmodel)
        print('The best model for phase classification is '+bestmodel +' with F1w='+str(max_F1_w)+' +/- '+str(df_metrics_mean_std['F1 weighted std'][bestmodel])+'\n')
        print('The best accuracy is obtained for the same model :'+str(accuracy_best)+'\n')
        print('The lowest Hamming loss is obtained for the same model :'+str(hml_best)+'\n')

        # Save metrics mean and std 
        #df_metrics_mean_std.to_csv(f'{directory}results/Raw_R2_'+runs[num_model]+'_'+fit[num_pred]+'.csv')
        df_metrics_mean_std.to_csv(f'{directory}results/class/metrics_mean_std_'+runs[num_model]+'_'+classification[num_pred]+'.csv')                                                    

FileNotFoundError: [Errno 2] No such file or directory: '/Users/elisegarel/Desktop/PUBLI/run/campaign08/NN_class/Class_CI/'