# 2.3A Izbira modela za strojno učenje

V koraku izbira modela za strojno učenje:
- uvozimo podatke pridobljene v koraku 2. s katerih bomo učili modele
- podatke o strukturah molekule pretvorimo v fingerprinte (bitni zapis strukture, s tem dobimo featurje na X osi)
- izberemo kombinacijo najbolj primernega fingerprinta, klasifikatorja, vzorčenjske tehnike, skalarja, tehnike za zaznavanje outlierjev

# Uvoz knjižnic in splošnih funkcij

In [1]:
%run __A_knjiznice.py

from __A_knjiznice import *
from __B_funkcije import *
import __C_konstante as kon
%matplotlib inline

# Uvoz obdelanih podatkov obdelanih v koraku 2. Obdelava in analiza podatkov

## Pregled podatkov

We are limited to max 4142 samples. The sample size is quite small but we are limited to exisitng data. 

Two rules of thumb are often considered when we estimate the size of training set.

1. Rule-of-Thumb for Prediction Classes
The first rule suggests having a sample size at least 50 to 1000 times the number of prediction classes. Since we're dealing with binary classification (2 classes), this guideline would technically require a minimum of 100 to 2000 samples. With 4250 samples, we comfortably exceed the lower end of this range, suggesting that, from the perspective of prediction classes alone, our sample size is adequate.
2. Rule-of-Thumb for Observations vs. Features
The more challenging guideline in our case is the one suggesting having at least 20 times the number of observations as features. With up to 4860 features in case of certain fingerprints, this rule would imply you need around 100.000 observations, a number far exceeding our current sample size. This guideline is particularly important in machine learning to avoid overfitting, where a model learns the noise in the training data instead of the actual signal, leading to poor generalization to new data.

In order to be as close as possible to the second rule, I decidet to exclude fingerprints which have more than 1024 features, because fingerprints with +1024 features could lead to overfiting



In [2]:
df = pd.read_csv(f'{kon.path_files}/dp.csv')
df

Unnamed: 0,Smiles,ROMol,Activity
0,O=C1c2cc([N+](=O)[O-])ccc2-n2c1nc1ccccc1c2=O,<rdkit.Chem.rdchem.Mol object at 0x16fce49e0>,1
1,Cc1cc(C2CC2)ncc1-c1ccc(C2(C(=O)Nc3ccc(F)cc3)CO...,<rdkit.Chem.rdchem.Mol object at 0x16fcb09e0>,1
2,O=C(Nc1ccc(C2(C(=O)Nc3ccc(F)cc3)COC2)cc1)c1ccc...,<rdkit.Chem.rdchem.Mol object at 0x16fcdcba0>,1
3,O=C(Nc1ccc(F)cc1)C1(C2CCC3C(CCCN3c3ccnc(C(F)(F...,<rdkit.Chem.rdchem.Mol object at 0x16fcddcb0>,1
4,O=C1CC(c2c[nH]c3ccc(F)cc23)C(=O)N1,<rdkit.Chem.rdchem.Mol object at 0x16fce25e0>,1
...,...,...,...
4137,FC(F)(F)c1ccc(-c2c[nH]nn2)cc1,<rdkit.Chem.rdchem.Mol object at 0x16fcbef10>,0
4138,c1ccc2[nH]nnc2c1,<rdkit.Chem.rdchem.Mol object at 0x16fcf54d0>,0
4139,Cc1cccc(NC(=O)C(F)(F)F)c1-c1c[nH]nn1,<rdkit.Chem.rdchem.Mol object at 0x16fcb5230>,0
4140,Cc1ccc(N)cc1-c1c[nH]nn1,<rdkit.Chem.rdchem.Mol object at 0x16fcb2500>,0


In [3]:
activity_counts = df['Activity'].value_counts()
print(activity_counts)

Activity
1    2103
0    2039
Name: count, dtype: int64


# Pregled kombinacij izbranih fingerprintov, klasifikacijskih modelov in korakov preprocesiranja

In [4]:
# https://medium.com/artificialis/why-how-we-split-train-valid-and-test-fb4d6746ede

In [5]:
input_directory = f'{kon.path_files}/molekulski_prstni_odtisi'

generated_fingerprints = [
    'df_morgan.csv',
    'df_circular.csv'
]

In [6]:
import os
import pandas as pd
from sklearn.model_selection import train_test_split, StratifiedKFold
from sklearn.ensemble import RandomForestClassifier, ExtraTreesClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC  # Make sure to import SVC
from xgboost import XGBClassifier
from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score, roc_auc_score
from imblearn.pipeline import Pipeline as ImbPipeline
from sklearn.feature_selection import SelectKBest, chi2, VarianceThreshold
from sklearn.decomposition import PCA
import numpy as np

# Define classifiers with random_state for reproducibility
classifiers = {
    'SupportVectorMachine': SVC(probability=True, random_state=kon.random_seed),
    'RandomForestClassifier': RandomForestClassifier(n_jobs=-1, random_state=kon.random_seed)
}

# Dimensionality Reduction Methods with default parameters
dim_reduction_methods = {
    "None": None,
    "SelectKBest": SelectKBest(score_func=chi2, k=300),
    "PCA": PCA(n_components=50)  
}

# Store results
results_list = []

# Define Stratified k-fold cross-validation
cv = StratifiedKFold(n_splits=10, shuffle=True, random_state=kon.random_seed)

# List of specific filenames to process
for filename in generated_fingerprints:  # Assuming generated_fingerprints is defined
    file_path = os.path.join(input_directory, filename)
    
    if os.path.exists(file_path):  # Check if the file exists
        print(f'Processing fingerprint DataFrame: {filename}')
        
        df = pd.read_csv(file_path)
        y = df[['Activity']].values.ravel()  # Assuming 'Activity' is the target
        X = df.iloc[:, 3:]  # Assuming features start from the 4th column

        # Split the data into train, validation, and test sets
        X_interim, X_test, y_interim, y_test = train_test_split(X, y, test_size=0.10, random_state=kon.random_seed, shuffle=True, stratify=y)
        X_train, X_val, y_train, y_val = train_test_split(X_interim, y_interim, test_size=10/90, random_state=kon.random_seed, shuffle=True, stratify=y_interim)

        # Remove constant features
        selector = VarianceThreshold()
        X_train = pd.DataFrame(selector.fit_transform(X_train), columns=selector.get_feature_names_out())
        
        # Apply the same transformation to the validation and test sets
        X_val = pd.DataFrame(selector.transform(X_val), columns=selector.get_feature_names_out())
        X_test = pd.DataFrame(selector.transform(X_test), columns=selector.get_feature_names_out())

        # Train and evaluate each classifier
        for clf_name, clf in classifiers.items():
            for dr_name, dr_method in dim_reduction_methods.items():
                steps = []
                if dr_method is not None:
                    steps.append(('dim_reduction', dr_method))
                steps.append(('classifier', clf))
                
                # Create the pipeline
                pipeline = ImbPipeline(steps)

                # Perform cross-validation
                cv_results = []
                for train_index, val_index in cv.split(X_train, y_train):
                    X_train_cv, X_val_cv = X_train.iloc[train_index], X_train.iloc[val_index]
                    y_train_cv, y_val_cv = y_train[train_index], y_train[val_index]

                    # Fit the model
                    pipeline.fit(X_train_cv, y_train_cv)

                    # Evaluate on the validation set
                    y_val_pred = pipeline.predict(X_val_cv)
                    val_accuracy = accuracy_score(y_val_cv, y_val_pred)
                    val_f1 = f1_score(y_val_cv, y_val_pred)
                    val_precision = precision_score(y_val_cv, y_val_pred)
                    val_recall = recall_score(y_val_cv, y_val_pred)
                    val_roc_auc = roc_auc_score(y_val_cv, y_val_pred)

                    # Store the results for this fold
                    cv_results.append({
                        'Val_Accuracy': val_accuracy,
                        'Val_F1': val_f1,
                        'Val_Precision': val_precision,
                        'Val_Recall': val_recall,
                        'Val_ROC_AUC': val_roc_auc,
                    })

                # Calculate mean metrics across all folds
                mean_cv_results = pd.DataFrame(cv_results).mean()

                # Fit the model on the entire training set
                pipeline.fit(X_train, y_train)

                                # Evaluate on the training set (train metrics)
                y_train_pred = pipeline.predict(X_train)
                train_accuracy = accuracy_score(y_train, y_train_pred)
                train_f1 = f1_score(y_train, y_train_pred)
                train_precision = precision_score(y_train, y_train_pred)
                train_recall = recall_score(y_train, y_train_pred)
                train_roc_auc = roc_auc_score(y_train, y_train_pred)

                # Evaluate on the hold-out validation set
                y_val_final_pred = pipeline.predict(X_val)
                val_accuracy_final = accuracy_score(y_val, y_val_final_pred)
                val_f1_final = f1_score(y_val, y_val_final_pred)
                val_precision_final = precision_score(y_val, y_val_final_pred)
                val_recall_final = recall_score(y_val, y_val_final_pred)
                val_roc_auc_final = roc_auc_score(y_val, y_val_final_pred)

                # Append results to the list
                results_temp = {
                    'Fingerprint': filename,  # Use the filename for identification
                    'Dim_Reduction': dr_name,
                    'Classifier': clf_name,
                    'CV_Mean_Accuracy': mean_cv_results['Val_Accuracy'],
                    'CV_Mean_F1': mean_cv_results['Val_F1'],
                    'CV_Mean_Precision': mean_cv_results['Val_Precision'],
                    'CV_Mean_Recall': mean_cv_results['Val_Recall'],
                    'CV_Mean_ROC_AUC': mean_cv_results['Val_ROC_AUC'],
                    'Train_Accuracy': train_accuracy,
                    'Train_F1': train_f1,
                    'Train_Precision': train_precision,
                    'Train_Recall': train_recall,
                    'Train_ROC_AUC': train_roc_auc,
                    'Val_Accuracy': val_accuracy_final,
                    'Val_F1': val_f1_final,
                    'Val_Precision': val_precision_final,
                    'Val_Recall': val_recall_final,
                    'Val_ROC_AUC': val_roc_auc_final,
                }
                results_list.append(results_temp)
                print("\nResults:")
                print(results_temp)

# Create DataFrame from the results list
results_df = pd.DataFrame(results_list)

Processing fingerprint DataFrame: df_morgan.csv

Results:
{'Fingerprint': 'df_morgan.csv', 'Dim_Reduction': 'None', 'Classifier': 'SupportVectorMachine', 'CV_Mean_Accuracy': 0.8759118043169657, 'CV_Mean_F1': 0.879293349152217, 'CV_Mean_Precision': 0.8684204946524023, 'CV_Mean_Recall': 0.8911348267117498, 'CV_Mean_ROC_AUC': 0.875664300981193, 'Train_Accuracy': 0.9296497584541062, 'Train_F1': 0.9315712187958884, 'Train_Precision': 0.919953596287703, 'Train_Recall': 0.9434860202260559, 'Train_ROC_AUC': 0.9294376759622003, 'Val_Accuracy': 0.8987951807228916, 'Val_F1': 0.9027777777777778, 'Val_Precision': 0.8823529411764706, 'Val_Recall': 0.9241706161137441, 'Val_ROC_AUC': 0.8983598178607937}

Results:
{'Fingerprint': 'df_morgan.csv', 'Dim_Reduction': 'SelectKBest', 'Classifier': 'SupportVectorMachine', 'CV_Mean_Accuracy': 0.875313944600153, 'CV_Mean_F1': 0.8785680646751898, 'CV_Mean_Precision': 0.8685128336674237, 'CV_Mean_Recall': 0.8893632009016624, 'CV_Mean_ROC_AUC': 0.875087106959884, 

In [7]:
results_df

Unnamed: 0,Fingerprint,Dim_Reduction,Classifier,CV_Mean_Accuracy,CV_Mean_F1,CV_Mean_Precision,CV_Mean_Recall,CV_Mean_ROC_AUC,Train_Accuracy,Train_F1,Train_Precision,Train_Recall,Train_ROC_AUC,Val_Accuracy,Val_F1,Val_Precision,Val_Recall,Val_ROC_AUC
0,df_morgan.csv,,SupportVectorMachine,0.875912,0.879293,0.86842,0.891135,0.875664,0.92965,0.931571,0.919954,0.943486,0.929438,0.898795,0.902778,0.882353,0.924171,0.89836
1,df_morgan.csv,SelectKBest,SupportVectorMachine,0.875314,0.878568,0.868513,0.889363,0.875087,0.902476,0.904804,0.896612,0.913147,0.902312,0.896386,0.900693,0.878378,0.924171,0.895909
2,df_morgan.csv,PCA,SupportVectorMachine,0.873499,0.875779,0.874008,0.878061,0.873416,0.887379,0.889743,0.884254,0.8953,0.887258,0.896386,0.900232,0.881818,0.919431,0.89599
3,df_morgan.csv,,RandomForestClassifier,0.882851,0.885591,0.87815,0.893533,0.882686,0.991848,0.991948,0.994617,0.989292,0.991887,0.886747,0.889412,0.883178,0.895735,0.886593
4,df_morgan.csv,SelectKBest,RandomForestClassifier,0.873791,0.87586,0.875377,0.87687,0.873734,0.989432,0.989543,0.993998,0.985128,0.989498,0.898795,0.900943,0.896714,0.905213,0.898685
5,df_morgan.csv,PCA,RandomForestClassifier,0.865944,0.870224,0.856164,0.885193,0.865638,0.991546,0.991667,0.992257,0.991077,0.991553,0.886747,0.891455,0.869369,0.914692,0.886268
6,df_circular.csv,,SupportVectorMachine,0.880139,0.883731,0.870772,0.897682,0.879855,0.929046,0.930984,0.919374,0.942891,0.928834,0.898795,0.901869,0.889401,0.914692,0.898522
7,df_circular.csv,SelectKBest,SupportVectorMachine,0.869872,0.872587,0.86719,0.878649,0.869726,0.913949,0.915755,0.910106,0.921475,0.913834,0.906024,0.908665,0.898148,0.919431,0.905794
8,df_circular.csv,PCA,SupportVectorMachine,0.874104,0.877486,0.867543,0.888169,0.873875,0.888889,0.891765,0.881908,0.901844,0.88869,0.898795,0.902778,0.882353,0.924171,0.89836
9,df_circular.csv,,RandomForestClassifier,0.88074,0.883293,0.87763,0.889356,0.880601,0.999396,0.999405,0.998812,1.0,0.999387,0.913253,0.915888,0.903226,0.92891,0.912984


In [8]:
results_df.sort_values(by=['Val_Accuracy'], ascending=False, inplace = True)
results_df

Unnamed: 0,Fingerprint,Dim_Reduction,Classifier,CV_Mean_Accuracy,CV_Mean_F1,CV_Mean_Precision,CV_Mean_Recall,CV_Mean_ROC_AUC,Train_Accuracy,Train_F1,Train_Precision,Train_Recall,Train_ROC_AUC,Val_Accuracy,Val_F1,Val_Precision,Val_Recall,Val_ROC_AUC
9,df_circular.csv,,RandomForestClassifier,0.88074,0.883293,0.87763,0.889356,0.880601,0.999396,0.999405,0.998812,1.0,0.999387,0.913253,0.915888,0.903226,0.92891,0.912984
7,df_circular.csv,SelectKBest,SupportVectorMachine,0.869872,0.872587,0.86719,0.878649,0.869726,0.913949,0.915755,0.910106,0.921475,0.913834,0.906024,0.908665,0.898148,0.919431,0.905794
10,df_circular.csv,SelectKBest,RandomForestClassifier,0.882853,0.883967,0.888464,0.879843,0.882898,0.998792,0.99881,0.99881,0.99881,0.998792,0.903614,0.906542,0.894009,0.919431,0.903343
0,df_morgan.csv,,SupportVectorMachine,0.875912,0.879293,0.86842,0.891135,0.875664,0.92965,0.931571,0.919954,0.943486,0.929438,0.898795,0.902778,0.882353,0.924171,0.89836
4,df_morgan.csv,SelectKBest,RandomForestClassifier,0.873791,0.87586,0.875377,0.87687,0.873734,0.989432,0.989543,0.993998,0.985128,0.989498,0.898795,0.900943,0.896714,0.905213,0.898685
6,df_circular.csv,,SupportVectorMachine,0.880139,0.883731,0.870772,0.897682,0.879855,0.929046,0.930984,0.919374,0.942891,0.928834,0.898795,0.901869,0.889401,0.914692,0.898522
8,df_circular.csv,PCA,SupportVectorMachine,0.874104,0.877486,0.867543,0.888169,0.873875,0.888889,0.891765,0.881908,0.901844,0.88869,0.898795,0.902778,0.882353,0.924171,0.89836
11,df_circular.csv,PCA,RandomForestClassifier,0.874704,0.877803,0.868589,0.887577,0.874499,0.999094,0.999108,0.998811,0.999405,0.999089,0.898795,0.902778,0.882353,0.924171,0.89836
1,df_morgan.csv,SelectKBest,SupportVectorMachine,0.875314,0.878568,0.868513,0.889363,0.875087,0.902476,0.904804,0.896612,0.913147,0.902312,0.896386,0.900693,0.878378,0.924171,0.895909
2,df_morgan.csv,PCA,SupportVectorMachine,0.873499,0.875779,0.874008,0.878061,0.873416,0.887379,0.889743,0.884254,0.8953,0.887258,0.896386,0.900232,0.881818,0.919431,0.89599


In [9]:
results_df.sort_values(by=['CV_Mean_Accuracy'], ascending=False, inplace = True)
results_df

Unnamed: 0,Fingerprint,Dim_Reduction,Classifier,CV_Mean_Accuracy,CV_Mean_F1,CV_Mean_Precision,CV_Mean_Recall,CV_Mean_ROC_AUC,Train_Accuracy,Train_F1,Train_Precision,Train_Recall,Train_ROC_AUC,Val_Accuracy,Val_F1,Val_Precision,Val_Recall,Val_ROC_AUC
10,df_circular.csv,SelectKBest,RandomForestClassifier,0.882853,0.883967,0.888464,0.879843,0.882898,0.998792,0.99881,0.99881,0.99881,0.998792,0.903614,0.906542,0.894009,0.919431,0.903343
3,df_morgan.csv,,RandomForestClassifier,0.882851,0.885591,0.87815,0.893533,0.882686,0.991848,0.991948,0.994617,0.989292,0.991887,0.886747,0.889412,0.883178,0.895735,0.886593
9,df_circular.csv,,RandomForestClassifier,0.88074,0.883293,0.87763,0.889356,0.880601,0.999396,0.999405,0.998812,1.0,0.999387,0.913253,0.915888,0.903226,0.92891,0.912984
6,df_circular.csv,,SupportVectorMachine,0.880139,0.883731,0.870772,0.897682,0.879855,0.929046,0.930984,0.919374,0.942891,0.928834,0.898795,0.901869,0.889401,0.914692,0.898522
0,df_morgan.csv,,SupportVectorMachine,0.875912,0.879293,0.86842,0.891135,0.875664,0.92965,0.931571,0.919954,0.943486,0.929438,0.898795,0.902778,0.882353,0.924171,0.89836
1,df_morgan.csv,SelectKBest,SupportVectorMachine,0.875314,0.878568,0.868513,0.889363,0.875087,0.902476,0.904804,0.896612,0.913147,0.902312,0.896386,0.900693,0.878378,0.924171,0.895909
11,df_circular.csv,PCA,RandomForestClassifier,0.874704,0.877803,0.868589,0.887577,0.874499,0.999094,0.999108,0.998811,0.999405,0.999089,0.898795,0.902778,0.882353,0.924171,0.89836
8,df_circular.csv,PCA,SupportVectorMachine,0.874104,0.877486,0.867543,0.888169,0.873875,0.888889,0.891765,0.881908,0.901844,0.88869,0.898795,0.902778,0.882353,0.924171,0.89836
4,df_morgan.csv,SelectKBest,RandomForestClassifier,0.873791,0.87586,0.875377,0.87687,0.873734,0.989432,0.989543,0.993998,0.985128,0.989498,0.898795,0.900943,0.896714,0.905213,0.898685
2,df_morgan.csv,PCA,SupportVectorMachine,0.873499,0.875779,0.874008,0.878061,0.873416,0.887379,0.889743,0.884254,0.8953,0.887258,0.896386,0.900232,0.881818,0.919431,0.89599


In [10]:
cv_stats = results_df['CV_Mean_Accuracy'].describe()
train_stats = results_df['Train_Accuracy'].describe()
val_stats = results_df['Val_Accuracy'].describe()

print('\nTrain cross validation accuracy\n')
print(cv_stats)
print('\nTrain accuracy\n')
print(train_stats)
print('\nValidation accuracy\n')
print(val_stats)


Train cross validation accuracy

count    12.000000
mean      0.875810
std       0.005123
min       0.865944
25%       0.873718
50%       0.875009
75%       0.880289
max       0.882853
Name: CV_Mean_Accuracy, dtype: float64

Train accuracy

count    12.000000
mean      0.951791
std       0.046993
min       0.887379
25%       0.911081
50%       0.959541
75%       0.993584
max       0.999396
Name: Train_Accuracy, dtype: float64

Validation accuracy

count    12.000000
mean      0.898594
std       0.007299
min       0.886747
25%       0.896386
50%       0.898795
75%       0.900000
max       0.913253
Name: Val_Accuracy, dtype: float64


In [11]:
results_df['val'] = (results_df['CV_Mean_Accuracy'] + results_df['Val_Accuracy']) /2
results_df['train_vs_val'] = (results_df['Train_Accuracy'] / results_df['CV_Mean_Accuracy']) -1

results_df.sort_values(by=['val'], ascending=False, inplace = True)
results_df

Unnamed: 0,Fingerprint,Dim_Reduction,Classifier,CV_Mean_Accuracy,CV_Mean_F1,CV_Mean_Precision,CV_Mean_Recall,CV_Mean_ROC_AUC,Train_Accuracy,Train_F1,Train_Precision,Train_Recall,Train_ROC_AUC,Val_Accuracy,Val_F1,Val_Precision,Val_Recall,Val_ROC_AUC,val,train_vs_val
9,df_circular.csv,,RandomForestClassifier,0.88074,0.883293,0.87763,0.889356,0.880601,0.999396,0.999405,0.998812,1.0,0.999387,0.913253,0.915888,0.903226,0.92891,0.912984,0.896997,0.134723
10,df_circular.csv,SelectKBest,RandomForestClassifier,0.882853,0.883967,0.888464,0.879843,0.882898,0.998792,0.99881,0.99881,0.99881,0.998792,0.903614,0.906542,0.894009,0.919431,0.903343,0.893234,0.131323
6,df_circular.csv,,SupportVectorMachine,0.880139,0.883731,0.870772,0.897682,0.879855,0.929046,0.930984,0.919374,0.942891,0.928834,0.898795,0.901869,0.889401,0.914692,0.898522,0.889467,0.055568
7,df_circular.csv,SelectKBest,SupportVectorMachine,0.869872,0.872587,0.86719,0.878649,0.869726,0.913949,0.915755,0.910106,0.921475,0.913834,0.906024,0.908665,0.898148,0.919431,0.905794,0.887948,0.050671
0,df_morgan.csv,,SupportVectorMachine,0.875912,0.879293,0.86842,0.891135,0.875664,0.92965,0.931571,0.919954,0.943486,0.929438,0.898795,0.902778,0.882353,0.924171,0.89836,0.887353,0.061351
11,df_circular.csv,PCA,RandomForestClassifier,0.874704,0.877803,0.868589,0.887577,0.874499,0.999094,0.999108,0.998811,0.999405,0.999089,0.898795,0.902778,0.882353,0.924171,0.89836,0.88675,0.142208
8,df_circular.csv,PCA,SupportVectorMachine,0.874104,0.877486,0.867543,0.888169,0.873875,0.888889,0.891765,0.881908,0.901844,0.88869,0.898795,0.902778,0.882353,0.924171,0.89836,0.886449,0.016915
4,df_morgan.csv,SelectKBest,RandomForestClassifier,0.873791,0.87586,0.875377,0.87687,0.873734,0.989432,0.989543,0.993998,0.985128,0.989498,0.898795,0.900943,0.896714,0.905213,0.898685,0.886293,0.132345
1,df_morgan.csv,SelectKBest,SupportVectorMachine,0.875314,0.878568,0.868513,0.889363,0.875087,0.902476,0.904804,0.896612,0.913147,0.902312,0.896386,0.900693,0.878378,0.924171,0.895909,0.88585,0.031031
2,df_morgan.csv,PCA,SupportVectorMachine,0.873499,0.875779,0.874008,0.878061,0.873416,0.887379,0.889743,0.884254,0.8953,0.887258,0.896386,0.900232,0.881818,0.919431,0.89599,0.884942,0.015891
