# Example Notebook for classifier finder

## 0. set log-level of sam_ml library 

(e.g. debugging)

NOTE: has to happen before importing the sam_ml library

In [1]:
import os
os.environ["SAM_ML_LOG_LEVEL"] = "info"

## 1. libraries

In [2]:
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

from sam_ml.models import CTest, LR

## 2. data

In [3]:
df = load_iris()
y = pd.Series(df.target)
X = pd.DataFrame(df.data, columns=df.feature_names)
x_train, x_test, y_train, y_test = train_test_split(X,y, train_size=0.80, random_state=42)

## 3. model

### 3.1. create tester class object

CTest is an auto-ml class. You can use it to compare different models and find the best one for your data.

**models**: list of *Classifier* subclass objects or *'all'* (for all integrated wrapper class classifier) or *'basic'* (for a smaller selection of basic classifier)

**vectorizer**, **scaler**, **selector**, **sampler**: CTest init creates *Pipeline* objects out of the given models with the data class parameters given
(look into the *iris_pipeline.ipynb* notebook to see the possible parameters)

In [4]:
tester = CTest("all", scaler="minmax")

get all models in the CTest class object

In [5]:
tester.models

{'LogisticRegression': Pipeline(vectorizer=None, scaler=Scaler(scaler='minmax', clip=False, copy=True, feature_range=(0, 1), ), selector=None, sampler=None, model=LogisticRegression(random_state=42), model_name='LogisticRegression (pipeline)'),
 'QuadraticDiscriminantAnalysis': Pipeline(vectorizer=None, scaler=Scaler(scaler='minmax', clip=False, copy=True, feature_range=(0, 1), ), selector=None, sampler=None, model=QuadraticDiscriminantAnalysis(), model_name='QuadraticDiscriminantAnalysis (pipeline)'),
 'LinearDiscriminantAnalysis': Pipeline(vectorizer=None, scaler=Scaler(scaler='minmax', clip=False, copy=True, feature_range=(0, 1), ), selector=None, sampler=None, model=LinearDiscriminantAnalysis(), model_name='LinearDiscriminantAnalysis (pipeline)'),
 'MLP Classifier': Pipeline(vectorizer=None, scaler=Scaler(scaler='minmax', clip=False, copy=True, feature_range=(0, 1), ), selector=None, sampler=None, model=MLPClassifier(random_state=42), model_name='MLP Classifier (pipeline)'),
 'Line

you can add models

In [6]:
tester.add_model(LR(model_name="LogisticRegression (elasticnet penalty)", penalty="elasticnet", solver="saga", l1_ratio=0.5))

In [7]:
tester.models

{'LogisticRegression': Pipeline(vectorizer=None, scaler=Scaler(scaler='minmax', clip=False, copy=True, feature_range=(0, 1), ), selector=None, sampler=None, model=LogisticRegression(random_state=42), model_name='LogisticRegression (pipeline)'),
 'QuadraticDiscriminantAnalysis': Pipeline(vectorizer=None, scaler=Scaler(scaler='minmax', clip=False, copy=True, feature_range=(0, 1), ), selector=None, sampler=None, model=QuadraticDiscriminantAnalysis(), model_name='QuadraticDiscriminantAnalysis (pipeline)'),
 'LinearDiscriminantAnalysis': Pipeline(vectorizer=None, scaler=Scaler(scaler='minmax', clip=False, copy=True, feature_range=(0, 1), ), selector=None, sampler=None, model=LinearDiscriminantAnalysis(), model_name='LinearDiscriminantAnalysis (pipeline)'),
 'MLP Classifier': Pipeline(vectorizer=None, scaler=Scaler(scaler='minmax', clip=False, copy=True, feature_range=(0, 1), ), selector=None, sampler=None, model=MLPClassifier(random_state=42), model_name='MLP Classifier (pipeline)'),
 'Line

you can remove models

In [8]:
tester.remove_model("LogisticRegression (elasticnet penalty)")

In [9]:
tester.models

{'LogisticRegression': Pipeline(vectorizer=None, scaler=Scaler(scaler='minmax', clip=False, copy=True, feature_range=(0, 1), ), selector=None, sampler=None, model=LogisticRegression(random_state=42), model_name='LogisticRegression (pipeline)'),
 'QuadraticDiscriminantAnalysis': Pipeline(vectorizer=None, scaler=Scaler(scaler='minmax', clip=False, copy=True, feature_range=(0, 1), ), selector=None, sampler=None, model=QuadraticDiscriminantAnalysis(), model_name='QuadraticDiscriminantAnalysis (pipeline)'),
 'LinearDiscriminantAnalysis': Pipeline(vectorizer=None, scaler=Scaler(scaler='minmax', clip=False, copy=True, feature_range=(0, 1), ), selector=None, sampler=None, model=LinearDiscriminantAnalysis(), model_name='LinearDiscriminantAnalysis (pipeline)'),
 'MLP Classifier': Pipeline(vectorizer=None, scaler=Scaler(scaler='minmax', clip=False, copy=True, feature_range=(0, 1), ), selector=None, sampler=None, model=MLPClassifier(random_state=42), model_name='MLP Classifier (pipeline)'),
 'Line

## 3.1. evaluation of the models

CTest has 3 ways implemented to evaluate the models. Depending on the dataset you can choose which one to use

### 3.1.1. one-vs-all cross validation

**Concept:**

The model will be trained on all datapoints except one and then tested on this last one. This will be repeated for all datapoints so that we have our predictions for all datapoints.

**Advantage:** optimal use of information for training

**Disadvantage:** long train time

This concept is very useful for small datasets (datapoints < 150) because the long train time is still not too long and especially with a small amount of information for the model, it is important to use all the information one has for the training.

In [10]:
tester.eval_models_cv(X ,y , avg="macro", small_data_eval=True)
tester.output_scores_as_pd(sort_by="recall", console_out=False)

Crossvalidation:   0%|          | 0/21 [00:00<?, ?it/s]

LogisticRegression (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

QuadraticDiscriminantAnalysis (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

LinearDiscriminantAnalysis (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

MLP Classifier (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

LinearSupportVectorClassifier (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

DecisionTreeClassifier (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

RandomForestClassifier (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

SupportVectorClassifier (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

GradientBoostingMachine (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

AdaBoostClassifier (DTC based) (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

AdaBoostClassifier (RFC based) (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

AdaBoostClassifier (mixed based) (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

KNeighborsClassifier (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

ExtraTreesClassifier (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

GaussianNB (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

BernoulliNB (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

GaussianProcessClassifier (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

BaggingClassifier (DTC based) (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

BaggingClassifier (RFC based) (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

BaggingClassifier (mixed based) (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

XGBClassifier (pipeline):   0%|          | 0/150 [00:00<?, ?it/s]

Unnamed: 0,accuracy,precision,recall,s_score,l_score,train_score,train_time
LinearDiscriminantAnalysis,0.98,0.980125,0.98,0.9904373,1.0,0.98,0:00:00
QuadraticDiscriminantAnalysis,0.973333,0.973825,0.973333,0.9894085,1.0,0.980045,0:00:00
BaggingClassifier (DTC based),0.96,0.96,0.96,0.9874448,1.0,0.999911,0:00:00
AdaBoostClassifier (RFC based),0.953333,0.953448,0.953333,0.9861117,1.0,1.0,0:00:00
KNeighborsClassifier,0.953333,0.953448,0.953333,0.9861117,1.0,0.960045,0:00:00
BaggingClassifier (RFC based),0.953333,0.953448,0.953333,0.9861117,1.0,0.988456,0:00:00
GaussianNB,0.953333,0.953448,0.953333,0.9861117,1.0,0.959418,0:00:00
RandomForestClassifier,0.953333,0.953448,0.953333,0.9861117,1.0,1.0,0:00:00
SupportVectorClassifier,0.953333,0.953448,0.953333,0.9861117,1.0,0.979732,0:00:00
GradientBoostingMachine,0.953333,0.953448,0.953333,0.9861117,1.0,1.0,0:00:00


### 3.1.2. multiple split crossvalidation

does **cv_num** splits and takes the average values for evaluating the model

In [11]:
tester.eval_models_cv(X, y, avg="macro", small_data_eval=False, cv_num=10)
tester.output_scores_as_pd(sort_by="recall", console_out=False)

Crossvalidation:   0%|          | 0/21 [00:00<?, ?it/s]

Unnamed: 0,accuracy,precision,recall,s_score,l_score,train_score,train_time
LinearDiscriminantAnalysis,0.966667,0.85,0.833333,0.698512,0.7,0.979259,0:00:00
QuadraticDiscriminantAnalysis,0.966667,0.85,0.833333,0.698512,0.7,0.982222,0:00:00
SupportVectorClassifier,0.953333,0.8,0.776667,0.599006,0.6,0.979259,0:00:00
AdaBoostClassifier (RFC based),0.946667,0.8,0.773333,0.599004,0.6,1.0,0:00:00
RandomForestClassifier,0.946667,0.8,0.773333,0.599004,0.6,1.0,0:00:00
KNeighborsClassifier,0.946667,0.8,0.773333,0.599005,0.6,0.964444,0:00:00
ExtraTreesClassifier,0.946667,0.8,0.773333,0.599005,0.6,1.0,0:00:00
GradientBoostingMachine,0.926667,0.8,0.763333,0.598997,0.6,1.0,0:00:00
BaggingClassifier (DTC based),0.92,0.8,0.76,0.598951,0.6,0.992593,0:00:00
DecisionTreeClassifier,0.953333,0.766667,0.743333,0.499746,0.5,1.0,0:00:00


### 3.1.3. evaluate on given train-test-split

sometimes it only makes sense to split a dataset in one way so that cross validation is useless

In [12]:
tester.eval_models(x_train, y_train, x_test, y_test, avg="macro")
tester.output_scores_as_pd(sort_by="recall", console_out=False)

Crossvalidation:   0%|          | 0/21 [00:00<?, ?it/s]

Unnamed: 0,accuracy,precision,recall,s_score,l_score,train_score,train_time
AdaBoostClassifier (RFC based),1.0,1.0,1.0,0.9926,1.0,1.0,0:00:00
GradientBoostingMachine,1.0,1.0,1.0,0.9926,1.0,1.0,0:00:00
BaggingClassifier (RFC based),1.0,1.0,1.0,0.9926,1.0,0.966667,0:00:00
BaggingClassifier (DTC based),1.0,1.0,1.0,0.9926,1.0,1.0,0:00:00
GaussianNB,1.0,1.0,1.0,0.9926,1.0,0.95,0:00:00
ExtraTreesClassifier,1.0,1.0,1.0,0.9926,1.0,1.0,0:00:00
KNeighborsClassifier,1.0,1.0,1.0,0.9926,1.0,0.958333,0:00:00
AdaBoostClassifier (DTC based),1.0,1.0,1.0,0.9926,1.0,0.966667,0:00:00
XGBClassifier,1.0,1.0,1.0,0.9926,1.0,1.0,0:00:00
SupportVectorClassifier,1.0,1.0,1.0,0.9926,1.0,0.975,0:00:00


### 3.2. find best model

**Idea:**

The find_best_model_randomCV method is using randomCVsearch for every modeltype to find its best hyperparameters and afterwards comparing the results of the modeltypes. The randomCVsearch should be just for a few models per modeltype to save time.

**Useful parameters:**

- you can change the used crossvalidation with **small_data_eval** and if *small_data_eval=False*, you can set the number of CVs with **cv_num**

- with the **scoring** parameter you can choose which metric to look at for searching the best model (you can use **avg**, **secondary_scoring**, **strength**, and **pos_label** to more specify it)

- you can change the number of parameter sets to test for each model type with **n_trails**

- you can change with **leave_loadbar** if the loading bar from the  randomCVsearches of each model type shall disappear after they finished

In [13]:
scores = tester.find_best_model_randomCV(x_train, y_train, x_test, y_test, scoring="recall", avg="macro", small_data_eval=False)

randomCVsearch:   0%|          | 0/21 [00:00<?, ?it/s]

randomCVsearch (LogisticRegression (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:34,878 - sam_ml.models.ClassifierTest - INFO - LogisticRegression (pipeline) - score: 0.9506172839506172 (recall) - parameters: {'C': 63.512210106407046, 'penalty': 'l2', 'solver': 'saga'}


randomCVsearch (QuadraticDiscriminantAnalysis (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:35,018 - sam_ml.models.ClassifierTest - INFO - QuadraticDiscriminantAnalysis (pipeline) - score: 0.9598765432098766 (recall) - parameters: {'reg_param': 0.0}


randomCVsearch (LinearDiscriminantAnalysis (pipeline)):   0%|          | 0/4 [00:00<?, ?it/s]

2023-06-27 12:34:35,126 - sam_ml.models.ClassifierTest - INFO - LinearDiscriminantAnalysis (pipeline) - score: 0.9814814814814815 (recall) - parameters: {'solver': 'svd'}


randomCVsearch (MLP Classifier (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:35,607 - sam_ml.models.ClassifierTest - INFO - MLP Classifier (pipeline) - score: 0.9363298738298736 (recall) - parameters: {'activation': 'relu', 'alpha': 0.0001, 'hidden_layer_sizes': (100,), 'learning_rate': 'constant', 'solver': 'adam'}


randomCVsearch (LinearSupportVectorClassifier (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:35,831 - sam_ml.models.ClassifierTest - INFO - LinearSupportVectorClassifier (pipeline) - score: 0.9907407407407408 (recall) - parameters: {'C': 635.1221010640695, 'dual': True, 'penalty': 'l2'}


randomCVsearch (DecisionTreeClassifier (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:35,967 - sam_ml.models.ClassifierTest - INFO - DecisionTreeClassifier (pipeline) - score: 0.9517195767195767 (recall) - parameters: {'criterion': 'entropy', 'max_depth': 9, 'min_samples_leaf': 5, 'min_samples_split': 3}


randomCVsearch (RandomForestClassifier (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:36,623 - sam_ml.models.ClassifierTest - INFO - RandomForestClassifier (pipeline) - score: 0.9573412698412698 (recall) - parameters: {'bootstrap': True, 'criterion': 'gini', 'max_depth': 7, 'min_samples_leaf': 4, 'min_samples_split': 4, 'n_estimators': 164}


randomCVsearch (SupportVectorClassifier (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:36,940 - sam_ml.models.ClassifierTest - INFO - SupportVectorClassifier (pipeline) - score: 0.9560185185185185 (recall) - parameters: {'C': 24.81040974867808, 'gamma': 0.29154431891537513, 'kernel': 'sigmoid', 'probability': True}


randomCVsearch (GradientBoostingMachine (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:38,103 - sam_ml.models.ClassifierTest - INFO - GradientBoostingMachine (pipeline) - score: 0.9529320987654321 (recall) - parameters: {'criterion': 'friedman_mse', 'learning_rate': 0.009470976192691145, 'loss': 'deviance', 'max_depth': 8, 'max_features': 'auto', 'min_samples_leaf': 13, 'min_samples_split': 10, 'n_estimators': 256, 'subsample': 0.7195154778955838}


randomCVsearch (AdaBoostClassifier (DTC based) (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:39,526 - sam_ml.models.ClassifierTest - INFO - AdaBoostClassifier (DTC based) (pipeline) - score: 0.9529320987654321 (recall) - parameters: {'algorithm': 'SAMME', 'estimator': DecisionTreeClassifier(max_depth=2), 'learning_rate': 1.1666347719377983, 'n_estimators': 871}


randomCVsearch (AdaBoostClassifier (RFC based) (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:43,914 - sam_ml.models.ClassifierTest - INFO - AdaBoostClassifier (RFC based) (pipeline) - score: 0.9529320987654321 (recall) - parameters: {'algorithm': 'SAMME.R', 'estimator': RandomForestClassifier(max_depth=8), 'learning_rate': 1.4075303082606903, 'n_estimators': 131}


randomCVsearch (AdaBoostClassifier (mixed based) (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:47,303 - sam_ml.models.ClassifierTest - INFO - AdaBoostClassifier (mixed based) (pipeline) - score: 0.8192239858906526 (recall) - parameters: {'algorithm': 'SAMME.R', 'estimator': LogisticRegression(), 'learning_rate': 1.0, 'n_estimators': 50}


randomCVsearch (KNeighborsClassifier (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:47,558 - sam_ml.models.ClassifierTest - INFO - KNeighborsClassifier (pipeline) - score: 0.9542548500881834 (recall) - parameters: {'leaf_size': 19, 'n_neighbors': 5, 'p': 4, 'weights': 'distance'}


randomCVsearch (ExtraTreesClassifier (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:48,218 - sam_ml.models.ClassifierTest - INFO - ExtraTreesClassifier (pipeline) - score: 0.9529320987654321 (recall) - parameters: {'bootstrap': False, 'criterion': 'gini', 'max_depth': 5, 'min_samples_leaf': 1, 'min_samples_split': 2, 'n_estimators': 100}


randomCVsearch (GaussianNB (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:48,444 - sam_ml.models.ClassifierTest - INFO - GaussianNB (pipeline) - score: 0.9380511463844797 (recall) - parameters: {'var_smoothing': 1e-09}


randomCVsearch (BernoulliNB (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:48,572 - sam_ml.models.ClassifierTest - INFO - BernoulliNB (pipeline) - score: 0.3694885361552028 (recall) - parameters: {'binarize': 0, 'fit_prior': True}


randomCVsearch (GaussianProcessClassifier (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:48,802 - sam_ml.models.ClassifierTest - INFO - GaussianProcessClassifier (pipeline) - score: 0.9321496404829738 (recall) - parameters: {'max_iter_predict': 47, 'multi_class': 'one_vs_one'}


randomCVsearch (BaggingClassifier (DTC based) (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:34:49,822 - sam_ml.models.ClassifierTest - INFO - BaggingClassifier (DTC based) (pipeline) - score: 0.9529320987654321 (recall) - parameters: {'bootstrap': True, 'bootstrap_features': False, 'estimator': DecisionTreeClassifier(max_depth=4), 'max_features': 1.0, 'max_samples': 1.0, 'n_estimators': 10}


randomCVsearch (BaggingClassifier (RFC based) (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:35:00,466 - sam_ml.models.ClassifierTest - INFO - BaggingClassifier (RFC based) (pipeline) - score: 0.9449955908289241 (recall) - parameters: {'bootstrap': True, 'bootstrap_features': False, 'estimator': RandomForestClassifier(max_depth=4), 'max_features': 1.0, 'max_samples': 1.0, 'n_estimators': 10}


randomCVsearch (BaggingClassifier (mixed based) (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:35:02,549 - sam_ml.models.ClassifierTest - INFO - BaggingClassifier (mixed based) (pipeline) - score: 0.9542548500881834 (recall) - parameters: {'bootstrap': True, 'bootstrap_features': True, 'estimator': KNeighborsClassifier(), 'max_features': 4, 'max_samples': 0.373818018663584, 'n_estimators': 205}


randomCVsearch (XGBClassifier (pipeline)):   0%|          | 0/5 [00:00<?, ?it/s]

2023-06-27 12:35:03,014 - sam_ml.models.ClassifierTest - INFO - XGBClassifier (pipeline) - score: 0.9459876543209876 (recall) - parameters: {'colsample_bytree': 1.0, 'gamma': 0.0, 'learning_rate': 0.1, 'max_depth': 6, 'min_child_weight': 1, 'n_estimators': 100, 'reg_alpha': 0, 'reg_lambda': 1.0}
2023-06-27 12:35:03,083 - sam_ml.models.ClassifierTest - INFO - best model type LogisticRegression - recall: 1.0 - parameters: {'C': 63.512210106407046, 'penalty': 'l2', 'solver': 'saga'}


In [14]:
tester.output_scores_as_pd(sort_by=["recall", "train_time"], console_out=False)

Unnamed: 0,accuracy,precision,recall,s_score,l_score,train_time,train_score,best_score (rCVs),best_hyperparameters (rCVs)
LogisticRegression,1.0,1.0,1.0,0.9926,1.0,0:00:00,0.966667,0.950617,"{'C': 63.512210106407046, 'penalty': 'l2', 'so..."
LinearDiscriminantAnalysis,1.0,1.0,1.0,0.9926,1.0,0:00:00,0.975,0.981481,{'solver': 'svd'}
DecisionTreeClassifier,1.0,1.0,1.0,0.9926,1.0,0:00:00,0.95,0.95172,"{'criterion': 'entropy', 'max_depth': 9, 'min_..."
RandomForestClassifier,1.0,1.0,1.0,0.9926,1.0,0:00:00,0.966667,0.957341,"{'bootstrap': True, 'criterion': 'gini', 'max_..."
GradientBoostingMachine,1.0,1.0,1.0,0.9926,1.0,0:00:00,0.958333,0.952932,"{'criterion': 'friedman_mse', 'learning_rate':..."
AdaBoostClassifier (RFC based),1.0,1.0,1.0,0.9926,1.0,0:00:00,1.0,0.952932,"{'algorithm': 'SAMME.R', 'estimator': RandomFo..."
KNeighborsClassifier,1.0,1.0,1.0,0.9926,1.0,0:00:00,1.0,0.954255,"{'leaf_size': 19, 'n_neighbors': 5, 'p': 4, 'w..."
ExtraTreesClassifier,1.0,1.0,1.0,0.9926,1.0,0:00:00,0.983333,0.952932,"{'bootstrap': False, 'criterion': 'gini', 'max..."
GaussianNB,1.0,1.0,1.0,0.9926,1.0,0:00:00,0.95,0.938051,{'var_smoothing': 1e-09}
GaussianProcessClassifier,1.0,1.0,1.0,0.9926,1.0,0:00:00,0.941667,0.93215,"{'max_iter_predict': 47, 'multi_class': 'one_v..."
