# Overview

**GENERAL THOUGHTS:**  
Use PyCaret (pycaret.classification) as a general way to investigate which algorithm, with automated pre-processing are (well) suited for the given tasks, as well as to investigate the potential performance based on a (large) varity of model hyper-parameters.
The notebook includes multiple scenarios of using PyCaret:
- including and excluding custom data pre-processing (see below)
- including auto pre-processing by PyCaret
- including multiple classifiers by using:
  - multiple ml algorithms with a base configuration of their hyper-parameters defined within PyCaret
  - "standard" HPO for each algorithm with a defined search space by PyCaret and random search as search strategy with 15 random hyper-parameter configurations for each algorithm. https://github.com/pycaret/pycaret/blob/master/pycaret/containers/models/classification.py

**CUSTOM DATA PREPROCESSING:**

Imbalanced data:
- over_sampling for imbalanced data
- cost-sensitive learning for imbalanced data

**PyCaret MULTI-CLASS CLASSIFIERS:**
Class weights are not considered during training when using `compare_models` or `tune_models`. As an evaluation metric `f1_macro` was choosen, which equally consideres all classes. Since training is not optimizied regarding this aspect, results could be improved for training individual models with 'create_model` which supports class weights. For comparison we neglect class weights for the reason of an easy use of PyCaret. The effect of considering class weights can vary highly depeding on the machine learning algorithm (e.g. splits in decision trees, distance calculation in KNN).
- Overview of models to be considered using PyCaret:  
  - [X] RandomForest
  - [X] ExtraTrees
  - [X] XGBoost
  - [X] LightGBM
  - [X] KNeughbors
  - [X] CatBoost
  - [X] Decision Tree Classifier
  - [X] Gradient Boosting Classifier
  - [X] Extreme Gradient Boosting
  - [X] catboost	CatBoost Classifier
  - [X] Extra Trees Classifier
  - [X] Random Forest Classifier
  - [X] K Neighbors Classifier
  - [X] Linear Discriminant Analysis
  - [X] Ridge Classifier
  - [X] Naive Bayes
  - [X] Quadratic Discriminant Analysis
  - [X] Ada Boost Classifier
  - [X] Light Gradient Boosting Machine
  - [X] Logistic Regression
  - [X] SVM - Linear Kernel
  - [X] Dummy Classifier

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import mlflow

import os
from datetime import datetime
import yaml
import json
import copy

In [2]:
import sklearn
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.compose import ColumnTransformer
from sklearn.model_selection import train_test_split
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import LabelEncoder, OrdinalEncoder, OneHotEncoder, MinMaxScaler, StandardScaler
from sklearn.preprocessing import PowerTransformer
from sklearn.metrics import classification_report, f1_score
from sklearn.utils import class_weight
from sklearn.utils.class_weight import compute_sample_weight

import imblearn
from imblearn.over_sampling import RandomOverSampler

import pycaret
# import ClassificationExperiment
from pycaret.classification import *
# from pycaret.classification import ClassificationExperiment

# check installed version
pycaret.__version__

'3.3.0'

In [3]:
# import custom functions
import sys
sys.path.append('/Users/dat/Library/CloudStorage/OneDrive-foryouandyourcustomers/GitHub/AutomatedPackagingCategories_Showcase/ml_packaging_classification/src')
import utils

In [4]:
# General settings within the data science workflow

pd.set_option('display.max_columns', None)

SEED = 42

# NOTE: for dev only
subsample = False
subsample_size = 100  # subsample subset of data for faster demo or development


# Get current date and time
now = datetime.now()
# Format date and time
formatted_date_time = now.strftime("%Y-%m-%d_%H:%M:%S")
print(formatted_date_time)

2024-04-03_16:23:13


# Load and prepare data

In [5]:
df = pd.read_csv('../../data/output/df_ml.csv', sep='\t')

df['material_number'] = df['material_number'].astype('object')

df_sub = df[[
    'material_number',
    'brand',
    'product_area',
    'core_segment',
    'component',
    'manufactoring_location',
    'characteristic_value',
    'material_weight', 
    'packaging_code',
    'packaging_category',
]]

## Transform to PyCaret data format
When you execute the setup function in PyCaret it splits the data into train and test sets (70/30) by default. Cross-validation is then done on train set only.
The hold-out set is there just for an additional sense of surety.

In [None]:
df_sub.head()

Unnamed: 0,material_number,brand,product_area,core_segment,component,manufactoring_location,characteristic_value,material_weight,packaging_code,packaging_category
0,75116293,BOT,PA5,Metal Grinding,6035765C21,Distribution Center,CORRUGATED,85.0,PCode_304109,Countertop display
1,75116293,BOT,PA5,Metal Grinding,6035940565,Distribution Center,WOOD FREE,0.54,PCode_440854,Countertop display
2,75116293,BOT,PA5,Metal Grinding,6035822768,Distribution Center,MCB/GT2,22.9,PCode_834649,Countertop display
3,75116293,BOT,PA5,Metal Grinding,6035822768,Distribution Center,MCB/GT2,22.9,PCode_834649,Countertop display
4,75116293,BOT,PA5,Metal Grinding,6035765P54,Distribution Center,CORRUGATED,85.0,PCode_304109,Countertop display


# PyCaret AutoML: without custom pre-processing; unrestricted selection of models including HPO

## PyCaret Base Models Training Pipeline

In [8]:
# init the ClassificationExperiment class
exp_base = ClassificationExperiment()

print(f"Experiment Type: {type(exp_base)}") # check the type of exp

Experiment Type: <class 'pycaret.classification.oop.ClassificationExperiment'>


In [9]:
# init setup on exp
exp_base.setup(
    df_sub,
    target='packaging_category',
    train_size=0.8,
    fold=5,
    fold_strategy='stratifiedkfold',
    session_id=123
)

Unnamed: 0,Description,Value
0,Session id,123
1,Target,packaging_category
2,Target type,Multiclass
3,Target mapping,"Blister and Insert Card: 0, Blister and sealed blist: 1, Book packaging: 2, Cardb. Sleeve w - w/o Shr.: 3, Cardboard hanger w/o bag: 4, Carton cover (Lid box): 5, Carton tube with or w/o: 6, Case: 7, Corrugated carton: 8, Countertop display: 9, Envelope: 10, Fabric packaging: 11, Folding carton: 12, Hanger/ Clip: 13, Metal Cassette: 14, Paperboard pouch: 15, Plastic Box: 16, Plastic Cassette: 17, Plastic Pouch: 18, Plastic bag with header: 19, Shrink film and insert o: 20, Skincard: 21, TightPack: 22, Trap Card: 23, Trap Folding Card: 24, Tray Packer: 25, Tube: 26, Unpacked: 27, Wooden box: 28"
4,Original data shape,"(82977, 10)"
5,Transformed data shape,"(82977, 65)"
6,Transformed train set shape,"(66381, 65)"
7,Transformed test set shape,"(16596, 65)"
8,Numeric features,1
9,Categorical features,8


<pycaret.classification.oop.ClassificationExperiment at 0x17f8364d0>

In [10]:
# add sklearn f1_score macro average
exp_base.add_metric(id='f1_macro', name='F1_Macro', score_func=utils.f1_score_macro, greater_is_better=True)
exp_base.remove_metric('MCC')
exp_base.remove_metric('Kappa')
exp_base.get_metrics()

Unnamed: 0_level_0,Name,Display Name,Score Function,Scorer,Target,Args,Greater is Better,Multiclass,Custom
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
acc,Accuracy,Accuracy,<function accuracy_score at 0x17c557420>,accuracy,pred,{},True,True,False
auc,AUC,AUC,<pycaret.internal.metrics.BinaryMulticlassScor...,"make_scorer(roc_auc_score, response_method='pr...",pred_proba,"{'average': 'weighted', 'multi_class': 'ovr'}",True,True,False
recall,Recall,Recall,<pycaret.internal.metrics.BinaryMulticlassScor...,"make_scorer(recall_score, response_method='pre...",pred,{'average': 'weighted'},True,True,False
precision,Precision,Prec.,<pycaret.internal.metrics.BinaryMulticlassScor...,"make_scorer(precision_score, response_method='...",pred,{'average': 'weighted'},True,True,False
f1,F1,F1,<pycaret.internal.metrics.BinaryMulticlassScor...,"make_scorer(f1_score, response_method='predict...",pred,{'average': 'weighted'},True,True,False
f1_macro,F1_Macro,F1_Macro,<pycaret.internal.metrics.EncodedDecodedLabels...,"make_scorer(f1_score_macro, response_method='p...",pred,{},True,True,True


In [11]:
# train and compare base models

# #NOTE: class_weights are not yet supported in compare_models
# class_weights = class_weight.compute_class_weight(
#     class_weight="balanced",
#     classes=np.unique(df_sub.iloc[:, -1]),
#     y=df_sub.iloc[:, -1]
# )
# class_weight_dict = dict(enumerate(class_weights))
# basemodels = exp_base.compare_models(include=['dt', 'rf'], sort='F1_Macro', fit_kwargs={'class_weight': class_weight_dict})
# #FIXME: To be removed, only for testing of using class weights. Class weights are not supported yet by compare_models and tune_models
# lr_clf = exp_base.create_model('lr', class_weight=class_weight_dict)

base_models = exp_base.compare_models(sort='F1_Macro', n_select=exp_base.models().shape[0])

Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,F1_Macro,TT (Sec)
dt,Decision Tree Classifier,0.8483,0.0,0.8483,0.9836,0.9069,0.8136,0.47
rf,Random Forest Classifier,0.911,0.0,0.911,0.9291,0.9152,0.7616,0.72


In [12]:
leaderboard_base = exp_base.get_leaderboard()
leaderboard_base.sort_values(by='F1_Macro', ascending=False)

Unnamed: 0_level_0,Model Name,Model,Accuracy,AUC,Recall,Prec.,F1,F1_Macro
Index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0,Decision Tree Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.8483,0.0,0.8483,0.9836,0.9069,0.8136
1,Random Forest Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.911,0.0,0.911,0.9291,0.9152,0.7616


## PyCaret Tuned Models Training Pipeline

In [13]:
exp_tuned = exp_base

print(f"Experiment Type: {type(exp_tuned)}") # check the type of exp

# NOTE: Uncomment to define separat pycaret experiment for tuning
# # init the ClassificationExperiment class
# exp_tuned = ClassificationExperiment()
# print(f"Experiment Type: {type(exp_tuned)}") # check the type of exp
# # init setup on exp
# exp_tuned.setup(
#     df_train,
#     target='packaging_category',
#     train_size=0.8,
#     fold=5,
#     fold_strategy='stratifiedkfold',
#     session_id=456
# )
# # add sklearn f1_score macro average
# exp_tuned.add_metric(id='f1_macro', name='F1_Macro', score_func=utils.f1_score_macro, greater_is_better=True)
# exp_tuned.remove_metric('MCC')
# exp_tuned.remove_metric('Kappa')
# exp_tuned.get_metrics()

Experiment Type: <class 'pycaret.classification.oop.ClassificationExperiment'>


In [14]:
# Use previous created base models (with a pre-defined hyper-parameter set) to tune them with a pre-defined hyper-parameter seach space

# basemodels = exp_base.compare_models(sort='F1_Macro') # define base models

tuned_models = []
for i in base_models:
    print(f"##### Model Algorithm: {i.__class__} #####")
    tuned_model = exp_tuned.tune_model(estimator=i, optimize='F1_Macro', search_library='scikit-learn', search_algorithm='random', n_iter=20)
    tuned_models.append(tuned_model)  # Append the tuned model to the list
    print("\n")

##### Model Algorithm: <class 'sklearn.tree._classes.DecisionTreeClassifier'> #####


Unnamed: 0_level_0,Accuracy,AUC,Recall,Prec.,F1,F1_Macro
Fold,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,0.2711,0.0,0.2711,0.1188,0.1564,0.0405
1,0.2126,0.0,0.2126,0.0728,0.0965,0.0273
2,0.2127,0.0,0.2127,0.0735,0.0969,0.0275
3,0.2118,0.0,0.2118,0.073,0.0962,0.0272
4,0.2117,0.0,0.2117,0.0726,0.096,0.0271
Mean,0.224,0.0,0.224,0.0822,0.1084,0.0299
Std,0.0236,0.0,0.0236,0.0183,0.024,0.0053


Fitting 5 folds for each of 2 candidates, totalling 10 fits
Original model was better than the tuned model, hence it will be returned. NOTE: The display metrics are for the tuned model (not the original one).


##### Model Algorithm: <class 'sklearn.ensemble._forest.RandomForestClassifier'> #####


Unnamed: 0_level_0,Accuracy,AUC,Recall,Prec.,F1,F1_Macro
Fold,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,0.7829,0.0,0.7829,0.8401,0.789,0.4652
1,0.7793,0.0,0.7793,0.8388,0.7875,0.4657
2,0.7785,0.0,0.7785,0.8362,0.7851,0.4581
3,0.785,0.0,0.785,0.8463,0.7964,0.4714
4,0.785,0.0,0.785,0.8682,0.7938,0.4856
Mean,0.7822,0.0,0.7822,0.8459,0.7904,0.4692
Std,0.0028,0.0,0.0028,0.0116,0.0041,0.0092


Fitting 5 folds for each of 2 candidates, totalling 10 fits
Original model was better than the tuned model, hence it will be returned. NOTE: The display metrics are for the tuned model (not the original one).




## PyCaret Best Model Evaluation

In [15]:
leaderboard_tuned = exp_tuned.get_leaderboard()
leaderboard_tuned.sort_values(by='F1_Macro', ascending=False)

Unnamed: 0_level_0,Model Name,Model,Accuracy,AUC,Recall,Prec.,F1,F1_Macro
Index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0,Decision Tree Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.8483,0.0,0.8483,0.9836,0.9069,0.8136
3,Decision Tree Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.8483,0.0,0.8483,0.9836,0.9069,0.8136
1,Random Forest Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.911,0.0,0.911,0.9291,0.9152,0.7616
5,Random Forest Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.911,0.0,0.911,0.9291,0.9152,0.7616
4,Random Forest Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.7822,0.0,0.7822,0.8459,0.7904,0.4692
2,Decision Tree Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.224,0.0,0.224,0.0822,0.1084,0.0299


In [16]:
# returns best model based on the defined metric in the given pycaret experiment
best_model = exp_tuned.automl(optimize='F1_Macro')

# predict on test set
holdout_pred = exp_tuned.predict_model(best_model)

# show predictions df
# holdout_pred.head()

# print classification report for holdout test data
print(classification_report(holdout_pred['packaging_category'], holdout_pred['prediction_label']))

Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,F1_Macro
0,Decision Tree Classifier,0.9025,0.9508,0.9025,0.9839,0.9386,0.8376


                            precision    recall  f1-score   support

   Blister and Insert Card       0.97      0.92      0.94      1749
  Blister and sealed blist       0.97      0.93      0.95      1582
            Book packaging       1.00      0.50      0.67         2
Cardb. Sleeve w - w/o Shr.       0.95      0.84      0.89       135
  Cardboard hanger w/o bag       0.91      0.88      0.89        80
    Carton cover (Lid box)       0.97      0.88      0.92       130
   Carton tube with or w/o       1.00      0.67      0.80         9
                      Case       0.91      0.79      0.85        97
         Corrugated carton       0.99      0.92      0.96       774
        Countertop display       0.96      0.90      0.93        30
                  Envelope       1.00      0.98      0.99        59
          Fabric packaging       1.00      1.00      1.00         3
            Folding carton       1.00      0.86      0.92      1644
              Hanger/ Clip       1.00      0.94

# PyCaret AutoML: custom pre-processing; unrestricted selection of models including HPO

## Define features and target, performe oversampling, split data into train and test

In [17]:
# Define features and target
X = df_sub.iloc[:, :-1]
y = df_sub.iloc[:, -1]  # the last column is the target

In [18]:
distribution_classes = y.value_counts()
print('Class distribution before oversmapling')
print(distribution_classes.to_dict())

# NOTE: Oversampling so each class has at least 100 sample; to properly apply CV and evaluation
dict_oversmapling = {
    'Metal Cassette': 100,
    'Carton tube with or w/o': 100,
    'Wooden box': 100,
    'Fabric packaging': 100,
    'Book packaging': 100
}
# define oversampling strategy
oversampler = RandomOverSampler(sampling_strategy=dict_oversmapling, random_state=SEED)
# fit and apply the transform
X_oversample, y_oversample = oversampler.fit_resample(X, y)

distribution_classes = y_oversample.value_counts()
print('\n')
print('Class distribution after oversmapling')
print(distribution_classes.to_dict())

Class distribution before oversmapling
{'Hanger/ Clip': 13543, 'Tube': 11687, 'Blister and Insert Card': 8744, 'TightPack': 8296, 'Folding carton': 8219, 'Blister and sealed blist': 7912, 'Corrugated carton': 3872, 'Paperboard pouch': 3478, 'Trap Folding Card': 2188, 'Plastic Pouch': 1904, 'Plastic bag with header': 1850, 'Plastic Cassette': 1708, 'Shrink film and insert o': 1499, 'Plastic Box': 1491, 'Unpacked': 1415, 'Skincard': 1143, 'Trap Card': 804, 'Cardb. Sleeve w - w/o Shr.': 676, 'Carton cover (Lid box)': 652, 'Case': 485, 'Tray Packer': 431, 'Cardboard hanger w/o bag': 400, 'Envelope': 295, 'Countertop display': 150, 'Metal Cassette': 50, 'Carton tube with or w/o': 44, 'Wooden box': 16, 'Fabric packaging': 15, 'Book packaging': 10}


Class distribution after oversmapling
{'Hanger/ Clip': 13543, 'Tube': 11687, 'Blister and Insert Card': 8744, 'TightPack': 8296, 'Folding carton': 8219, 'Blister and sealed blist': 7912, 'Corrugated carton': 3872, 'Paperboard pouch': 3478, 'Trap 

In [19]:
# Generate data set for PyCaret
df_sub_oversampled = pd.concat([X_oversample, y_oversample], axis=1)

## PyCaret Base Models Training Pipeline

In [20]:
# init the ClassificationExperiment class
exp_base_custom = ClassificationExperiment()

print(f"Experiment Type: {type(exp_base_custom)}") # check the type of exp

Experiment Type: <class 'pycaret.classification.oop.ClassificationExperiment'>


In [21]:
# init setup on exp
exp_base_custom.setup(
    df_sub_oversampled,
    target='packaging_category',
    train_size=0.8,
    fold=5,
    fold_strategy='stratifiedkfold',
    session_id=456
)

Unnamed: 0,Description,Value
0,Session id,123
1,Target,packaging_category
2,Target type,Multiclass
3,Target mapping,"Blister and Insert Card: 0, Blister and sealed blist: 1, Book packaging: 2, Cardb. Sleeve w - w/o Shr.: 3, Cardboard hanger w/o bag: 4, Carton cover (Lid box): 5, Carton tube with or w/o: 6, Case: 7, Corrugated carton: 8, Countertop display: 9, Envelope: 10, Fabric packaging: 11, Folding carton: 12, Hanger/ Clip: 13, Metal Cassette: 14, Paperboard pouch: 15, Plastic Box: 16, Plastic Cassette: 17, Plastic Pouch: 18, Plastic bag with header: 19, Shrink film and insert o: 20, Skincard: 21, TightPack: 22, Trap Card: 23, Trap Folding Card: 24, Tray Packer: 25, Tube: 26, Unpacked: 27, Wooden box: 28"
4,Original data shape,"(83342, 10)"
5,Transformed data shape,"(83342, 66)"
6,Transformed train set shape,"(66673, 66)"
7,Transformed test set shape,"(16669, 66)"
8,Numeric features,1
9,Categorical features,8


<pycaret.classification.oop.ClassificationExperiment at 0x3034be590>

In [22]:
# add sklearn f1_score macro average
exp_base_custom.add_metric(id='f1_macro', name='F1_Macro', score_func=utils.f1_score_macro, greater_is_better=True)
exp_base_custom.remove_metric('MCC')
exp_base_custom.remove_metric('Kappa')
exp_base_custom.get_metrics()

Unnamed: 0_level_0,Name,Display Name,Score Function,Scorer,Target,Args,Greater is Better,Multiclass,Custom
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
acc,Accuracy,Accuracy,<function accuracy_score at 0x17c557420>,accuracy,pred,{},True,True,False
auc,AUC,AUC,<pycaret.internal.metrics.BinaryMulticlassScor...,"make_scorer(roc_auc_score, response_method='pr...",pred_proba,"{'average': 'weighted', 'multi_class': 'ovr'}",True,True,False
recall,Recall,Recall,<pycaret.internal.metrics.BinaryMulticlassScor...,"make_scorer(recall_score, response_method='pre...",pred,{'average': 'weighted'},True,True,False
precision,Precision,Prec.,<pycaret.internal.metrics.BinaryMulticlassScor...,"make_scorer(precision_score, response_method='...",pred,{'average': 'weighted'},True,True,False
f1,F1,F1,<pycaret.internal.metrics.BinaryMulticlassScor...,"make_scorer(f1_score, response_method='predict...",pred,{'average': 'weighted'},True,True,False
f1_macro,F1_Macro,F1_Macro,<pycaret.internal.metrics.EncodedDecodedLabels...,"make_scorer(f1_score_macro, response_method='p...",pred,{},True,True,True


In [23]:
# train and compare base models

# #NOTE: class_weights are not yet supported in compare_models
# class_weights = class_weight.compute_class_weight(
#     class_weight="balanced",
#     classes=np.unique(df_sub.iloc[:, -1]),
#     y=df_sub.iloc[:, -1]
# )
# class_weight_dict = dict(enumerate(class_weights))
# basemodels = exp_base.compare_models(include=['dt', 'rf'], sort='F1_Macro', fit_kwargs={'class_weight': class_weight_dict})
# #FIXME: To be removed, only for testing of using class weights. Class weights are not supported yet by compare_models and tune_models
# lr_clf = exp_base.create_model('lr', class_weight=class_weight_dict)

base_models = exp_base_custom.compare_models(sort='F1_Macro', n_select=exp_base_custom.models().shape[0])

Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,F1_Macro,TT (Sec)
dt,Decision Tree Classifier,0.8509,0.0,0.8509,0.9851,0.9088,0.8433,0.236
rf,Random Forest Classifier,0.9107,0.0,0.9107,0.9288,0.9143,0.839,0.57


In [24]:
leaderboard_base_custom = exp_base_custom.get_leaderboard()
leaderboard_base_custom.sort_values(by='F1_Macro', ascending=False)

Unnamed: 0_level_0,Model Name,Model,Accuracy,AUC,Recall,Prec.,F1,F1_Macro
Index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0,Decision Tree Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.8509,0.0,0.8509,0.9851,0.9088,0.8433
1,Random Forest Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.9107,0.0,0.9107,0.9288,0.9143,0.839


## PyCaret Tuned Models Training Pipeline

In [25]:
exp_tuned_custom = exp_base_custom

print(f"Experiment Type: {type(exp_tuned_custom)}") # check the type of exp

# NOTE: Uncomment to define separat pycaret experiment for tuning
# # init the ClassificationExperiment class
# exp_tuned_custom = ClassificationExperiment()
# print(f"Experiment Type: {type(exp_tuned_custom)}") # check the type of exp
# # init setup on exp
# exp_tuned_custom.setup(
#     df_train,
#     target='packaging_category',
#     train_size=0.8,
#     fold=5,
#     fold_strategy='stratifiedkfold',
#     session_id=456
# )
# # add sklearn f1_score macro average
# exp_tuned_custom.add_metric(id='f1_macro', name='F1_Macro', score_func=utils.f1_score_macro, greater_is_better=True)
# exp_tuned_custom.remove_metric('MCC')
# exp_tuned_custom.remove_metric('Kappa')
# exp_tuned_custom.get_metrics()

Experiment Type: <class 'pycaret.classification.oop.ClassificationExperiment'>


In [26]:
# Use previous created base models (with a pre-defined hyper-parameter set) to tune them with a pre-defined hyper-parameter seach space

# basemodels = exp_base.compare_models(sort='F1_Macro') # define base models

tuned_models = []
for i in base_models:
    print(f"##### Model Algorithm: {i.__class__} #####")
    tuned_model = exp_tuned_custom.tune_model(estimator=i, optimize='F1_Macro', search_library='scikit-learn', search_algorithm='random', n_iter=20)
    tuned_models.append(tuned_model)  # Append the tuned model to the list
    print("\n")

##### Model Algorithm: <class 'sklearn.tree._classes.DecisionTreeClassifier'> #####


Unnamed: 0_level_0,Accuracy,AUC,Recall,Prec.,F1,F1_Macro
Fold,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,0.3211,0.0,0.3211,0.2127,0.2416,0.0655
1,0.1625,0.0,0.1625,0.0264,0.0454,0.0096
2,0.2124,0.0,0.2124,0.0725,0.0963,0.0274
3,0.1625,0.0,0.1625,0.0264,0.0454,0.0096
4,0.1624,0.0,0.1624,0.0264,0.0454,0.0096
Mean,0.2042,0.0,0.2042,0.0729,0.0948,0.0244
Std,0.0616,0.0,0.0616,0.0722,0.076,0.0217


Fitting 5 folds for each of 2 candidates, totalling 10 fits
Original model was better than the tuned model, hence it will be returned. NOTE: The display metrics are for the tuned model (not the original one).


##### Model Algorithm: <class 'sklearn.ensemble._forest.RandomForestClassifier'> #####


Unnamed: 0_level_0,Accuracy,AUC,Recall,Prec.,F1,F1_Macro
Fold,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,0.7831,0.0,0.7831,0.835,0.7862,0.4625
1,0.7744,0.0,0.7744,0.8324,0.7821,0.4583
2,0.7771,0.0,0.7771,0.8323,0.7864,0.4684
3,0.7667,0.0,0.7667,0.8266,0.7733,0.4514
4,0.78,0.0,0.78,0.8429,0.7879,0.4746
Mean,0.7763,0.0,0.7763,0.8338,0.7831,0.463
Std,0.0056,0.0,0.0056,0.0053,0.0053,0.008


Fitting 5 folds for each of 2 candidates, totalling 10 fits
Original model was better than the tuned model, hence it will be returned. NOTE: The display metrics are for the tuned model (not the original one).




## PyCaret Best Model Evaluation

In [27]:
leaderboard_tuned_custom = exp_tuned_custom.get_leaderboard()
leaderboard_tuned_custom.sort_values(by='F1_Macro', ascending=False)

Unnamed: 0_level_0,Model Name,Model,Accuracy,AUC,Recall,Prec.,F1,F1_Macro
Index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
0,Decision Tree Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.8509,0.0,0.8509,0.9851,0.9088,0.8433
3,Decision Tree Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.8509,0.0,0.8509,0.9851,0.9088,0.8433
1,Random Forest Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.9107,0.0,0.9107,0.9288,0.9143,0.839
5,Random Forest Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.9107,0.0,0.9107,0.9288,0.9143,0.839
4,Random Forest Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.7763,0.0,0.7763,0.8338,0.7831,0.463
2,Decision Tree Classifier,"(TransformerWrapperWithInverse(exclude=None, i...",0.2042,0.0,0.2042,0.0729,0.0948,0.0244


In [28]:
# returns best model based on the defined metric in the given pycaret experiment
best_model = exp_tuned_custom.automl(optimize='F1_Macro')

# predict on test set
holdout_pred = exp_tuned_custom.predict_model(best_model)

# show predictions df
# holdout_pred.head()

# print classification report for holdout test data
print(classification_report(holdout_pred['packaging_category'], holdout_pred['prediction_label']))

Unnamed: 0,Model,Accuracy,AUC,Recall,Prec.,F1,F1_Macro
0,Decision Tree Classifier,0.9028,0.9509,0.9028,0.9847,0.9385,0.8784


                            precision    recall  f1-score   support

   Blister and Insert Card       0.98      0.90      0.94      1749
  Blister and sealed blist       0.95      0.95      0.95      1582
            Book packaging       1.00      1.00      1.00        20
Cardb. Sleeve w - w/o Shr.       0.97      0.84      0.90       135
  Cardboard hanger w/o bag       1.00      0.91      0.95        80
    Carton cover (Lid box)       0.99      0.93      0.96       130
   Carton tube with or w/o       0.83      0.95      0.88        20
                      Case       0.95      0.81      0.88        97
         Corrugated carton       1.00      0.91      0.95       774
        Countertop display       0.97      0.97      0.97        30
                  Envelope       1.00      0.90      0.95        59
          Fabric packaging       1.00      1.00      1.00        20
            Folding carton       1.00      0.86      0.93      1644
              Hanger/ Clip       1.00      0.94