### Faculdade de Engenharia Industrial - FEI

### Centro Universitário da Fundação Educacional Inaciana "Padre Sabóia de Medeiros" (FEI)


*FEI's Stricto Sensu Graduate Program in Electrical Engineering*

Concentration area: ARTIFICIAL INTELLIGENCE APPLIED TO AUTOMATION AND ROBOTICS

Master's thesis student Andre Luiz Florentino

***

## Check for GPU

In [None]:
import tensorflow as tf
print(tf.__version__)

pd = tf.config.experimental.list_physical_devices()
for i in pd:
    print(i)
print('------------------------------------------------------------------------------------------')


print(tf.config.list_physical_devices('GPU'))
# [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

print(tf.test.is_built_with_cuda)
# <function is_built_with_cuda at 0x000001AA24AFEC10>

print(tf.test.gpu_device_name())
# /device:GPU:0

#gvd = tf.config.get_visible_devices()
for j in tf.config.get_visible_devices():
    print(j)
# PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')
# PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')

# Chapter 9: Convolutional Neural Network (2D)

***

- The paper entitled "ESC-ConvNet: Environmental Sound Classification with Convolutional Neural Networks" (PICZAK, 2015) serves as the foundation for the following analysis. In this study, the author employs Convolutional Neural Networks (CNNs) for image classification, utilizing fixed dimension images that consist of multiple channels (such as RGB for color images). The network undergoes various stages of convolution, pooling, and fully connected layers, ultimately outputting class probabilities for the given image. With the aim to replicate this approach using sound clips, the utilization of log-scaled mel-spectrograms and their respective deltas from each sound clip is proposed instead of directly using the sound file as an amplitude vs. time signal. In order to address the requirement of fixed size input, the sound clips are segmented into 60x41 segments (60 bands and 41 frames - windowing techinique). The log-scaled mel-spectrograms are extracted from all the recordings, which were resampled to 22050 Hz and normalized with a window size of 1024, a hop length of 512, and 60 mel-bands.

- The human auditory system perceives sound on a logarithmic scale, rendering it difficult to distinguish closely-scaled frequencies. This effect becomes more pronounced with increasing frequency. Therefore, only the power within different frequency bands is considered. As a result, the mel-spectrograms and their corresponding deltas are transformed into two channels that are subsequently inputted into the CNN for analysis.

- During the iterative process of file exploration, it is noted that each sound is 5 seconds in duration and has in its duration (sometimes) silent periods. In order to achieve a more representative "sound image", for each sound, the "extract_feature" methods are utilized to trim the silent periods and duplicate the sound, effectively doubling its trimmed length (augmented audio). Subsequently, the aforementioned features, along with the class labels, are calculated and appended to arrays.

- The final result transformed the audio file into a spectrogram image consisting of 60 bands, 41 frames, and 2 channels.

## Import modules

In [None]:
import librosa
import librosa.display
import os
import warnings
import pickle
import itertools
import mimetypes
import time

import pandas     as pd
import seaborn    as sns
import numpy      as np

from matplotlib  import pyplot  as plt
from keras       import backend as K

from tqdm                        import tqdm

from sklearn                     import metrics
from sklearn.model_selection     import train_test_split
from sklearn.metrics             import confusion_matrix, classification_report

from tensorflow                  import keras
from tensorflow.keras.models     import Sequential, load_model
from tensorflow.keras.layers     import Dense, Dropout, Flatten, InputLayer, Conv2D
from tensorflow.keras.layers     import MaxPooling2D, BatchNormalization, Activation

from keras.wrappers.scikit_learn import KerasClassifier
from keras.callbacks             import ModelCheckpoint, EarlyStopping
from keras.optimizers            import SGD
from keras.constraints           import maxnorm

warnings.filterwarnings('ignore')

pd.set_option('display.max_columns', 12)
pd.set_option('display.width', 300)
pd.set_option('display.max_colwidth', 120)

cmap_cm   = plt.cm.Blues

In [None]:
# Globals
current_path = os.getcwd()

# For the picture names
pic_first_name = '09_CNN_2D_'

# For Librosa
FRAME_SIZE  = 1024
HOP_LENGTH  = 512
SEED        = 1000
SR          = 22050

## Loading the dataset

In [None]:
# Select the dataset

opcD = 0
while str(opcD) not in '1234':
    print()
    print("1-) ESC-10")
    print("2-) BDLib2")
    print("3-) US8K")
    print("4-) US8K_AV")

    opcD = input("\nSelect the dataset: ")
    if opcD.isdigit():
        opcD = int(opcD)
    else:
        opcD = 0

if opcD == 1:

    path        = os.path.join(current_path, "_dataset", "ESC-10")
    path_pic    = os.path.join(current_path, "ESC-10_results")
    path_models = os.path.join(current_path, "ESC-10_saved_models")
    
    # Check if the folder exists, if not, create it
    if not os.path.exists(path_models):
        os.makedirs(path_models)
   
    subfolders  = next(os.walk(path))[1]
    nom_dataset = 'ESC-10' 
    csv_file    = 'ESC-10.csv'
    fold        = 1
    dog_set     = 'Dog bark'
    
    pkl_features_CNN_2D          = 'ESC-10_features_CNN_2D_original.pkl'
    pkl_aug_features_CNN_2D      = 'ESC-10_features_CNN_2D_augmented_no_windowing.pkl'
    pkl_aug_wind_features_CNN_2D = 'ESC-10_features_CNN_2D_augmented.pkl'
    

    
if opcD == 2:
    
    path        = os.path.join(current_path, "_dataset", "BDLib2")
    path_pic    = os.path.join(current_path, "BDLib2_results")
    path_models = os.path.join(current_path, "BDLib2_saved_models")
    
    # Check if the folder exists, if not, create it
    if not os.path.exists(path_models):
        os.makedirs(path_models)

    subfolders  = next(os.walk(path))[1]
    nom_dataset = 'BDLib2' 
    csv_file    = 'BDLib2.csv'
    fold        = 'fold-1'
    dog_set     = 'dogs'
    
    pkl_features_CNN_2D          = 'BDLib2_features_CNN_2D_original.pkl'
    pkl_aug_features_CNN_2D      = 'BDLib2_features_CNN_2D_augmented_no_windowing.pkl'
    pkl_aug_wind_features_CNN_2D = 'BDLib2_features_CNN_2D_augmented.pkl'
    
    
if opcD == 3:
    
    path        = os.path.join(current_path, "_dataset", "US8K")
    path_pic    = os.path.join(current_path, "US8K_results")
    path_models = os.path.join(current_path, "US8K_saved_models")
    
    # Check if the folder exists, if not, create it
    if not os.path.exists(path_models):
        os.makedirs(path_models)
        
    subfolders  = next(os.walk(path))[1]
    nom_dataset = 'US8K' 
    csv_file    = 'US8K.csv'
    fold        = '1'
    dog_set     = 'dog_bark'

    pkl_features_CNN_2D          = 'US8K_features_CNN_2D_original.pkl'
    pkl_aug_features_CNN_2D      = 'US8K_features_CNN_2D_augmented_no_windowing.pkl'
    pkl_aug_wind_features_CNN_2D = 'US8K_features_CNN_2D_windowed.pkl' # augmented and windowed makes no sense. Dataset is already quite large
    
    
if opcD == 4:

    path        = os.path.join(current_path, "_dataset", "US8K_AV")
    path_pic    = os.path.join(current_path, "US8K_AV_results")
    path_models = os.path.join(current_path, "US8K_AV_saved_models")
    
    # Check if the folder exists, if not, create it
    if not os.path.exists(path_models):
        os.makedirs(path_models)


    subfolders  = next(os.walk(path))[1]
    nom_dataset = 'US8K_AV' 
    csv_file    = 'US8K_AV.csv'
    fold        = '1'
    dog_set     = 'dog_bark'
    
    pkl_features_CNN_2D          = 'US8K_AV_features_CNN_2D_original.pkl'
    pkl_aug_features_CNN_2D      = 'US8K_AV_features_CNN_2D_augmented_no_windowing.pkl'
    pkl_aug_wind_features_CNN_2D = 'US8K_AV_features_CNN_2D_windowed.pkl' # augmented and windowed makes no sense. Dataset is already quite large

In [None]:
def get_next_file_number(folder: str):
    files = [f for f in os.listdir(folder) if os.path.isfile(os.path.join(folder, f)) and f.startswith(pic_first_name)]
    if not files:
        return 1
    else:
        numbers = [int(f.split('.')[0].split('_')[-1]) for f in files]
        return max(numbers) + 1

In [None]:
from MT_loadDataset import loadDataset

In [None]:
loadDataset = loadDataset(path)
DB          = loadDataset.db_B

print("\nClasses:\n--------------------")
print(DB["Class_categorical"].value_counts())
print("\nTotal number of unique files..........: ", len(np.unique(DB["File_name"])))
print("Total number of AUDIO files...........: ", len(DB))
DB

In [None]:
# Analysis of the class balancing

sns.set_style("darkgrid")
gTitle = f'{nom_dataset} - Number of classes = ' + str(len(pd.Series(DB['Class_categorical']).unique()))
g = sns.displot(DB,x='Class_categorical', hue='Class_categorical',height = 5, aspect = 2).set(title=gTitle)
g.set_xticklabels(rotation=90)
g.set_titles('Number of classes')

# Retrieve the axes object from the plot
axes = g.ax

# Iterate over each bar in the plot
for p in axes.patches:
    # Get the coordinates of the bar
    width = p.get_width()
    height = p.get_height()
    cord_x, cord_y = p.get_xy()
    if height > 0:
        axes.annotate(f'{height}', (cord_x + width/2, cord_y + height), ha='center')
        
g._legend.remove()

plt.tight_layout()

In [None]:
# Read the pkl file with the augmented features extracted

opc = 0
while str(opc) not in '123':
    print()
    print("1-) Features original")
    print("2-) Features augmented")
    print("3-) Features augmented and windowed (US8K is only windowed)")

    opc = input("\nSelect the dataset: ")
    if opc.isdigit():
        opc = int(opc)
    else:
        opc = 0

if opc == 1:
    DB_from_pkl      = pd.read_pickle(os.path.join(path_models, pkl_features_CNN_2D))
    model_surname    = '_original'

elif opc == 2:
    DB_from_pkl      = pd.read_pickle(os.path.join(path_models, pkl_aug_features_CNN_2D))
    model_surname    = '_augmented'

elif opc == 3:
    DB_from_pkl      = pd.read_pickle(os.path.join(path_models, pkl_aug_wind_features_CNN_2D))
    model_surname    = '_windowed'
    
else:
    pass

In [None]:
DB_from_pkl

In [None]:
# Analysis of the class balancing

sns.set_style("darkgrid")
gTitle = f'{nom_dataset} - Number of classes = ' + str(len(pd.Series(DB_from_pkl['Class_categorical']).unique()))
g = sns.displot(DB_from_pkl,x='Class_categorical', hue='Class_categorical',height = 5, aspect = 2).set(title=gTitle)
g.set_xticklabels(rotation=90)
g.set_titles('Number of classes')

# Retrieve the axes object from the plot
axes = g.ax

# Iterate over each bar in the plot
for p in axes.patches:
    # Get the coordinates of the bar
    width = p.get_width()
    height = p.get_height()
    cord_x, cord_y = p.get_xy()
    if height > 0:
        axes.annotate(f'{height}', (cord_x + width/2, cord_y + height), ha='center')
        
g._legend.remove()

plt.tight_layout()

In [None]:
for fold in np.unique(DB_from_pkl['Fold']):
    print(f"Validation fold: {fold}")
    
    valsize = len(DB_from_pkl[DB_from_pkl['Fold'] == fold])
    trnsize = len(DB_from_pkl[DB_from_pkl['Fold'] != fold])
    print(f'dbComplete_VAL size: {valsize}')
    print(f'dbComplete_TRN size: {trnsize}')
    print()

In [None]:
DB_from_pkl.dtypes

In [None]:
DB_from_pkl['Class_OHEV'][0][0]

In [None]:
print(DB_from_pkl['Fold'].shape)
print(DB_from_pkl['Class_OHEV'][0].shape)
print(DB_from_pkl['features'][0].shape)

In [None]:
print(type(DB_from_pkl['Fold'][0][0]))
print(type(DB_from_pkl['Class_OHEV'][0][0]))
print(type(DB_from_pkl['features'][0][0][0][0]))

In [None]:
# Group by the class and get one random sample of each class
k = DB_from_pkl.groupby('Class_categorical')['Class_OHEV'].apply(lambda s: s.sample(1))
print(k)

# Convert the pandas series into a dataframe
temp_k_df = k.reset_index()

# Delete the index from the grouppby result
del temp_k_df['level_1']

# Set the "Class" as the dataframe index
temp_k_df.set_index("Class_categorical", inplace=True)

# Convert the dataframe to a dictionary (Class: Class_encoder)
encoder_dict = temp_k_df["Class_OHEV"].to_dict()
encoder_dict

In [None]:
# Number of classes in the dataset

num_classes = len(encoder_dict.keys())
num_classes

In [None]:
# Name of the classes

nom_classes = list(encoder_dict.keys())
nom_classes

## Input split

In [None]:
# Separate 1 fold for validation and create a DB for the training / testing

DB_from_pkl_VAL = DB_from_pkl[DB_from_pkl['Fold'] == fold].copy()
DB_from_pkl_TRN = DB_from_pkl[DB_from_pkl['Fold'] != fold].copy()

X      = DB_from_pkl_TRN['features'].to_numpy()
y      = np.array(DB_from_pkl_TRN.Class_categorical.to_list())
y_OHEV = np.array(DB_from_pkl_TRN.Class_OHEV.to_list())

X_val      = DB_from_pkl_VAL['features'].to_numpy()
y_val      = np.array(DB_from_pkl_VAL.Class_categorical.to_list())
y_OHEV_val = np.array(DB_from_pkl_VAL.Class_OHEV.to_list())


# Stackup and pass all values to float32
X = np.stack(X)
X = np.asarray(X).astype(np.float32)

X_val = np.stack(X_val)
X_val = np.asarray(X_val).astype(np.float32)

y_OHEV     = np.asarray(y_OHEV).astype(np.float32)
y_OHEV_val = np.asarray(y_OHEV_val).astype(np.float32)


# Retrieve the indexes used for training the classifiers
idx_trn = np.genfromtxt(os.path.join(path_models, '_idx_trn_' + nom_dataset + model_surname + '.csv'), delimiter=',', dtype = int)
idx_tst = np.genfromtxt(os.path.join(path_models, '_idx_tst_' + nom_dataset + model_surname + '.csv'), delimiter=',', dtype = int)

X_train      = X[idx_trn]
X_test       = X[idx_tst]
y_train      = y[idx_trn]
y_test       = y[idx_tst]
y_train_OHEV = y_OHEV[idx_trn]
y_test_OHEV  = y_OHEV[idx_tst]

In [None]:
print("\n==================================")
print("Training set\n")

print(f'X_train.........: {np.shape(X_train)}')
print(f'y_train.........: {np.shape(y_train)}')
print(f'y_train_OHEV....: {np.shape(y_train_OHEV)}')

print("\n==================================")
print("Testing set\n")

print(f'X_test..........: {np.shape(X_test)}')
print(f'y_test..........: {np.shape(y_test)}')
print(f'y_test_OHEV.....: {np.shape(y_test_OHEV)}')

print("\n==================================")
print("Validation set\n")

print(f'X_val...........: {np.shape(X_val)}')
print(f'y_val...........: {np.shape(y_val)}')
print(f'y_OHEV_val......: {np.shape(y_OHEV_val)}')

In [None]:
# Simple confusion matrix

def simple_conf_matrix(y_true, y_pred, nom_classes, clf, acc):
    
    picture_name = f'{pic_first_name}{get_next_file_number(path_pic):02d}.png'

    conf_matrix = metrics.confusion_matrix(y_true, y_pred)
    title = nom_dataset + model_surname + norm_type + ' - Classifier ' + clf + ' - Validation accuracy: '+ str("{:0.2f} %".format(acc*100))

    plt.figure(figsize = (10,10))
    sns.heatmap(conf_matrix, 
                annot=True, 
                fmt='g', 
                cmap=cmap_cm, 
                annot_kws={"size": 8}, 
                xticklabels=nom_classes, 
                yticklabels=nom_classes)
    plt.title(title, fontsize = 12)
    plt.savefig(os.path.join(path_pic, picture_name))
    plt.show()

In [None]:
# Plot the confusion matrix

def plot_confusion_matrix(cm, labels, title, cmap, normalize):

    if labels is not None:
        tick_marks = np.arange(len(labels))
        plt.xticks(tick_marks, labels, fontsize=10, rotation=45)
        plt.yticks(tick_marks, labels, fontsize=10)
   
    if cmap is None:
        cmap = plt.get_cmap('Blues')
    
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]

    thresh = cm.max() / 1.5 if normalize else cm.max() / 2
    
    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
        if normalize:
            plt.text(j, i, "{:0.4f}".format(cm[i, j]),
                     horizontalalignment="center",
                     color="white" if cm[i, j] > thresh else "black", fontsize = 8)
        else:
            plt.text(j, i, "{:,}".format(cm[i, j]),
                     horizontalalignment="center",
                     color="white" if cm[i, j] > thresh else "black", fontsize = 8)

    plt.imshow(cm, interpolation = 'nearest', cmap = cmap)
    plt.title(title, fontsize=13)
    plt.colorbar(shrink=1)
    plt.ylabel('True label')
    plt.xlabel('Predicted label')
    plt.grid(None)
    plt.tight_layout()

## Classifiers

- **Convolutional Neural Networks** (CNNs) are a class of deep learning algorithms specifically designed for processing grid-like data, such as images and videos. CNNs are highly effective in tasks related to computer vision, including image recognition, object detection, and image segmentation. They are characterized by their ability to automatically and adaptively learn spatial hierarchies of features from input data. CNNs consist of multiple layers, including convolutional layers, pooling layers, and fully connected layers. The convolutional layers apply convolution operations to the input data, enabling the network to automatically learn patterns and features from images, such as edges, textures, and more complex structures. The pooling layers downsample the spatial dimensions of the data, reducing computational complexity while retaining important features. Fully connected layers at the end of the network process the learned features and make predictions based on them. One of the significant advantages of CNNs is their ability to capture local patterns and spatial hierarchies of features. By using shared weights and biases in the convolutional layers, CNNs are capable of learning translation-invariant features, making them well-suited for tasks where the spatial arrangement of features in the input data is essential. Additionally, CNNs can automatically learn relevant features from raw pixel values, eliminating the need for manual feature extraction.

In [None]:
inputShape = X_train[0].shape
inputShape

In [None]:
# Architecture based on Su et al. (2019)

def basemodel_Su(model_name):
       
    model = Sequential(name = model_name)
    
    # Input is 44 x 180
    # If we have N x N image size and F x F filter size, afer the convolution the result will be
    # (N x N) * (F x F) = (N - F + 1) x (N - F + 1)
    # (44 - 7 + 1) x (180 - 7 + 1) = (38 x 174)
    
    model.add(Conv2D(32, (3, 3), input_shape=(inputShape), padding='same', strides=(2,2), activation='relu'))
    model.add(BatchNormalization())
    model.add(Conv2D(32, (3, 3), strides=(2,2), activation='relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.5))

    model.add(Conv2D(64, (3, 3), padding='same', strides=(1,1), activation='relu'))
    model.add(BatchNormalization())
    model.add(Conv2D(64, (3, 3), padding='same', strides=(1,1), activation='relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))
    model.add(Dropout(0.5))
    
    model.add(Flatten(name='Flatten'))

    model.add(Dense(1024, activation='relu', kernel_constraint=maxnorm(3)))
    model.add(Dropout(0.5))
    
    model.add(Dense(num_classes, activation='softmax'))
    
    # Compile model
    epochs  = 100
    lrate   = 0.001
    decay   = lrate/epochs
    sgd     = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=False)
    
    model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
    
    return model

In [None]:
# Architecture based on Luz et al. (2021)

def basemodel_Luz(model_name):
       
    model = Sequential(name = model_name)
    
    model.add(Conv2D(24, (5, 5), input_shape=(inputShape), padding='same', strides=(1,1), activation='relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Conv2D(48, (5, 5), padding='same', strides=(1,1), activation='relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))
    
    model.add(Conv2D(48, (5, 5), padding='same', strides=(1,1), activation='relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))
    
    model.add(Conv2D(48, (5, 5), padding='same', strides=(1,1), activation='relu'))
    model.add(BatchNormalization())
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Flatten(name='Flatten'))

    model.add(Dense(64, activation='relu', kernel_constraint=maxnorm(3)))
    model.add(Dropout(0.5))
    model.add(Dense(128, activation='relu', kernel_constraint=maxnorm(3)))
    
    model.add(Dense(num_classes, activation='softmax'))
    
    # Compile model
    epochs  = 100
    lrate   = 0.001
    decay   = lrate/epochs
    sgd     = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=False)
    
    model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
    
    return model

In [None]:
monitor = EarlyStopping(monitor='val_accuracy', min_delta=0.0001, patience=30, verbose=1, mode='auto', restore_best_weights=True)

if not os.path.exists(path_models):
    os.makedirs(path_models)

filepath       = os.path.join(path_models, 'Model_CNN_2D_weights_0_best' + model_surname + '.hdf5')
checkpoint     = ModelCheckpoint(filepath, monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint, monitor]

In [None]:
# Select the model

opc = 0
while str(opc) not in '12':
    print()
    print("1-) Architecture based on Su et al. (2019)")
    print("2-) Architecture based on Luz et al. (2021)")

    opc = input("\nSelect the model: ")
    if opc.isdigit():
        opc = int(opc)
    else:
        opc = 0

if opc == 1:
    basemodel = basemodel_Su
    surName = '_Su'

elif opc == 2:
    basemodel = basemodel_Luz
    surName = '_Luz'

else:
    pass

Model_CNN_2D = basemodel('Model_CNN_2D' + surName)
print(Model_CNN_2D.summary())

In [None]:
tf.keras.utils.plot_model(Model_CNN_2D, to_file= os.path.join(path_models, 'Model_CNN_2D' + model_surname + '.png'), show_shapes=True)

### Understanding the column "Param":

1. For `Conv1D` layer:
   - The number of parameters for a `Conv1D` layer is calculated as `(kernel_size * input_channels + 1) * output_channels`, where `kernel_size` is the size of the convolutional kernel, `input_channels` is the number of input channels (1 in this case), and `output_channels` is the number of output channels.

2. For `Dense` layer:
   - The number of parameters for a `Dense` layer is calculated as `(input_units + 1) * output_units`, where `input_units` is the number of input units and `output_units` is the number of output units.
   
3. In the calculation of parameters for a convolutional layer, the term "channels" refers to the number of filters used in that layer.
4. Params = (filter_height * filter_width * input_channels + 1) * number_of_filters


- 624   parameters is the result of 24 filters * (5 kernels * 5 kernels * 1 channel + 1)
- 28,848 parameters is the result of 48 filter * (5 kernels * 5 kernels * 24 channels + 1)
- 57,648 parameters is the result of 48 filter * (5 kernels * 5 kernels * 48 channels + 1)
- 57,648 parameters is the result of 48 filter * (5 kernels * 5 kernels * 48 channels + 1)
- 67,648  parameters is the result of 64 neurons with 1,056 features + 64 bias values
- 8,320  parameters is the result of 128 neurons with 64 features + 128 bias values
- 645  parameters is the result of 5 neurons with 128 features + 5 bias values

### CNN 2D adjustments

In [None]:
print("\n========================================================================")
print("Training set\n")

print(f'X_train.........: {np.shape(X_train)} ...type: {type(X_train[0][0][0][0])}')
print(f'y_train_OHEV....: {np.shape(y_train_OHEV)} ............type: {type(y_train_OHEV[0][0])}')

print("\n========================================================================")
print("Testing set\n")

print(f'X_test..........: {np.shape(X_test)} ....type: {type(X_test[0][0][0][0])}')
print(f'y_test_OHEV.....: {np.shape(y_test_OHEV)} .............type: {type(y_test_OHEV[0][0])}')

print("\n========================================================================")
print("Validation set\n")

print(f'X_val...........: {np.shape(X_val)} ....type: {type(X_val[0][0][0][0])}')
print(f'y_OHEV_val......: {np.shape(y_OHEV_val)} .............type: {type(y_OHEV_val[0][0])}')

In [None]:
batch_size = 32
epochs     = 100
history    = Model_CNN_2D.fit(X_train, y_train_OHEV,
                              batch_size      = batch_size,
                              epochs          = epochs,
                              verbose         = 1,
                              validation_data = (X_test, y_test_OHEV),
                              steps_per_epoch=int(np.ceil(X_train.shape[0] / float(batch_size))),
                              callbacks       = callbacks_list,
                              use_multiprocessing = True)

In [None]:
score_CNN_2D = Model_CNN_2D.evaluate(X_val, y_OHEV_val, verbose=1, batch_size = batch_size)
print('Test loss:', score_CNN_2D[0])
print('Test accuracy:', score_CNN_2D[1])

In [None]:
picture_name = f'{pic_first_name}{get_next_file_number(path_pic):02d}.png'

fig, ax = plt.subplots(1,2, figsize=(16,8))
fig.suptitle('CNN 2D - Training / Validation loss and accuracy')
ax[0].plot(history.history['loss'], color='b', label="Training loss")
ax[0].plot(history.history['val_loss'], color='r', label="validation loss",axes =ax[0])
legend = ax[0].legend(loc='best', shadow=True)

ax[1].plot(history.history['accuracy'], color='b', label="Training accuracy")
ax[1].plot(history.history['val_accuracy'], color='r',label="Validation accuracy")
legend = ax[1].legend(loc='best', shadow=True)
fig.tight_layout()
plt.savefig(os.path.join(path_pic, picture_name))

In [None]:
# save model and architecture to single file (not the best model though)

# Model_CNN_2D.save(path_models + "Model_CNN_2D.h5")
# print("Saved model to disk")

In [None]:
y_pred_CNN_2d = np.argmax(Model_CNN_2D.predict(X_val),axis=1)
y_pred_CNN_2d

In [None]:
y_test_enc = np.argmax(y_OHEV_val, axis=1)
y_test_enc

In [None]:
score_CNN_2D[1]

In [None]:
metrics_set_CNN_2D = classification_report(y_test_enc, y_pred_CNN_2d, target_names=nom_classes)
print(metrics_set_CNN_2D)

In [None]:
# Load the model with the highest accuracy

Model_CNN_2D_saved = load_model(os.path.join(path_models, 'Model_CNN_2D_weights_0_best' + model_surname + '.hdf5'))
Model_CNN_2D_saved.summary()

In [None]:
score_CNN_2D_saved = Model_CNN_2D_saved.evaluate(X_val, y_OHEV_val, verbose=1, batch_size = batch_size)
print('Test loss:', score_CNN_2D_saved[0])
print('Test accuracy:', score_CNN_2D_saved[1])

In [None]:
y_pred_CNN_2D_saved = np.argmax(Model_CNN_2D_saved.predict(X_val),axis=1)
y_pred_CNN_2D_saved

In [None]:
prob = np.round(Model_CNN_2D_saved.predict(X_val)[7],6)
for i in prob:
    print(i)

In [None]:
metrics_set_CNN_2D_saved = classification_report(y_test_enc, y_pred_CNN_2D_saved, target_names=nom_classes)
print(metrics_set_CNN_2D_saved)

In [None]:
# Simple confusion matrix
picture_name = f'{pic_first_name}{get_next_file_number(path_pic):02d}.png'

conf_matrix = metrics.confusion_matrix(y_test_enc, y_pred_CNN_2D_saved)
title = nom_dataset + model_surname + ' - Classifier CNN 2D (best model) - Highest accuracy test: '+ str("{:0.2f}%".format(score_CNN_2D_saved[1]*100))

plt.figure(figsize = (10,10))
sns.heatmap(conf_matrix, 
            annot=True, 
            fmt='g', 
            cmap=cmap_cm, 
            annot_kws={"size": 8}, 
            xticklabels=nom_classes, 
            yticklabels=nom_classes)
plt.title(title, fontsize = 12)
plt.savefig(os.path.join(path_pic, picture_name))
plt.show()

In [None]:
Model_CNN_2D_saved.layers

In [None]:
for layer in Model_CNN_2D_saved.layers:
    print(layer.get_weights())

## Metrics for the classifiers


1. Accuracy: Accuracy is a measure of how many correct predictions a model makes overall, i.e., the ratio of correct predictions to the total number of predictions. It's a commonly used metric for evaluating models, but it may not be suitable in certain situations.

2. Precision: Precision measures the ratio of true positives (correctly predicted positive instances) to all instances predicted as positive. It focuses on the accuracy of positive predictions.

3. Recall: Recall, also known as sensitivity or true positive rate, measures the ratio of true positives to all actual positive instances. It focuses on how well a model captures all the positive instances.

4. F1 Score: The F1 score is the harmonic mean of precision and recall. It provides a balanced measure that takes into account both false positives and false negatives. The F1 score is especially useful when you want to strike a balance between precision and recall.


The F1 score is a metric that combines precision and recall, and it is particularly useful in situations where class imbalance or unequal misclassification costs are present. In such contexts, the F1 score can be more informative and meaningful than accuracy.

A context where considering the F1 score makes more sense than accuracy:

**Medical Diagnosis:**

Imagine you're developing a model to diagnose a rare disease, and only 5% of the population has this disease. In this case, you have a significant class imbalance, where the majority of cases are negative (non-disease) and only a small fraction are positive (disease). If you were to use accuracy as the evaluation metric, the model could achieve a high accuracy by simply predicting "negative" for every case, because it would be correct 95% of the time due to the class imbalance. However, this would be entirely useless for detecting the actual disease.

In this scenario, you'd be more interested in the F1 score. The F1 score considers both precision and recall, helping you find a balance between correctly identifying the disease (high recall) and not making too many false positive predictions (high precision). A high F1 score in this context indicates that your model is effective at correctly identifying the disease while minimizing false alarms.

In [None]:
classifiers = ['Model_CNN_2D_Su', 'Model_CNN_2D_Luz']

In [None]:
# Pipeline to run the classifiers and their metrics

def model_classifiers(classifiers:list, db: pd.DataFrame):
    
    # Clear the session to start a new training
    K.clear_session()
                      
    monitor = EarlyStopping(monitor='val_accuracy', 
                        min_delta = 0.0001, 
                        patience = 30, 
                        verbose = 1, 
                        mode = 'auto', 
                        restore_best_weights = True)
                      
    count       = 1
    verbose     = True
    models      = []
    acc_set     = pd.DataFrame(index=None, columns=['Model',
                                                    'Fold',
                                                    'Accuracy(Train)',
                                                    'Accuracy(Val)',
                                                    'F1(Train)',
                                                    'F1(Val)', 
                                                    'Precision(Train)',
                                                    'Precision(Val)', 
                                                    'Recall(Train)',
                                                    'Recall(Val)', 
                                                    'Conf_M',
                                                    'Process_time',                                                     
                                                    'Class_report(Val)'])
                      
    for fold in np.unique(db['Fold']):
        print(f"\nValidation fold: {fold}")

        DB_VAL = db[db['Fold'] == fold]
        DB_TRN = db[db['Fold'] != fold]

        X      = DB_TRN['features'].to_numpy()
        y      = np.array(DB_TRN.Class_categorical.to_list())
        y_OHEV = np.array(DB_TRN.Class_OHEV.to_list())

        X_val      = DB_VAL['features'].to_numpy()
        y_val      = np.array(DB_VAL.Class_categorical.to_list())
        y_OHEV_val = np.array(DB_VAL.Class_OHEV.to_list())


        # Stackup and pass all values to float32
        X = np.stack(X)
        X = np.asarray(X).astype(np.float32)

        X_val = np.stack(X_val)
        X_val = np.asarray(X_val).astype(np.float32)

        y_OHEV     = np.asarray(y_OHEV).astype(np.float32)
        y_OHEV_val = np.asarray(y_OHEV_val).astype(np.float32)

        X_train_final, X_test, y_train_final, y_test = train_test_split(X,
                                                                        y_OHEV, 
                                                                        test_size = 0.1, 
                                                                        random_state = 100, 
                                                                        stratify = y_OHEV)
        
        print("\n========================================================================")
        print("Training set\n")

        print(f'X_train.........: {np.shape(X_train_final)} ...type: {type(X_train_final[0][0][0][0])}')
        print(f'y_train_OHEV....: {np.shape(y_train_final)} ............type: {type(y_train_final[0][0])}')

        print("\n========================================================================")
        print("Testing set\n")

        print(f'X_test..........: {np.shape(X_test)} ....type: {type(X_test[0][0][0][0])}')
        print(f'y_test_OHEV.....: {np.shape(y_test)} .............type: {type(y_test[0][0])}')

        print("\n========================================================================")
        print("Validation set\n")

        print(f'X_val...........: {np.shape(X_val)} ....type: {type(X_val[0][0][0][0])}')
        print(f'y_OHEV_val......: {np.shape(y_OHEV_val)} .............type: {type(y_OHEV_val[0][0])}')
        print()

        
        for i in tqdm(range(len(classifiers))):
            
            name         = classifiers[i]
            model_name   = (classifiers[i] + '_' + str(count))
            count        = count + 1
            
            if not os.path.exists(path_models):
                os.makedirs(path_models)

            filepath       = os.path.join(path_models, classifiers[i] + '_weights_0_best' + model_surname + '.hdf5')
            checkpoint     = ModelCheckpoint(filepath, 
                                             monitor = 'val_accuracy', 
                                             verbose = 1, 
                                             save_best_only = True, 
                                             mode = 'max')
            callbacks_list = [checkpoint, monitor]

            if classifiers[i] == 'Model_CNN_2D_Su':
                model = basemodel_Su(classifiers[i])
                model.summary()
                print(model_name)
            
            elif classifiers[i] == 'Model_CNN_2D_Luz':
                model = basemodel_Luz(classifiers[i])
                model.summary()
                print(model_name)
            else:
                pass


            model.fit(X_train_final, 
                      y_train_final,
                      batch_size          = batch_size,
                      epochs              = epochs,
                      verbose             = 1,
                      validation_data     = (X_test, y_test),
                      steps_per_epoch     = int(np.ceil(X_train_final.shape[0] / float(batch_size))),
                      callbacks           = callbacks_list,
                      use_multiprocessing = True)
                      
            # Get the model predictions
            y_train_enc = np.argmax(y_train_final, axis=1)
            y_val_enc   = np.argmax(y_OHEV_val, axis=1)

            y_train_predicted = np.argmax(model.predict(X_train_final), axis=1)

            t_srt             = time.process_time_ns()
            y_val_predicted   = np.argmax(model.predict(X_val), axis=1)
            t_end             = time.process_time_ns()
            proc_time         = ((t_end - t_srt) / 1000000)   
            
            # Compute the classifier metrics
            accuracy_train = metrics.accuracy_score(y_train_enc, y_train_predicted)
            accuracy_val   = metrics.accuracy_score(y_val_enc,  y_val_predicted)

            f1_Score_train = metrics.f1_score(y_train_enc, y_train_predicted, average = 'weighted')
            f1_Score_val   = metrics.f1_score(y_val_enc,  y_val_predicted,  average = 'weighted')

            precision_score_train = metrics.precision_score(y_train_enc, y_train_predicted, average = 'weighted')
            precision_score_val   = metrics.precision_score(y_val_enc,  y_val_predicted,  average = 'weighted')

            recall_score_train = metrics.recall_score(y_train_enc, y_train_predicted, average = 'weighted')
            recall_score_val   = metrics.recall_score(y_val_enc,  y_val_predicted,  average = 'weighted')

            class_report_val = classification_report(y_val_enc, y_val_predicted, target_names = nom_classes)
            print(class_report_val)
            
            # Compute the confusion matrix
            CM = metrics.confusion_matrix(y_val_enc, y_val_predicted)
            y_val_enc       = []
            y_val_predicted = []

            # Store the name, test accuracy results and model
            models.append((name, accuracy_val, model))
            
            K.clear_session()
            del model
                    
            acc_set = pd.concat([acc_set, pd.DataFrame({'Model': [name],
                                                        'Fold': [fold],
                                                        'Accuracy(Train)': [accuracy_train],
                                                        'Accuracy(Val)': [accuracy_val],
                                                        'F1(Train)': [f1_Score_train],
                                                        'F1(Val)': [f1_Score_val],
                                                        'Precision(Train)': [precision_score_train],
                                                        'Precision(Val)': [precision_score_val],
                                                        'Recall(Train)': [recall_score_train],
                                                        'Recall(Val)': [recall_score_val],
                                                        'Conf_M': [CM],
                                                        'Process_time': [proc_time],
                                                        'Class_report(Val)': class_report_val})], ignore_index = True)
                   
    return acc_set, models

In [None]:
metrics_set, models_set  = model_classifiers(classifiers, DB_from_pkl)

In [None]:
metrics_set

In [None]:
# Sort by Model and Accuracy test. Reset the index.

metrics_set = metrics_set.sort_values(['Model', 'Accuracy(Val)'], ascending = [True, True]).reset_index()
metrics_set

In [None]:
metrics_set[['Model', 'Accuracy(Val)']].style.background_gradient(cmap = cmap_cm)

In [None]:
highest_accuracy = metrics_set.groupby('Model')['Accuracy(Val)'].max()
highest_accuracy

In [None]:
# Creates a dictionary of each classifier and its data explanation

unique_models = []
results       = {}

for c in classifiers:
    unique_models.append(c)

for model in unique_models:
    result = metrics_set[metrics_set['Model'] == model].describe().round(4)
    results[model] = result

In [None]:
for model in results.keys():
    print(f'Model...: {model}')
    display(results[model])

In [None]:
metrics_set_no_cm = metrics_set.drop(['Conf_M', 'Class_report(Val)'], axis=1)
metrics_set_no_cm

In [None]:
metrics_set_name       = nom_dataset + '_metrics_set_CNN_2D' +  model_surname + '.pkl'
metrics_set_name_no_cm = nom_dataset + '_metrics_set_CNN_2D' +  model_surname + '_no_cm.csv'

print(metrics_set_name)
print(metrics_set_name_no_cm)

In [None]:
# Writes de results to a PKL and CSV file

with open(os.path.join(path_models, metrics_set_name), 'wb') as file:
    pickle.dump(metrics_set, file)
    
metrics_set_no_cm.to_csv(os.path.join(path_models, metrics_set_name_no_cm), sep='\t', encoding='utf-8')

In [None]:
metrics_set_from_pkl = pd.read_pickle(os.path.join(path_models, metrics_set_name))
metrics_set_from_pkl

In [None]:
idx = metrics_set.groupby('Model')['Accuracy(Val)'].idxmax()
conf_matrices = metrics_set.loc[idx, ['Model','Accuracy(Val)','Conf_M']]
conf_matrices.set_index('Model', inplace=True)
conf_matrices_dict = conf_matrices.to_dict('index')
conf_matrices_dict

In [None]:
conf_matrices_dict['Model_CNN_2D_Su']['Conf_M']

In [None]:
for i, idx in zip(conf_matrices_dict.keys(), range(1, len(conf_matrices_dict) + 1)):
    print(idx)
    print(i)
    print(conf_matrices_dict[i]['Accuracy(Val)'])
    print(conf_matrices_dict[i]['Conf_M'])

In [None]:
# Plot the confusion matrix for the highest accuracy test classifiers

picture_name = f'{pic_first_name}{get_next_file_number(path_pic):02d}.png'

plt.figure(figsize=(20,8))
plt.suptitle(nom_dataset + model_surname + ' - Confusion matrices of the best results for each classifier', fontsize = 16,  y=0.99)
for i, idx in zip(conf_matrices_dict.keys(), range(1, len(conf_matrices_dict) + 1)):
    title = 'Classifier '+ i + ' (Highest accuracy validation of the best models: ' + str("{:0.4f}".format(conf_matrices_dict[i]['Accuracy(Val)'])) +')'
    plt.subplot(1,2,idx)
    plot_confusion_matrix(conf_matrices_dict[i]['Conf_M'],  
                          nom_classes, 
                          title,
                          cmap = None,                          
                          normalize = False)

plt.savefig(os.path.join(path_pic, picture_name))
plt.tight_layout()

In [None]:
picture_name = f'{pic_first_name}{get_next_file_number(path_pic):02d}.png'

plt.figure(figsize=(18,8))
plt.suptitle(f'{nom_dataset} - Box plot each classifier (batch type: {model_surname})', fontsize = 16,  y=0.97)
box_plot = sns.boxplot(data=metrics_set, x="Model", y="Accuracy(Val)", showfliers = True)

medians = list(metrics_set.groupby(['Model'])['Accuracy(Val)'].median())
medians = [round(element, 2) for element in medians]

vertical_offset = metrics_set['Accuracy(Val)'].median()*0.0001  # offset from median for display

for xtick in box_plot.get_xticks():
    box_plot.text(xtick, medians[xtick] + vertical_offset, medians[xtick], 
            horizontalalignment='center',size='medium',color='w',weight='semibold')
plt.savefig(os.path.join(path_pic, picture_name))

# End of the notebook