<span style="font-size:10pt">AI-ML @ ENSPIMA / v1.3 september 2024 / Jean-Luc CHARLES (Jean-Luc.charles@mailo.com) / CC BY-SA 4.0 /</span>

<div style="color:brown;font-family:arial;font-size:26pt;font-weight:bold;text-align:center"> 
Machine Learning $-$ MiniProject</div><br>
<hr>
<div style="color:blue;font-family:arial;font-size:22pt;font-weight:bold;text-align:center">
Training a neural network to diagnose bearing faults<br><br>
Part 3/3: The mini project.</div>
<hr>

Expected duration : 60 minutes

## Part-3 targeted learning objectives
- Know how to train/operate a DNN to diagnose bearing faults using a labeled temporal dataset.
- The mini peoject: know how to train/operate a DNN to diagnose bearing faults using the spectral transform of a temporal dataset.

<div class="alert alert-block alert-danger">
<span style="color:brown;font-family:arial;font-size:12pt"> 
It is important to use a <span style="font-weight:bold;">Python Virtual Environment</span> (PVE) for your Python projects: a PVE makes it possible to control for each project the versions of the Python interpreter and the "sensitive" modules (like tensorflow).
    
All the notebooks must be loaded in a `jupyter notebook` or `jupyter lab` launched within the <b><span style="color: rgb(100, 151, 202);" >pyml</span></b> PVE specially created for the session.    
</span></div>

In [None]:
import os, sys
# Delete the (numerous) warning messages from the **tensorflow** module:
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

# specific modules:
from utils.tools import scan_dir, plot_loss_accuracy

In [None]:
print(f"Python    : {sys.version.split()[0]}")
print(f"tensorflow: {tf.__version__} incluant keras {keras.__version__}")
print(f"numpy     : {np.__version__}")

In [None]:
# set the seed of the random generators used by tensorflow:
SEED = 1234

Reminder: The bearing data set was obtained under the experimental conditions
- under normal condition (N)
- with outer race fault (OF)
- with inner race fault (IF)
- with roller fault (OF).

|class label|Fault type|Fault diameter|
|:---------:|:--------:|-------------:|
| 1         | N        | 0            |
| 2         | RF       | 0.18         |
| 3         | RF       | 0.36         |
| 4         | RF       | 0.54         |
| 5         | IF       | 0.36         |
| 6         | IF       | 0.36         |
| 7         | IF       | 0.54         |
| 8         | OF       | 0.18         |
| 9         | OF       | 0.36         |
| 10        | OF       | 0.54         |
 

# 4 $-$ A first try to train the neural network with the temporal dataset

## 4.1 $-$ Load the *CWRU* data and define some useful objects

In [None]:
npzfile = np.load('CWRU_dadaset.npz')
A, B, C = npzfile.values()
full_dataset = (A, B, C)

Let's have a look at the arrays shape:

In [None]:
A.shape, B.shape, C.shape

As you can see, the dimensions of the arrays are : (#health_conditions, #samples, #data_points).

Let's define:
- `H` $\leadsto$ the number of health conditions
- `S` $\leadsto$ the number of samples per health contion
- `N` $\leadsto$ the number of data points per sample
- `L` $\leadsto$ the number of load cases.

In [None]:
H, S, N = A.shape           # we can use, A,B or C : they have the same shape
L = len(full_dataset)

print(f"L={L} load cases of the motor,")
print(f"H={H} health conditions,")
print(f"S={S} samples per health conditions,")
print(f"=> giving LxHxS={L*H*S} samples of N={N} data points per sample.")

Let's define the list of the labels for the health conditions:

In [None]:
# create the list of the health condition labels:
health_cond = ['N']
for def_type in 'RF', 'IF', 'OF':
    for size in '18', '36', '54':
        health_cond.append(f"{def_type}.{size}")
print(f"list of {len(health_cond)} health conditions:", health_cond)

## 4.2 $-$ Prepare the labeled dataset for the training

We have two actions to do:
- merge the 200 samples for each of the 10 health conditions and each of the 3 load cases into a single array of 200 $\times$ 10 $\times$ 3 = 6000 samples,
- build the array of the corresponding 6000 labels.

First we define the arrays with the right shapes:

In [None]:
x_full = np.ndarray((L*H*S, N), dtype='float')  # the array of the temporal samples
y_full = np.ndarray((L*H*S,), dtype='uint8')    # the array of the labels of the defects

Let's verify the shape of the arrays:

In [None]:
x_full.shape, y_full.shape

Then we fill `x_full` with samples and `y_full` with corresponding labels:

In [None]:
i = 0
for data in (A, B, C):       # browse the 3 load cases
    for h in range(H):       # browse the 10 health conditions (gives the label)
        for s in range(S):   # browse the 200 samples
            x_full[i] = data[h, s]
            y_full[i] = h    # the label is given by the health condition loop variable
            i += 1                                                  

To verify, let's plot the sample of rank 10 in `x_full`:

In [None]:
plt.figure(figsize=(8,5))
plt.plot(x_full[10]);

### Normalization of the temporal samples

We normalize the samples of `x_full` by dividing each one by the max of its absolute values.<br>
There are two ways to do this:

a/ Version with an **explicit loop** to browse through all the samples of `x_full`:

In [None]:
for i in range(len(x_full)):
    x_full[i] = x_full[i]/np.abs(x_full[i]).max()

Let's take a look now at the sample of rank 10 in `x_full`:

In [None]:
plt.figure(figsize=(8,5))
plt.plot(x_full[10]);

b/ the numpy **vectorized** style (more efficient):

In [None]:
x_full = x_full/np.abs(x_full).max(axis=1, keepdims=True)

Let's take a look now at the sample of rank 10 in `x_full`:

In [None]:
plt.figure(figsize=(8,5))
plt.plot(x_full[10]);

$\leadsto$ the values of `x_full` are now all in the range [-1, 1].

## 4.3 Split the full dataset into train and test datasets

Thanks to the `train_test_split` function of the `sklearn.model_selection` module we can split the `x_full` and `y_full` ndarrays into a _train_ dataset (for the training) and a _test_ dataset (for testing).<br>
Samples and labels are randomly selected but respecting the proportion of each of the 10 classes in the original dataset (this is the interest of the `stratify` argument of the `train_test_split` function):

In [None]:
from sklearn.model_selection import train_test_split

x_train, x_test, lab_train, lab_test = train_test_split(x_full, y_full, 
                                                        stratify=y_full,      # use y_full to evenly distribute all classes 
                                                                              # in the train and test dadasets
                                                        test_size=0.33,        # 20 % test, 80% train 
                                                        random_state=SEED, 
                                                        shuffle=True)         # shuffe randomly the data

## 4.4 Transform labels to *one-hot* format

In [None]:
from tensorflow.keras.utils import to_categorical
# 'one-hot' encoding' des labels :
y_train = to_categorical(lab_train)
y_test  = to_categorical(lab_test)

Just a recap of the shapes of the arrays:

In [None]:
x_train.shape, x_test.shape, lab_train.shape, lab_test.shape, y_train.shape, y_test.shape

## 4.6 $-$ Build the Deep Neural Network

You will build a dense neural network with this structure:

     Input layer : N inputs
     Hidden layer 'h1' : N neurones, activation fucntion: relu                                                    
     Hidden layer 'h2' : 600  neurones, activation fucntion: relu                          
     Hidden layer 'h3' : 200  neurones, activation fucntion: relu                         
     Hidden layer 'h4' : 100  neurones, activation fucntion: relu
     Output layer      :  H   neurones, activation fucntion: softmax

In [None]:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Input

# set the seed of the random generators used by tensorflow:
SEED = 1234
tf.random.set_seed(SEED)
tf.config.experimental.enable_op_determinism()

# Build the neural network layer by layer:
model = Sequential()
...
...
...
...
...
model.compile(loss='categorical_crossentropy', optimizer='adam',  metrics=['accuracy'])

In [None]:
model.summary()

Let's save the structure and the initial weights of the network if we want to rebuild later the network to its initial state:

In [None]:
import os

# Check wether the 'model' directory exist (create it if needed):
if not os.path.exists("models"): os.mkdir("models")

# define a uniq key:
key = 'CWRU_temporal_init'

# define the path where to store the network data:
path = os.path.join('models', key)

# savue the structure and the weights of the current neural network:
model.save(path)

# display the tree beginning at f'./models/{key}':
tree = scan_dir(f"./models/{key}")
print(f'\nFiles written:\n{tree}')    

Now we train the neural network with `x_train` & `y_train` as the labeled dataset and `x_test` & `y_test` as the validation labeled dataset to use at the end of each epoch to mesure the network performance.<br>
To avoid any *over-fitting* we use the `EarlyStopping` callback:

In [None]:
from tensorflow.keras.callbacks import EarlyStopping

callbacks_list = [ 
    EarlyStopping(monitor='val_loss',    
                  patience=2,       
                  restore_best_weights=True,
                  verbose=1)
]

# in case we execute a training several times, we re-build the network 
# to its initial state if we want to compare the workouts...

# define the key to reload the initial state & structure of the network:
key = 'CWRU_temporal_init'
# define the path to be used:
model_path = os.path.join('models', key)
# load the network structure & initial weights:
model = tf.keras.models.load_model(model_path)

# Deterministic tensorflow training: 
# see https://blog.tensorflow.org/2022/05/whats-new-in-tensorflow-29.html
tf.keras.utils.set_random_seed(SEED)  # sets seeds for base-python, numpy and tf
tf.config.experimental.enable_op_determinism() 

# train the DNN:
hist = model.fit(x_train, y_train,
                 validation_data=(x_test, y_test), 
                 epochs=50, 
                 batch_size=64,                     # number of samples in the batch
                 callbacks = callbacks_list)

from utils.tools import plot_loss_accuracy
plot_loss_accuracy(hist)

Now we compute the trained network predictions for the test datatset:

In [None]:
results = model.predict(x_test)          # restults is an array of probabilities vectors
inferences = results.argmax(axis=-1)     # extract the highest probablities

And we can plot the **confusion matrix** to  see if the networkd is well trained or not:

In [None]:
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
fig = plt.figure(figsize=(6,6))
axis = plt.axes()
ConfusionMatrixDisplay.from_predictions(lab_test, inferences, 
                                        ax=axis,
                                        display_labels=health_cond, 
                                        xticks_rotation='vertical',
                                        colorbar=False);

Not all the bearing defaults are classified with a good score...<br>
$\leadsto$ the next step is to try to train the network with the spectral datasets computed from the temporal samples to see if it's better ?

# 5 $-$ The mini project: Train the neural network with the spectrum data set

## 5.1 $-$ Compute the spectral datasets

See **3.1 − Compute the spectral datasets** in the notebook *2-process_CWRU_data.ipynb* to help you to do the work....

In [None]:
H, S, N = A.shape
print(f"array A has <{S}> samples of <{N}> data point for each of the <{H}> health conditions ")

The spectra are computed with [numpy.fft.rfft](https://numpy.org/doc/stable/reference/generated/numpy.fft.rfft.html)<br>
On the web page, you can see how to compute the size of the spectrum:

In [None]:
if S % 2 == 0:
    N_spectrum = int(N/2+1)
else:
    N_spectrum = int((N+1)/2)
print(f"size of spectra: {N_spectrum}")    

Now you must define and dimension 3 ndarrays of `floats` to store the spectra of the 3 temporal data arrays.<br>
For the dimensions, you must use `H`, `S` and `N_spectrum`:

In [None]:
A_spectrum = ..........
B_spectrum = ..........
C_spectrum = ..........

Now you can compute the normalized spectra with the `np.fft.rfft` function and fill in the 3 arrays (:

In [None]:
from numpy.fft import rfft



## 5.2 $-$ Prepare the data set for supervised learning

As we saw previously, we can keep only the first 400 spectral points in each sample, so you must define `x_full` and `y_full` with the appropriate dimensions:

In [None]:
N_spectrum = 400

x_full = ..............
y_full = ..............

Check:

In [None]:
x_full.shape, y_full.shape

Then you can fill `x_full` and `y_full` appropriately (see **4.2 −
Prepare the labeled data for the training** to get some help):

Let's have a quick look on the spectrum of rank 10:

In [None]:
plt.figure(figsize=(8,5))
plt.plot(x_full[10]);

## 5.3 Split the full dataset into train and test datasets

see **4.3 Split the full dataset into train and test datasets** to get some help:

In [None]:
from sklearn.model_selection import train_test_split

x_train, x_test, lab_train, lab_test = ................    

## 5.4 Transform labels to *one-hot* format

see **4.4 Transform labels to one-hot format** to get some help:

In [None]:
from tensorflow.keras.utils import to_categorical
# 'one-hot' encoding' des labels :
y_train = ....................
y_test  = ....................

Check:

In [None]:
x_train.shape, x_test.shape, lab_train.shape, lab_test.shape, y_train.shape, y_test.shape

## 5.5 $-$ Build the Neural Network for the spectral dataset

Define `modelS`, the NN for the training with the spectral dataset (same as previoulsy, except the dimension of the input layer...):

In [None]:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Input

# set the seed of the random generators used by tensorflow:
SEED = 1234
tf.random.set_seed(SEED)
tf.config.experimental.enable_op_determinism()

# Build the neural network layer by layer (note the new name: 'modelS'):
modelS = Sequential()
...
...
...
...
...
model.compile(loss='categorical_crossentropy', optimizer='adam',  metrics=['accuracy'])

In [None]:
modelS.summary()

We save the structure and the weights of the initial state of `modelS`:

In [None]:
import os

# Check wether the 'model' directory exist (create it if needed):
if not os.path.exists("models"): os.mkdir("models")

# define a uniq key:
key = 'CWRU_spectral_init'

# define the path where to store the network data:
path = os.path.join('models', key)

# savue the structure and the weights of the current neural network:
modelS.save(path)

# display the tree beginning at f'./models/{key}':
tree = scan_dir(f"./models/{key}")
print(f'\nFiles written:\n{tree}') 

Train the `modelS` network:

You can try the earlyStopping an `val_loss` with patience of 1,2,3... to get the best scores in the confusion matrix...

Now we compute the trained network predictions for the test datatset:

And we can plot the **confusion matrix** to  see if the networkd is well trained or not: