<span style="font-size:10pt">AI @ ENSPIMA_2022-2023_v1.0_Jean-Luc CHARLES (Jean-Luc.charles@ensam.eu)_CC BY-SA 4.0</span>

# Problem-based learning
# Training a neural network to diagnose bearing faults - part 1 / 3

### Targeted learning objectives:
Part 1:<br>
- Know how to load files in *Matlab MAT-file* format with *Python*.
- Know how to dimension and fill numpy ndarrays with the data of the `.mat` files
- Know how to display a grid of data plots
- Know how to store the numpy ndarrays in a `.npz` file

Part 2:<br>
- Know how to load a `.npz` into numpy ndarrays
- Know how to process the temporal dataset to get a spectral dataset.
- Know how to display a grid of spectra plots.

Part 3:<br>
- Know how to train/operate a DNN to diagnose bearing faults using a labeled temporal dataset.
- The problem part of the APP: Know how to train/operate a DNN to diagnose bearing faults using a labeled temporal dataset.

<br>
<div class="alert alert-block alert-danger">
<span style="color:brown;font-family:arial;font-size:12pt"> 
It is important to use a <span style="font-weight:bold;">Python Virtual Environment</span> (PVE) for your main Python projects: <br>
    a PVE makes it possible to control for each project the versions of the Python interpreter and the "sensitive" modules (like tensorflow).<br><br>
    All the notebooks your work on must be loaded into a jupyter-notebook or a jupyter-lab launched in the PVE 
    <b><span style="color: rgb(100, 151, 202);" >pyml-pm</span></b> specially created for the session.<br>
</span></div>

In [None]:
import os, sys
# Delete the (numerous) warning messages from the **tensorflow** module:
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

print(f"Python    : {sys.version.split()[0]}")
print(f"tensorflow: {tf.__version__} incluant keras {keras.__version__}")
print(f"numpy     : {np.__version__}")

In [None]:
# set the seed of the random generators used by tensorflow:
SEED = 1234

Reminder: The bearing data set was obtained under the experimental conditions
- under normal condition (N)
- with outer race fault (OF)
- with inner race fault (IF)
- with roller fault (OF).

|class label|Fault type|Fault diameter|
|:---------:|:--------:|-------------:|
| 1         | N        | 0            |
| 2         | RF       | 0.18         |
| 3         | RF       | 0.36         |
| 4         | RF       | 0.54         |
| 5         | IF       | 0.36         |
| 6         | IF       | 0.36         |
| 7         | IF       | 0.54         |
| 8         | OF       | 0.18         |
| 9         | OF       | 0.36         |
| 10        | OF       | 0.54         |
 

# 4 $-$ A first try to train the neural network with the temporal dataset

## 4.1 $-$ Load the *CWRU* data and define some useful objects

In [None]:
npzfile = np.load('CWRU_dadaset.npz')
A, B, C = npzfile.values()
full_dataset = (A, B, C)

The dimensions of A are (#health_conditions, #samples, #data_points).

Let's define:
- `H` $\leadsto$ the total number of health conditions
- `S` $\leadsto$ the total number of samples per health contion
- `N` $\leadsto$ the total number of data points per sample
- `L` $\leadsto$ the total number of load cases.

In [None]:
H, S, N = A.shape
L = len(full_dataset)

Let's define the list of the health conditions:

In [None]:
# create the list of the health condition labels:
health_cond = ['N']
for def_type in 'RF', 'IF', 'OF':
    for size in '18', '36', '54':
        health_cond.append(f"{def_type}.{size}")
print(f"list of {len(health_cond)} health conditions:", health_cond)

## 4.2 $-$ Prepare the labeled data for the training

We have two actions to do:
- merge the 200 samples for each of the 10 health conditions and each of the 3 load cases into a single array of 200 $\times$ 10 $\times$ 3 = 6000 samples,
- build the array of the cooreponding 6000 labels.

First we define the arrays with the right shapes:

In [None]:
x_full = np.ndarray((L*H*S, N), dtype='float')  # the array of the samples
y_full = np.ndarray((L*H*S,), dtype='uint8')    # the array of the labels

Let's verify the shape of the arrays:

In [None]:
x_full.shape, y_full.shape

Then we fill `x_full` with samples and `y_full` with corresponding labels:

In [None]:
i = 0
for data in (A, B, C):       # browse the 3 load cases
    for h in range(H):       # browse the 10 health conditions -> the labels
        for s in range(S):   # browse the 200 samples
            x_full[i] = data[h, s]
            y_full[i] = h    # the label is given by the health condition loop variable
            i += 1                                                  

To verify, let's plot the sample of rank 10 in `x_full`:

In [None]:
plt.figure(figsize=(8,5))
plt.plot(x_full[10]);

### Normalization of the temporal samples

We normalize the samples of `x_full` by dividing each one by the max of its absolute values.<br>
There are two ways to do this:

a/ Version with an **explicit loop** to browse through all the samples of `x_full`:

In [None]:
for i in range(len(x_full)):
    x_full[i] = x_full[i]/np.abs(x_full[i]).max()

Let's take a look now at the sample of rank 10 in `x_full`:

In [None]:
plt.figure(figsize=(8,5))
plt.plot(x_full[10]);

b/ the numpy **vectorized** style (more efficient):

In [None]:
x_full = x_full/np.abs(x_full).max(axis=1, keepdims=True)

Let's take a look now at the sample of rank 10 in `x_full`:

In [None]:
plt.figure(figsize=(8,5))
plt.plot(x_full[10]);

$\leadsto$ the values of `x_full` are now all in the range [-1, 1].

## 4.3 Split the full dataset into train and test datasets

Here we just use the `train_test_split` function from `sklean.model_slection`:

In [None]:
from sklearn.model_selection import train_test_split

x_train, x_test, lab_train, lab_test = train_test_split(x_full, y_full, 
                                                        stratify=y_full,      # use y_full to evenly distribute all classes 
                                                                              # in the train and test dadasets
                                                        test_size=0.5,        # 50 % test, 50% train 
                                                        random_state=SEED, 
                                                        shuffle=True)         # shuffe randomly the data

## 4.4 Transform labels to *one-hot* format

In [None]:
from tensorflow.keras.utils import to_categorical
# 'one-hot' encoding' des labels :
y_train = to_categorical(lab_train)
y_test  = to_categorical(lab_test)

Just a recap the the shapes of the arrays:

In [None]:
x_train.shape, x_test.shape, lab_train.shape, lab_test.shape, y_train.shape, y_test.shape

## 4.6 $-$ Build the Deep Neural Network

You will build a dense neural network wit this structure:
    
     Input layer : 1900 inputs
     Hidden layer 'H1' : 1900 neurones, activation fucntion: relu                                                    
     Hidden layer 'H2' : 600  neurones, activation fucntion: relu                          
     Hidden layer 'H3' : 200  neurones, activation fucntion: relu                         
     Hidden layer 'H4' : 100  neurones, activation fucntion: relu
     Output layer      : 10   neurones, activation fucntion: softmax

In [None]:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Input

# set the seed of the random generators used by tensorflow:
SEED = 1234
tf.random.set_seed(SEED)

# the 5 lines to build the neural network:
model = Sequential()
model.add(Input(shape=(N,), name='Input'))
model.add(Dense(N, activation='relu', name='H1'))
model.add(Dense(600, activation='relu', name='H2'))
model.add(Dense(200, activation='relu', name='H3'))
model.add(Dense(100, activation='relu', name='H4'))
model.add(Dense(H, activation='softmax', name='Output'))
model.compile(loss='categorical_crossentropy', optimizer='adam',  metrics=['accuracy'])

In [None]:
model.summary()

Let's save the initial values of the network weights if we want to reload later the network to its initial state:

In [None]:
# Check whether the folder 'weights' exists and create it if needed:
if not os.path.isdir("weights"): os.mkdir("weights")

# Save the initial DNN (random) weights:
key = 'CWRU_temporal_init'
model.save_weights(os.path.join('weights', key))

# Display the created files:
files=[os.path.join("weights",f) for f in os.listdir("weights") if f.startswith(key)]
for f in files: print(f)

Now we train the neural network with `x_train` & `y_train` as the labeled dataset and `x_test` & `y_test` as the validation labeled dataset to use at the end of each epoch to mesure the network performance.<br>
To avoid any *over-fitting* we use the `EarlyStopping` callback:

In [None]:
from tensorflow.keras.callbacks import EarlyStopping

callbacks_list = [ 
    EarlyStopping(monitor='loss',   # 
                  patience=3,           #
                  restore_best_weights=True,
                  verbose=1)
]

# in case we execute this cell several times, we can re-initialize 
# the network to its initial state if we want to compare the workouts...
key = 'CWRU_temporal_init'
model.load_weights(os.path.join('weights', key)) 

# set the seed of the random generators inolved by tensorflow:
tf.random.set_seed(SEED)

# train the DNN:
hist = model.fit(x_train, y_train,
                 validation_data=(x_test, y_test), 
                 epochs=50, 
                 batch_size=64,                     # number of samples in the batch
                 callbacks = callbacks_list)

from utils.tools import plot_loss_accuracy
plot_loss_accuracy(hist)

Now we compute the trained network predictions for the test datatset:

In [None]:
results = model.predict(x_test)          # restults is an array of probabilities vectors
inferences = results.argmax(axis=-1)     # extract the highest probablities

And we can plot the **confusion matrix** to  see if the networkd is well trained or not:

In [None]:
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
fig = plt.figure(figsize=(8,8))
axis = plt.axes()
ConfusionMatrixDisplay.from_predictions(lab_test, inferences, 
                                        ax=axis,
                                        display_labels=health_cond, 
                                        xticks_rotation='vertical',
                                        colorbar=False);

Not all the bearing defaults are recognized with a good score...<br>
$\leadsto$ the next step is to try to train the network with the spectral datasets computed from the temporal samples to see if it's better ?

# 5 $-$ Finally the problem: Train the neural network with the Fourier spectrum of the temporal data set

## 5.1 $-$ Compute the spectral datasets

See **3.1 − Compute the spectral datasets** in the notebook *2-process_CWRU_data.ipynb* to help you to do the work....

In [None]:
H, S, N = A.shape
print(f"array A has <{S}> samples of <{N}> data point for each of the <{H}> health conditions ")

The spectra are computed with [numpy.fft.rfft](https://numpy.org/doc/stable/reference/generated/numpy.fft.rfft.html)<br>
On the web page, you can see how to compute the size of the spectrum:

In [None]:
if S % 2 == 0:
    N_spectrum = int(N/2+1)
else:
    N_spectrum = int((N+1)/2)
print(f"size of spectra: {N_spectrum}")    

Now you must define and dimension 3 ndarrays of `floats` to store the spectra of the 3 temporal data arrays.<br>
For the dimensions, you must use `H`, `S` and `N_spectrum`:

In [None]:
A_spectrum = 
B_spectrum = 
C_spectrum = 

Now you can compute the spectra with the `np.fft.rfft` function and fill in the 3 arrays (:

## 5.2 $-$ Prepare the data set for supervised learning

As we saw previously, we can keep only the first 400 spectral points in each sample, so you must define `x_full` and `y_full` with the appropriate dimensions:

In [None]:
N_spectrum = 400



Check:

In [None]:
x_full.shape, y_full.shape

Then you can fill `x_full` and `y_full` appropriately (see **4.2 −
Prepare the labeled data for the training** to get some help):

Have a look on the spectrum of rank 10:

In [None]:
plt.figure(figsize=(8,5))
plt.plot(x_full[10])

## 5.3 Split the full dataset into train and test datasets

see **4.3 Split the full dataset into train and test datasets** to get some help:

## 5.4 Transform labels to *one-hot* format

see **4.4 Transform labels to one-hot format** to get some help:

Check:

In [None]:
x_train.shape, x_test.shape, lab_train.shape, lab_test.shape, y_train.shape, y_test.shape

## 5.5 $-$ Build the Neural Network

In [None]:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Input

# set the seed of the random generators used by tensorflow:
SEED = 1234
tf.random.set_seed(SEED)

# the 5 lines to build the neural network:
modelS = Sequential()
......
......
......
modelS.compile(loss='categorical_crossentropy', optimizer='adam',  metrics=['accuracy'])

In [None]:
modelS.summary()

In [None]:
# Check whether the folder 'weights' exists and create it if needed:
if not os.path.isdir("weights"): os.mkdir("weights")

# Save the initial DNN (random) weights:
key = 'CWRU_spectral_init'
modelS.save_weights(os.path.join('weights', key))

# Display the created files:
files=[os.path.join("weights",f) for f in os.listdir("weights") if f.startswith(key)]
for f in files: print(f)

Train the `modelS` network:

In [None]:
# Check whether the folder 'models' exists and create it if needed:
if not os.path.exists("models"): os.mkdir("models")

# save the trained DNN structure + wieghts:
key = 'CWRU_spectral_init'
modelS.save(os.path.join('models', key) )

# Display the created files:
files=[os.path.join("models",f) for f in os.listdir("models") if f.startswith(key)]
for f in files: print(f)

Now we compute the trained network predictions for the test datatset:

And we can plot the **confusion matrix** to  see if the networkd is well trained or not: