<span style="font-size:10pt">AI @ ENSPIMA_2023-2024 / v1.2 september 2023 / Jean-Luc CHARLES (Jean-Luc.charles@mailo.com) / CC BY-SA 4.0 /</span>

<div style="color:brown;font-family:arial;font-size:26pt;font-weight:bold;text-align:center"> 
Machine learning with Python tensorflow2/keras modules</div><br>
<hr>
<div style="color:blue;font-family:arial;font-size:22pt;font-weight:bold;text-align:center"> 
Training a Dense Neural Network to classify handwritten digits<br><br>
Load the MNIST database</div>
<hr>
Expected duration : 20 minutes

<div class="alert alert-block alert-danger">
<span style="color:brown;font-family:arial;font-size:12pt"> 
It is important to use a <span style="font-weight:bold;">Python Virtual Environment</span> (PVE) for your Python projects: a PVE makes it possible to control for each project the versions of the Python interpreter and the "sensitive" modules (like tensorflow).</span></div>

All the notebooks must be loaded in a `jupyter notebook` or `jupyter lab` launched within the <b><span style="color: rgb(200, 151, 102);" >pyml-pm</span></b> PVE specially created for the session.<br>
They should be worked in this order:
- `ML1_MNIST.ipynb`: check that the <b><span style="color: rgb(200, 151, 102);">pyml-pm</span></b> PVE is fuly operationnal, load and use the data from the MNIST database (images and labels).
- `ML2_DNN_part1.ipynb`: build a Dense Neural Network (DNN), train it with data from the MNIST and evaluate its performance.
- `ML2_DNN_part2.ipynb`: reload a previously trained DNN and evaluate its performance with the MNIST test data.

## Part-1 targeted learning objectives:
- Know how to launch a notebook in a dedicated Python Virtual Environment
- Know how to load data from the MNIST bank (images and labels)
- Know how to view MNIST images and check the associated label.

## 1 - Verify importing Python modules
The **keras** module which allows high-level manipulation of **tensorflow** objects is integrated in the **tensorflow** (tf) module since version 2. <br>
The **tf.keras** module documentation to consult is here: https://www.tensorflow.org/api_docs/python/tf/keras.

Importing the `tensorflow` module in the cell below may generate some warning messages...<br>
if errors appear they must be corrected, possibly by recreating your <b><span style="color: rgb(200, 51, 102);">pyml-pm</span> PVE</b>:

In [None]:
import os, sys, cv2

# Delete the (numerous) warning messages from the **tensorflow** module:
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

print(f"Python    : {sys.version.split()[0]}")
print(f"tensorflow: {tf.__version__} incluant keras {keras.__version__}")
print(f"numpy     : {np.__version__}")
print(f"OpenCV    : {cv2.__version__}")

Embedding matplotlib plots in the notebook:

In [None]:
%matplotlib inline

## 2 - Load the MNIST data (images and labels)

### The MNIST images bank

In this practical work we use labeled images from the MNIST bank available on the Internet (http://yann.lecun.com/exdb/mnist/).

The MNIST database contains 70,000 grayscale images of 28 $\times$ 28 pixels, representing handwritten digits: each image corresponds to a 28$\times$28 matrix of 784 `uint8` numbers (positive integers on 8 bits, coding a value in the interval [0, 255]).<br>
The 70,000 MNIST images are grouped into a set of **60,000 training images** and a set of **10,000 test images**.

<div class="alert alert-block alert-danger">
The evaluation of the performance of a trained network must always be done with a data set different from the training set: this is why the MNIST bank offers 10,000 test images <b>different</b> from the 60,000 workout pictures.
</div>
<br>Example of MNIST images:
<p style="text-align:center; font-style:italic; font-size:12px;">
<img src="img/MNIST_digits_sample.png" width="500"><br>
[image credit: JLC]
</p>

Consult the documentation of the `load_data` function on the page [tf.keras.datasets.mnist.load_data](https://www.tensorflow.org/api_docs/python/tf/keras/datasets/mnist/load_data) then complete the cell below to load the data from the MNIST by naming the returned data:<br>
- `im_train`, `lab_train`: the training images and labels,
- `im_test`, `lab_test`: the test images and labels.

(In case of _"SSL error...."_ type error message to download MNIST data, see [Python SSL Certification Problems in Tensorflow](https://stackoverflow.com/questions/46858630/python -ssl-certification-problems-in-tensorflow))

The cell below displays the `shape` and `dtype` attributes of the resulting numpy arrays: are the values consistent? 

In [None]:
print("im_train.shape :", im_train.shape, ", dtype:", im_train.dtype,)
print("lab_train.shape:", lab_train.shape,  ", dtype:", lab_train.dtype)
print("im_test.shape  :", im_test.shape,  ", dtype:", im_test.dtype,)
print("lab_test.shape :", lab_test.shape,  ", dtype:", lab_test.dtype)

#### Images and labels visualisation:

#### With the `imshow` function of the `matplotlib.pyplot` module, display in gray tone the 600th image of the `im_train` array.<br>
Tips:
- use `plt.figure(figsize=(2,2))` to set the size of the image,
- use the `cmap='gray'` option of the `imshow` function to display the image in gray tone,
- remove the ticks in X and Y with `plt.axis('off');`).

Check that the associated label in the `im_train` table corresponds to what we see on the image....

Import the `plot_images` function from the module `utils.tools` and display the help on this function:

Using the `plot_images` function, display the training images in a 4 x 6 grid beginning with the 600th image:

In the cell below, we sort the images and we display rows of 16 $\times$ '0', 16 $\times$ '1'... in black over a white background:

In [None]:
data = np.ndarray((10*16, 28, 28))
for i in range(10):
    indexes = np.where(lab_train == i) # array of the ranks of the labels equal to i
    i_data = im_train[indexes][:16].tolist()    # array of the 16 associated images
    data[i*16:(i+1)*16] = i_data
    
R, C = 10, 16
plot_images(data, R, C, reverse=True)

### Further work:
You can now load the `ML2_DNN_part1_en.ipynb` *notebook* to learn how to build a Dense Neural Network and train it to classify MNIST images.