<a href="https://colab.research.google.com/github/Demosthene-OR/Student-AI-and-Data-Management/blob/main/08_Dense_Neural_Network_Tabular_en.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


<img src="https://prof.totalenergies.com/wp-content/uploads/2024/09/TotalEnergies_TPA_picto_DegradeRouge_RVB-1024x1024.png" height="150" width="150">


<hr style="border-width:2px;border-color:#75DFC1">
<center><h1>Introduction to Deep Learning with Keras</h1></center>
<center><h2>Predictions using Dense Neural Networks on tabular data</h2></center>
<hr style="border-width:2px;border-color:#75DFC1">

## Context and Objective

>As mentioned previously, the simple perceptron algorithm is not really useful. It is better to consider a multilayer perceptron to observe interesting results. The following GIF illustrates this: 
>
><img src="https://assets-datascientest.s3-eu-west-1.amazonaws.com/notebooks/masterclass_deeplearning_debutant_intro_dense.gif" style="width:400px">
>
>**We will now look at an example of a multilayer perceptron model on the well-known IRIS dataset.** As a reminder, this dataset is the most widely used for species recognition projects. It contains visual information on three species of iris, each described by 50 observations. The objective is therefore to recognize the iris flower based on certain characteristic measurements grouped together in **tabular** data.  

* **(a)** Run the following cell to import the necessary packages:



In [None]:
import numpy as np # Pour la manipulation de tableaux

import pandas as pd # Pour manipuler des DataFrames pandas

import matplotlib.pyplot as plt # Pour l'affichage d'images
from matplotlib import cm # Pour importer de nouvelles cartes de couleur
%matplotlib inline

from tensorflow.keras.layers import Input, Dense #Pour instancier une couche Dense et une d'Input
from tensorflow.keras.models import Model



## Data loading and preprocessing

* **(b)** Read the **Iris.csv** file into a dataframe df, specify the **Id** column to contain the indexes, and display the first 5 rows.


In [None]:
## Insérez votre code ici


In [None]:
url = "https://raw.githubusercontent.com/Demosthene-OR/Student-AI-and-Data-Management/main/data/"
df=pd.read_csv(url+'Iris.csv', index_col = 'Id')
df.head()



* **(c)** Separate the explanatory variables from the target variable.
* **(d)** Encode the target variable, **`‘Species’`**, according to each species.



In [None]:
# Insert your code here



* **(e)** Separate the data into a training set and a test set. The test set will account for **one-third** of the dataset.



In [None]:
## Insert your code here



## Multi-Layer Perceptron

> The multi-layer perceptron model is a **sequence of perceptron layers** whose input is the output of the **previous layer**.
>
> Consider a 3-layer perceptron model. For a given input vector $x$, the output of the first layer is:
>
> $$ H_1 = \mathrm{Layer_1}(x) $$ 
>
> Next, the vector $H_1$ is fed into the second layer: 
>
> $$ H_2 = \mathrm{Layer_2}(H_1) $$
>
> Finally, the vector $H_2$ is fed into the third layer to obtain the final output of the model:
>
> $$ O = \mathrm{Layer_3}(H_2) $$
>
> <img src = "https://assets-datascientest.s3-eu-west-1.amazonaws.com/notebooks/perceptron3.png">
>
> Multilayer perceptrons can be constructed **sequentially** by stacking dense layers one after the other. You will sometimes encounter this notation in Keras, although in this module we have decided to use the **functional** construction, which is more versatile and is used more often when writing complex models that may have a non-linear structure or take different inputs.

## Building and Training a Model




> We will build our model by adding layers one by one, from the input layer to the output layer.
>
> Building a model with **Keras** is very easy with the following steps:
>
>
>* **Step 1**: Import the `Input` and `Dense` classes from the `tensorflow.keras.layers` submodule and the `Model` class from the `tensorflow.keras.models` submodule
>
> ```python
> from tensorflow.keras... import ...
>```
>
> * **Step 2**: Instantiate an input layer that contains the dimensions of our input data.
>
> ```python
> inputs = Input(shape = (..), name = “Input”)
>```
>
> * **Step 3**: Instantiate the layers that will make up the model with their constructor. To instantiate a dense layer, use the `Dense` constructor, which we have imported.
>
>```python
> dense1 = Dense(units = ..., activation = “...”, name = “Layer_1”)
>```
>
> * **Step 4**: Apply the layers one by one (**functional** construction).
>
>```python
> x = dense1(inputs)
> x = dense2(x)
> ...
>```
>
> Instantiating layers involves several nuances:
>
> * The first layer we are going to add to the model must be instantiated **by specifying the dimensions of the input vector** with the **shape** parameter. This specification is not necessary for subsequent layers.
>
>
> * The **number of neurons** in a layer is defined with the **units** parameter.
>
>
> * To add an activation function to a neuron layer, you can either instantiate an activation layer and then add it to the model, or define the activation function in the **activation** parameter of the layer constructor.
>

* **(a)** Instantiate an inputs layer, with the number of explanatory variables in the model as its dimension.


* **(b)** Instantiate a dense layer called **`dense1`** with 10 neurons. This layer will have the `tanh` function as its activation function.


* **(c)** Instantiate a second dense layer called **`dense2`** with 8 neurons. The layer will have the `tanh` function as its activation function.


* **(d)** Instantiate a third dense layer called **`dense3`** with 6 neurons. The layer will have the `tanh` activation function.


* **(e)** Instantiate a fourth dense layer called **`dense4`** with 3 neurons. The layer will have the `softmax` activation function.



In [None]:
## Insert your code here


In [None]:
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model





* **(f)** Since we are dealing with a **functional** construction, we must apply the different layers of the model one by one, specifying that the first layer will take the input and the output will correspond to the application of the last Dense layer.



In [None]:
## Insert your code here



> The following commands allow you to finalize the definition of the model and display its structure.



In [None]:
model = Model(inputs = inputs, outputs = outputs)
model.summary()



* **(g)** Compile the model, with the loss function: **`“sparse_categorical_crossentropy”`**, suitable for multi-class classification. Set the optimizer: **`‘adam’`** and the metric: **`[“accuracy”]`**



In [None]:
## Insert your code here



> We can finally train our model on the data. This is done in the same style as *scikit-learn*:
>
> ```python
> model.fit (X, y, epochs = n, batch_size = 32, validation_split = p)
>```
>
>* The **`batch_size`** argument defines the number of training samples that will be used to calculate the gradient of the loss function. This means that not all the data is used at once, which has beneficial effects such as regularization and faster training time.
>
> * The **`epochs`** argument indicates the number of passes through the dataset that we will perform during the optimization process. There are therefore several optimization steps at each epoch, and each of these steps uses a **`batch_size`** number of data points. Training with a **high number** of epochs can lead to **overfitting** or **overlearning**, while a **low** number of epochs leads to **underfitting**.
> 
> The following illustration provides a clearer view of the epochs and batch_size arguments based on our dataset:
>
><img src=https://datascientest.fr/train/assets/masterclass_deeplearning_batch.png style=‘width:250px’>
>
> * The **`validation_split`** argument allows us to keep a certain proportion of the dataset as a **validation set**. The model will be **evaluated** on the validation dataset **at the end of each epoch**.
>
> We can finally train our model on the data. This is done in the same style as *scikit-learn*:

> 
> Do you remember the Iris dataset from the first notebook? The following widget will illustrate this concept of batch and epoch through the minimization of the loss function of this problem. For each optimization step, gradient descent is performed based on the loss calculated on the selected batch, and this operation is repeated until the entire dataset has been processed: that is, until one epoch has been completed.
* Run the following cell:



In [None]:
!wget -q https://raw.githubusercontent.com/Demosthene-OR/Student-AI-and-Data-Management/main/widget_batch_gradient_descent.py
%run widget_batch_gradient_descent.py



* **(h)** Train the model on “X_train” and “y_train” for **`500`** epochs with a batch size of **`32`** samples and maintaining a validation split of **`0.1`**.



In [None]:
## Insert your code here



## Model performance

> We now want to diagnose our model. To do this, we will calculate a confusion matrix on the test sample.
>
> However, if we try to predict the class of the test sample, the model's **`predict`** method returns a **probability vector** where each element is the probability of belonging to the class corresponding to its index.
>
> To use the `classification_report` function from the **metrics** submodule of **scikit-learn**, the prediction vector and the actual class vector must be composed of integers.
>
> We will then use the `argmax` method of a **numpy** array to find out which class the binary vectors and probability vectors correspond to.

* **(a)** Predict the classes of the **X_test** sample using the model's `predict` method. Store the result in an array named **test_pred**.


* **(b)** Apply the `argmax` method to the **test_pred** arrays to obtain vectors of integers corresponding to the predicted and actual classes. You will need to pass the argument ‘axis = 1’ so that argmax is calculated on the columns and not the rows. Store the outputs of the `argmax` method calls in an array named **test_pred_class** and the actual values in **y_test_class**.





In [None]:
## Insert your code here



* **(c)** Display a detailed evaluation report of the model's performance using the `classification_report` function of the **metrics** submodule of **scikit-learn**.


In [None]:
## Insert your code here


In [None]:
from sklearn.metrics import classification_report,confusion_matrix




>Comparativement au modèle avec un seul perceptron, les résultats de classification semble nettemenent meilleurs avec plusieurs couches Denses.
Compared to the single perceptron model, the classification results appear to be significantly better with multiple dense layers.

<hr style="border-width:2px;border-color:#75DFC1">
<h2 style = “text-align:center” > Key takeaways </h2> 
<hr style="border-width:2px;border-color:#75DFC1">

> * The **multilayer perceptron** algorithm is particularly effective for **classification** problems.
> * The **sparse categorical crossentropy** loss function is used for classification problems. 
> * You can easily build an **MLP** on **Keras** using a **sequential** construction when the architecture is simple. The **functional** construction is more versatile. 
> * You train the neural network by traversing it using the **`batch_size`** and **`epoch`** parameters.
 
>You can find all the features of **`keras`** in **`tensorflow.keras`**. In particular, there are “neural layers” in **`tensorflow.keras.layers`**:
>
> | Definition          | Class               |
> | :-----------------: |:--------------------:|
> | Dense layer         | `Dense`              |
> | Convolution 2D      | `Conv2D`             |
> | Dropout             | `Dropout`            |
> | Batch Normalization | `BatchNormalization` |
> | Average Pooling 2D  | `AveragePooling2D`   |
> | Flatten             | `Flatten`            |
> | RNN                 | `RNN`                |
> | LSTM                | `LSTM`               |
> | GRU                 | `GRU`                |

