**D3APL: Aplicações em Ciência de Dados** <br/>
IFSP Campinas

Prof. Dr. Samuel Martins (Samuka) <br/><br/>

<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.

# Regression Multilayer Perceptron (MLP) with Keras - V1

## 1. Set up

#### 1.1 TensorFlow + Keras

In [None]:
import tensorflow as tf
from tensorflow import keras

### 1.2 Fixing the seed for reproducibility (optional)
That's a try for reprodubility in Keras. See more on: <br/>
https://machinelearningmastery.com/reproducible-results-neural-networks-keras/ <br/>
https://www.tensorflow.org/api_docs/python/tf/random/set_seed

In [None]:
from numpy.random import seed
from tensorflow.random import set_seed

seed(42)
set_seed(42)

#### 1.3 Other imports

In [None]:
import numpy as np
import matplotlib.pyplot as plt

#### 1.4 Loading California Housing Dataset via Scikit-learn
https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_california_housing.html

In [None]:
# extracting the feature matrix and labels


In [None]:
print(f'X.shape = {X.shape}')
print(f'y.shape = {y.shape}')

#### 1.5 Spliting Data into Train and Test Sets

In [None]:
from sklearn.model_selection import train_test_split



In [None]:
print(f'X_train.shape = {X_train.shape}')
print(f'y_train.shape = {y_train.shape}\n')

print(f'X_test.shape = {X_test.shape}')
print(f'y_test.shape = {y_test.shape}\n')

#### 1.6 Saving datasets

#### 1.7 Spliting Training Data into Train Dev and Validation
NB: For the sake of simplicity, we are going to consider a single and fixed validation test.

In [None]:
print(f'X_train_dev.shape = {X_train_dev.shape}')
print(f'y_train_dev.shape = {y_train_dev.shape}\n')
X_val
print(f'X_val.shape = {X_val.shape}')
print(f'y_val.shape = {y_val.shape}\n')

#### 1.8 Normalization
For simplicity, we are going to use the `StandardScaler()` without taking into account possible problems with outliers, etc.

## 2. Building and Training a MLP via Keras

### 2.1 Defining the Network Architecture
Proposed architecture for Multiclass Classification:
- Input Layer: 8 neurons
- Hidden Layer: 30 neurons, ReLu
- Output Layer: 1 neuron (no activation)**


In short: [8], [30 ReLU], [1]

#### **Defining the Network's Architecture**

In [None]:
model.summary()

In [None]:
X_train_dev.shape

### 2.2 Compiling: Defining the Loss Function, Optimizer, and Metrics

### 2.3 Training

In case of GPU drivers, we can monitor its use by [_gpustat_](https://github.com/wookayin/gpustat).

On terminal, use: `gpustat -cpi`


In [None]:
# https://www.tensorflow.org/api_docs/python/tf/keras/Sequential#fit



#### **Visualizing the training history**

In [None]:
import pandas as pd
import matplotlib.pyplot as plt

history_df = pd.DataFrame(history.history)

history_df.plot(figsize=(8, 5))
plt.grid(True)
plt.ylim(0, 1) # set the vertical range to [0-1]
plt.xlabel('Epochs')
plt.ylabel('Score')

### 2.4 Evaluating and Predicting New Samples

#### **Evaluation**
https://www.tensorflow.org/api_docs/python/tf/keras/Sequential#evaluate

#### **Prediction**
https://www.tensorflow.org/api_docs/python/tf/keras/Sequential#predict

Repeat all steps shown in this notebook for the MNIST dataset available in Keras: <br/>
https://keras.io/api/datasets/mnist/