**D3APL: Aplicações em Ciência de Dados** <br/>
IFSP Campinas

Prof. Dr. Samuel Martins (Samuka) <br/><br/>

<a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.

# Regression Multilayer Perceptron (MLP) with Keras - V4
In this notebook, we will see how to wrap a Keras Model to be used in Sklearn environment. <br/>
We'll see:
- Pipeline: Normalization + MLP

## 1. Set up

#### 1.1 TensorFlow + Keras

In [1]:
import tensorflow as tf
from tensorflow import keras

### 1.2 Fixing the seed for reproducibility (optional)
That's a try for reprodubility in Keras. See more on: <br/>
https://machinelearningmastery.com/reproducible-results-neural-networks-keras/ <br/>
https://www.tensorflow.org/api_docs/python/tf/random/set_seed

In [2]:
from numpy.random import seed
from tensorflow.random import set_seed

seed(42)
set_seed(42)

#### 1.3 Other imports

In [3]:
import numpy as np
import matplotlib.pyplot as plt

#### 1.4 Loading California Housing Dataset via Scikit-learn
https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_california_housing.html

In [4]:
X_train = np.load('./datasets/housing_train_data.npy')
y_train = np.load('./datasets/housing_train_labels.npy')
X_test = np.load('./datasets/housing_test_data.npy')
y_test = np.load('./datasets/housing_test_labels.npy')

In [5]:
print(f'X_train.shape = {X_train.shape}')
print(f'y_train.shape = {y_train.shape}\n')

print(f'X_test.shape = {X_test.shape}')
print(f'y_test.shape = {y_test.shape}\n')

X_train.shape = (16512, 8)
y_train.shape = (16512,)

X_test.shape = (4128, 8)
y_test.shape = (4128,)



## 2. Wrapping Keras Model to Sklearn

### 2.1 Defining the Network Architecture
Proposed architecture for Multiclass Classification:
- Input Layer: 8 neurons
- Hidden Layer: 30 neurons, ReLu
- Output Layer: 1 neuron (no activation)**


In short: [8], [30 ReLU], [1]

#### **Wrapping with Hyperparameters to use Scikit-Learn Pipelines**

##### **A function that builds a Keras Model**
Create a function that returns a **Keras model**, taking as _parameters_ things you want to verify using _cross-valdiation_ and _model selection_.

https://stackoverflow.com/a/44134591/7069696 <br/>
https://machinelearningmastery.com/use-keras-deep-learning-models-scikit-learn-python/

In [6]:
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense

def build_model():
    model = Sequential([
        Dense(30, activation='relu', input_shape=(8,)),  # it automatically creates an input layer with `input_shape` neurons
        Dense(1)
    ])
    
    model.compile(loss='mean_squared_error', optimizer='sgd')
    
    return model

##### **Wrapping**
**Current alternative - SciKeras** <br/>
https://github.com/adriangb/scikeras <br/>
https://www.adriangb.com/scikeras/stable/migration.html

In [7]:
from scikeras.wrappers import KerasRegressor

# pass the function that creates a keras model, and inform the other parameters used for training, such as `epochs`, `batch_size`, etc.
model = KerasRegressor(model=build_model, epochs=20)
model

KerasRegressor(
	model=<function build_model at 0x7f0bf6f27e50>
	build_fn=None
	warm_start=False
	random_state=None
	optimizer=rmsprop
	loss=None
	metrics=None
	batch_size=None
	validation_batch_size=None
	verbose=1
	callbacks=None
	validation_split=0.0
	shuffle=True
	run_eagerly=False
	epochs=20
)

### 2.2 Pipeline
https://stackoverflow.com/a/47520976/7069696

In case of GPU drivers, we can monitor its use by [_gpustat_](https://github.com/wookayin/gpustat).

On terminal, use: `gpustat -cpi`


In [8]:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler

pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('regressor', model)
])

In [9]:
pipeline.fit(X_train, y_train)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


Pipeline(steps=[('scaler', StandardScaler()),
                ('regressor',
                 KerasRegressor(epochs=20, model=<function build_model at 0x7f0bf6f27e50>))])

### 2.4 Evaluating and Predicting New Samples

#### **Prediction**
https://www.tensorflow.org/api_docs/python/tf/keras/Sequential#predict

In [10]:
print(f'X_test.min() = {X_test.min()}')
print(f'X_test.max() = {X_test.max()}')

X_test.min() = -124.19
X_test.max() = 16122.0


In [11]:
y_pred = pipeline.predict(X_test)
y_pred



array([0.57400614, 1.5409418 , 3.7457018 , ..., 4.797339  , 0.7753818 ,
       1.678353  ], dtype=float32)

In [12]:
from sklearn.metrics import mean_squared_error
mean_squared_error(y_test, y_pred)

0.3579981041924993

#### **Manul Prediction/Evaluation**

In [13]:
pipeline

Pipeline(steps=[('scaler', StandardScaler()),
                ('regressor',
                 KerasRegressor(epochs=20, model=<function build_model at 0x7f0bf6f27e50>))])

In [14]:
scaler = pipeline.get_params()['scaler']
model_wrap = pipeline.get_params()['regressor']
model = model_wrap.model_

In [15]:
X_test_norm = scaler.transform(X_test)
X_test_norm

array([[-1.15508475, -0.28632369, -0.52068576, ...,  0.06740798,
         0.1951    ,  0.28534728],
       [-0.70865905,  0.11043502, -0.16581537, ..., -0.03602975,
        -0.23549054,  0.06097472],
       [-0.21040155,  1.85617335, -0.61076476, ..., -0.14998876,
         1.00947776, -1.42487026],
       ...,
       [ 2.80902421, -0.28632369,  0.75501156, ..., -0.02646898,
         0.78014149, -1.23041404],
       [-0.57542978,  0.58654547, -0.06124296, ..., -0.04390537,
         0.52740357, -0.08860699],
       [-0.17259111, -0.92113763, -0.6058703 , ...,  0.05466644,
        -0.66608108,  0.60445493]])

In [16]:
model.evaluate(X_test_norm, y_test)



0.35799816250801086