# Creating a Keras model

## Keras model building steps
* Specify Architecture
    * How many layers?
    * How many nodes in each layer?
    * What activation function to use in each layer?
        * ReLU - Rectified Linear Unit
        * Identity function
        * Hyperbolic Tangent
* Compile
    * Specify loss function
    * Optimization parameters
* Fit
    
* Predict

## Model specification

In [1]:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'
import tensorflow as tf
print(f"{tf.config.list_physical_devices('GPU') = }")

tf.config.list_physical_devices('GPU') = [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]


In [2]:
import numpy as np
#read data
predictors = np.loadtxt('hourly_wages.csv', delimiter=',', skiprows=1, usecols=range(1,10))
target = np.loadtxt('hourly_wages.csv', delimiter=',', skiprows=1, usecols=0)
n_cols = predictors.shape[1]
predictors.shape, target.shape

((534, 9), (534,))

* There are two ways to build up a model, and we will focus on sequential, which is the easier way to build a model.
* Sequential models require that each layer has weights or connections only to the one layer coming directly after it in the network diagram.
* There are more exotic models out there with complex patterns of connections, but Sequential will do the trick for everything we need here.

In [3]:
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

# easier way to crate a model
model = Sequential()
model.add(Dense(100, activation='relu', input_shape = (n_cols,)))
model.add(Dense(100, activation='relu'))
model.add(Dense(1))

* We start adding layers using the add method of the model.
* The type of layer you have seen, that standard layer type, is called a Dense layer. It is called Dense because all of the nodes in the previous layer connect to all of the nodes in the current layer.
* As you advance in deep learning, you may start using layers that aren't Dense.
* In each layer, we specify the number of nodes as the first positional argument, and the activation function we want to use in that layer using the keyword argument activation.
* Keras supports every activation function you will want in practice.
* In the first layer, we need to specify input shapes as shown here. That says the input will have n_cols columns, and there is nothing after the comma, meaning it can have any number of rows, that is, any number of data points.
* You'll notice the last layer has 1 node. That is the output layer, and it matches those diagrams where we ended with only a single node as the output or prediction of the model.
* This model has 2 hidden layers, and an output layer.
* You may be struck that each hidden layers has 100 nodes. Keras and TensorFlow do the math for us, so don't feel afraid to use much bigger networks than we've seen before. It's quite common to use 100 or 1000s nodes in a layer.
* You'll learn more about choosing an appropriate number of nodes later. 

In [4]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 100)               1000      
                                                                 
 dense_1 (Dense)             (None, 100)               10100     
                                                                 
 dense_2 (Dense)             (None, 1)                 101       
                                                                 
Total params: 11,201
Trainable params: 11,201
Non-trainable params: 0
_________________________________________________________________


# Compiling and fitting a model

## Why you need to compile your model
* Specify the optimizer
    * Many options and mathematically complex
    * "Adam" is usually a good choice: **adjusts learning rate as it does gradient descent**
* Loss function
    * "mean_squared_error" common for regression

## Compiling a model

In [5]:
model.compile(optimizer='adam', loss='mean_squared_error')

## What is fitting a model
* Applying backpropagation and gradient descent with your data to update the weights
* Scaling data before fitting can ease optimization

## Fitting a model

In [6]:
model.fit(predictors, target)



<keras.callbacks.History at 0x7fcc90cf05b0>

# Classification models

## Classification
* 'categorical_crossentropy' loss function: itś by far the most common
    * Similar to log loss: Lower is better
* Add `metrics = ['accuracy']` to compile step for easy-to-understand diagnostics
* Output layer has separate node for each possible outcome, and uses 'softmax' activation
    * The softmax activation function ensures the predictions sum to 1, so they can be intepreted as probabilities.

In [7]:
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from tensorflow.keras.utils import to_categorical
import pandas as pd
from sklearn.model_selection import train_test_split

In [8]:
df = pd.read_csv('titanic_all_numeric.csv')
df

Unnamed: 0,survived,pclass,age,sibsp,parch,fare,male,age_was_missing,embarked_from_cherbourg,embarked_from_queenstown,embarked_from_southampton
0,0,3,22.000000,1,0,7.2500,1,False,0,0,1
1,1,1,38.000000,1,0,71.2833,0,False,1,0,0
2,1,3,26.000000,0,0,7.9250,0,False,0,0,1
3,1,1,35.000000,1,0,53.1000,0,False,0,0,1
4,0,3,35.000000,0,0,8.0500,1,False,0,0,1
...,...,...,...,...,...,...,...,...,...,...,...
886,0,2,27.000000,0,0,13.0000,1,False,0,0,1
887,1,1,19.000000,0,0,30.0000,0,False,0,0,1
888,0,3,29.699118,1,2,23.4500,0,True,0,0,1
889,1,1,26.000000,0,0,30.0000,1,False,1,0,0


In [9]:
X = df.drop(['survived'], axis=1).values.astype('float64')
X

array([[ 3.        , 22.        ,  1.        , ...,  0.        ,
         0.        ,  1.        ],
       [ 1.        , 38.        ,  1.        , ...,  1.        ,
         0.        ,  0.        ],
       [ 3.        , 26.        ,  0.        , ...,  0.        ,
         0.        ,  1.        ],
       ...,
       [ 3.        , 29.69911765,  1.        , ...,  0.        ,
         0.        ,  1.        ],
       [ 1.        , 26.        ,  0.        , ...,  1.        ,
         0.        ,  0.        ],
       [ 3.        , 32.        ,  0.        , ...,  0.        ,
         1.        ,  0.        ]])

In [10]:
y = to_categorical(df.survived)
y

array([[1., 0.],
       [0., 1.],
       [0., 1.],
       ...,
       [1., 0.],
       [0., 1.],
       [1., 0.]], dtype=float32)

In [11]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=91, random_state=1)
model = Sequential()
model.add(Dense(32, activation='relu', input_shape=(X.shape[1],)))
model.add(Dense(2, activation='softmax'))
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_3 (Dense)             (None, 32)                352       
                                                                 
 dense_4 (Dense)             (None, 2)                 66        
                                                                 
Total params: 418
Trainable params: 418
Non-trainable params: 0
_________________________________________________________________


In [12]:
model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train)



<keras.callbacks.History at 0x7fcbb8172c70>

## Using models
* Save
* Reload
* Make predictions

## Saving, reloading, and using your Model

In [13]:
from tensorflow.keras.models import load_model

model.save('model_file.h5')
! ls -lh model_file.h5

-rw-rw-r-- 1 mauricio mauricio 18K Jan 23 19:10 model_file.h5


In [14]:
model = load_model('model_file.h5')
yhat = model.predict(X_test)
yhat[:,1]



array([0.77579623, 0.18321379, 0.5985319 , 0.15043865, 0.93121475,
       0.38392857, 0.8683316 , 0.9983882 , 0.9995751 , 0.4999679 ,
       0.29096454, 0.23951885, 0.99485105, 0.47724587, 0.13835773,
       0.9364748 , 0.9174745 , 0.7056046 , 0.11219339, 0.74623287,
       0.16343929, 0.61775935, 0.99591535, 0.9977272 , 0.6218784 ,
       0.28673714, 1.        , 0.1866011 , 0.44390872, 0.99998796,
       0.4817563 , 0.96155417, 0.9087853 , 0.23576674, 0.06181916,
       0.89824295, 0.8670105 , 0.1621944 , 0.99703515, 0.40954867,
       0.22056195, 0.44707128, 0.99676526, 0.8617817 , 0.9437205 ,
       0.10585638, 0.9686973 , 0.271681  , 1.        , 0.66503197,
       0.5929783 , 0.5362141 , 0.62809694, 0.21482807, 0.28618413,
       0.6430202 , 0.36666778, 0.839049  , 0.92063   , 0.41903684,
       0.99853504, 0.15274681, 0.9762003 , 0.83809966, 0.99757046,
       0.6298674 , 0.7006554 , 0.7443636 , 0.705439  , 0.82876956,
       0.84886765, 0.95961213, 0.9999273 , 0.18321379, 0.03681

## Verifying model structure

In [15]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_3 (Dense)             (None, 32)                352       
                                                                 
 dense_4 (Dense)             (None, 2)                 66        
                                                                 
Total params: 418
Trainable params: 418
Non-trainable params: 0
_________________________________________________________________
