# Keras

<img src="https://keras.io/img/logo.png" width=300 />

### Currnet documentation
1. Getting started: https://keras.io/getting_started/
2. Guides: https://keras.io/guides/
3. API reference: https://keras.io/api/

### Notes:
* TensorFlow team has included Keras in TensorFlow Core as module `tf.keras`.

### Workflow
1. Create the model
2. Create and add layers to the model
3. Compile the model
4. Train the model
5. Use the model for prediction or evaluation

### Sequential vs Functional API
The models in Keras can be created using the **sequential** or the **functional** APIs. 
Both the functional and sequential APIs can be used to build *any kind of models*. 

The **functional** API makes it **easier** to build the **complex models** that have multiple inputs, multiple outputs and shared layers.

We have also observed that building simple models with the functional API makes it easier to grow the models into complex models with branching and sharing. Hence for our work, we always use the functional API.

### Sequential API
```python
model = Sequential()
# then .add() ...

# Or, pass all layers to constructor as a list:
model = Sequential(
    [
        Dense(10, input_shape=(256,)),
        Activation('tanh'),
        Dense(10),
        Activation('softmax')
    ]
)
```

### Functional API
> In the functional API, you create the model as an instance of the `Model` class that takes an **input** and **output** parameter.

The input and output parameters represent one or more input and output tensors, respectively.

```python
model = Model(inputs=tensor1, outputs=tensor2)
```
In the above code, `tensor1` and `tensor2` are either tensors **or objects that can be treated like tensors**, for example, Keras layer objects.


If there are >1 input/output tensors, can pass as list:
```python
model = Model(inputs=[i1,i2,i3], outputs=[o1,o2,o3])
```

### Keras Layers
* For an overview (as of *Keras 2*), see book 📚.
* For up to date docs, see current docs.

### Compiling the Model
Signature of `model.compile()`:

```python
compile(
    self, 
    optimizer, 
    loss, 
    metrics=None, 
    sample_weight_mode=None
)
```

Where:
* `optimizer` - own, or built-ins:
    * SGD
    * RMSprop
    * Adagrad
    * Adadelta
    * Adam
    * Adamax
    * Nadam


* `loss` - own or built-ins:
    * mean_squared_error
    * mean_absolute_error
    * mean_absolute_pecentage_error
    * mean_squared_logarithmic_error
    * squared_hinge
    * hinge
    * categorical_hinge
    * sparse_categorical_crossentropy
    * binary_crossentropy
    * poisson
    * cosine proximity
    * binary_accuracy
    * categorical_accuracy
    * sparse_categorical_accuracy
    * top_k_categorical_accuracy
    * sparse_top_k_categorical_accuracy


* `metrics`: 
The third argument is a list of metrics that need to be collected while training the model. 
If verbose output is on, then the metrics are printed for each iteration. 
The metrics are like loss functions; some are provided by Keras with the ability to write your own metrics functions. 
All the loss functions also work as the metric function.

### Training
Signature of `model.fit()`:
```python
fit(
    self, 
    x, 
    y, 
    batch_size=32, 
    epochs=10, 
    verbose=1, 
    callbacks=None,
    validation_split=0.0, 
    validation_data=None, 
    shuffle=True,
    class_weight=None, 
    sample_weight=None, 
    initial_epoch=0
)
```

### Predicting
The trained model can be used:
* either to predict the value with the `model.predict()` method, 
* or to evaluate the model with the `model.evaluate()` method.

Signatures:
```python
predict(self, x, batch_size=32, verbose=0)
evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None)
```

### Additional Modules
* The `preprocessing` module provides several functions for the preprocessing of sequence, image, and text data.
* The `datasets` module provides several functions for quick access to several popular datasets, such as CIFAR10 images, CIFAR100 images, IMDB movie reviews, Reuters newswire topics, MNIST handwritten digits, and Boston housing prices.
* The `initializers` module provides several functions to set initial random weight parameters of layers, such as `Zeros`, `Ones`, `Constant`, `RandomNormal`, `RandomUniform`, `TruncatedNormal`, `VarianceScaling`, `Orthogonal`, `Identity`, `lecun_normal`, `lecun_uniform`, `glorot_normal`, `glorot_uniform`, `he_normal`, and `he_uniform`.
* The `models` module provides several functions to restore the model architectures and weights, such as `model_from_json`, `model_from_yaml`, and `load_model`. The model architectures can be saved using the `model.to_yaml()` and `model.to_json()` methods. The model weights can be saved by calling the `model.save()` method. **The weights get saved in an HDF5 file.** 
* The `applications` module provides several pre-built and pre-trained models such as Xception, VGG16, VGG19, ResNet50, Inception V3, InceptionResNet V2, and MobileNet. We shall learn how to use the pre-built models to predict with our datasets. We shall also learn how to retrain the pre-trained models in the applications module with our datasets from a slightly different domain.


## Keras MNIST Example (Sequential)

In [1]:
# Import the keras modules

# Use tensorflow.keras as keras  so that we don't have to install it separately.

import tensorflow.keras
from tensorflow.keras.datasets import mnist  # <-- Keras' own MNIST dataset.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import SGD
from tensorflow.keras import utils

import numpy as np

In [2]:
# Define some hyper parameters
batch_size = 100
n_inputs = 784
n_classes = 10
n_epochs = 10

In [3]:
# Get the data
(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [4]:
print(type(x_train))
print(x_train.shape)

<class 'numpy.ndarray'>
(60000, 28, 28)


In [5]:
# Reshape the two dimensional 28 x 28 pixels sized images into a single vector of 784 pixels.
x_train = x_train.reshape(60000, n_inputs)
x_test = x_test.reshape(10000, n_inputs)

In [6]:
# Convert the input values to float32
print(x_train.dtype)
x_train = x_train.astype(np.float32)
x_test = x_test.astype(np.float32)
print(x_train.dtype)

uint8
float32


In [7]:
# Normalize the values of image vectors to fit under 1.
x_train /= 255
x_test /= 255
print(x_train.max())

1.0


In [8]:
# convert output data into one hot encoded format
print(y_train.shape)
y_train = utils.to_categorical(y_train, n_classes)
y_test = utils.to_categorical(y_test, n_classes)
print(y_train.shape)

(60000,)
(60000, 10)


In [9]:
# Build a sequential model

model = Sequential()

# The first layer has to specify the dimensions of the input vector
model.add(Dense(units=128, activation='sigmoid', input_shape=(n_inputs,)))

# Add dropout layer for preventing overfitting
model.add(Dropout(0.1))
model.add(Dense(units=128, activation='sigmoid'))
model.add(Dropout(0.1))

# Output layer can only have the neurons equal to the number of outputs
model.add(Dense(units=n_classes, activation='softmax'))

Instructions for updating:
If using Keras pass *_constraint arguments to layers.


In [10]:
# Print the summary of our model
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 128)               100480    
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               16512     
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1290      
Total params: 118,282
Trainable params: 118,282
Non-trainable params: 0
_________________________________________________________________


In [11]:
# Compile the model!
model.compile(
    loss='categorical_crossentropy',
    optimizer=SGD(),
    metrics=['accuracy']
)

In [12]:
# Train the model
model.fit(
    x_train, 
    y_train,
    batch_size=batch_size,
    epochs=n_epochs
)

Train on 60000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f5195317dd0>

In [13]:
# Evaluate the model and print the accuracy score
scores = model.evaluate(x_test, y_test)

print('\n loss:', scores[0])
print('\n accuracy:', scores[1])


 loss: 0.8174687757492065

 accuracy: 0.8047


# Same Example in Functional API

In [14]:
import tensorflow as tf
tf.reset_default_graph()

In [15]:
# Define some hyper parameters
batch_size = 100
n_inputs = 784
n_classes = 10
n_epochs = 10

In [16]:
# Get the data
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Reshape the two dimensional 28 x 28 pixels sized images into a single vector of 784 pixels.
x_train = x_train.reshape(60000, n_inputs)
x_test = x_test.reshape(10000, n_inputs)

# Convert the input values to float32
x_train = x_train.astype(np.float32)
x_test = x_test.astype(np.float32)

# Normalize the values of image vectors to fit under 1.
x_train /= 255
x_test /= 255

# convert output data into one hot encoded format
y_train = utils.to_categorical(y_train, n_classes)
y_test = utils.to_categorical(y_test, n_classes)

In [17]:
# Additional imports needed:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Activation

In [18]:
# Build model - functional

input_ = Input(shape=(n_inputs,))

# Build up the model layers in a functional manner:
hidden = Dense(units=128)(input_)
hidden = Activation("sigmoid")(hidden)  # In this example, use activations explicitly with Activation.
hidden = Dropout(0.1)(hidden)
hidden = Dense(units=128)(hidden)
hidden = Activation("sigmoid")(hidden)
hidden = Dropout(0.1)(hidden)

output = Dense(units=n_classes)(hidden)
output = Activation("softmax")(output)

# Define `Model`:
model = Model(inputs=input_, outputs=output)

In [19]:
# Print the summary of our model

# Note that the summary isn't identical to the Sequential case, though the model is.

model.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 784)]             0         
_________________________________________________________________
dense (Dense)                (None, 128)               100480    
_________________________________________________________________
activation (Activation)      (None, 128)               0         
_________________________________________________________________
dropout (Dropout)            (None, 128)               0         
_________________________________________________________________
dense_1 (Dense)              (None, 128)               16512     
_________________________________________________________________
activation_1 (Activation)    (None, 128)               0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 128)               0     

In [20]:
# Compile the model!
model.compile(
    loss='categorical_crossentropy',
    optimizer=SGD(),
    metrics=['accuracy']
)

In [21]:
# Train the model
model.fit(
    x_train, 
    y_train,
    batch_size=batch_size,
    epochs=n_epochs
)

Train on 60000 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f5196ea7dd0>

In [22]:
# Evaluate the model and print the accuracy score
scores = model.evaluate(x_test, y_test)

print('\n loss:', scores[0])
print('\n accuracy:', scores[1])


 loss: 0.8595962315559387

 accuracy: 0.7932
