# Deep Learning with Keras

___

<img src="https://keras.io/img/logo.png" alt="Keras" width="500"/>

> ## Notes:
> We can think of the **Keras model's life-cycle** in five steps:
> 1. [Defining](#Defining-the-Network)
> 2. [Compiling](#Compiling-the-Network)
> 3. [Training/Fitting](#Training-the-Network)
> 4. [Testing/Evaluating](#Testing-the-Network)
> 5. [Predicting](#Predicting-with-the-Network)
>
> In this notebook, we will take a look at each of these steps in a bit more detail using the `Sequential` class. 
>
> We will also look at the Keras **[Functional API](#Functional-API)**, which gives us more flexibility in creating our models. 
> 
> Finally, we use Keras to create the following **standard neural network models**:
> - [Multilayer Perceptron (MLP)](#Multilayer_Perceptron_(MLP))
> - [Convolutional Neural Network (CNN)](#Convolutional-Neural-Network-(CNN))
> - [Recurrent Neural Network (RNN)](#Recurrent-Neural-Network-(RNN))

___

## Defining the Network

In Keras, **neural networks are defined as a sequence of layers**, and the container for such layers is the `Sequential` class. 

In [2]:
from tensorflow.keras.models import Sequential

A **layer** can be created using `Dense`, which takes as input the number of neurons desired in that layer. 

In [3]:
from tensorflow.keras.layers import Dense

A layer can be added to the network by using the `add` method with the `Dense` layer created.

Let's create an instance of the `Sequential` class, then create and add one layer with two neurons to the network. 

In [4]:
model = Sequential()
model.add(Dense(2))

We get the same result if we simply say:

In [5]:
model = Sequential([Dense(2)])

Keep in mind that the dimensions of the input data must be specified in the first layer of any `Sequential` model. 

This can be done by using the `input_dim` argument. For higher dimensional data, this can be done using `input_shape`. 

Let's create a multilayer perceptron model with 3 inputs, 6 neurons in the hidden layer, and 2 neurons in the output layer.

In [34]:
model = Sequential()
model.add(Dense(6, input_dim=3))
model.add(Dense(2))

We get the same result if we simply say: 

In [7]:
model = Sequential([
    Dense(6, input_dim=3),
    Dense(2)
])

We can also specify the **activation function** used for that layer. 

We will use the `activation` argument to specify the activation function associated with that layer. 

In [20]:
model = Sequential([
    Dense(6, input_dim=3, activation='relu'),
    Dense(2, activation='sigmoid')
])

> For more information on the `Sequential` class, head over to [TensorFlow](https://www.tensorflow.org/api_docs/python/tf/keras/Sequential). 

___

## Compiling the Network

Once our network is defined, it is time to compile it. **Compilation** transforms our sequence of layers into a series of matrix transformations that allows for efficient computations from our model. 

To compile our network, we need to pass in two arguments into the `compile` method: 
- `optimizer`: the optimization algorith our model will use
- `loss`: the loss function our model will use

Let's compile our model, specifying the stochastic gradient descent (`sgd`) optimization algorithm and the mean squared error (`mean-squared-error`) loss function. 

In [22]:
model.compile(optimizer='sgd', loss='mean-squared-error')

We likely want to configure our optimizer beforehand, and then pass that in as an argument to our model compilation. 

Let's create a stochastic gradient descent optimizer with a `learning_rate` of $0.08$ and `momentum` of $0.2$, and then compile our model with it. 

In [26]:
from tensorflow.keras.optimizers import SGD

optimation_algorith = SGD(learning_rate=0.08, momentum=0.2)
model.compile(optimizer=optimation_algorith, loss='mean-squared-error')

We can also define metrics we'd like the model to collect while training. Keep in mind that for multiple metrics, they must be specified in an array of strings. 

Let's compile our model, this time adding `accuracy` as a metric we would like to collect. 

In [29]:
model.compile(optimizer='sgd', loss='mean-squared-error', metrics='accuracy')

___

## Training the Network

Once the network is compiled, it is time to train it. **Training** updates the weights of the model based on training data given to the model. As expected, the model is trained using the optimization algorithm and loss function specified when compiling the model. 

To train our network, we need to pass the input and output training data into the `fit` method, alongside additional arguments such as:
- `batch_size`: the number of samples our network is exposed to before updating weights within an epoch
- `epochs`: the number of rounds of exposures to the training data
- `verbose`: the amount of information to be displayed on screen (0=none, 2=only loss per epoch)

Running the `fit` method on our model returns a history object that provides a performance summary (which inclues loss alongside any additional metrics specified when compiling) for each epoch. 

Let's train our model using a generic `X_train` input and `y_train` output, with a `batch_size` of 20 for 200 `epochs`, setting the `verbose` value to $0$. 

In [32]:
# history = model.fit(X_train, y_train, batch_size=20, epochs=200, verbose=0)

> Keep in mind that you will get an error when running the line above, as we haven't defined `X_train` and `y_train`, and so the line has been commented out. 

___

## Testing the Network

Once the network is trained, it is time to test it. **Testing** evaluates the performance of our model on a set of data that was unused during training. 

To test our network, we need to pass the input and output testing data into the `evaluate` method, alongside additional arguments such as:
- `verbose`: the amount of information to be displayed on screen

Running the `evaluate` method on our model returns a list of evaluation metrics. 

Let's test our model using a generic `X_test` input and `y_test` output, setting the `verbose` value to $0$, making sure we also capture the `accuracy` of our model.

In [None]:
# loss, accuracy = model.evaluate(X_test, y_test, verbose=0)

> Keep in mind that you will get an error when running the line above, as we haven't defined `X_test` and `y_test`, and so the line has been commented out. 

___

## Predicting with the Network

Now that we've defined, compiled, trained, and tested our network, we can finally use it to make predictions. **Predicting** estimates the output associated with a set of data that was unused during training and testing. 

To predict using our network, we need to pass the input data into the `predict` method, alongside additional arguments such as:
- `verbose`: the amount of information to be displayed on screen

Running the `predict` method on our model returns predictions in the format provided by the output layer of the network. 

- For <u>regression problems</u>, the output is likely the same format as input. 
- For <u>binary classification problems</u>, the output is likely a probability that need to be rounded to a $0$ or $1$. 
- For <u>multiclass classification problems</u>, the output is likely an array of probabilities that need to be converted to a single class prediction.

Let's use our model to predict the output given a generic `X` input, setting the `verbose` value to $0$. 

In [None]:
# predictions = model.predict(X, verbose=0)

For classification problems, we can use the `predict_classes` method to automatically convert our array of probabilities into integer class values. 

In [None]:
# predictions = model.predict_classes(X, verbose=0)

> Keep in mind that you will get an error when running the lines above, as we haven't defined `X`, and so the lines have been commented out. 

___

## Functional-API

___

## Multilayer Perceptron (MLP)

___

## Convolutional Neural Network (CNN)

___

## Recurrent Neural Network (RNN)

In [21]:
model.summary()

Model: "sequential_8"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_11 (Dense)             (None, 6)                 24        
_________________________________________________________________
dense_12 (Dense)             (None, 2)                 14        
Total params: 38
Trainable params: 38
Non-trainable params: 0
_________________________________________________________________
