## Title :
Dropout

## Description :

The goal of this exercise is to understand and use **dropouts for neural network regularization.

This method avoids overfitting by briefly switching off certain weights during training.

NOTE: This graph is only a sample.

<img src="../fig/fig4.png" style="width: 500px;">

## Instructions:

- Use the helper function `unregularized_model` to:
    - Generate the predictor and response data using the helper code given.
    - Build a simple neural network with 5 hidden layers with 100 neurons each and display the trace plot. This network has no regularization.
- For the same model architecture implement dropout by adding appropriate dropout layers.
- Compile the model with MSE as the loss. Fit the model on the training data.
- Use the helper code to visualise the MSE of the train and test data with respect to the epochs.
- Predict on the entire data. 
- Use the helper code to plot the predictions along with the generated data.
- This plot will consist of the predictions of both the neural networks. The graph will look similar to the one given above.

## Hints: 

<a href="https://www.tensorflow.org/api_docs/python/tf/keras/Sequential" target="_blank">tf.keras.sequential()</a>
A sequential model is for a plain stack of layers where each layer has exactly one input tensor and one output tensor.

<a href="https://www.tensorflow.org/api_docs/python/tf/keras/optimizers" target="_blank">tf.keras.optimizers()</a>
An optimizer is one of the two arguments required for compiling a Keras model

<a href="https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense" target="_blank">model.add()</a>
Adds layers to the model.

<a href="https://www.tensorflow.org/api_docs/python/tf/keras/Model#compile" target="_blank">model.compile()</a>
Compiles the layers defined into a neural network

<a href="https://www.tensorflow.org/api_docs/python/tf/keras/Model" target="_blank">model.fit()</a>
Fits the data to the neural network

<a href="https://www.tensorflow.org/api_docs/python/tf/keras/Model" target="_blank">model.predict()</a>
Used to predict the values given the model

<a href="https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/History" target="_blank">history()</a>
The history object is returned from calls to the fit() function used to train the model. Metrics are stored in a dictionary in the history member of the object returned.

<a href="https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dropout" target="_blank">tf.keras.layers.Dropout()</a>
Applies Dropout to the input data of the layer.

In [0]:
# Import the necessary libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings("ignore")
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras import models
from tensorflow.keras import optimizers
from tensorflow.keras import regularizers
np.random.seed(0)
tf.random.set_seed(0)
from helper import unregularized_model
from tensorflow.keras.models import load_model
from sklearn.metrics import mean_squared_error
from sklearn.model_selection import train_test_split
%matplotlib inline

## Implement an unregularized NN 

In [0]:
# Call the helper function unregularized_model() to get the 
# unregularized model along with the data
x_b, x_train, x_test, y_train, y_test, y_pred, mse = unregularized_model()

In [0]:
# Printing the MSE of the unregularized model
print("MSE of the unregularized model is", mse)

## Implement the NN with dropouts
For dropout we build the same network with "Dropout" layers after each activation.

In [0]:
model_2 = models.Sequential(name='Dropout_regularized')

# Hidden 5 layer with 100 neurons each (or nodes)
# Add a dropout layer after each hidden layer with some dropout percentage
model_2.add(layers.Dense(100, activation='relu', input_shape=(1,)))
model_2.add(___)

model_2.add(layers.Dense(100, activation='relu'))
model_2.add(___)

model_2.add(layers.Dense(100, activation='relu'))
model_2.add(___)

model_2.add(layers.Dense(100, activation='relu'))
model_2.add(___)

model_2.add(layers.Dense(100, activation='relu'))
model_2.add(___)

# Output layer with one neuron 
model_2.add(layers.Dense(1,  activation='linear'))


In [0]:
# Compile the model with MSE as loss and Adam optimizer with learning rate as 0.001
___

# Save the history about the model after fitting on the train data
# Use 0.2 validation split  with 1500 epochs and batch size of 10
history_2 = ___


In [0]:
# Helper code to plot the data

# Plot the MSE of the model
plt.rcParams["figure.figsize"] = (10,8)
plt.title("Dropout Regularized")
plt.semilogy(history_2.history['loss'], label='Train Loss', color='#FF9A98', linewidth=2)
plt.semilogy(history_2.history['val_loss'],  label='Validation Loss', color='#75B594', linewidth=2)
plt.legend()

# Set the axes labels
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()


### ⏸ In the trace plot above, why is the validation error lower than the training error?


#### A. The dropout percentage is high and hence the model is underfit during validation.
#### B. During the validation phase, the validation loss is multiplied by the percentage of dropout, hence the loss is always lower than the training loss.
#### C. The dropout percentage is low and hence the model overfits on the validation data.
#### D. The validation takes place in the evaluation mode of dropout where the weights are already learned.

In [0]:
### edTest(test_chow1) ###
# Submit an answer choice as a string below 
# (eg. if you choose option A, put 'A')
answer1 = '___'

In [0]:
### edTest(test_mse) ###
# Predict your model on x_b (used exclusively for plotting)
y_hat_dropout = ___

# Predict your model on the test data 
y_dropout_test = ___

# Compute the MSE on the test data
mse_dropout = ___

In [0]:
# Print the MSE of the dropout regularized model
print("MSE of the dropout regularized model is", mse_dropout)

In [0]:
# Use the helper code to plot the predicted data

# Plotting the predicted data using the L2 regularized model
plt.rcParams["figure.figsize"] = (10,8)
plt.plot(x_b, y_hat_dropout, label='Dropout regularized', color='black', linewidth=2)

# Plotting the predicted data using the unregularized model
plt.plot(x_b, y_pred, label = 'Unregularized model', color='#005493', linewidth=2)

# Plotting the training data
plt.plot(x_train, y_train, '.', label='Train data', markersize=15, color='#FF9A98')

# Plotting the testing data
plt.plot(x_test,y_test, '.', label='Test data', markersize=15, color='#75B594')

# Set the axes labels
plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.show()

### ⏸ **After marking the exercise, change dropout percentage to 0.8 first and 0.2 next. Do you notice any change? Which value regularizes the neural network more?**

In [0]:
### edTest(test_chow2) ###
# Type your answer within in the quotes given

answer2 = '___'