Deep Learning Regularization
Neural networks are very prone to overfitting. In order to combat this, we need to regularize so that our model is not too overfit to the training data and is able to perform well on new data as well.
Three common regularization techniques for deep learning include:
 Dropout
 Early stopping
 L1/L2 regularization
Dropout
One of the most common forms of regularization is dropout. What this does is drops out a portion of the neurons so that the model does not learn weights and biases that are too perfect for the training set.
Visually, a dropout layer looks like this:

Notice that in the dropout layer, each neuron has a 50% probability (p = 0.5) of not being included/updated in that epoch. When we finalize our model and run it through the testing data, we include all of the
neurons and do not drop any out.
Dropout in Keras
Let's try this in Keras! We will look at a neural network with and without dropout.
Note: you can watch a video walkthrough of this code at the end of this module.

In [None]:
# Imports
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout


# We will use the NBA rookie data again to predict whether or not we think a rookie will last at least 5 years in the league. 

df = pd.read_csv('/content/drive/path_to_data/nba.csv', index_col = 'Name')
df.head()


In [None]:
# Clean data & split into X & y
# Drop missings
df.dropna(inplace = True)
# Save X data
X = df.drop(columns = 'TARGET_5Yrs')
# Encode our target
y = df['TARGET_5Yrs']


# Train test split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=3)


In [None]:
# Scale our data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


In [None]:
# Step 1: Define our network structure
# Save the number of features we have as our input shape
input_shape = X_train.shape[1]
input_shape

In [None]:
# Without dropout
# Sequential model
model = Sequential()
# First hidden layer
model.add(Dense(19, # How many neurons you have in your first hidden layer
                input_dim = input_shape, # What is the shape of your input features (number of columns)
                activation = 'relu')) # What activation function are you using?
model.add(Dense(10, 
                activation = 'relu'))
model.add(Dense(1, activation = 'sigmoid'))
model.compile(loss = 'bce', optimizer = 'adam')
history = model.fit(X_train, y_train,
                    validation_data = (X_test, y_test), 
                    epochs=100)


In [None]:
# Visualize the loss
plt.plot(history.history['loss'], label='Train loss')
plt.plot(history.history['val_loss'], label='Test Loss')
plt.legend();


# Yikes, our model is super overfit! Notice how the training loss continues to decrease while the testing loss begins to increase as we increase the number of epochs we train our model for. This is a super common problem with neural networks and tells us that our model is overfit and is not performing well on unseen data.
# Let's build this same model with dropout to try to prevent overfitting. Dropout in Keras is coded as another layer, after the layer you would like to dropout. You need to specify the probability of dropout as well (the probability that each individual neuron has to dropout of the training that epoch).


In [None]:
# With dropout
# Sequential model
model = Sequential()
# First hidden layer
model.add(Dense(19, # How many neurons you have in your first hidden layer
                input_dim = input_shape, # What is the shape of your input features (number of columns)
                activation = 'relu')) # What activation function are you using?
model.add(Dropout(.2))
model.add(Dense(10, 
                activation = 'relu'))
model.add(Dropout(0.2))
model.add(Dense(1, activation = 'sigmoid'))
model.compile(loss = 'bce', optimizer = 'adam')
history = model.fit(X_train, y_train,
                    validation_data = (X_test, y_test), 
                    epochs=100)
# Visualize the loss
plt.plot(history.history['loss'], label='Train loss')
plt.plot(history.history['val_loss'], label='Test Loss')
plt.legend();