---------------------
### Dropout is a regularization technique 
- commonly used in deep learning models, including those built with Keras, to prevent overfitting. 
- Overfitting occurs when a neural network learns to perform well on the training data but struggles to generalize to unseen data. 
- Dropout helps mitigate this problem by randomly deactivating (dropping out) a fraction of neurons during each training iteration. 
- This prevents any single neuron from becoming too reliant on specific inputs and forces the network to learn more robust features.
-----------------------------

In Keras, you can apply dropout to layers using the Dropout layer or by setting the dropout parameter in certain layers like Dense and LSTM. 

Here's how you can use dropout in Keras:

In [28]:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

In [29]:
# Load and preprocess the MNIST dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape(X_train.shape[0], 784).astype('float32') / 255
X_test  = X_test.reshape(X_test.shape[0], 784).astype('float32') / 255

y_train = to_categorical(y_train)  # One-hot encode labels
y_test  = to_categorical(y_test)    # One-hot encode labels

In [31]:
# Create a simple feedforward neural network with dropout
model = Sequential()
model.add(Dense(512, input_dim=784, activation='relu'))
model.add(Dropout(0.5))  # Dropout with a rate of 0.5

model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))  # Dropout with a rate of 0.5

model.add(Dense(10, activation='softmax'))

In [32]:
# Compile the model
model.compile(loss     ='categorical_crossentropy', 
              optimizer='adam', 
              metrics  =['accuracy'])

In [33]:
# Train the model
model.fit(X_train, 
          y_train, 
          epochs=10, 
          batch_size=128, 
          verbose=1, 
          validation_data=(X_test, y_test))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x2028aa4e1d0>

In [34]:
# Evaluate the model
loss, accuracy = model.evaluate(X_test, y_test)
print(f'Test loss: {loss:.4f}, Test accuracy: {accuracy:.4f}')

Test loss: 0.0614, Test accuracy: 0.9817


#### Tips For Using Dropout

- Generally, use a small dropout value of 20%-50% of neurons with 20% providing a good starting point. A probability too low has minimal effect and a value too high results in under-learning by the network.

- Use a __larger network__. You are likely to get better performance when dropout is used on a larger network, giving the model more of an opportunity to learn independent representations.

- Use dropout on __incoming__ (visible) as well as __hidden__ units. Application of dropout at each layer of the network has shown good results.

- Use a __large learning rate__ with __decay__ and a __large momentum__. 

- Increase your learning rate by a factor of 10 to 100 and use a high momentum value of 0.9 or 0.99.

- Constrain the size of network weights. A large learning rate can result in very large network weights. Imposing a constraint on the size of network weights such as max-norm regularization with a size of 4 or 5 has been shown to improve results.