# Regularization and Dropout in Neural Networks

Deep neural networks can easily **overfit** the training data. To reduce overfitting, we use:

- **L2 Regularization (Weight Decay)**: Penalizes large weights.
- **Dropout**: Randomly drops neurons during training to prevent co-adaptation.

In this notebook, we’ll demonstrate both techniques on the MNIST dataset.

In [None]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Dropout
from tensorflow.keras.datasets import mnist
from tensorflow.keras import regularizers
import matplotlib.pyplot as plt

print("TensorFlow version:", tf.__version__)

## Load and preprocess dataset

In [None]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

print("Training data:", x_train.shape)
print("Test data:", x_test.shape)

## Model with L2 Regularization

In [None]:
model_l2 = Sequential([
    Flatten(input_shape=(28,28)),
    Dense(128, activation='relu', kernel_regularizer=regularizers.l2(0.001)),
    Dense(10, activation='softmax')
])

model_l2.compile(optimizer='adam',
                 loss='sparse_categorical_crossentropy',
                 metrics=['accuracy'])

history_l2 = model_l2.fit(x_train, y_train, epochs=5, batch_size=32,
                          validation_data=(x_test, y_test), verbose=0)

## Model with Dropout

In [None]:
model_dropout = Sequential([
    Flatten(input_shape=(28,28)),
    Dense(128, activation='relu'),
    Dropout(0.5),  # Drop 50% neurons during training
    Dense(10, activation='softmax')
])

model_dropout.compile(optimizer='adam',
                      loss='sparse_categorical_crossentropy',
                      metrics=['accuracy'])

history_dropout = model_dropout.fit(x_train, y_train, epochs=5, batch_size=32,
                                    validation_data=(x_test, y_test), verbose=0)

## Compare Results

In [None]:
plt.plot(history_l2.history['val_accuracy'], label='L2 Regularization')
plt.plot(history_dropout.history['val_accuracy'], label='Dropout')
plt.title('Validation Accuracy Comparison')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

## Key Takeaways
- **L2 Regularization** keeps weights small and prevents overfitting.
- **Dropout** forces the network to be more robust by not relying on specific neurons.
- In practice, both techniques can be combined for better generalization.