### Part 1: Understanding Regularization

In [None]:
"""1. What is regularization in the context of deep learning? Why is it important?
    Ans: Regularization in deep learning refers to techniques that prevent overfitting by adding extra constraints or penalties to the model during training. 
         It is essential because it helps improve generalization, reducing the risk of the model memorizing the training data and performing poorly on unseen data.
 
2. Explain the bias-variance tradeoff and how regularization helps in addressing this tradeoff.
    Ans: The bias-variance tradeoff refers to the balance between a model's ability to fit training data (low bias) and generalize to unseen data (low variance).
         Regularization helps address this tradeoff by adding penalties to the model's complexity during training. It reduces overfitting (high variance) by favoring simpler models (higher bias) that generalize better.

3. Describe the concept of L1 and L2 regularization. How do they differ in terms of penalty calculation and their effects on the model? 
    Ans:  L1 regularization adds a penalty proportional to the absolute value of weights, promoting sparsity. 
          L2 regularization adds a penalty proportional to the square of weights, encouraging smaller but non-zero weights. 
          L1 can lead to more sparse models, while L2 encourages smaller weights without making them exactly zero.

4. Discuss the role of regularization in preventing overfitting and improving the generalization  of deep learning models. 
    Ans: Regularization prevents overfitting in deep learning models by adding constraints or penalties during training. 
         It discourages the model from becoming too complex and memorizing the training data. By favoring simpler models, regularization improves generalization, 
         allowing the model to perform better on unseen data and make more accurate predictions."""

### Part 2: Regularization Techniques

In [None]:
"""5. Explain Dropout regularization and how it works to reduce overfitting. Discuss the impact of Dropout on model training and inference. 
    Ans: Dropout regularization randomly deactivates (sets to zero) some neurons during training. This prevents neurons from relying too much on others, reducing overfitting.
         During inference, neurons are activated with scaled weights to compensate for the dropout. Dropout can slow training, but it helps improve generalization and model performance.

6. Describe the concept of Early Stopping as a form of regularization. How does it help prevent overfitting during the training process? 
    Ans: Early stopping is a form of regularization that stops training when the model's performance on a validation set starts to degrade. 
         By preventing the model from training for too long, it avoids overfitting. It ensures the model retains its ability to generalize well and perform better on unseen data.

7. Explain the concept of Batch Normalization and its role as a form of regularization. How does Batch Normalization help in preventing overfitting? 
    Ans: Batch Normalization is a technique that normalizes the activations in each layer of a neural network during training. 
         It helps prevent overfitting by reducing internal covariate shift, making training more stable. This regularization improves generalization and allows for faster convergence during training."""

### Part 3: Applying Regularization

In [3]:
import tensorflow as tf
import numpy as np
from tensorflow.keras.datasets import mnist
from tensorflow.keras.layers import Dense, Flatten, Dropout


def model(dropout_rate):
    # loading dataset
    (X_train_full, y_train_full), (X_test, y_test) = mnist.load_data()

    # Scale the data between 0 to 1 by dividing it by 255. as its an unsigned data between 0-255 range
    X_valid, X_train = X_train_full[:5000] / 255., X_train_full[5000:] / 255.
    y_valid, y_train = y_train_full[:5000], y_train_full[5000:]

    # scale the test set as well
    X_test = X_test / 255.
    model = tf.keras.models.Sequential()
    model.add(Flatten(input_shape=[28, 28], name="inputLayer"))
    model.add(Dense(300, activation='relu', name="hiddenLayer1"))
    model.add(Dropout(dropout_rate))
    model.add(Dense(100, activation='relu', name="hiddenLayer2"))
    model.add(Dropout(dropout_rate))
    model.add(Dense(10, activation="softmax", name="outputLayer"))

    model.compile(loss='sparse_categorical_crossentropy',
                optimizer='adam',
                metrics=["accuracy"])
    model.fit(X_train, y_train, epochs=25,
                    validation_data=(X_valid, y_valid), batch_size=1000)
    
    return max(model.history.history['val_accuracy'])

In [4]:
# 8. Implement Dropout regularization in a deep learning model using a framework of your choice. Evaluate its impact on model performance and compare it with a model without Dropout. 

model_performance_with_Dropout_Layer = model(dropout_rate=0.2)
model_performance_without_Dropout_Layer = model(dropout_rate=0)
print("\n\n\n")
print(f"Model Performance With Dropout Layer : {model_performance_with_Dropout_Layer}")
print(f"Model Performance Without Dropout Layer : {model_performance_without_Dropout_Layer}")


Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25
Epoch 1/25
Epoch 2/25
Epoch 3/25
Epoch 4/25
Epoch 5/25
Epoch 6/25
Epoch 7/25
Epoch 8/25
Epoch 9/25
Epoch 10/25
Epoch 11/25
Epoch 12/25
Epoch 13/25
Epoch 14/25
Epoch 15/25
Epoch 16/25
Epoch 17/25
Epoch 18/25
Epoch 19/25
Epoch 20/25
Epoch 21/25
Epoch 22/25
Epoch 23/25
Epoch 24/25
Epoch 25/25




Model Performance With Dropout Layer : 0.9837999939918518
Model Performance Without Dropout Layer : 0.9797999858856201


### As we can see in above out that using Dropout Layer out model performance increases.

In [None]:
"""9. Discuss the considerations and tradeoffs when choosing the appropriate regularization technique for a given deep learning task. 
    Ans: When choosing a regularization technique, consider the task complexity and data size. L1 and L2 regularization penalize large weights, 
         while Dropout deactivates neurons. Each technique affects model training and inference differently. 
         Experimentation is key to finding the most suitable regularization method, balancing performance and computational cost."""