<a href="https://colab.research.google.com/github/amirmohammadkalateh/overfitting/blob/main/kernel_regularizer_regularization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
import tensorflow as tf
from tensorflow.keras import layers, regularizers
import matplotlib.pyplot as plt
import numpy as np

# Define the ANN model with L1 and L2 regularization
def create_regularized_ann(input_dim, hidden_units, output_dim, l1_reg=0.01, l2_reg=0.01):
    """
    Creates an Artificial Neural Network (ANN) model with L1 and L2 regularization
    applied to the kernel weights of the hidden layers.

    Args:
        input_dim (int): The dimensionality of the input features.
        hidden_units (list of int): A list specifying the number of units in each hidden layer.
        output_dim (int): The dimensionality of the output.
        l1_reg (float): The L1 regularization strength (lambda).
        l2_reg (float): The L2 regularization strength (lambda).

    Returns:
        tf.keras.Model: The compiled Keras model.
    """
    model = tf.keras.Sequential()
    model.add(layers.Input(shape=(input_dim,)))

    # Add hidden layers with L1 and L2 regularization on kernel weights
    for units in hidden_units:
        model.add(layers.Dense(units, activation='relu',
                               kernel_regularizer=regularizers.L1L2(l1=l1_reg, l2=l2_reg)))

    # Add the output layer
    model.add(layers.Dense(output_dim, activation='sigmoid'))  # Assuming binary classification

    return model

# Model parameters
input_dimension = 10
hidden_layer_units = [64, 32]
output_dimension = 1
l1_strength = 0.005
l2_strength = 0.001

# Create the regularized ANN model
regularized_model = create_regularized_ann(input_dimension, hidden_layer_units,
                                           output_dimension, l1_strength, l2_strength)

# Visualize the model architecture
tf.keras.utils.plot_model(regularized_model, to_file='regularized_ann_model.png',
                          show_shapes=True, show_layer_names=True, dpi=96)

print("Regularized ANN model architecture saved to 'regularized_ann_model.png'")

# --- Explanation and Visualization Details ---

print("\n--- Explanation of the Model and Regularization ---")
print("\n**Model Architecture:**")
print("The model is a sequential Artificial Neural Network (ANN) built using TensorFlow/Keras.")
print("It consists of the following layers:")
print(f"- **Input Layer:** Accepts input features of dimension {input_dimension}.")
for i, units in enumerate(hidden_layer_units):
    print(f"- **Hidden Layer {i+1}:** A dense layer with {units} units and ReLU activation.")
    print(f"  - **Kernel Regularization:** Both L1 and L2 regularization are applied to the weights (kernel) of this layer.")
print(f"- **Output Layer:** A dense layer with {output_dimension} unit and sigmoid activation (for binary classification).")

print("\n**L1 and L2 Regularization:**")
print("Regularization techniques are used to prevent overfitting in machine learning models by adding a penalty term to the loss function.")
print("This penalty discourages the model from learning overly complex patterns from the training data.")

print("\n**1. L1 Regularization (Lasso):**")
print("- Adds a penalty equal to the absolute value of the weights multiplied by a regularization strength ($\\lambda_1$).")
print("- The L1 regularization term in the loss function is: $$\\lambda_1 \sum_{i=1}^{n} |w_i|$$")
print("- **Effect:** Tends to drive some weights to exactly zero, leading to sparse weight matrices. This can be useful for feature selection as it effectively makes some features irrelevant.")
print(f"- In this model, the L1 regularization strength ($\\lambda_1$) is set to {l1_strength}.")

print("\n**2. L2 Regularization (Ridge):**")
print("- Adds a penalty equal to the square of the weights multiplied by a regularization strength ($\\lambda_2$).")
print("- The L2 regularization term in the loss function is: $$\\lambda_2 \sum_{i=1}^{n} w_i^2$$")
print("- **Effect:** Shrinks the weights towards zero but rarely makes them exactly zero. It helps to reduce the impact of large weights, making the model less sensitive to individual data points.")
print(f"- In this model, the L2 regularization strength ($\\lambda_2$) is set to {l2_strength}.")

print("\n**L1L2 Regularizer in Keras:**")
print("- The `regularizers.L1L2(l1=l1_reg, l2=l2_reg)` in Keras applies both L1 and L2 regularization simultaneously to the kernel weights of the dense layers.")
print("- The total regularization loss added to the main loss function will be:")
print("  $$\\text{Loss}_{regularization} = \\lambda_1 \sum |w_i| + \\lambda_2 \sum w_i^2$$")
print("- By applying this regularizer to the `kernel_regularizer` argument of the `Dense` layers, we are penalizing large weights in the hidden layers during training.")

print("\n**Visualization:**")
print("The `tf.keras.utils.plot_model` function has been used to visualize the architecture of the regularized ANN.")
print("The generated image ('regularized_ann_model.png') shows:")
print("- The layers of the network (Input, Dense hidden layers, Output).")
print("- The shape of the output at each layer.")
print("- The name of each layer.")
print("- While the visualization shows the structure, it **doesn't directly visualize the effect of regularization** on the weights themselves. Visualizing the weight values would typically involve inspecting the trained model's weights after training on some data, perhaps through histograms or heatmaps.")

print("\nTo further visualize the effect of regularization, you would typically:")
print("1. Train the model with and without regularization on the same dataset.")
print("2. Examine the distribution of the weights in the trained models (e.g., using histograms). You would likely observe that the regularized model has smaller magnitude weights compared to the unregularized model. The L1 regularized model might also show more weights close to zero.")
print("3. Compare the performance of the models on a separate test set. A well-regularized model should generalize better and have lower test error than an overfit unregularized model.")

Regularized ANN model architecture saved to 'regularized_ann_model.png'

--- Explanation of the Model and Regularization ---

**Model Architecture:**
The model is a sequential Artificial Neural Network (ANN) built using TensorFlow/Keras.
It consists of the following layers:
- **Input Layer:** Accepts input features of dimension 10.
- **Hidden Layer 1:** A dense layer with 64 units and ReLU activation.
  - **Kernel Regularization:** Both L1 and L2 regularization are applied to the weights (kernel) of this layer.
- **Hidden Layer 2:** A dense layer with 32 units and ReLU activation.
  - **Kernel Regularization:** Both L1 and L2 regularization are applied to the weights (kernel) of this layer.
- **Output Layer:** A dense layer with 1 unit and sigmoid activation (for binary classification).

**L1 and L2 Regularization:**
Regularization techniques are used to prevent overfitting in machine learning models by adding a penalty term to the loss function.
This penalty discourages the model from