## Explanation of the Code
1. Class Initialization:
- The __init__ method initializes epsilon for numerical stability and momentum to control the update rate for running statistics.
2. Parameter Initialization:
- The initialize_params method initializes the learnable parameters gamma (scaling factor) and beta (shift), as well as running_mean and running_var for inference.
3. Forward Pass (Normalization):
- If training=True, the batch mean and variance are calculated on the current batch.
These statistics are used to normalize the batch, and the running mean and variance are updated using momentum.
- If training=False, we use the running mean and variance accumulated during training for normalization.
- Finally, the normalized values are scaled by gamma and shifted by beta
4. Updating Parameters:
- The update_params method allows updating gamma and beta, which is useful in a training scenario with backpropagation.
5. Example Usage:
- We create a simulated batch of input data and apply the batch normalization during both training and inference phases to observe the differences.
Key Differences from TensorFlow's BatchNormalization
- Backpropagation and Parameter Updates: In a complete neural network training process, gamma and beta would be updated during backpropagation. This code does not include backpropagation; however, the update_params method could be connected to an optimizer.
- Momentum and Running Statistics: This example uses the momentum parameter to update running_mean and running_var, similar to TensorFlow's implementation, which uses these statistics during inference.
This custom BatchNormalization layer can be integrated into a larger neural network framework or used for experimentation.

In [53]:
import numpy as np

class BatchNormalization:
    def __init__(self, epsilon=1e-5, momentum=0.9):
        """
        Initializes the BatchNormalization layer.
        
        Parameters:
        - epsilon (float): Small constant to avoid division by zero.
        - momentum (float): Momentum for the running mean and variance.
        """
        self.epsilon = epsilon
        self.momentum = momentum
        self.gamma = None
        self.beta = None
        self.running_mean = None
        self.running_var = None

    def initialize_params(self, shape):
        """
        Initializes gamma and beta parameters, as well as running mean and variance.
        
        Parameters:
        - shape (tuple): Shape of the input data (features).
        """
        # Initialize scale (gamma) and shift (beta) parameters
        self.gamma = np.ones(shape)
        self.beta = np.zeros(shape)
        
        # Initialize running statistics for mean and variance
        self.running_mean = np.zeros(shape)
        self.running_var = np.ones(shape)

    def forward(self, X, training=True):
        """
        Applies batch normalization to the input data.
        
        Parameters:
        - X (np.array): Input data with shape (batch_size, features).
        - training (bool): Indicates whether to use batch statistics or running statistics.
        
        Returns:
        - np.array: The batch-normalized data.
        """
        # Initialize parameters if not done already
        if self.gamma is None or self.beta is None:
            self.initialize_params(X.shape[1:])

        if training:
            # Calculate batch mean and variance
            batch_mean = np.mean(X, axis=0)
            batch_var = np.var(X, axis=0)
            
            # Normalize the input
            X_norm = (X - batch_mean) / np.sqrt(batch_var + self.epsilon)
            
            # Update running mean and variance
            self.running_mean = self.momentum * self.running_mean + (1 - self.momentum) * batch_mean
            self.running_var = self.momentum * self.running_var + (1 - self.momentum) * batch_var
        else:
            # Use running statistics for inference
            X_norm = (X - self.running_mean) / np.sqrt(self.running_var + self.epsilon)
        
        # Scale and shift the normalized data
        out = self.gamma * X_norm + self.beta
        return out

    def update_params(self, gamma, beta):
        """
        Updates gamma and beta parameters for backpropagation.
        
        Parameters:
        - gamma (np.array): Learned scale parameter (of same shape as features).
        - beta (np.array): Learned shift parameter (of same shape as features).
        """
        self.gamma = gamma
        self.beta = beta



In [54]:
import tensorflow as tf
# Example input tensor
X = tf.random.normal((96,55, 55))

# Custom BatchNormalization layer
bn_custom = BatchNormalization()
output_custom = bn_custom.forward(X, training=True)
output_custom.shape


TensorShape([96, 55, 55])