<a href="https://colab.research.google.com/github/deep1185/PWSkills_assignments_1/blob/main/He_initialization_using_pytorch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

He initialization is a technique used to initialize the weights of neural networks, particularly for layers using ReLU (Rectified Linear Unit) activation functions. It helps in maintaining the variance of the outputs of each layer to be the same as the inputs, which can help in training deep networks.

Below are implementations of He initialization in both TensorFlow and PyTorch.

He Initialisation using TensorFlow

In [1]:
import tensorflow as tf

# Define a dense layer with He initialization
def dense_layer_he(inputs, units):
    # number of input features
    input_dim = inputs.shape[-1]

    # He initialization: stddev = sqrt(2.0 / input_dim)
    initializer = tf.keras.initializers.HeNormal()

    # Create a dense layer with He initialization
    layer = tf.keras.layers.Dense(
        units,
        kernel_initializer=initializer,
        activation='relu'
    )

    return layer(inputs)

# Example usage
inputs = tf.random.normal([32, 128])  # Batch of 32 samples, each with 128 features
outputs = dense_layer_he(inputs, 64)  # Output with 64 units
print(outputs.shape)  # Output shape: (32, 64)

(32, 64)


He Initialisation using PyTorch

In [2]:
import torch
import torch.nn as nn

# Define a linear layer with He initialization
class DenseLayerHe(nn.Module):
    def __init__(self, input_dim, output_dim):
        super(DenseLayerHe, self).__init__()
        self.linear = nn.Linear(input_dim, output_dim)

        # Apply He initialization
        nn.init.kaiming_normal_(self.linear.weight, mode='fan_in', nonlinearity='relu')
        nn.init.zeros_(self.linear.bias)  # Initialize bias to zeros

    def forward(self, x):
        return torch.relu(self.linear(x))

# Example usage
input_dim = 128
output_dim = 64
inputs = torch.randn(32, input_dim)  # Batch of 32 samples, each with 128 features
layer = DenseLayerHe(input_dim, output_dim)
outputs = layer(inputs)
print(outputs.shape)  # Output shape: torch.Size([32, 64])

torch.Size([32, 64])
