# Weight Initialization

In deep learning, weight initialization plays a crucial role in training neural networks efficiently.
- Poor initialization can cause vanishing or exploding gradients.
- Proper initialization helps faster convergence and better accuracy.

## Types of Initialization
1. **Zeros Initialization**: All weights = 0 (not recommended, causes symmetry).
2. **Random Initialization**: Small random values.
3. **Xavier Initialization**: Keeps variance across layers stable.
4. **He Initialization**: Better for ReLU activations.

In [1]:
import numpy as np

# Different initialization techniques
def zeros_init(shape):
    return np.zeros(shape)

def random_init(shape):
    return np.random.randn(*shape) * 0.01

def xavier_init(shape):
    return np.random.randn(*shape) * np.sqrt(1.0 / shape[0])

def he_init(shape):
    return np.random.randn(*shape) * np.sqrt(2.0 / shape[0])


In [2]:
shape = (2, 3)
print("Zeros Init:\n", zeros_init(shape), "\n")
print("Random Init:\n", random_init(shape), "\n")
print("Xavier Init:\n", xavier_init(shape), "\n")
print("He Init:\n", he_init(shape))

Zeros Init:
 [[0. 0. 0.]
 [0. 0. 0.]]

Random Init:
 [[-0.0016  0.0032 -0.0047]
 [-0.0082  0.0111  0.0009]]

Xavier Init:
 [[-0.4351  0.2528 -0.0141]
 [ 0.4372  0.4054  0.6122]]

He Init:
 [[-0.7402 -0.1621 -0.4513]
 [-0.3075 -0.2540  0.5437]]


✅ **Summary:**
- **Zeros init** fails because all neurons learn the same thing.
- **Random init** works but may cause slow convergence.
- **Xavier init** works well for **tanh/sigmoid** activations.
- **He init** is preferred for **ReLU** activations.