# Xavier initialization


The Xavier initialization is an algorithm used to initialize the weights of a neural network to solve the vanishing and exploding gradients problem.
- It initializes the weights by drawing them from a distribution with a carefully chosen variance, designed to keep the signal flowing effectively through the network."

- Best if for when using certain activation functions like the **Softmax**, **tanh** and **sigmoid**.

---

**Uniform Distribution:**

$$x = \sqrt{\frac{6}{n_{inputDim} + n_{outputDim}} }$$

- $n_{inputDim}$: The number of input units to the layer.
- $n_{outputDim}$: The number of output units from the layer.

- Each weight is initialized by drawing a random value from a uniform distribution in the range [-x, x]
- By constraining the weights within a range determined by $x$, you ensure that the variance of the initial weights is controlled and is suitable for effective training.

---
**Normal Xavier Initialization**
$$\alpha = \sqrt{\frac{2}{n_{inputDim} + n_{outputDim}} }$$

- Init the weights from a gaussian distribution with a mean of $0$ and a specific standard deviation, which is determined by the formula:
- $n_{inputDim}$: The number of input units to the layer.
- $n_{outputDim}$: The number of output units from the layer.
  
- Each weight is initialized by drawing a random value from a normal distribution with a mean of $0$ and a standard deviation of $\alpha$.

In [18]:
import numpy as np

def Xavier_init(n_in, n_out, use_uniform=True):
    """
    Initializes a weight matrix using the Xavier algorithm.

    Args:
        n_in: The number of input neurons.
        n_out: The number of output neurons.
        use_uniform (bool): If True, use uniform distribution.
                              If False, use normal distribution.
    """

    if use_uniform:
        # Xavier Uniform Initialization
        limit = np.sqrt(6 / (n_in + n_out))
        # Create the weight matrix with random values from a uniform distribution
        weights = np.random.uniform(-limit, limit, size=(n_out, n_in))

    else:
        # Xavier Normal Initialization
        std_dev = np.sqrt(2 / (n_in + n_out))
        # Create weights from a normal distribution with mean 0 and calculated std_dev
        weights = np.random.randn(n_out, n_in) * std_dev

    return weights

In [19]:
# Example Usage
# Using Uniform Xavier (default)
weights_uniform = Xavier_init(n_in=9, n_out=1, use_uniform=True)
print("Uniform Xavier Weights:\n", weights_uniform)

# Using Normal Xavier
weights_normal = Xavier_init(n_in=9, n_out=1, use_uniform=False)
print("\nNormal Xavier Weights:\n", weights_normal)

Uniform Xavier Weights:
 [[-0.17494655 -0.22389002  0.46530059  0.18174468 -0.15329114 -0.40667018
  -0.26673887 -0.30990112 -0.34482336]]

Normal Xavier Weights:
 [[ 0.08935176  0.43392185 -0.23677776 -0.65177905 -0.03343328 -0.33219696
   0.16984359  0.1004534  -0.02954187]]
