# Real World Use Case: Deep Training Dynamics

**Scenario**: You are training a very deep network (e.g., ResNet-50). The loss refuses to go down. Is it a bug?
**Goal**: Demonstrate how **Batch Normalization** fixes the internal signal distribution, allowing deep networks to actually train.

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Simulation: Signal passing through 10 layers
layers = 10
input_signal = np.random.randn(1000, 500) # Centered at 0, Std=1

activations = {}

# 1. Without Norm (Problem)
curr = input_signal
means = []
stds = []

for i in range(layers):
    W = np.random.randn(500, 500) * 0.1 # Bad initialization
    curr = np.dot(curr, W)
    curr = np.maximum(0, curr) # ReLU
    means.append(np.mean(curr))
    stds.append(np.std(curr))

# 2. Visualize
plt.figure(figsize=(10, 4))
plt.plot(means, label='Mean Activation')
plt.plot(stds, label='Std Dev')
plt.title("Signal Degradation in Deep Networks (Without BN)")
plt.xlabel("Layer Depth")
plt.ylabel("Signal Strength")
plt.legend()
plt.grid()
plt.show()

## Conclusion
By Layer 10, the signal Mean and Std Deviation often collapse to near zero (Vanishing) or explode.
Batch Normalization forces the plot above to stay stable (Mean near 0, Std near 1) across all layers, ensuring the last layer gets as strong a signal as the first layer.