# Neural Network Dimensions: Z[1] and A[1]

## Question

**"What are the dimensions of Z[1] and A[1]?"**

Given a neural network with:
- 4 input features (x₁, x₂, x₃, x₄)
- 2 neurons in the first hidden layer
- 1 output neuron

This notebook explains how to determine the dimensions of Z[1] and A[1].


## Understanding the Notation

### Key Terms:
- **Z[l]**: Pre-activation values (weighted sum + bias) for layer `l`
- **A[l]**: Post-activation values (output after activation function) for layer `l`
- **n[l]**: Number of neurons in layer `l`
- **m**: Number of training examples

### General Rule:
The dimensions of Z[l] and A[l] are:
```
(n[l], m)
```
Where:
- First dimension: Number of neurons in layer `l`
- Second dimension: Number of training examples


In [None]:
import numpy as np

print("=" * 70)
print("NEURAL NETWORK DIMENSIONS EXPLANATION")
print("=" * 70)

# Network architecture
n_x = 4  # Number of input features
n_1 = 2  # Number of neurons in layer 1 (hidden layer)
n_2 = 1  # Number of neurons in layer 2 (output layer)
m = 5    # Number of training examples (example value)

print(f"\nNetwork Architecture:")
print(f"  Input features: {n_x}")
print(f"  Hidden layer 1 neurons: {n_1}")
print(f"  Output layer neurons: {n_2}")
print(f"  Training examples: {m}")

print("\n" + "-" * 70)
print("DIMENSIONS FOR LAYER 1:")
print("-" * 70)


## Dimensions of Z[1] and A[1]

### For Layer 1 (First Hidden Layer):

**Z[1] dimensions:**
- Z[1] contains the pre-activation values for each neuron in layer 1
- Number of rows = number of neurons in layer 1 = **2**
- Number of columns = number of training examples = **m**

**A[1] dimensions:**
- A[1] contains the post-activation values for each neuron in layer 1
- Same dimensions as Z[1] because activation function doesn't change shape
- Number of rows = number of neurons in layer 1 = **2**
- Number of columns = number of training examples = **m**

**Answer: Z[1] and A[1] are (2, m)**


In [None]:
# Demonstrate with actual arrays
print("\nCreating example arrays to demonstrate dimensions:")

# Input X: (n_x, m) = (4, m)
X = np.random.randn(n_x, m)
print(f"\nX (input) shape: {X.shape} = (n_x, m) = ({n_x}, {m})")

# Weights W[1]: (n_1, n_x) = (2, 4)
W1 = np.random.randn(n_1, n_x)
print(f"W[1] shape: {W1.shape} = (n_1, n_x) = ({n_1}, {n_x})")

# Bias b[1]: (n_1, 1) = (2, 1)
b1 = np.random.randn(n_1, 1)
print(f"b[1] shape: {b1.shape} = (n_1, 1) = ({n_1}, 1)")

# Compute Z[1] = W[1] @ X + b[1]
# (2, 4) @ (4, 5) + (2, 1) = (2, 5)
Z1 = W1 @ X + b1
print(f"\nZ[1] = W[1] @ X + b[1]")
print(f"Z[1] shape: {Z1.shape} = (n_1, m) = ({n_1}, {m})")

# Apply activation function (e.g., ReLU)
A1 = np.maximum(0, Z1)  # ReLU activation
print(f"\nA[1] = activation(Z[1])")
print(f"A[1] shape: {A1.shape} = (n_1, m) = ({n_1}, {m})")

print("\n" + "=" * 70)
print("✅ ANSWER: Z[1] and A[1] are (2, m)")
print("=" * 70)


## Why Not the Other Options?

### ❌ (4, m) - Wrong
- This would be the shape of X (input), not Z[1] or A[1]
- 4 is the number of input features, not neurons in layer 1

### ❌ (4, 1) - Wrong
- This would be for a single training example
- Missing the `m` dimension for multiple training examples

### ❌ (2, 1) - Wrong
- This would be for a single training example
- Missing the `m` dimension for multiple training examples

### ✅ (2, m) - Correct
- 2 = number of neurons in layer 1
- m = number of training examples
- This is the correct shape for both Z[1] and A[1]


## Matrix Multiplication Dimensions

Let's trace through the dimensions:

```
Forward Pass for Layer 1:

X:      (n_x, m)  = (4, m)
W[1]:   (n_1, n_x) = (2, 4)
b[1]:   (n_1, 1)   = (2, 1)

Z[1] = W[1] @ X + b[1]
      (2, 4) @ (4, m) + (2, 1)
      = (2, m) + (2, 1)  [broadcasting]
      = (2, m)

A[1] = activation(Z[1])
      = activation((2, m))
      = (2, m)  [activation doesn't change shape]
```

**Key insight:** The number of columns (m) comes from the input X, and the number of rows (2) comes from the number of neurons in layer 1.


In [None]:
# Visual representation
print("=" * 70)
print("VISUAL REPRESENTATION")
print("=" * 70)

print("""
Network Structure:

Input Layer          Hidden Layer 1        Output Layer
    x₁ ──────────────┐
                      ├──→ a₁^[1] ────────┐
    x₂ ──────────────┤                    ├──→ a₁^[2] → ŷ
                      ├──→ a₂^[1] ────────┘
    x₃ ──────────────┤
                      │
    x₄ ──────────────┘

Dimensions:
  X:    (4, m)  ← 4 features, m examples
  Z[1]: (2, m)  ← 2 neurons, m examples
  A[1]: (2, m)  ← 2 neurons, m examples
  Z[2]: (1, m)  ← 1 neuron, m examples
  A[2]: (1, m)  ← 1 neuron, m examples
""")

print("\n" + "=" * 70)
print("SUMMARY")
print("=" * 70)
print("""
For any layer l:
  - Z[l] shape: (n[l], m)
  - A[l] shape: (n[l], m)
  
Where:
  - n[l] = number of neurons in layer l
  - m = number of training examples

For layer 1 in this network:
  - n[1] = 2 (2 neurons in hidden layer)
  - m = m (number of training examples)
  
Therefore: Z[1] and A[1] are (2, m)
""")


## General Formula for All Layers

For any layer `l` in a neural network:

| Variable | Dimensions | Description |
|----------|------------|-------------|
| **X** | (n₀, m) | Input features |
| **W[l]** | (n[l], n[l-1]) | Weights for layer l |
| **b[l]** | (n[l], 1) | Bias for layer l |
| **Z[l]** | (n[l], m) | Pre-activation for layer l |
| **A[l]** | (n[l], m) | Post-activation for layer l |

Where:
- `n₀` = number of input features
- `n[l]` = number of neurons in layer l
- `m` = number of training examples

**Remember:** Z[l] and A[l] always have the same shape: **(n[l], m)**


In [None]:
# Complete example showing all layers
print("=" * 70)
print("COMPLETE NETWORK DIMENSIONS")
print("=" * 70)

# Layer 0 (Input)
print(f"\nLayer 0 (Input):")
print(f"  X shape: ({n_x}, m) = ({n_x}, {m})")

# Layer 1 (Hidden)
print(f"\nLayer 1 (Hidden):")
print(f"  W[1] shape: ({n_1}, {n_x}) = ({n_1}, {n_x})")
print(f"  b[1] shape: ({n_1}, 1) = ({n_1}, 1)")
print(f"  Z[1] shape: ({n_1}, m) = ({n_1}, {m})")
print(f"  A[1] shape: ({n_1}, m) = ({n_1}, {m})")

# Layer 2 (Output)
print(f"\nLayer 2 (Output):")
print(f"  W[2] shape: ({n_2}, {n_1}) = ({n_2}, {n_1})")
print(f"  b[2] shape: ({n_2}, 1) = ({n_2}, 1)")
print(f"  Z[2] shape: ({n_2}, m) = ({n_2}, {m})")
print(f"  A[2] shape: ({n_2}, m) = ({n_2}, {m})")

print("\n" + "=" * 70)
print("FINAL ANSWER TO THE QUESTION:")
print("=" * 70)
print("Z[1] and A[1] are (2, m)")
print("=" * 70)
