# Weight and Bias Dimensions in Neural Networks

## Question

**"Which of the following statements are True? (Check all that apply)"**

Given a neural network with:
- 2 input features (x₁, x₂)
- 4 neurons in the first hidden layer
- 1 neuron in the output layer

This notebook explains how to determine the dimensions of W^[1], W^[2], b^[1], and b^[2].


## General Formulas for Weights and Biases

### Key Rules:

For any layer `l`:
- **W^[l]** (weights) has shape: **(n[l], n[l-1])**
  - First dimension: Number of neurons in layer `l`
  - Second dimension: Number of neurons in previous layer (or input features for layer 1)
  
- **b^[l]** (biases) has shape: **(n[l], 1)**
  - First dimension: Number of neurons in layer `l`
  - Second dimension: Always 1 (one bias per neuron)

### Notation:
- `n_x` or `n[0]` = number of input features
- `n[1]` = number of neurons in layer 1
- `n[2]` = number of neurons in layer 2
- etc.


In [None]:
import numpy as np

print("=" * 70)
print("WEIGHT AND BIAS DIMENSIONS EXPLANATION")
print("=" * 70)

# Network architecture from the question
n_x = 2  # Number of input features (x1, x2)
n_1 = 4  # Number of neurons in hidden layer 1
n_2 = 1  # Number of neurons in output layer

print(f"\nNetwork Architecture:")
print(f"  Input features: {n_x}")
print(f"  Hidden layer 1 neurons: {n_1}")
print(f"  Output layer neurons: {n_2}")

print("\n" + "=" * 70)
print("CALCULATING DIMENSIONS")
print("=" * 70)


## Dimensions for Layer 1 (Input → Hidden Layer)

### W^[1] Dimensions:
- Formula: W^[1] shape = (n[1], n[0]) = (n[1], n_x)
- Calculation: (4, 2)
- **Answer: W^[1] has shape (4, 2)** ✅

**Why?**
- 4 rows: One row for each neuron in layer 1
- 2 columns: One column for each input feature

### b^[1] Dimensions:
- Formula: b^[1] shape = (n[1], 1)
- Calculation: (4, 1)
- **Answer: b^[1] has shape (4, 1)** ✅

**Why?**
- 4 rows: One bias for each neuron in layer 1
- 1 column: Always 1 (one bias per neuron)


## Dimensions for Layer 2 (Hidden Layer → Output Layer)

### W^[2] Dimensions:
- Formula: W^[2] shape = (n[2], n[1])
- Calculation: (1, 4)
- **Answer: W^[2] has shape (1, 4)** ✅

**Why?**
- 1 row: One row for the single neuron in layer 2
- 4 columns: One column for each neuron in layer 1

### b^[2] Dimensions:
- Formula: b^[2] shape = (n[2], 1)
- Calculation: (1, 1)
- **Answer: b^[2] has shape (1, 1)** ✅

**Why?**
- 1 row: One bias for the single neuron in layer 2
- 1 column: Always 1 (one bias per neuron)


In [None]:
# Create example arrays to demonstrate dimensions
print("\nCreating example arrays:")

# Layer 1
W1 = np.random.randn(n_1, n_x)  # (4, 2)
b1 = np.random.randn(n_1, 1)    # (4, 1)

print(f"\nLayer 1:")
print(f"  W^[1] shape: {W1.shape} = (n[1], n_x) = ({n_1}, {n_x})")
print(f"  b^[1] shape: {b1.shape} = (n[1], 1) = ({n_1}, 1)")

# Layer 2
W2 = np.random.randn(n_2, n_1)  # (1, 4)
b2 = np.random.randn(n_2, 1)    # (1, 1)

print(f"\nLayer 2:")
print(f"  W^[2] shape: {W2.shape} = (n[2], n[1]) = ({n_2}, {n_1})")
print(f"  b^[2] shape: {b2.shape} = (n[2], 1) = ({n_2}, 1)")

print("\n" + "=" * 70)
print("✅ CORRECT ANSWERS:")
print("=" * 70)
print("  ✓ W^[1] has shape (4, 2)")
print("  ✓ W^[2] has shape (1, 4)")
print("  ✓ b^[1] has shape (4, 1)")
print("  ✓ b^[2] has shape (1, 1)")
print("=" * 70)


## Why These Dimensions Work

Let's verify the dimensions work correctly in matrix multiplication:


In [None]:
# Demonstrate forward pass with correct dimensions
m = 3  # Number of training examples

print("=" * 70)
print("VERIFYING DIMENSIONS WITH FORWARD PASS")
print("=" * 70)

# Input X: (n_x, m) = (2, 3)
X = np.random.randn(n_x, m)
print(f"\nX (input) shape: {X.shape} = (n_x, m) = ({n_x}, {m})")

# Layer 1 forward pass
print(f"\nLayer 1 Forward Pass:")
print(f"  W^[1] shape: ({n_1}, {n_x}) = ({n_1}, {n_x})")
print(f"  X shape: ({n_x}, {m})")
print(f"  b^[1] shape: ({n_1}, 1) = ({n_1}, 1)")

# Z[1] = W[1] @ X + b[1]
# (4, 2) @ (2, 3) + (4, 1) = (4, 3) + (4, 1) = (4, 3) [broadcasting]
Z1 = W1 @ X + b1
print(f"\n  Z[1] = W^[1] @ X + b^[1]")
print(f"  Z[1] shape: {Z1.shape} = (n[1], m) = ({n_1}, {m}) ✓")

# Layer 2 forward pass
A1 = np.maximum(0, Z1)  # ReLU activation (doesn't change shape)
print(f"\nLayer 2 Forward Pass:")
print(f"  A[1] shape: {A1.shape} = (n[1], m) = ({n_1}, {m})")
print(f"  W^[2] shape: ({n_2}, {n_1}) = ({n_2}, {n_1})")
print(f"  b^[2] shape: ({n_2}, 1) = ({n_2}, 1)")

# Z[2] = W[2] @ A[1] + b[2]
# (1, 4) @ (4, 3) + (1, 1) = (1, 3) + (1, 1) = (1, 3) [broadcasting]
Z2 = W2 @ A1 + b2
print(f"\n  Z[2] = W^[2] @ A[1] + b^[2]")
print(f"  Z[2] shape: {Z2.shape} = (n[2], m) = ({n_2}, {m}) ✓")

print("\n" + "=" * 70)
print("All dimensions are correct! The matrix multiplications work perfectly.")
print("=" * 70)


## Why the Other Options are Wrong

### ❌ W^[1] has shape (2, 4) - Wrong
- This would be the transpose
- Would require (2, 4) @ (2, m) which doesn't work
- Correct is (4, 2) @ (2, m) = (4, m) ✓

### ❌ W^[2] has shape (4, 1) - Wrong
- This would be the transpose
- Would require (4, 1) @ (4, m) which doesn't work
- Correct is (1, 4) @ (4, m) = (1, m) ✓

### ❌ b^[1] has shape (2, 1) - Wrong
- This would be for 2 neurons, but layer 1 has 4 neurons
- Need one bias per neuron → (4, 1) ✓

### ❌ b^[2] has shape (4, 1) - Wrong
- This would be for 4 neurons, but layer 2 has 1 neuron
- Need one bias per neuron → (1, 1) ✓


## Summary Table

| Variable | Formula | This Network | Shape |
|----------|---------|--------------|-------|
| **W^[1]** | (n[1], n_x) | (4, 2) | (4, 2) ✅ |
| **b^[1]** | (n[1], 1) | (4, 1) | (4, 1) ✅ |
| **W^[2]** | (n[2], n[1]) | (1, 4) | (1, 4) ✅ |
| **b^[2]** | (n[2], 1) | (1, 1) | (1, 1) ✅ |

### Memory Trick:
- **Weights W^[l]**: (neurons in layer l, neurons in previous layer)
- **Biases b^[l]**: (neurons in layer l, 1)

Always think: "How many neurons in this layer?" → That's the first dimension!


In [None]:
# Visual summary
print("=" * 70)
print("FINAL ANSWER SUMMARY")
print("=" * 70)

print("""
Network Structure:

Input Layer          Hidden Layer 1        Output Layer
    x₁ ──────────────┐
                      ├──→ a₁^[1] ────────┐
    x₂ ──────────────┤   a₂^[1]           │
                      ├──→ a₃^[1] ────────┤
                      │   a₄^[1]          │
                      └───────────────────┼──→ a₁^[2] → ŷ
                                         │
                                         └───

Dimensions:
  Input:  X:    (2, m)  ← 2 features, m examples
  
  Layer 1:
    W^[1]: (4, 2)  ← 4 neurons, 2 inputs
    b^[1]: (4, 1)  ← 4 neurons, 1 bias each
    Z[1]:  (4, m)  ← 4 neurons, m examples
    A[1]:  (4, m)  ← 4 neurons, m examples
  
  Layer 2:
    W^[2]: (1, 4)  ← 1 neuron, 4 inputs
    b^[2]: (1, 1)  ← 1 neuron, 1 bias
    Z[2]:  (1, m)  ← 1 neuron, m examples
    A[2]:  (1, m)  ← 1 neuron, m examples
""")

print("\n" + "=" * 70)
print("CORRECT ANSWERS TO CHECK:")
print("=" * 70)
print("  ✓ W^[1] will have shape (4, 2)")
print("  ✓ W^[2] will have shape (1, 4)")
print("  ✓ b^[1] will have shape (4, 1)")
print("  ✓ b^[2] will have shape (1, 1)")
print("=" * 70)
