# Exercise 02: Neural Network from Scratch

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/shang-vikas/series1-coding-exercises/blob/main/exercises/blog-01/exercise-02.ipynb)

This notebook demonstrates building a simple neural network from scratch using only NumPy.

## Learning Objectives
- Understand the basic structure of a neural network
- Implement forward propagation manually with multiple hidden layers
- See how matrix multiplication drives neural networks
- Apply activation functions (ReLU) across layers

In [None]:
## 1. Setup and Installation

✓ numpy is already installed
✓ scikit-learn is already installed
✓ pandas is already installed


In [11]:
# Install required packages using the kernel's Python interpreter
import sys
import subprocess
import importlib


def install_if_missing(package, import_name=None):
    """Install package if it's not already installed."""
    if import_name is None:
        import_name = package
    try:
        importlib.import_module(import_name)
        print(f"✓ {package} is already installed")
    except ImportError:
        print(f"Installing {package}...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", package])
        print(f"✓ {package} installed successfully")


# Install required packages
install_if_missing("numpy")
install_if_missing("scikit-learn", "sklearn")
install_if_missing("pandas")

✓ numpy is already installed
✓ scikit-learn is already installed
✓ pandas is already installed


In [12]:
import numpy as np

## 3. Define Input Data

In [13]:
# Example input (2 features)
x = np.array([2.0, 3.0])
print(f"Input shape: {x.shape}")
print(f"Input: {x}")

Input shape: (2,)
Input: [2. 3.]


## 4. Define Layer 1 (Hidden Layer)

In [14]:
# Layer 1 weights: shape (2, 3) - 2 inputs, 3 neurons
W1 = np.array([
    [0.5, -0.2, 0.1],
    [0.3, 0.8, -0.5]
])

# Layer 1 bias: shape (3,)
b1 = np.array([0.1, -0.1, 0.05])

print(f"W1 shape: {W1.shape}")
print(f"b1 shape: {b1.shape}")

W1 shape: (2, 3)
b1 shape: (3,)


## 5. Forward Propagation Through Layer 1

In [15]:
# Linear transformation: z1 = x @ W1 + b1
z1 = x @ W1 + b1

# Apply ReLU activation function
a1 = np.maximum(0, z1)  # ReLU: max(0, z)

print(f"Layer 1 pre-activation (z1): {z1}")
print(f"Layer 1 output (a1): {a1}")

Layer 1 pre-activation (z1): [ 2.    1.9  -1.25]
Layer 1 output (a1): [2.  1.9 0. ]


## 6. Define Layer 2 (Second Hidden Layer)

In [16]:
# Layer 2 weights: shape (3, 4) - 3 inputs from layer 1, 4 neurons
W2 = np.array([
    [0.4, -0.1, 0.3, 0.2],
    [-0.2, 0.5, -0.1, 0.4],
    [0.1, -0.3, 0.2, -0.1]
])

# Layer 2 bias: shape (4,)
b2 = np.array([0.05, -0.05, 0.1, -0.1])

print(f"W2 shape: {W2.shape}")
print(f"b2 shape: {b2.shape}")

W2 shape: (3, 4)
b2 shape: (4,)


## 7. Forward Propagation Through Layer 2

In [17]:
# Linear transformation: z2 = a1 @ W2 + b2
z2 = a1 @ W2 + b2

# Apply ReLU activation function
a2 = np.maximum(0, z2)  # ReLU: max(0, z)

print(f"Layer 2 pre-activation (z2): {z2}")
print(f"Layer 2 output (a2): {a2}")

Layer 2 pre-activation (z2): [0.47 0.7  0.51 1.06]
Layer 2 output (a2): [0.47 0.7  0.51 1.06]


In [18]:
# Output layer weights: shape (4, 1) - 4 inputs from layer 2, 1 output
W3 = np.array([
    [0.7],
    [-0.3],
    [0.2],
    [0.1]
])

# Output layer bias: shape (1,)
b3 = np.array([0.05])

print(f"W3 shape: {W3.shape}")
print(f"b3 shape: {b3.shape}")

W3 shape: (4, 1)
b3 shape: (1,)


## 9. Forward Propagation Through Output Layer

In [19]:
# Linear transformation: z3 = a2 @ W3 + b3
z3 = a2 @ W3 + b3
output = z3

print(f"Final output: {output}")
print(f"Output shape: {output.shape}")

Final output: [0.377]
Output shape: (1,)


## Summary

### What Just Happened?

You performed a complete forward pass through a **multi-layer neural network**:

**Input → Linear Transform → ReLU → Linear Transform → ReLU → Linear Transform → Output**

That's a deep neural network in its essence:
- **No brains**
- **No magic**  
- **Just matrix multiplications + simple functions stacked together**

### Network Architecture

```
Input (2 features)
    ↓
Layer 1 (Hidden): 2 → 3 neurons (with ReLU activation)
    ↓
Layer 2 (Hidden): 3 → 4 neurons (with ReLU activation)
    ↓
Output Layer: 4 → 1 neuron
    ↓
Final Output
```

### Key Concepts Demonstrated

1. **Matrix Multiplication**: The core operation (`@` operator) connecting layers
2. **Bias Terms**: Added to shift the activation function at each layer
3. **Activation Functions**: ReLU introduces non-linearity in hidden layers
4. **Forward Propagation**: Data flows sequentially through multiple layers
5. **Deep Networks**: Stacking multiple hidden layers allows the network to learn more complex patterns

### Why Multiple Hidden Layers?

- **Layer 1** learns basic features from raw input
- **Layer 2** combines those features into more complex representations
- **Output Layer** makes the final prediction based on these complex features

Each layer builds upon the previous one, enabling the network to model increasingly sophisticated relationships in the data.