# 03_deep_learning_intro.ipynb

## Week 3: Introduction to Deep Learning & Best Practices

### Notebook Overview
This notebook is structured to help you:
1. Understand **Neural Network Fundamentals** (Monday).
2. Get comfortable with a **Deep Learning Framework** (PyTorch or TensorFlow) & build an MLP for MNIST (Tuesday).
3. Implement a **CNN for image classification** (Wednesday).
4. Explore **regularization & best practices** (Thursday).
5. Complete a **mini-project CNN showcase** (Friday).
6. Over the **Weekend**, review & consolidate your learnings.

By the end of this week, you'll have:
- Built your first neural networks (MLP, CNN) with a popular framework.
- Learned about optimization basics (SGD, Adam, etc.), overfitting vs. underfitting, and ways to mitigate.
- Gained hands-on experience with experiment tracking.
- Prepared a mini-project showcasing a CNN from start to finish.

---
## 1. Monday: Neural Net Fundamentals

### Topics:
- Perceptron model, activation functions (ReLU, Sigmoid, Tanh)
- Forward and backpropagation concepts
- Gradient descent (SGD) basics

### Notebook Tasks:
1. **Key Concepts**: Write down your own definitions and intuitive explanations of a perceptron, forward pass, backprop.
2. (Optional) **From-scratch example**: Implement a small forward pass in pure Python or illustrate the math.

### Why This Matters
Neural networks are the backbone of deep learning. Understanding the fundamentals (especially how forward/backprop works) helps you debug, tune, and architect more complex models.

---
## 2. Tuesday: Framework Setup & MLP for MNIST

### Topics:
- PyTorch or TensorFlow environment setup
- Building a simple MLP (multi-layer perceptron) for MNIST classification
- Data loading, training loop, evaluating accuracy
 
### Notebook Tasks:
1. **Install** PyTorch or TensorFlow (if not already installed).
2. **Load MNIST** dataset (or Fashion-MNIST if you want a slightly different challenge).
3. **Build** a small MLP, define a loss function (cross-entropy), and optimizer (SGD or Adam).
4. **Train** the network, track training/validation accuracy.
5. **Interpret** results in a markdown cell.

### Industry Context
MNIST might seem basic, but it's a common starting point for understanding CNNs and debugging training loops.

---
## 3. Wednesday: CNN for Image Classification

### Topics:
- Convolutional Neural Networks (CNNs): Convolution, pooling, typical architectures
- Comparing CNN performance to MLP for image tasks
- Basic debugging of CNN training

### Notebook Tasks:
1. **Define** a CNN with a few convolution + pooling layers.
2. **Train** on MNIST or CIFAR-10.
3. **Log** metrics (accuracy, loss) over epochs.
4. **Visualize** sample predictions.

### Observations to Note
- Did your CNN perform better than the MLP? How quickly?
- Common pitfalls: exploding/vanishing gradients, overfitting.

---
## 4. Thursday: Regularization & Best Practices

### Topics:
- Techniques to reduce overfitting: **Dropout**, **Batch Normalization**, **Data Augmentation**
- **Experiment Tracking**: TensorBoard, W&B, or MLflow

### Notebook Tasks:
1. **Add dropout** to your CNN and see if it helps.
2. **Incorporate batch normalization** layers.
3. **Use data augmentation** (random flips, crops, rotations) for CIFAR-10 or MNIST.
4. **Set up** a simple experiment tracking method (TensorBoard or other) to compare runs.

### Industry Context
In real-world projects, controlling overfitting is critical. Data augmentation is standard for image tasks, while logging experiments is vital for systematic iteration.

---
## 5. Friday: Mini-Project – CNN Showcase

### Objective
Build a short but comprehensive **CNN project**:
1. Load dataset (MNIST, CIFAR-10, or a small custom dataset).
2. Define CNN architecture.
3. Integrate dropout, batch norm, or data augmentation.
4. Train and track your experiments.
5. Evaluate final model (accuracy, confusion matrix, sample predictions).
6. Summarize your findings.

### Key Points
- Document everything: hyperparameters, epoch counts, best accuracy, etc.
- Reflect on how you might extend this in a real-world scenario (more data, more complex model, etc.).

---
## 6. Weekend: Review & Consolidation

- **Review** each day’s tasks, ensure you’re comfortable with the code and concepts.
- If time permits, **refine** your mini-project code for clarity.
- **ADHD Tip**: If you feel overwhelmed, break your review into short bursts (Pomodoro style) focusing on one concept at a time.

### Next Steps Preview
In **Week 4**, we’ll dive deeper into **Advanced Deep Learning** concepts (RNNs, LSTM, Transformers, advanced CNNs/transfer learning).


## Practical Implementation Sections
Below are skeleton code cells and markdown cells for each day. Insert your own details, code, or notes.

---

### 1. Monday: Neural Net Fundamentals


**Key Definitions & Intuitive Explanations**
- **Perceptron**: *(Fill in your notes: how does it work, what’s the update rule?)*
- **Activation Functions**: ReLU, Sigmoid, Tanh (pros/cons, typical usage)
- **Forward Pass & Backpropagation**:  *(Write a short summary of chain rule in the context of neural nets.)*
- **Gradient Descent**: *(Why do we need it? Basic process?)*


In [None]:
# (Optional) 1.1 Simple Forward Pass Example
import numpy as np

def relu(x):
    return np.maximum(0, x)

def simple_forward_pass(inputs, weights, biases):
    # Just one hidden layer with ReLU activation
    hidden = relu(np.dot(inputs, weights["w1"]) + biases["b1"])
    out = np.dot(hidden, weights["w2"]) + biases["b2"]
    return out

# Example usage:
inputs = np.array([1.0, 2.0])
weights = {
    "w1": np.random.randn(2, 3),  # from 2 input neurons to 3 hidden neurons
    "w2": np.random.randn(3, 1)   # from 3 hidden neurons to 1 output neuron
}
biases = {
    "b1": np.random.randn(3),
    "b2": np.random.randn(1)
}

output = simple_forward_pass(inputs, weights, biases)
print("Output:", output)


**Your Observations**:
- *(Write notes on how changing `weights` or `biases` might affect the output. How would backprop update these?)*

### 2. Tuesday: Framework Setup & MLP for MNIST

**Instructions**:
1. Pick **PyTorch** or **TensorFlow**. Below is a PyTorch example skeleton.
2. **Install** if needed: `pip install torch torchvision`.
3. **Load MNIST** using `torchvision.datasets.MNIST` or a similar dataset.
4. **Define** an MLP model, train it, and track accuracy.
5. **Write** your observations.


In [None]:
# TODO: 2.1 PyTorch Setup & MLP Example
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms

# Hyperparameters
batch_size = 64
learning_rate = 0.001
epochs = 2  # Increase for better results

# Data loaders
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))  # mean and std for MNIST
])

train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Define a simple MLP
class MLP(nn.Module):
    def __init__(self):
        super(MLP, self).__init__()
        self.fc1 = nn.Linear(28*28, 128)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(128, 10)
    def forward(self, x):
        x = x.view(-1, 28*28)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = MLP()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

# Training loop
for epoch in range(epochs):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()

    print(f"Epoch {epoch+1}/{epochs} - Loss: {loss.item():.4f}")

# Evaluation
model.eval()
correct = 0
total = 0

with torch.no_grad():
    for data, target in test_loader:
        output = model(data)
        _, predicted = torch.max(output, 1)
        total += target.size(0)
        correct += (predicted == target).sum().item()

accuracy = 100.0 * correct / total
print(f"Test Accuracy: {accuracy:.2f}%")

**Your Observations**:
- *(Document final accuracy, how many epochs you trained, any improvements if you tweak hyperparams, etc.)*

### 3. Wednesday: CNN for Image Classification

**Instructions**:
1. Build a CNN with convolution + pooling layers.
2. Use either MNIST or CIFAR-10 (CIFAR-10 is a bit more challenging).
3. Compare CNN accuracy vs. MLP.
4. Plot training loss and validation accuracy across epochs (if time).


In [None]:
# TODO: 3.1 Example CNN on MNIST (Adjust for CIFAR-10 if you prefer)
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.relu = nn.ReLU()
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.fc1 = nn.Linear(64*7*7, 128)
        self.fc2 = nn.Linear(128, 10)
    def forward(self, x):
        x = self.relu(self.conv1(x))
        x = self.pool(x)
        x = self.relu(self.conv2(x))
        x = self.pool(x)
        x = x.view(-1, 64*7*7)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# Instantiate the CNN
cnn_model = SimpleCNN()
cnn_optimizer = optim.Adam(cnn_model.parameters(), lr=learning_rate)
cnn_criterion = nn.CrossEntropyLoss()

# Simple training loop (for demonstration; you can refine or add validation)
for epoch in range(epochs):
    cnn_model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        cnn_optimizer.zero_grad()
        output = cnn_model(data)
        loss = cnn_criterion(output, target)
        loss.backward()
        cnn_optimizer.step()
    print(f"[CNN] Epoch {epoch+1}/{epochs} - Loss: {loss.item():.4f}")

# Evaluate CNN
cnn_model.eval()
correct = 0
total = 0

with torch.no_grad():
    for data, target in test_loader:
        output = cnn_model(data)
        _, predicted = torch.max(output, 1)
        total += target.size(0)
        correct += (predicted == target).sum().item()

cnn_accuracy = 100.0 * correct / total
print(f"CNN Test Accuracy: {cnn_accuracy:.2f}%")

**Comparison**:
- *(Write how CNN results compare to MLP. Which is higher accuracy? Any interesting patterns?)*

### 4. Thursday: Regularization & Best Practices

**Techniques**:
- **Dropout**: random zeroing of some neurons’ outputs.
- **Batch Normalization**: normalizes layer inputs, can stabilize training.
- **Data Augmentation**: random flips, crops, rotations to artificially expand dataset.
- **Experiment Tracking**: TensorBoard, W&B, MLflow.

**Notebook Tasks**:
1. Implement or add **dropout** in your CNN.
2. Add **batch norm** after convolution layers.
3. **Augment** data (especially helpful on CIFAR-10) with `transforms.RandomHorizontalFlip()`, etc.
4. Use **TensorBoard** or another tool to log accuracy & loss. (Pseudocode or partial code if short on time.)


In [None]:
# (Example) 4.1 A CNN with Dropout & BatchNorm
class CNNWithDropoutBatchNorm(nn.Module):
    def __init__(self):
        super(CNNWithDropoutBatchNorm, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.bn1 = nn.BatchNorm2d(32)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.bn2 = nn.BatchNorm2d(64)
        self.pool = nn.MaxPool2d(2, 2)
        self.dropout = nn.Dropout(p=0.5)
        self.fc1 = nn.Linear(64*7*7, 128)
        self.fc2 = nn.Linear(128, 10)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.relu(self.bn1(self.conv1(x)))
        x = self.pool(x)
        x = self.relu(self.bn2(self.conv2(x)))
        x = self.pool(x)
        x = x.view(-1, 64*7*7)
        x = self.dropout(self.relu(self.fc1(x)))
        x = self.fc2(x)
        return x

# (Pseudo-code for training this new CNN). You can replicate your training loop from above.
# TODO: Compare final accuracy with/without dropout and batchnorm.
print("Example architecture. You can code the full training & evaluation loop similarly to above.")

**Experiment Tracking**:
- *(If using TensorBoard, show code to log your loss/accuracy. If using W&B, log in a similar fashion.)*
- Summarize the runs in a short markdown cell.


### 5. Friday: Mini-Project – CNN Showcase

**Objective**: Consolidate your CNN knowledge into a short end-to-end project. For example:
1. Load & preprocess data.
2. Define CNN architecture with dropout/batch norm.
3. Train with data augmentation.
4. Track experiments (baseline vs. augmented, dropout vs. no dropout).
5. Evaluate final results. Show confusion matrix or sample predictions.

**Your Steps**:
- *(Create code cells that do each step. Document your final results in a markdown cell.)*

In [None]:
# TODO: 5.1 Example skeleton for your mini-project
def run_cnn_showcase():
    # 1. Data loading & augmentation
    # 2. Model definition
    # 3. Training loop & experiment tracking
    # 4. Evaluation
    # 5. Document results
    pass

print("CNN Showcase placeholder. Fill with your own logic.")

### Industry Context
CNNs power many real-world applications: image recognition, medical imaging, self-driving car vision, etc. Documenting your mini-project thoroughly demonstrates practical skill.


## 6. Weekend: Review & Consolidation

### Checklist
- [ ] Familiar with forward/backprop?
- [ ] Comfortable training MLP & CNN?
- [ ] Understanding of overfitting, dropout, batch norm?
- [ ] Basic logging of experiments?

### ADHD Tip
- Break your weekend review into **short, focused sessions**.
- Reward yourself after each milestone (completing 1 or 2 bullet points).

### Looking Ahead
In **Week 4**, we’ll explore **Advanced Deep Learning** topics like RNNs, LSTMs, Transformers, and more advanced CNN or transfer learning methods. Keep your momentum going!

# End of Week 3 Notebook

---
Congratulations on completing your first steps into Deep Learning! Next week, we’ll expand these fundamentals into more specialized or advanced architectures.
