# üìù Week 6 Homework: Hyperparameter Tuning

**Goal**: Step into the role of a machine learning engineer by experimenting with hyperparameters to see how they affect model performance.

---



## ‚ñ∂Ô∏è Today's Video

If you haven't already, watch this video to understand hyperparameters and why tuning them is one of the most important and creative skills in machine learning.

üîó [Neural Networks Summary: All hyperparameters](https://www.youtube.com/watch?v=h291CuASDno)

---

In [None]:
#@title üîó Neural Networks Summary: All hyperparameters
from IPython.display import HTML


# Create the HTML for embedding
html_code = f"""

<iframe width="560" height="315" src="https://www.youtube.com/embed/h291CuASDno?si=iko_Nw8BeGjsY0v_" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>

"""
# Display the video
display(HTML(html_code))



## üìñ Today's Theory: The Art of Tuning

The values we set ourselves **before training begins**‚Äîlike the **learning rate (`lr`)**, the **number of epochs**, or the **choice of optimizer**‚Äîare called **hyperparameters**. They control the learning process itself.

Finding a good combination of hyperparameters is often more of an **art than a science** and is a critical skill for building high-performing models.

### üìª Analogy: Tuning a Radio

Finding the right learning rate is like tuning an old radio:
- Turn the dial **too quickly** (`lr` too high) ‚Üí you overshoot the signal and get noise.
- Turn it **too slowly** (`lr` too low) ‚Üí it takes forever to find a clear station.

The goal is to find the **sweet spot** where learning is stable and efficient.

---

## üöÄ Your Task: Experiment, Document, and Analyze

Use the **baseline script** (provided below) as your starting point. It‚Äôs a complete, working training and evaluation pipeline for MNIST.

### üî¨ The Experiments

Run **three separate experiments**. For each:

1. Create a **new code cell**.
2. Copy the **entire baseline script** into it.
3. Make **only the specified change**.
4. Run the cell and record the **final test accuracy**.

> üí° **Important**: Only change the hyperparameter listed for each experiment. Keep everything else identical.

- **Experiment A (High Learning Rate)**: Change `lr` from `0.01` to `0.1`.
- **Experiment B (More Epochs)**: Change `epochs` from `3` to `10`.
- **Experiment C (Different Optimizer)**: Replace  
  `optim.SGD(net.parameters(), lr=0.01, momentum=0.9)`  
  with  
  `optim.Adam(net.parameters(), lr=0.001)`.

---

## üìã Homework Submission Template

At the **very bottom of your notebook**, create a **new Markdown cell** and use this template to document your findings.

### My Hyperparameter Tuning Experiments

**Experiment A: Learning Rate (0.1)**  
- **Final Accuracy**: *What was the accuracy on the test set?*  
- **Conclusion**: *How did this high learning rate affect the model's ability to generalize compared to the original model?*

**Experiment B: Epochs (10)**  
- **Final Accuracy**: *What was the accuracy after 10 epochs?*  
- **Conclusion**: *Did training for longer improve performance significantly? What is the trade-off with training time?*

**Experiment C: Optimizer (Adam)**  
- **Final Accuracy**: *What was Adam's final accuracy?*  
- **Conclusion**: *How did Adam's performance compare to SGD's? Which would you choose for this problem and why?*

In [None]:
# ========== Week 6 Homework: Hyperparameter Tuning ==========

# --- INSTRUCTIONS ---
# For each experiment (A, B, C):
# 1. Copy this ENTIRE cell into a NEW code cell.
# 2. Make ONLY the specified change (see homework instructions).
# 3. Run the cell and note the final accuracy.

# 1. Imports
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

# 2. Load Data
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
trainset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
testset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False)

# 3. Define Model
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(28 * 28, 128)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(128, 10)
    def forward(self, x):
        x = x.view(-1, 28 * 28)
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

# 4. Define Evaluation Function
def evaluate_model(model, loader):
    correct = 0
    total = 0
    with torch.no_grad():
        for data in loader:
            images, labels = data
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    return 100 * correct / total

# ========== SOLUTION: Baseline (lr=0.01, epochs=3, SGD) ==========
# Expected accuracy: ~85-90%

# ... (same imports and data loading as above) ...

net = Net()
optimizer = optim.SGD(net.parameters(), lr=0.01, momentum=0.9)
epochs = 3
# ... rest unchanged ‚Üí accuracy ‚âà 88.5%

# ========== SOLUTION: Experiment A (lr=0.1) ==========
optimizer = optim.SGD(net.parameters(), lr=0.1, momentum=0.9)
epochs = 3
# ‚Üí Likely unstable, accuracy drops (e.g., ~10-50%)

# ========== SOLUTION: Experiment B (epochs=10) ==========
optimizer = optim.SGD(net.parameters(), lr=0.01, momentum=0.9)
epochs = 10
# ‚Üí Accuracy improves slightly (e.g., ~90-92%), but diminishing returns

# ========== SOLUTION: Experiment C (Adam, lr=0.001) ==========
optimizer = optim.Adam(net.parameters(), lr=0.001)
epochs = 3
# ‚Üí Faster convergence, higher accuracy (e.g., ~92-94%)

criterion = nn.CrossEntropyLoss()

print("üöÄ Starting Training...")
for epoch in range(epochs):
    for data in trainloader:
        inputs, labels = data
        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
print("üèÅ Finished Training!")

accuracy = evaluate_model(net, testloader)
print(f'Final Test Accuracy: {accuracy:.2f} %')

üöÄ Starting Training...
üèÅ Finished Training!
Final Test Accuracy: 96.22 %


---
### **`My Solution and Explanation`**
---

In [None]:
# ========== Week 6 Homework: Hyperparameter Tuning ==========

# 1. Imports
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms

# 2. Load Data
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
trainset = torchvision.datasets.MNIST(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
testset = torchvision.datasets.MNIST(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False)

# 3. Define Model
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(28 * 28, 128)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(128, 10)
    def forward(self, x):
        x = x.view(-1, 28 * 28)
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

# 4. Define Evaluation Function
def evaluate_model(model, loader):
    correct = 0
    total = 0
    with torch.no_grad():
        for data in loader:
            images, labels = data
            outputs = model(images)
            _, predicted = torch.max(outputs.data, 1)
            total += labels.size(0)
            correct += (predicted == labels).sum().item()
    return 100 * correct / total

# ============================================================
# ========== SOLUTION: Baseline (lr=0.01, epochs=3, SGD) ==========
# Expected accuracy: ~85-90%
# ============================================================
net = Net()
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.01, momentum=0.9)
epochs = 3  # Baseline epochs

print("üöÄ Starting Training (Baseline)...")
for epoch in range(epochs):
    for data in trainloader:
        inputs, labels = data
        optimizer.zero_grad()
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
print("üèÅ Finished Training (Baseline)!")
baseline_acc = evaluate_model(net, testloader)
print()
print(f'Baseline Final Test Accuracy: {baseline_acc:.2f} %')
print()
# ============================================================
# ========== SOLUTION: Experiment A (High LR) ==========
# lr = 0.1
# ============================================================
net_a = Net()
optimizer_a = optim.SGD(net_a.parameters(), lr=0.1, momentum=0.9)
epochs_a = 3

for epoch in range(epochs_a):
    for data in trainloader:
        inputs, labels = data
        optimizer_a.zero_grad()
        outputs = net_a(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer_a.step()
acc_a = evaluate_model(net_a, testloader)
print(f'Experiment A Accuracy (High LR): {acc_a:.2f} %')

# ============================================================
# ========== SOLUTION: Experiment B (More Epochs) ==========
# epochs = 10
# ============================================================
net_b = Net()
optimizer_b = optim.SGD(net_b.parameters(), lr=0.01, momentum=0.9)
epochs_b = 10

for epoch in range(epochs_b):
    for data in trainloader:
        inputs, labels = data
        optimizer_b.zero_grad()
        outputs = net_b(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer_b.step()
acc_b = evaluate_model(net_b, testloader)
print(f'Experiment B Accuracy (More Epochs): {acc_b:.2f} %')

# ============================================================
# ========== SOLUTION: Experiment C (Adam Optimizer) ==========
# optimizer = Adam, lr = 0.001
# ============================================================
net_c = Net()
optimizer_c = optim.Adam(net_c.parameters(), lr=0.001)
epochs_c = 3

for epoch in range(epochs_c):
    for data in trainloader:
        inputs, labels = data
        optimizer_c.zero_grad()
        outputs = net_c(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer_c.step()
acc_c = evaluate_model(net_c, testloader)
print(f'Experiment C Accuracy (Adam): {acc_c:.2f} %')



üöÄ Starting Training (Baseline)...
üèÅ Finished Training (Baseline)!

Baseline Final Test Accuracy: 96.34 %

Experiment A Accuracy (High LR): 77.69 %
Experiment B Accuracy (More Epochs): 97.46 %
Experiment C Accuracy (Adam): 96.62 %


###  **My Hyperparameter Tuning Experiments**

| **Experiment** | **Hyperparameter Change** | **Final Accuracy** | **Conclusion** |
|----------------|----------------------------|--------------------|----------------|
| **A** | Learning Rate = 0.1 | 77.69% | High learning rate caused unstable training. The model overshot optimal weights, reducing accuracy. |
| **B** | Epochs = 10 | 97.46% | Training longer improved performance significantly. Diminishing returns are small, but training time increases. |
| **C** | Optimizer = Adam (lr=0.001) | 96.62% | Adam converged faster and achieved higher accuracy compared to SGD, making it efficient for this dataset. |

---

### **Reflection**
Through these experiments, I learned that **small hyperparameter changes can greatly affect model performance**.  
- A **too-high learning rate** made training unstable.  
- **More epochs** allowed the model to learn deeper patterns and reach higher accuracy.  
- The **Adam optimizer** provided faster convergence and better performance compared to SGD.  

This exercise helped me understand how tuning hyperparameters is both a **science and an art** ‚Äî finding the perfect balance can make a simple model perform impressively well.


In [None]:
#@title Run to Enter your results

# ========== Record Your Homework Results ==========
# Run this cell to input your experiment accuracies interactively

try:
    expA = float(input("Enter final test accuracy for Experiment A (High LR = 0.1): "))
    expB = float(input("Enter final test accuracy for Experiment B (Epochs = 10): "))
    expC = float(input("Enter final test accuracy for Experiment C (Adam optimizer): "))

    # Validate ranges
    if not all(0 <= acc <= 100 for acc in [expA, expB, expC]):
        print("‚ö†Ô∏è Warning: Accuracy should be between 0 and 100. Please re-run this cell if values are incorrect.")

    # Store in the expected format for self-assessment
    homework_results = {
        'expA_acc': expA,
        'expB_acc': expB,
        'expC_acc': expC
    }

    print("\n‚úÖ Results saved successfully!")
    print(f"Experiment A: {expA:.2f}%")
    print(f"Experiment B: {expB:.2f}%")
    print(f"Experiment C: {expC:.2f}%")

except ValueError:
    print("‚ùå Error: Please enter numeric values only (e.g., 85.3). Re-run this cell to try again.")
    homework_results = None

Enter final test accuracy for Experiment A (High LR = 0.1): 77.69
Enter final test accuracy for Experiment B (Epochs = 10): 97.46
Enter final test accuracy for Experiment C (Adam optimizer): 96.62

‚úÖ Results saved successfully!
Experiment A: 77.69%
Experiment B: 97.46%
Experiment C: 96.62%


In [None]:
# ========== Week 6 Homework Self-Assessment ==========

#@title Run to check your homework submission
from IPython.display import display, Markdown
import re

def check_homework_submission():
    feedback = []
    score = 0
    total = 1

    # Check if a markdown cell with results exists below
    # (We can't programmatically read other markdown cells in Colab/Jupyter,
    # so we ask the student to define a variable with their results.)

    # ALTERNATIVE: Ask student to define a dict in a code cell after experiments
    try:
        # Student should create this after running all 3 experiments
        if 'homework_results' in globals():
            results = homework_results
            required_keys = {'expA_acc', 'expB_acc', 'expC_acc'}
            if not required_keys.issubset(results.keys()):
                feedback.append("‚ùå Please define `homework_results` with keys: 'expA_acc', 'expB_acc', 'expC_acc'")
            else:
                # Basic sanity check: accuracies should be between 0 and 100
                valid = all(0 <= v <= 100 for v in [results['expA_acc'], results['expB_acc'], results['expC_acc']])
                if valid:
                    score += 1
                    feedback.append("‚úÖ Homework results recorded correctly!")
                else:
                    feedback.append("‚ùå Accuracy values must be between 0 and 100.")
        else:
            feedback.append("üìù **Reminder**: After running all 3 experiments, create a code cell with:\n```python\nhomework_results = {\n    'expA_acc': YOUR_ACCURACY_A,\n    'expB_acc': YOUR_ACCURACY_B,\n    'expC_acc': YOUR_ACCURACY_C\n}\n```")
    except Exception as e:
        feedback.append(f"‚ùå Error checking results: {e}")

    final_message = "**üéØ Week 6 Homework Self-Assessment**\n\n" + "\n".join(feedback)
    final_message += f"\n\nüìä **Score: {score}/{total}**"
    if score == 1:
        final_message += "\n\nüéâ Great! You‚Äôve completed the hyperparameter tuning homework. Well done!"
    else:
        final_message += "\n\n‚úèÔ∏è Please follow the instructions above to record your results."

    display(Markdown(final_message))

check_homework_submission()

**üéØ Week 6 Homework Self-Assessment**

‚úÖ Homework results recorded correctly!

üìä **Score: 1/1**

üéâ Great! You‚Äôve completed the hyperparameter tuning homework. Well done!