# PyTorch Deep Dive: Advanced Topics & Best Practices

You can build models. Now let's build them **professionally**.

In this notebook, we will learn the tools that make PyTorch scalable and robust.

## Learning Objectives
- **The Vocabulary**: What is a "Dataset", "DataLoader", "Batch Size", and "Checkpoint"?
- **The Intuition**: Why we eat with a spoon (Batches) instead of a shovel.
- **The Practice**: Using `torch.utils.data` to handle massive datasets.
- **The Safety**: Saving and Loading your model so you don't lose work.


In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
import numpy as np

torch.manual_seed(42)

## Part 1: The Vocabulary (Definitions First)

### 1. Dataset (`torch.utils.data.Dataset`)
- A class that knows where your data is and how to get one item.
- It must implement `__len__` (How big am I?) and `__getitem__` (Give me item #5).
- Analogy: A Librarian who knows where every book is.

### 2. DataLoader (`torch.utils.data.DataLoader`)
- A worker that grabs items from the Dataset and bundles them into Batches.
- It handles shuffling and parallel loading (using multiple CPU cores).
- Analogy: A Delivery Driver who packs books into boxes and brings them to you.

### 3. Batch Size
- How many items the DataLoader puts in one box.
- **Small Batch (e.g., 1)**: Noisy updates, slow training.
- **Large Batch (e.g., 1000)**: Stable updates, memory hungry.
- **Sweet Spot**: Usually 32, 64, or 128.

### 4. Checkpoint
- A file containing the model's weights at a specific point in time.
- Analogy: A "Save Game" file.

## Part 2: The Intuition (Spoon vs Shovel)

Why do we use Batches?

Imagine you need to eat a mountain of rice (The Dataset).
- **Batch Size = 1 (SGD)**: Eating one grain at a time. You will starve before you finish.
- **Batch Size = All (Batch GD)**: Trying to shove the entire mountain into your mouth at once. You will choke (Out of Memory).
- **Batch Size = 64 (Mini-Batch GD)**: Eating with a spoon. Efficient and manageable.

The `DataLoader` is your spoon.

## Part 3: Custom Dataset (The Practice)

Let's build a fake dataset of random numbers.

In [None]:
class RandomDataset(Dataset):
    def __init__(self, size, length):
        self.len = length
        self.data = torch.randn(length, size)

    def __getitem__(self, index):
        return self.data[index]

    def __len__(self):
        return self.len

# Create a dataset with 1000 items
dataset = RandomDataset(size=5, length=1000)

# Test the Librarian
print(f"Item 0: {dataset[0]}")
print(f"Length: {len(dataset)}")

## Part 4: The DataLoader (The Driver)

Now let's put the Driver to work.

In [None]:
# Create a DataLoader
loader = DataLoader(dataset=dataset, batch_size=64, shuffle=True)

# Test the Driver
for batch in loader:
    print(f"Batch Shape: {batch.shape}")
    break # Just show the first one

## Part 5: Saving and Loading (The Save Game)

Training takes hours. You don't want to lose progress if your computer crashes.

We use `torch.save` and `torch.load`.

In [None]:
# Define a simple model
model = nn.Linear(5, 1)

# Save the state dictionary (The Weights)
torch.save(model.state_dict(), 'model_weights.pth')
print("Model saved to model_weights.pth")

# Load it back
new_model = nn.Linear(5, 1)
new_model.load_state_dict(torch.load('model_weights.pth'))
print("Model loaded successfully!")

# Verify they are the same
print(f"Original Weight: {model.weight}")
print(f"Loaded Weight:   {new_model.weight}")

## Summary Checklist

1. **Dataset** = The Librarian (Stores data).
2. **DataLoader** = The Driver (Batches data).
3. **Batch Size** = The Spoon Size (Efficiency).
4. **state_dict** = The Save File (Weights).

You are now equipped to handle real-world, large-scale Deep Learning projects.