# Build a mouse detector using Neural Network with Pytorch

I have a cat whose name is Pumpkin. Pumpkin is an outdoor cat who brings me surprises on a daily basis: sometimes a leaf, sometimes a mouse! In this exercise I will build a classier that will tell me when she brings a mouse back (to potentially send me a warning!). Because we are learning about neural network, I will use Pytorch, which is the most popular DL framework in python.

## Step 1: Preparing data
I have a camera which takes a video of Pumpkin every time she gets home. Images from these videos were collected and manually labelled (a great shout-out to Aiken!!!) into four classes:

* cat: image contains a cat with nothing in its mouth
* empty: image does NOT contain a cat
* leaf: image contains a cat with a leaf in its mouth
* mouse: image contains a cat with a mouse in its mouth

PyTorch's ImageFolder is the easiest way to start. First we organize photos into folders named after the labels:
```
dataset/
├── cat/
├── empty/
├── leaf/
└── mouse/
```

## Step 2: The PyTorch Pipeline
### 2.1 Imports and Transforms
We need to resize all photos to the same size (e.g., 224x224) and convert them to "Tensors" (the math format PyTorch uses).

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms, models
from torch.utils.data import DataLoader

# Image transformer
transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) # Standard for vision models
]) 

# Load dataset
dataset = datasets.ImageFolder(root='./Pumpkin Dataset/', transform=transform)
train_set, test_set = torch.utils.data.random_split(dataset, [int(0.8*len(dataset)), len(dataset)-int(0.8*len(dataset))])

train_loader = DataLoader(train_set, batch_size=32, shuffle=True) # Shuffle for training, prevents "order learning" and improve gradient descent
test_loader = DataLoader(test_set, batch_size=32, shuffle=False) # No need to shuffle for testing, consistency in evaluation and efficiency (no shuffling overhead)

2.1 The Model (Transfer Learning)
Instead of building a Neural Network from scratch (which is very hard for images), we use Transfer Learning. We take a pre-trained model like ResNet18 (trained on millions of images) and just change the final layer to output our 4 classes.

In [None]:
# Load a pre-trained ResNet18
model = models.resnet18(pretrained=True)

# Freeze the early layers (so we don't ruin the pre-trained weights)
for param in model.parameters():
    param.requires_grad = False

# Replace the final layer (fc) with a new one for our 4 categories
# The ResNet18's original final layer has an input size of 512 and output size of 1000 (for ImageNet)
# num_ftrs gets the input size of the original final layer, rquired to create a new layer
num_ftrs = model.fc.in_features
# Replace the final layer with a new layer with 4 outputs (for our 4 classes)
model.fc = nn.Linear(num_ftrs, 4) 

# Select to GPU if available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# Move the model to the selected device
model = model.to(device)

### 2.3 Training the model and "Targeting" the Mouse

In this training example we pass the training dataset 5 times in 5 "epoches". An intuitive understanding of epoch is: with more epoches the NN starts to grasp more understanding of the data: e.g. first epoch distinguishes cat vs no cat, second epoch looks more into cat with mouse

* too few epoches: may underfit
* too many epoches: may overfit

We will want to watch the loss value during epoches to decide when to stop.

In [None]:
# crossentropy is a loss function for multi-class classification through penalizing wrong labels
criterion = nn.CrossEntropyLoss()
# only optimize the final layer's parameters
# takes information from loss and backpropagates to update weights
optimizer = optim.Adam(model.fc.parameters(), lr=0.001)

train_losses = []
train_accuracies = []

for epoch in range(5):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0
    
    for inputs, labels in train_loader:
        inputs, labels = inputs.to(device), labels.to(device)

        # ensure the model starts with a clean slate for gradient calculation
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        # calculates which pixels and features contributed to the mistake
        loss.backward()
        optimizer.step()
        
        # Calculation for progress
        running_loss += loss.item()
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()
    
    epoch_loss = running_loss / len(train_loader)
    epoch_acc = 100 * correct / total
    train_losses.append(epoch_loss)
    train_accuracies.append(epoch_acc)
    
    print(f"Epoch {epoch+1}: Loss = {epoch_loss:.4f}, Accuracy = {epoch_acc:.2f}%")