# Rock Paper Scissors with Jetbot - Train Model
Train image classifier to detect three classes ``rock``, ``paper``, and ``scissor``, which we'll use for calculating the result.

*PyTorch* is used to obtain a pre-trained model to use transfer learning.

## 1. Initialization
* torch - PyTorch
* optim - contains optimization algorithms
* functional - common NN functions
* torchvision - popular datasets, architectures, and image transformations
* datasets - ImageFolder for accessing dataset
* models - contains AlexNet
* transforms - pre-process images

In [None]:
import torch
import torch.optim as optim
import torch.nn.functional as func
import torchvision
import torchvision.datasets as datasets
import torchvision.models as models
import torchvision.transforms as transforms

### 1. Create and Pre-process Image Dataset

Now we use the ``ImageFolder`` dataset class available with the ``torchvision.datasets`` package.  We attach transforms from the ``torchvision.transforms`` package to prepare the data for training.  

In [None]:
dataset = datasets.ImageFolder(
    'dataset',
    transforms.Compose([
        transforms.ColorJitter(0.1, 0.1, 0.1, 0.1),
        transforms.Resize((224, 224)),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
)

### 2. Split Dataset ***

Next, we split the dataset into *training*, *validation*, and *test* sets.
The validation set will be used to verify and improve model accuracy.
The test set will be ran once for the final accuracy.

In [None]:
train_dataset, test_dataset = torch.utils.data.random_split(dataset, [len(dataset) - 50, 50])

### 3. Create Data Loaders

A ``DataLoader`` instance for each data set, which provide utilities for shuffling data, producing *batches* of images, and loading the samples in parallel with multiple workers.

In [None]:
train_loader = torch.utils.data.DataLoader(
    train_dataset,
    batch_size=16,
    shuffle=True,
    num_workers=4
)

test_loader = torch.utils.data.DataLoader(
    test_dataset,
    batch_size=16,
    shuffle=True,
    num_workers=4
)

### 4. Define the Neural Network
Transfer learning will be used with a pre-trained ``Alexnet`` model from ``torchvision`` to keep important learned features for reuse.

In [None]:
model = models.alexnet(pretrained=True)

Replace the final layer with an untrained 3 outputs layer.  

In [None]:
model.classifier[6] = torch.nn.Linear(model.classifier[6].in_features, 3)

Transfer the model to execute on the GPU using CUDA.

In [None]:
device = torch.device('cuda')
model = model.to(device)

### 5. Train the Neural Network ***
Train the model for 30 epochs, saving the best performing model after each epoch.

Learning rate was optimized using X.

Momentum is typically 0.5, 0.9, and 0.99. 0.9 was selected.

In [None]:
NUM_EPOCHS = 30
BEST_MODEL_PATH = 'rps_model.pth'
best_accuracy = 0.0

optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

for epoch in range(NUM_EPOCHS):
    
    for images, labels in iter(train_loader):
        images = images.to(device)
        labels = labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = func.cross_entropy(outputs, labels)
        loss.backward()
        optimizer.step()
    
    test_error_count = 0.0
    for images, labels in iter(test_loader):
        images = images.to(device)
        labels = labels.to(device)
        outputs = model(images)
        test_error_count += float(torch.sum(torch.abs(labels - outputs.argmax(1))))
    
    test_accuracy = 1.0 - float(test_error_count) / float(len(test_dataset))
    print('%d: %f' % (epoch, test_accuracy))
    if test_accuracy > best_accuracy:
        torch.save(model.state_dict(), BEST_MODEL_PATH)
        best_accuracy = test_accuracy