# Developing an AI application

In this project, I'll train an image classifier to recognize different species of flowers. You can imagine using something like this in a phone app that tells you the name of the flower your camera is looking at. In practice you'd train this classifier, then export it for use in your application. I'll be using [this dataset](http://www.robots.ox.ac.uk/~vgg/data/flowers/102/index.html) of 102 flower categories, you can see a few examples below. 

<img src='assets/Flowers.png' width=500px>

Going forward, AI algorithms will be incorporated into more and more everyday applications. For example, we might want to include an image classifier in a smart phone app. To do this, we'd use a deep learning model trained on hundreds of thousands of images as part of the overall application architecture. A large part of software development in the future will be using these types of models as common parts of applications. 

I break down the project into multiple steps:

* Load and preprocess the image dataset
* Train the image classifier on your dataset
* Use the trained classifier to predict image content

The final product is an application that can be trained on any set of labeled images. Here my network will be learning about flowers and end up as a command line application. Further applications are up to our imagination. For example, imagine an app where you take a picture of a car, it tells you what the make and model is, then looks up information about it. The possibilities are endless.

I completed this project as a part of the Udacity Data Scientist Nanodegree program.

In [1]:
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import time

import torch
from torch import nn
from torch import optim
from torch.utils.data import DataLoader
import torch.nn.functional as F
from torchvision import datasets, transforms
import warnings
warnings.filterwarnings('ignore')

import wandb

from models.resnet import resnet50, resnet152
from models.vgg import select_model
from mbs import MicroBatchStreaming

# Data loader arguments

In [2]:
# Get arguments
model_arch: int = 50
image_size: int = 224
batch_size: int = 8
num_workers: int = 0
pin_memory: bool = False
use_batch_norm: bool = False
device: torch.device = torch.device('cuda')

## Get wandb arguments
use_wandb: bool = True
exp: str = 'flower1'

## Get arguments of MBS
use_mbs: bool = False
micro_size: int = 4
micro_bn: bool = False

## optimizer
learning_rate = 0.001

## Load the data

Here I'll use `torchvision` to load the data ([documentation](http://pytorch.org/docs/0.3.0/torchvision/index.html)). The data should be included alongside this notebook, otherwise you can [download it here](https://s3.amazonaws.com/content.udacity-data.com/nd089/flower_data.tar.gz). The dataset is split into three parts, training, validation, and testing. For the training, I'll apply transformations such as random scaling, cropping, and flipping. This will help the network generalize leading to better performance. The input data is resized to 224x224 pixels as required by the pre-trained networks.

The validation and testing sets are used to measure the model's performance on data it hasn't seen yet. For this I won't use any scaling or rotation transformations, but I'll resize then crop the images to the appropriate size.

The pre-trained networks were trained on the ImageNet dataset where each color channel was normalized separately. For all three sets I'll normalize the means and standard deviations of the images to what the network expects. For the means, it's `[0.485, 0.456, 0.406]` and for the standard deviations `[0.229, 0.224, 0.225]`, calculated from the ImageNet images.  These values will shift each color channel to be centered at 0 and range from -1 to 1.
 

In [3]:
# Define dir paths
data_dir = 'flowers'
train_dir = data_dir + '/train'
valid_dir = data_dir + '/valid'
test_dir = data_dir + '/test'

# Define your transforms for the training, validation, and testing sets

# Add random transforms in training set for better generalization
train_transforms = transforms.Compose([transforms.RandomRotation(30),
                                       transforms.Resize(image_size),
                                       transforms.CenterCrop(image_size),
                                       transforms.RandomHorizontalFlip(),
                                       transforms.ToTensor(),
                                       transforms.Normalize([0.485, 0.456, 0.406],
                                                            [0.229, 0.224, 0.225])])

test_valid_transforms = transforms.Compose([transforms.Resize(image_size), 
                                      transforms.CenterCrop(image_size),
                                      transforms.ToTensor(),
                                      transforms.Normalize([0.485, 0.456, 0.406],
                                                           [0.229, 0.224, 0.225])])

# Load the datasets with ImageFolder
train_data = datasets.ImageFolder(train_dir, transform=train_transforms)
test_data = datasets.ImageFolder(test_dir, transform=test_valid_transforms)
valid_data = datasets.ImageFolder(valid_dir, transform=test_valid_transforms)

# Using the image datasets and the trainforms, define the dataloaders
trainloader = DataLoader(train_data, batch_size=batch_size, shuffle=True, pin_memory=pin_memory, num_workers=num_workers)
testloader = DataLoader(test_data, batch_size=batch_size, pin_memory=pin_memory, num_workers=num_workers)
validloader = DataLoader(valid_data, batch_size=batch_size, pin_memory=pin_memory, num_workers=num_workers)

# Get num of classes

In [4]:
# Get dataloader arguments
num_classes: int = len(train_data.classes)
data_type: str = 'flowers'

In [5]:
# Define WanDB
model: torch.nn.modules.Module
name = None
if model_arch == 50:
    name = 'resnet-50'
    model = resnet50(num_classes)
elif model_arch == 152:
    name = 'resnet-152'
    model = resnet152(num_classes)
else:
    name = 'vgg-{}'.format(model_arch)
    model = select_model(use_batch_norm, model_arch, num_classes)

tags = []
tags.append( f'batch {batch_size}' )
tags.append( f'image {image_size}' )
tags.append( f'worker {num_workers}' )
tags.append( f'pin memory {pin_memory}' )
tags.append( f'{data_type} {num_classes}')
tags.append( f'exp {exp}')
if use_mbs:
    if micro_bn:
        name += 'with MBS-BN'
    else:
        name += 'with MBS'
    tags.append( f'micro {micro_size}')
else:
    tags.append( f'base')

if use_wandb:
    wandb.init(
        project='mbs_paper_results',
        entity='xypiao97',
        name=f'{name}',
        tags=tags,
    )

Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
[34m[1mwandb[0m: Currently logged in as: [33mxypiao97[0m (use `wandb login --relogin` to force relogin)


# MBS

In [6]:
model = model.to(device)
criterion = nn.NLLLoss().to(device)
optimizer = optim.Adam(model.parameters(), lr = learning_rate)

mbs_trainer = None
if use_mbs:
    mbs_trainer, model = MicroBatchStreaming(
                dataloader=trainloader,
                model=model,
                criterion=criterion,
                optimizer=optimizer,
                lr_scheduler=None,
                device_index=0,
                batch_size=batch_size,
                micro_batch_size=micro_size,
                bn_factor=micro_bn,
            ).get_trainer()

# Building and training the classifier

Now that the data is ready, it's time to build and train the classifier. I use one of the pretrained models from `torchvision.models` to get the image features. I build and train a new feed-forward classifier using those features.


I implement the following:
* Load a [pre-trained network](http://pytorch.org/docs/master/torchvision/models.html) (As a starting point, the VGG networks work great and are straightforward to use)
* Define a new, untrained feed-forward network as a classifier, using ReLU activations and dropout
* Train the classifier layers using backpropagation using the pre-trained network to get the features
* Track the loss and accuracy on the validation set to determine the best hyperparameters

During training, I update only the weights of the feed-forward network. I try different hyperparameters (learning rate, units in the classifier, epochs, etc) to find the best model. I save those hyperparameters to use as default values in the next part of the project.

In [7]:
# Define validation function 
def validation(model, testloader, criterion, device):
    test_loss = 0
    accuracy = 0
    
    for images, labels in testloader:
        images, labels = images.to(device), labels.to(device)
                
        output = model.forward(images)
        test_loss += criterion(output, labels).item()
        
        ps = torch.exp(output)
        equality = (labels.data == ps.max(dim=1)[1])
        accuracy += equality.type(torch.FloatTensor).mean()

    return test_loss, accuracy

In [8]:
# Def model test function 
def test_model(model: torch.nn.modules.Module, testloader=testloader, device: str = 'cuda'):  
    model.to(device)
    model.eval()
    accuracy = 0
    
    for images, labels in testloader:
        images, labels = images.to(device), labels.to(device)
                
        output: torch.Tensor = model.forward(images)
        
        ps = torch.exp(output)
        equality = (labels.data == ps.max(dim=1)[1])
        accuracy += equality.type(torch.FloatTensor).mean()
    
    print('Testing Accuracy: {:.3f}'.format(accuracy/len(testloader)))

    accuracy = accuracy / len(testloader)
    return accuracy

In [9]:
# Define NN function
def baseline_trian(
    model: torch.nn.modules.Module, 
    criterion,
    optimizer,
    n_epoch: int = 2,
    device: str = 'cuda'
):
    epochs = n_epoch
    steps = 0 
    running_loss = 0
    one_epoch_loss = 0
    print_every = 40

    for e in range(epochs):
        start = time.time()
        model.train()
        for images, labels in trainloader:
            images, labels = images.to(device), labels.to(device)

            steps += 1

            optimizer.zero_grad()

            output: torch.Tensor = model.forward(images)
            output: torch.Tensor = nn.LogSoftmax(dim=1)(output)
            loss: torch.Tensor = criterion(output, labels)
            loss.backward()
            optimizer.step()

            running_loss += loss.item()
            one_epoch_loss += loss.item()

            # if steps % print_every == 0:
            #     model.eval()
            #     with torch.no_grad():
            #         val_loss, val_accuracy = validation(model, validloader, criterion, device)

            #     print("Epoch: {}/{} - ".format(e+1, epochs),
            #           "Training Loss: {:.3f} - ".format(running_loss/print_every),
            #           "Validation Loss: {:.3f} - ".format(val_loss/len(validloader)),
            #           "Validation Accuracy: {:.3f}".format(val_accuracy/len(validloader)))

            #     running_loss = 0
            #     model.train()
        one_epoch_time = time.time() - start
        test_accuracy = test_model(model)
        print(  f'[{e+1}/{epochs}][{len(trainloader)}]',
                'avg time: {:.2f}'.format( one_epoch_time ),
                'train loss: {:.4f}'.format( one_epoch_loss/len(trainloader) ),
                'test accuracy: {:.4f}'.format( test_accuracy ),
                # 'last val loss: {:.4f}'.format(val_loss),
                # 'last val accuracy: {:.4f}'.format(val_accuracy)
            )
        if use_wandb:
            wandb.log( {'train loss': one_epoch_loss}, step=e+1 )
            wandb.log( {'epoch time' : one_epoch_time}, step=e+1)
            wandb.log( {'accuracy': test_accuracy}, step=e+1 )
            # wandb.log( {'val loss' : val_loss}, step=e+1)
            # wandb.log( {'test loss': val_accuracy}, step=e+1 )
    return model

In [10]:
def mbs_train(
    mbs_trainer: MicroBatchStreaming, model, n_epoch: int = 100
):
    epochs = n_epoch
    
    for e in range(epochs):
        start = time.perf_counter()
        mbs_trainer.train()
        end = time.perf_counter()

        one_epoch_time = end - start
        test_accuracy = test_model(model)
        print(  f'[{e+1}/{epochs}][{len(trainloader)}]',
                'epoch time: {:.2f}'.format( one_epoch_time ),
                'train loss: {:.4f}'.format( mbs_trainer.get_loss() ),
                'test accuracy: {:.4f}'.format( test_accuracy ),
            )
        if use_wandb:
            wandb.log( {'train loss': mbs_trainer.get_loss()}, step=e+1 )
            wandb.log( {'epoch time' : one_epoch_time}, step=e+1)
            wandb.log( {'accuracy': test_accuracy}, step=e+1 )
            # wandb.log( {'val loss' : val_loss}, step=e+1)


# Train

In [11]:
if use_mbs:
    mbs_train( mbs_trainer, model, n_epoch=100 )
else:
    baseline_trian( model, criterion, optimizer, n_epoch=100 )

RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 23.70 GiB total capacity; 19.66 GiB already allocated; 226.94 MiB free; 19.98 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF