# Image-Classifier-TensorFlow-Demo

In this demonstration I will train an image classifier using the PyTorch Framework to classify 102 species of flowers. For training I will be using [this dataset](http://www.robots.ox.ac.uk/~vgg/data/flowers/102/index.html) of 102 flower categories. You can see a few examples below.

<img src='assets/Flowers.png' width=500px>

The project is broken down into multiple steps:

* Load and preprocess the image dataset
* Train the image classifier on the dataset
* Use the trained classifier to predict image content

In [2]:
import torch
from torch import nn, optim
import torch.nn.functional as F
from torchvision import datasets, transforms, models

from matplotlib import pyplot as plt
from PIL import Image
import numpy as np
import time
import json

# Use GPU if it's available, else use cpu
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Run autotuner to select kernel with best peroformance
torch.backends.cudnn.benchmark = True

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

## Load the data

Here we'll use `torchvision` to load the data ([documentation](http://pytorch.org/docs/0.3.0/torchvision/index.html)). The dataset is split into three parts, training, validation, and testing. You can [download the dataset here](https://s3.amazonaws.com/content.udacity-data.com/nd089/flower_data.tar.gz). For the training, we'll want to apply transformations such as random scaling, cropping, and flipping. This will help the network generalize leading to better performance. We'll also need to make sure the input data is resized to 224x224 pixels as required by the pre-trained networks.

The validation and testing sets are used to measure the model's performance on data it hasn't seen yet. For this we don't want any scaling or rotation transformations, but we'll need to resize then crop the images to the appropriate size.

The pre-trained networks we'll use were trained on the ImageNet dataset where each color channel was normalized separately. For all three sets we'll need to normalize the means and standard deviations of the images to what the network expects. For the means, it's `[0.485, 0.456, 0.406]` and for the standard deviations `[0.229, 0.224, 0.225]`, calculated from the ImageNet images.  These values will shift each color channel to be centered at 0 and range from -1 to 1.

In [3]:
data_dir = 'flowers'
train_dir = data_dir + '/train'
valid_dir = data_dir + '/valid'
test_dir = data_dir + '/test'

In [4]:
# Define transforms for the training, validation, and testing sets
train_transforms = transforms.Compose([transforms.RandomRotation(30),
                                       transforms.RandomHorizontalFlip(),
                                       transforms.RandomResizedCrop(224, scale=(0.5, 1.5)),
                                       transforms.ColorJitter(brightness=0.03, hue=0.03),
                                       transforms.ToTensor(),
                                       transforms.Normalize([0.485, 0.456, 0.406], 
                                                            [0.229, 0.224, 0.225])])

valid_test_transforms = transforms.Compose([transforms.Resize(224), 
                                            transforms.CenterCrop(224),
                                            transforms.ToTensor(),
                                            transforms.Normalize([0.485, 0.456, 0.406], 
                                                                 [0.229, 0.224, 0.225])])

# Load the datasets with ImageFolder
train_data = datasets.ImageFolder(data_dir + '/train', transform=train_transforms)

valid_data = datasets.ImageFolder(data_dir + '/valid', transform=valid_test_transforms)

test_data = datasets.ImageFolder(data_dir + '/test', transform=valid_test_transforms)

# Using the image datasets and the trainforms, define the dataloaders
trainloader = torch.utils.data.DataLoader(train_data, batch_size=64, shuffle=True, num_workers=6, pin_memory=True)

validloader = torch.utils.data.DataLoader(valid_data, batch_size=64, shuffle=False, num_workers=6, pin_memory=True)

testloader = torch.utils.data.DataLoader(test_data, batch_size=64, shuffle=False, num_workers=6, pin_memory=True)

### Label mapping

We'll need to load in a mapping from category label to category name. We can find this in the file `cat_to_name.json`. It's a JSON object which we can read in with the [`json` module](https://docs.python.org/2/library/json.html). This will give us a dictionary mapping the integer encoded categories to the actual names of the flowers.

In [5]:
with open('cat_to_name.json', 'r') as f:
    cat_to_name = json.load(f)

# Building and training the model and classifier

Now that the data is ready, it's time to build and train the classifier. We will be using one of the pretrained models from `torchvision.models` to get the image features. Many pretrained classification models are available and can be found [here](https://pytorch.org/vision/stable/models.html). We will need to:

* Load a pretrained network and freeze the models weights
* Define a new, untrained feed-forward network as a classifier, using ReLU activations and dropout
* Train the classifier layers using backpropagation using the pre-trained network to get the features
* Track the loss and accuracy on the validation set to determine the best hyperparameters

When training we will only be updating the weights of the new classifier we build while using the weights of the pretrained model to detect image features. We will then try some hyperparameter combinations (learning rate, units in the classifier, epochs, etc) to see if we can increase the model validation accuracy.

First I will define some functions and a class to help build the classifier and train the model.

In [6]:
# Load pretrained weights
weights='IMAGENET1K_V1'

# Load model using pretrained weights
model = models.maxvit_t(weights=weights)

# Freeze parameters so we don't backprop through them
for param in model.parameters():
    param.requires_grad = False

  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]


In [7]:
class Network(nn.Module):
    def __init__(self, input_size, output_size, hidden_layers, drop_p=0.5):
        ''' Builds a feedforward network with arbitrary hidden layers.
        
            Arguments
            ---------
            input_size: integer, size of the input layer
            output_size: integer, size of the output layer
            hidden_layers: list of integers, the sizes of the hidden layers
            drop_p: float, dropout probability
        '''
        super().__init__()
        
        self.drop_p = drop_p
        
        # Input to a hidden layer
        self.hidden_layers = nn.ModuleList([nn.Linear(input_size, hidden_layers[0])])
        
        # Add a variable number of more hidden layers
        layer_sizes = zip(hidden_layers[:-1], hidden_layers[1:])
        self.hidden_layers.extend([nn.Linear(h1, h2) for h1, h2 in layer_sizes])
        
        self.output = nn.Linear(hidden_layers[-1], output_size)
        
    def forward(self, x):
        ''' Forward pass through the network, returns the output logits '''
        
        for each in self.hidden_layers:
            x = F.relu(each(x))
            x = F.dropout(x, self.drop_p)
        x = self.output(x)
        
        return F.log_softmax(x, dim=1)

In [8]:
def train(model, trainloader, validloader, criterion, optimizer, epochs=10, scheduler=None):
    ''' Trains the model on training data for number of epochs
    
        Arguments
        ---------
        model: model to train
        trainloader: training data loader
        validloader: validation data loader
        criterion: loss function
        optimizer: optimizer to update model parameters
        epochs: number of epochs to train the model
        scheduler: learning rate scheduler
    
    '''
    # Loss and Accuracy variables are global for plotting and comparing
    global train_losses
    global validation_losses
    global validation_accuracy
    
    train_losses, validation_losses, validation_accuracy = [], [], []
         
    # Move model to GPU if available, else CPU is used
    model.to(device) 
       
    # Loop over epochs
    for epoch in range(epochs):
        
        # Start timer
        start = time.time() 
        
        # Set the model to training mode
        model.train()
        
        # Keep track of training epoch loss rate
        running_loss = 0 

        # Get our data
        for images,labels in trainloader:

            # Move image and label tensors to the default device
            images, labels = images.to(device), labels.to(device)

            # Clear the gradients, do this because gradients are accumulated
            optimizer.zero_grad()

            # Forward pass, then backward pass, then update weights
            log_ps = model.forward(images) # Forward Pass. Ouputs log probabilities
            loss = criterion(log_ps, labels) # Calculate Loss
            loss.backward() # Backpropagation
            optimizer.step() # Update Weights        
            running_loss += loss.item() # Accumulate training loss

        else: # After all training batches in current epoch are complete

            # Update learning rate if scheduler is supplied
            if scheduler:
                scheduler.step() 

            # Turn off gradients to speed up this part
            with torch.no_grad():

                # Set model to evaluation mode/ inference mode. Turns off dropout
                model.eval()

                # Keep track of validation loss and accuracy
                validation_loss = 0
                accuracy = 0

                # Validation pass here
                # Loop over validation data
                for images, labels in validloader:
                    images, labels = images.to(device), labels.to(device)

                    log_ps = model.forward(images) # Forward Pass. Ouputs log probabilities
                    batch_loss = criterion(log_ps, labels) # Calculate Loss
                    validation_loss += batch_loss.item() # Accumulate validation loss

                    # Calculate accuracy
                    ps = torch.exp(log_ps) # Exponential of log probabilities for each class
                    top_p, top_class = ps.topk(1, dim=1) # Get top 1 predictions
                    equals = top_class == labels.view(*top_class.shape) # Compare prediction with ground truth
                    accuracy += torch.mean(equals.type(torch.FloatTensor)) # Accumulate accuracy scores

            # Get mean loss to enable comparison between train and validation sets
            train_losses.append(running_loss/ len(trainloader))
            validation_losses.append(validation_loss/ len(validloader))

            # Get mean validation accuracy
            validation_accuracy.append(accuracy/ len(validloader))

            if (epoch+1) == 1 or ((epoch+1) % 5) == 0: # Print 1st and every 5 epochs
                print(f"Epoch {epoch+1}/{epochs}, "
                      f"Time per epoch: {time.time() - start:.2f} seconds, "
                      f"Training Loss: {train_losses[-1]:.4f}, "
                      f"Validation Loss: {validation_losses[-1]:.4f}, "
                      f"Validation accuracy: {validation_accuracy[-1]:.4f}")
                
             # Set the model to training mode
            model.train()    
            
    print(f"Final Epoch {epoch+1}/{epochs}, Validation accuracy: {validation_accuracy[-1]:.4f}")               

In [9]:
# Hyperparameter Dictionary
hyperparameters = {0: {'hidden_units':[1024], 'learning_rate':0.001},
                   1: {'hidden_units':[1024], 'learning_rate':0.003},
                   2: {'hidden_units':[1024,512], 'learning_rate':0.001},
                   3: {'hidden_units':[1024,512], 'learning_rate':0.003},
                   4: {'hidden_units':[512], 'learning_rate':0.001},
                   5: {'hidden_units':[512], 'learning_rate':0.003}}

In [10]:
# List to hold each hyperparameter combos best validation accuracy
best_validation_accuracies = []

epochs = 15

# Loop through hyperparameter combinations
for i in range(len(hyperparameters)):
    hidden_layers, learning_rate = hyperparameters[i]['hidden_units'], hyperparameters[i]['learning_rate']
    
    # Print current hyperparameters
    print(f"\nCurrent Training Hyperparameters: Hidden_layer_Size(s): {hidden_layers}, Learning_Rate: {learning_rate}")
    
    # Define new classifier
    new_classifier = Network(512, 102, hidden_layers, drop_p=0.1)
    
    # Replace pretrained model output classifier layer[5] with newly created classifier
    model.classifier[5] = new_classifier
    
    # Define Loss Functions
    criterion = nn.NLLLoss() # Negative Log Likelihood Loss
    
    # Optimizer only updating new classifier layer[5] weights
    optimizer = optim.Adam(model.classifier[5].parameters(), lr=learning_rate)
    
    # Learning rate scheduler
    scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=epochs)
    
    # Train model using training function with hyperparameters
    train(model, trainloader, validloader, criterion, optimizer, epochs=epochs, scheduler=scheduler)
    
    # Best validation accuracy for this hyperparameter combination after completing all training epochs
    best_validation_accuracies.append(max(validation_accuracy))      
    
    print(f"Best current hyperparameter validation accuracy: {best_validation_accuracies[-1]:.4f}")
    
    # Save model if validation accuracy is the highest
    
    model_name = 'model_best_val_accuracy.pth'
    
    if best_validation_accuracies[-1] == max(best_validation_accuracies):
        best_hyperparamters =  hyperparameters[i] # Store hyperparameters for best validation accuracy
        print(f"\nSaving Model with Best Validation Accuracy over all hyperparameters: {best_validation_accuracies[-1]:.4f}"
              f"\nBest hyperparameters: {best_hyperparamters}"
              f"\nModel saved to: '{model_name}'\n")
        
        torch.save({'state_dict': model.state_dict(),
                    'classifier': model.classifier[5],
                    'optimizer_state_dict': optimizer.state_dict(),
                    'class_to_idx': train_data.class_to_idx
                    }, model_name)
        
print(f"\nDone Training!"
      f"\nBest Validation Accuracy over all hyperparameters: {max(best_validation_accuracies):.4f}"
      f"\nBest hyperparameters: {best_hyperparamters}"
      f"\nModel saved to: '{model_name}'")


Current Training Hyperparameters: Hidden_layer_Size(s): [1024], Learning_Rate: 0.001
Epoch 1/15, Time per epoch: 84.28 seconds, Training Loss: 2.1580, Validation Loss: 0.7415, Validation accuracy: 0.8394
Epoch 5/15, Time per epoch: 30.54 seconds, Training Loss: 0.1996, Validation Loss: 0.3235, Validation accuracy: 0.9154
Epoch 10/15, Time per epoch: 31.45 seconds, Training Loss: 0.1004, Validation Loss: 0.2228, Validation accuracy: 0.9437
Epoch 15/15, Time per epoch: 55.07 seconds, Training Loss: 0.0715, Validation Loss: 0.2332, Validation accuracy: 0.9368
Final Epoch 15/15, Validation accuracy: 0.9368
Best current hyperparameter validation accuracy: 0.9440

Saving Model with Best Validation Accuracy over all hyperparameters: 0.9440
Best hyperparameters: {'hidden_units': [1024], 'learning_rate': 0.001}
Model saved to: 'model_best_val_accuracy.pth'


Current Training Hyperparameters: Hidden_layer_Size(s): [1024], Learning_Rate: 0.003
Epoch 1/15, Time per epoch: 55.54 seconds, Training 

## Testing your network

It's good practice to test the trained network on test data, images the network has never seen either in training or validation. This will give us a good estimate for the model's performance on completely new images. We will run the test images through the network and measure the accuracy, the same way we did with validation.