# Developing an AI application for the identification of flower species

In this project, I will train an image classifier to recognize different species of flowers, then generate and save the corresponding model, and further use it to predict flower species from an independent flower image. The model will use [this dataset](http://www.robots.ox.ac.uk/~vgg/data/flowers/102/index.html) of 102 flower categories for training. A few examples are presented below. 

<img src='assets/Flowers.png'>

The project is broken down into multiple steps:

* Load and preprocess the image dataset
* Load a pre-trained model
* Train the image classifier on the dataset
* Use the trained classifier to predict image content

In [None]:
# Importing necessary modules

import torch
import numpy as np
import torchvision
import torchvision.transforms as transforms
import torchvision.models as models
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from time import gmtime, strftime

### Loading the data

The dataset is split into two parts, training and validation. For the training, I will apply transformations such as random scaling, cropping, and flipping, to help the network generalize leading to better performance. To make sure the images are fit for the pre-trained network, the input data is resized to 224x224 pixels. The validation set is only resized then cropped to the appropriate size.

The pre-trained networks available from `torchvision` were trained on the ImageNet dataset where each color channel was normalized separately. For both datasets, I will therefore normalize the means and standard deviations of the images to what the network expects. For the means, it's `[0.485, 0.456, 0.406]` and for the standard deviations `[0.229, 0.224, 0.225]`, calculated from the ImageNet images.  These values will shift each color channel to be centered at 0 and range from -1 to 1.

In [None]:
# Defining directories

data_dir = 'flower_data'
train_dir = data_dir + '/train'
valid_dir = data_dir + '/valid'

In [None]:
# Defining parameters

random_rotation = 20
resize = 224
center_crop = 224

In [None]:
# Defining the transforms for the training and validation sets

train_transforms = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomVerticalFlip(),
    transforms.RandomRotation(random_rotation),
    transforms.RandomResizedCrop(resize),
    transforms.ToTensor(),
    transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
])

valid_transforms = transforms.Compose([
    transforms.Resize(resize),
    transforms.CenterCrop(center_crop),
    transforms.ToTensor(),
    transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
])

In [None]:
# Loading the datasets with ImageFolder

train_dataset = torchvision.datasets.ImageFolder(train_dir, transform = train_transforms)
valid_dataset = torchvision.datasets.ImageFolder(valid_dir, transform = valid_transforms)

In [None]:
# Defining the batch size so that I will have 60 batches

batch_size = int(len(train_dataset)/60)

# Defining the data loaders

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size = batch_size, shuffle = True)
valid_loader = torch.utils.data.DataLoader(valid_dataset, batch_size = batch_size, shuffle = True)

### Label mapping

I will load in a mapping from category label to category name from the file `cat_to_name.json`. This will give me a dictionary mapping the integer encoded categories to the actual names of the flowers.

In [None]:
import json

with open('assets/cat_to_name.json', 'r') as f:
    cat_to_name = json.load(f)

### Building and training the classifier

In order to build and train the classifier, I will use the pre-trained ResNet512 model from `torchvision.models` to get the image features. The whole process will be accomplished according to the following steps:

* Load the [pre-trained network](http://pytorch.org/docs/master/torchvision/models.html)
* Define a new, untrained feed-forward network as a classifier, using ReLU activations and dropout
* Train the classifier layers using backpropagation using the pre-trained network to get the features
* Track the loss and accuracy on the validation set
* The model with the lowest loss will be saved as a checkpoint

In [None]:
# Building and training the classifier

model = models.resnet152(pretrained = True)

for param in model.parameters():
    param.requires_grad = False

# Defining the classifier class

class FCClassifier(nn.Module):

    def __init__(self, in_features, middle_features_1, middle_features_2, out_features, drop_prob = 0.2):
        super().__init__()

        self.fc1 = nn.Linear(in_features, middle_features_1, bias = True)
        self.fc2 = nn.Linear(middle_features_1, middle_features_2, bias = True)
        self.fc3 = nn.Linear(middle_features_2, out_features, bias = True)
        self.drop = nn.Dropout(p = drop_prob)

    def forward(self, x):
        x = self.drop(F.relu(self.fc1(x)))
        x = self.drop(F.relu(self.fc2(x)))
        x = self.fc3(x)
        x = F.log_softmax(x, dim=1)
        return x

# Defining the classifier in my model, using the classifier class created above

model.fc = FCClassifier(2048, 2048, 1024, 102)

# Defining the loss and optimizer functions

criterion = nn.NLLLoss()

lr = 0.001
optimizer = optim.Adam(filter(lambda p: p.requires_grad, model.parameters()), lr = lr)

# Moving the model to GPU

model.cuda();

In [None]:
# Defining a function to save the model, the classifier, and the "model class to folder index" dictionary

def save_checkpoint(model, filename):
    checkpoint = {'model': 'resnet152',
                  'classifier': model.fc,
                  'model_state_dict': model.state_dict(),
                  'class_to_idx': train_dataset.class_to_idx}
    torch.save(checkpoint, filename)

In [None]:
# Defining parameters for training

epochs = 10
steps = 0
total_steps = len(train_loader) * epochs
running_loss = 0
print_every = 5
valid_loss_min = np.Inf

# Training

for epoch in range(epochs):
    # Loading a batch of training images
    for images, labels in train_loader:
        steps += 1
        
        # Moving images and labels to GPU
        images, labels = images.cuda(), labels.cuda()
        
        # Training the model and adjusting the weights
        optimizer.zero_grad()
        
        logps = model(images)
        loss = criterion(logps, labels)
        loss.backward()
        optimizer.step()
        
        # Keeping track of the training loss
        running_loss += loss.item()
        
        # Check the validation set at the first, the last and at each 5 training batches
        if (steps == 1) or (steps % print_every == 0) or (steps == total_steps):
            model.eval()
            
            test_loss = 0
            accuracy = 0
            
            # Loading a batch of validation images
            for images, labels in valid_loader:
                images, labels = images.cuda(), labels.cuda()
                
                # Predicting and calculating validation loss
                logps = model(images)
                loss = criterion(logps, labels)
                test_loss += loss.item()
                
                # Calculating accuracy
                ps = torch.exp(logps)
                top_ps, top_class = ps.topk(1, dim = 1)
                equality = top_class == labels.view(*top_class.shape)
                accuracy += torch.mean(equality.type(torch.FloatTensor)).item()
            
            # Printing partial measures
            print(f'{strftime("%H:%M:%S", gmtime())}.. '
                  f'Epoch {epoch+1}/{epochs}.. '
                  f'Step {steps}/{total_steps}.. '
                  f'Train loss: {running_loss/print_every:.3f}.. '
                  f'Test loss: {test_loss/len(valid_loader):.3f}.. '
                  f'Test accuracy: {accuracy/len(valid_loader):.3f}')
            
            running_loss = 0
            model.train()
            
            # Saving the model if the lowest validation loss so far has been reached
            if test_loss/len(valid_loader) <= valid_loss_min:
                print('Saving model ... ', end="")
                save_checkpoint(model, 'checkpoint.pth')
                print('Done!')
                valid_loss_min = test_loss/len(valid_loader)

### Loading the checkpoint

I will load the saved checkpoint, which consists of the model with the lowest validation loss, so I can use it to predict the species of new flower images. 

In [None]:
# Defining a function to load the checkpoint and format the model

def load_checkpoint(filename):
    checkpoint = torch.load(filename, map_location='cpu')
    model = models.resnet152(pretrained = True)
    for param in model.parameters():
        param.requires_grad = False
    model.fc = checkpoint['classifier']
    model.load_state_dict(checkpoint['model_state_dict'])
    model.class_to_idx = checkpoint['class_to_idx']
    return model

In [None]:
# Loading the checkpoint and transfering the model to GPU

model = load_checkpoint('checkpoint.pth')
model.cuda();

### Inference for classification

I am providing 7 test images in the folder `assets`, labeled testX.jpg, where X = 1 - 7. Nevertheless, any flower image from any source can be tested.

In order to infer flower species, first I will write a function to preprocess any input image, so that it can be fed to the model.

First, the images will be resized where the shortest side is 256 pixels, keeping the aspect ratio. Then they will be center-cropped to generate 224x224 images.

Color channels of images are typically encoded as integers 0-255, but the model expected floats 0-1, therefore they will be converted likewise.

As before, the network expects the images to be normalized in a specific way. For the means, it's `[0.485, 0.456, 0.406]` and for the standard deviations `[0.229, 0.224, 0.225]`. I will subtract the means from each color channel, then divide by the standard deviation. 

And finally, PyTorch expects the color channel to be the first dimension but it's the third dimension in the PIL image and Numpy array. The color channel needs to be first and retain the order of the other two dimensions.

In [None]:
# Defining a function to preprocess the image

def process_image(image):
    ''' Scales, crops, and normalizes a PIL image for a PyTorch model,
        returns an Numpy array
    '''
    
    img = image
    
    # Resize
    ratio = min(img.size)/256
    img = img.resize((int(img.size[0]/ratio), int(img.size[1]/ratio)))
    
    # Center crop
    center = (int(img.size[0]/2), int(img.size[1]/2))
    left, right, top, bottom = int(center[0]-224/2), int(center[0]+224/2), int(center[1]-224/2), int(center[1]+224/2)
    img = img.crop((left, top, right, bottom))
    
    # Convert to numpy
    img = np.array(img, dtype = 'float')
    img = img/255.
        
    # Normalize
    mean = np.array([0.485, 0.456, 0.406])
    sd = np.array([0.229, 0.224, 0.225])
    img = (img - mean)/sd
        
    # Transpose
    img = img.transpose((2, 0, 1))
    
    # Change to tensor
    img = torch.tensor(img).cuda()
    
    return img

### Class Prediction

I will write a function that accepts an image and a model, and returns thre lists. The first list will give the highest 5 probabilities for class predicted by the model. The second list will give the corresponding 5 top classes, as defined by the ImageFolder loader. The third list will give the corresponding 5 top indices, mapping to the folder number. This last list can then be translated into flower species using the `cat_to_name` dictionary.

In [None]:
# Defining the prediction function

def predict(image_path, model, topk=5):
    ''' Predict the class (or classes) of an image using a trained deep learning model.
    '''
    from PIL import Image
    img = Image.open(image_path)

    img = process_image(img)
    img = img.unsqueeze(0)
    img = img.float()
    
    # TODO: Implement the code to predict the class from an image file
    
    model2.eval()
    
    logps = model2(img)
    ps = torch.exp(logps)
    
    top_ps, top_class = ps.topk(topk, dim = 1)
    probs = top_ps.cpu().detach().numpy().tolist()[0]

    classes = top_class.cpu().detach().numpy().tolist()[0]

    class_to_idx = model2.class_to_idx
    idx_to_class = {i: ii for ii, i in class_to_idx.items()}
    
    indexes = []
    for i in range(len(classes)):
        index = idx_to_class[classes[i]]
        indexes.append(index)
        
    return probs, classes, indexes

### Visualing the results

The function below converts a PyTorch tensor and displays it in the notebook.

In [None]:
# Defining a funciton to view the test image

def imshow(image, ax=None, title=None):
    """Imshow for Tensor."""
    if ax is None:
        fig, ax = plt.subplots()
    
    # PyTorch tensors assume the color channel is the first dimension
    # but matplotlib assumes is the third dimension
    image = image.numpy().transpose((1, 2, 0))
    
    # Undo preprocessing
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    image = std * image + mean
    
    # Image needs to be clipped between 0 and 1 or it looks like noise when displayed
    image = np.clip(image, 0, 1)
    
    ax.imshow(image)
    ax.set_title(title)
    
    return ax

I will use the model on the `test1.jpg` image. The results will be shown in a barplot.

In [None]:
# Predicting the results

test = 'assets/test1.jpg'
probs, classes, indexes = predict(test, model2)
names = []
for i in range(len(classes)):
    name = cat_to_name[indexes[i]]
    names.append(name)

In [None]:
# Generating the image plot

fig, (ax1, ax2) = plt.subplots(2, 1)
img = process_image(Image.open(test)).cpu()
imshow(img, ax1, title = names[0])
ax2.barh(names, probs)
ax2.invert_yaxis()

And that's it!