## Prepare the workspace

In [4]:
# Before you proceed, update the PATH
import os
os.environ['PATH'] = f"{os.environ['PATH']}:/root/.local/bin"
os.environ['PATH'] = f"{os.environ['PATH']}:/opt/conda/lib/python3.6/site-packages"
# Restart the Kernel at this point. 

In [1]:
# Do not execute the commands below unless you have restart the Kernel after updating the PATH. 
!pip install torchinfo
#!python -m pip install torch== 1.0.0

Defaulting to user installation because normal site-packages is not writeable
Collecting torchinfo
  Downloading torchinfo-1.8.0-py3-none-any.whl (23 kB)
Installing collected packages: torchinfo
Successfully installed torchinfo-1.8.0


In [2]:
# Check torch version and CUDA status if GPU is enabled.
import torch
print(torch.__version__)
print(torch.cuda.is_available()) # Should return True when GPU is enabled. 

2.0.1
False


# Developing an AI application

Going forward, AI algorithms will be incorporated into more and more everyday applications. For example, one might want to include an image classifier in a smart phone app. To do this, one would use a deep learning model trained on hundreds of thousands of images as part of the overall application architecture. A large part of software development in the future will be using these types of models as common parts of applications. 

In this project, I have trained an image classifier to recognize different species of flowers. The idea is to use something like this in a phone app. This would tell the name of the flower your camera is looking at. I have been using [this dataset](http://www.robots.ox.ac.uk/~vgg/data/flowers/102/index.html) of 102 flower categories, below are few examples. 

<img src='assets/Flowers.png' width=500px>

The project is broken down into multiple steps:

* Load and preprocess the image dataset
* Train the image classifier on the dataset
* Use the trained classifier to predict image content


By the end of this project, there will be an application that can be trained on any set of labeled images. The network will be learning about flowers and end up as a command line application.

First up is importing the packages needed. It's good practice to keep all the imports at the beginning of the code. 

In [3]:
# Imports here
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision.transforms as transforms
import torchvision.datasets as datasets
import torchvision.models as models
from PIL import Image
import os
import torchvision.datasets as datasets
import matplotlib.pyplot as plt
from torchinfo import summary
from torchvision.datasets import ImageFolder

## Load the data

Using
 `torchvision` to load the data ([documentation](http://pytorch.org/docs/0.3.0/torchvision/index.html)). The data should be included alongside this notebook, otherwise you can [download it here](https://s3.amazonaws.com/content.udacity-data.com/nd089/flower_data.tar.gz). 

If you do not find the `flowers/` dataset in the current directory, **/workspace/home/aipnd-project/**, you can download it using the following commands. 

```bash
!wget 'https://s3.amazonaws.com/content.udacity-data.com/nd089/flower_data.tar.gz'
!unlink flowers
!mkdir flowers && tar -xzf flower_data.tar.gz -C flowers
```


## Data Description
The dataset is split into three parts, training, validation, and testing. For the training, I applied transformations such as random scaling, cropping, and flipping. This helped the network generalize leading to better performance. The input data is resized to 224x224 pixels as required by the pre-trained networks.

The validation and testing sets are used to measure the model's performance on data it hasn't seen yet. For this I did not use any scaling or rotation transformations, but I resized then cropped the images to the appropriate size.

The pre-trained networks I've use were trained on the ImageNet dataset where each colour channel was normalized separately. For all three sets I had to normalize the means and standard deviations of the images to what the network expects. For the means, it's `[0.485, 0.456, 0.406]` and for the standard deviations `[0.229, 0.224, 0.225]`, calculated from the ImageNet images.  These values will shift each colour channel to be centered at 0 and range from -1 to 1.
 

In [4]:
data_dir = 'flowers'
train_dir = data_dir + '/train'
valid_dir = data_dir + '/valid'
test_dir = data_dir + '/test'

In [5]:
# Define the transformations for the train, validation, and test sets
data_transforms = {
    'train': transforms.Compose([
        transforms.RandomRotation(15),
        transforms.Resize((224, 224)),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'valid': transforms.Compose([
        transforms.Resize((224, 224)),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'test': transforms.Compose([
        transforms.Resize((224,224)),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ])
}

# Load the datasets with ImageFolder
train_dataset = ImageFolder(root=train_dir, transform=data_transforms['train'])
validation_dataset = ImageFolder(root=valid_dir, transform=data_transforms['valid'])
test_dataset = ImageFolder(root=test_dir, transform=data_transforms['test'])

# Combine the datasets into image_datasets
image_datasets = {'train': train_dataset, 'valid': validation_dataset, 'test': test_dataset}

# Define the batch size for train, validation, and test sets
train_batch_size = 32
validation_batch_size = 64
test_batch_size = 64

# Check if a GPU is available
if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')

# Define the dataloaders
train_loader = torch.utils.data.DataLoader(train_dataset,
                                           batch_size = train_batch_size,
                                           shuffle=True,
                                           num_workers=os.cpu_count() if str(device) == "cuda" else 0,      
                                           pin_memory=True if str(device) == "cuda" else False) 
val_loader = torch.utils.data.DataLoader(validation_dataset,
                                         batch_size = validation_batch_size,
                                         shuffle=False,
                                         num_workers=os.cpu_count() if str(device) == "cuda" else 0,
                                         pin_memory=True if str(device) == "cuda" else False)
test_loader = torch.utils.data.DataLoader(test_dataset,
                                          batch_size = test_batch_size,
                                          shuffle=False,
                                          num_workers=os.cpu_count() if str(device) == "cuda" else 0,
                                          pin_memory=True if str(device) == "cuda" else False)



### Label mapping

I have also loaded in a mapping from category label to category name. You can find this in the file `cat_to_name.json`. It's a JSON object which you can read in with the [`json` module](https://docs.python.org/2/library/json.html). This will give you a dictionary mapping the integer encoded categories to the actual names of the flowers.

In [6]:
import json

with open('cat_to_name.json', 'r') as f:
    cat_to_name = json.load(f)

In [7]:
# Define model architecture,

model = models.resnet18(pretrained=True)


# Use the summary function to get model information

summary(model,
        input_size=(1,3,224,224),
        col_names=("input_size",
                   "output_size",
                   "num_params",
                   "trainable"),
        col_width=18,
        row_settings=("var_names",))


Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth" to /home/student/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
100%|██████████| 44.7M/44.7M [00:00<00:00, 193MB/s] 


Layer (type (var_name))                  Input Shape        Output Shape       Param #            Trainable
ResNet (ResNet)                          [1, 3, 224, 224]   [1, 1000]          --                 True
├─Conv2d (conv1)                         [1, 3, 224, 224]   [1, 64, 112, 112]  9,408              True
├─BatchNorm2d (bn1)                      [1, 64, 112, 112]  [1, 64, 112, 112]  128                True
├─ReLU (relu)                            [1, 64, 112, 112]  [1, 64, 112, 112]  --                 --
├─MaxPool2d (maxpool)                    [1, 64, 112, 112]  [1, 64, 56, 56]    --                 --
├─Sequential (layer1)                    [1, 64, 56, 56]    [1, 64, 56, 56]    --                 True
│    └─BasicBlock (0)                    [1, 64, 56, 56]    [1, 64, 56, 56]    --                 True
│    │    └─Conv2d (conv1)               [1, 64, 56, 56]    [1, 64, 56, 56]    36,864             True
│    │    └─BatchNorm2d (bn1)            [1, 64, 56, 56]    [1, 64, 56, 

# Building and training the classifier

Now that the data is ready, it's time to build and train the classifier. I have used one of the pretrained models from `torchvision.models` to get the image features. Built and trained a new feed-forward classifier using those features.

The following steps were:

* Loaded a [pre-trained network](http://pytorch.org/docs/master/torchvision/models.html) 
* Defined a new, untrained feed-forward network as a classifier, using ReLU activations and dropout
* Trained the classifier layers using backpropagation using the pre-trained network to get the features
* Tracked the loss and accuracy on the validation set to determine the best hyperparameters




## Note : 
If the network is over 1 GB when saved as a checkpoint, there might be issues with saving backups in your workspace. Typically this happens with wide dense layers after the convolutional layers. If the saved checkpoint is larger than 1 GB (you can open a terminal and check with `ls -lh`), reduce the size of your hidden layers and train again.

In [8]:
#define manual_seed

def manual_seed(random_seed: int = 42) -> None:
    # for non-cuda
    torch.manual_seed(random_seed)
    # in case cuda exists
    if torch.cuda.is_available():
        torch.cuda.manual_seed(random_seed)
    

In [9]:
#Deffining the class Classifier

class Classifier(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(Classifier, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, num_classes)
    
    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out

In [10]:
manual_seed()

#Freezing the network
# Define the number of output classes 
NUM_CLASSES = len(train_dataset.classes)

# freezing out the complete network first

for param in model.parameters():
    param.requires_grad = False

# now swapping out the MLP classifier head
model.fc = Classifier(input_size = 512, hidden_size = 256, num_classes = NUM_CLASSES)


In [11]:
# Definition of the loss function
criterion = nn.CrossEntropyLoss()

# Definition of the optimizer
optimizer = optim.Adam(model.fc.parameters(), lr= 1e-3)

model.to(device);


In [12]:
# To train the model
num_epochs = 3
for epoch in range(num_epochs):
    model.train()
    train_running_loss = 0.0
    val_running_loss = 0.0
    train_acc = 0.0
    val_acc = 0.0

    for inputs, labels in train_loader:
        inputs = inputs.to(device)
        labels = labels.to(device).type(torch.long)

        # Forward pass
        outputs = model(inputs)
        loss = criterion(outputs, labels)

        # Backward pass and optimization
        optimizer.zero_grad()   # Zero the gradients
        loss.backward()
        optimizer.step()
        
        with torch.no_grad():
            preds = outputs.argmax

        train_running_loss += loss.item()/len(labels)

    # Print the average loss for the epoch
    print(f"Epoch (train): {epoch+1}/{num_epochs}, Loss: {train_running_loss/len(train_loader)}")
    
    model.eval()
 
    for inputs, labels in val_loader:
        inputs = inputs.to(device)
        labels = labels.to(device).type(torch.long)

        # Forward pass
        with torch.no_grad():
            outputs = model(inputs)
            loss = criterion(outputs, labels)

        val_running_loss += loss.item()/len(labels)
        
    # Print the average loss for the epoch
    print(f"Epoch (val): {epoch+1}/{num_epochs}, Loss: {val_running_loss/len(val_loader)}") 
   

Epoch 10/10, Loss: 0.0007152378559112549
Epoch 10/10, Loss: 0.001441108671630301
Epoch 10/10, Loss: 0.0021389238718079358
Epoch 10/10, Loss: 0.002849803683234424
Epoch 10/10, Loss: 0.003526529379007293
Epoch 10/10, Loss: 0.0042423789821019985
Epoch 10/10, Loss: 0.0049456452451101165
Epoch 10/10, Loss: 0.005610853288231826
Epoch 10/10, Loss: 0.006281640398793104
Epoch 10/10, Loss: 0.006950380380560712
Epoch 10/10, Loss: 0.007629798897882787
Epoch 10/10, Loss: 0.008284999466523892
Epoch 10/10, Loss: 0.009576510365416364
Epoch 10/10, Loss: 0.010256260051959898
Epoch 10/10, Loss: 0.010924199950404284
Epoch 10/10, Loss: 0.011588603694264482
Epoch 10/10, Loss: 0.012267202720409487
Epoch 10/10, Loss: 0.012930118601496626
Epoch 10/10, Loss: 0.013561865760058892
Epoch 10/10, Loss: 0.014226855737407033
Epoch 10/10, Loss: 0.014878932077710221
Epoch 10/10, Loss: 0.015505015704689956
Epoch 10/10, Loss: 0.01616574671210312
Epoch 10/10, Loss: 0.016815568252307612
Epoch 10/10, Loss: 0.0174798309076123

In [4]:
# Create the directory if it doesn't exist
os.makedirs('home/flowers', exist_ok=True)

# Save the trained model
torch.save(model.state_dict(), '/home/aipnd-project/flowers/trained_model.pth') 

NameError: name 'model' is not defined

## Testing the network

It's good practice to test the trained network on test data, images the network has never seen either in training or validation. This will give a good estimate for the model's performance on completely new images. I run the test images through the network and measured the accuracy, same for validation.

In [26]:
# TODO: Do validation on the test set

# Validating the model
num_epochs = 10
for epoch in range(num_epochs):
    model.valid()
    running_loss = 0.0

    for inputs, labels in val_loader:
        inputs = inputs.to(device)
        labels = labels.to(device).type(torch.long)
        
        print(inputs.shape)

        # Forward pass
        with torch.no_grad():
            outputs = model(inputs)
            loss = criterion(outputs, labels)

        # Backward pass and optimization
        optimizer.zero_grad()   # Zero the gradients
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    # Print the average loss for the epoch
    print(f"Epoch {epoch+1}/{num_epochs}, Loss: {running_loss/len(trainloader)}")
    
    
# Save the tested model
torch.save(model.state_dict(), 'flowers/to/save/model.pth')




AttributeError: 'ResNet' object has no attribute 'valid'

## Save the checkpoint

Now that your network is trained, it is time to save the model to load it later for making predictions. 

```model.class_to_idx = image_datasets['train'].class_to_idx```

To completely rebuild the model later so I can use it for inference, I had to make sure to include any information needed in the checkpoint. 

In [None]:
# TODO: Save the checkpoint 
torch.save(model.state_dict(),'home/aipnd-project/checkpoint.pth')


#Save the mapping of classes to indices:
# Create the DatasetFolder object
dataset = datasets.DatasetFolder(data_dir)

# Get the mapping of classes to indices
model.class_to_idx = image_datasets['train'].class_to_idx

# Save the mapping to a file
torch.save(model.class_to_idx, 'model.class_to_idx.pth')

## Loading the checkpoint

It's good to write a function that can load a checkpoint and rebuild the model. That way I can come back to this project and keep working on it without having to retrain the network.

In [None]:
# TODO: Write a function that loads a checkpoint and rebuilds the model

filepath = 'home/aipnd-project/checkpoint.pth'

def load_checkpoint(filepath):
    # Load the checkpoint
    checkpoint = torch.load(filepath)
    
    # Rebuild the model architecture
    model = models.resnet50(pretrained=False)  
    
    # Load the model's state dictionary
    model.load_state_dict(checkpoint['state_dict'])
    
    # Load the class-to-index mapping
    model.load_class_to_idx(image_datasets['train')
    
    return model

# Inference for classification

Next, I have written a function to use a trained network for inference. That is, I passed an image into the network and predicted the class of the flower in the image. 


First I needed to handle processing the input image so it could be used in my network. 

## Image Preprocessing

Used `PIL` to load the image ([documentation](https://pillow.readthedocs.io/en/latest/reference/Image.html)). It's best practice to write a function that preprocesses the image so it can be used as input for the model. This function processes the images in the same manner used for training. 

First, I resized the images where the shortest side is 256 pixels, keeping the aspect ratio. This was done with the [`thumbnail`](http://pillow.readthedocs.io/en/3.1.x/reference/Image.html#PIL.Image.Image.thumbnail) another option would've been [`resize`](http://pillow.readthedocs.io/en/3.1.x/reference/Image.html#PIL.Image.Image.thumbnail) methods. 

Colour channels of images are typically encoded as integers 0-255, but the model expected floats 0-1. I have converted the values. It's easy with a Numpy array, which you can get from a PIL image like so `np_image = np.array(pil_image)`.

The network expects the images to be normalized in a specific way. For the means, it's `[0.485, 0.456, 0.406]` and for the standard deviations `[0.229, 0.224, 0.225]`. For this, I subtracted the means from each colour channel, then divided by the standard deviation. 

And finally, as PyTorch expects the colour channel to be the first dimension but it's the third dimension in the PIL image and Numpy array. I have reordered dimensions using [`ndarray.transpose`](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.ndarray.transpose.html). The colour channel needed to be first and retained the order of the other two dimensions.

In [16]:
def process_image(image):
    ''' Scales, crops, and normalizes a PIL image for a PyTorch model,
        returns a NumPy array
    '''
    transform = transforms.Compose([
        transforms.Resize((224, 224)),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ])

    processed_image = transform(image)
    processed_image = processed_image.numpy()  # Convert to NumPy array
    
    return processed_image
  

To check my work, the function below converted a PyTorch tensor and displayed it in the notebook. 

In [17]:
def imshow(image, ax=None, title=None):
    """Imshow for Tensor."""
    if ax is None:
        fig, ax = plt.subplots()
    
    # PyTorch tensors assume the color channel is the first dimension
    # but matplotlib assumes is the third dimension
    image = image.numpy().transpose((1, 2, 0))
    
    # Undo preprocessing
    mean = np.array([0.485, 0.456, 0.406])
    std = np.array([0.229, 0.224, 0.225])
    image = std * image + mean
    
    # Image needs to be clipped between 0 and 1 or it looks like noise when displayed
    image = np.clip(image, 0, 1)
    
    ax.imshow(image)
    
    return ax



## Class Prediction

Next is a function for making predictions with my model. It is a common practice to predict the top 5 (usually called top-$K$) most probable classes. I calculated the class probabilities then find the $K$ largest values.

To get the top $K$ largest values in a tensor I used [`x.topk(k)`](http://pytorch.org/docs/master/torch.html#torch.topk). This method returns both the highest `k` probabilities and the indices of those probabilities corresponding to the classes. I had to convert from these indices to the actual class labels using `class_to_idx`.


In [18]:
def predict(image_path, model, topk=5):
    ''' Predict the class (or classes) of an image using a trained deep learning model.
    '''
    # Load and preprocess the image
    image = Image.open(image_path)
    transform = transforms.Compose([
        transforms.Resize((224, 224)),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ])
    processed_image = transform(image)
    processed_image = processed_image.unsqueeze(0)

    # Set the model to evaluation mode
    model.eval()

    # Forward pass through the model
    with torch.no_grad():
        output = model(processed_image)

    # Get the predicted class indices
    _, predicted_indices = torch.topk(output, topk)
    
    # Load the class labels
    with open('path/to/class_labels.txt') as f:
        class_labels = f.readlines()

    # Get the predicted class labels
    predicted_labels = [class_labels[idx].strip() for idx in predicted_indices[0]]

    return predicted_labels

## Sanity Checking

Now that I can use a trained model for predictions, I had to check to make sure it makes sense. Even if the testing accuracy is high, it's always good to check that there aren't obvious bugs. I used `matplotlib` to plot the probabilities for the top 5 classes as a bar graph, along with the input image.



In [3]:
# TODO: Display an image along with the top 5 classes
.head()

matplotlib.pyplot.hist2d(x, y, bins=10, range=None, density=False, weights=None, cmin=None, cmax=None, *, data=None, **kwargs)
                         
x = 'flowers'
plt.hist(x, 55 , ec = 'red')
plt.xlabel('flowers')
plt.ylabel('flower_type')
plt.tittle('classes_of_flowers')





SyntaxError: iterable argument unpacking follows keyword argument unpacking (1388929872.py, line 4)

## Reminder
If your network becomes very large when saved as a checkpoint, you should reduce the size of your hidden layers and train again. 
    

In [2]:
# TODO remove .pth files 
def remove_pth_files(directory):
    # Lists all files in the directory
    file_lists = os.listdir(directory)
    
    for file in file_lists:
        # Checks for files with .pth extension
        if file.endswith(".pth"):
           
            file_path = os.path.join(directory, file)
            
            # Remove the file
            os.remove(file_path)