# Convolutional Neural Networks

## Project: Write an Algorithm for Landmark Classification

---

In this notebook, some template code has already been provided for you, and you will need to implement additional functionality to successfully complete this project. You will not need to modify the included code beyond what is requested. Sections that begin with **'(IMPLEMENTATION)'** in the header indicate that the following block of code will require additional functionality which you must provide. Instructions will be provided for each section, and the specifics of the implementation are marked in the code block with a 'TODO' statement. Please be sure to read the instructions carefully! 

> **Note**: Once you have completed all the code implementations, you need to finalize your work by exporting the Jupyter Notebook as an HTML document. Before exporting the notebook to HTML, all the code cells need to have been run so that reviewers can see the final implementation and output. You can then export the notebook by using the menu above and navigating to **File -> Download as -> HTML (.html)**. Include the finished document along with this notebook as your submission.

In addition to implementing code, there will be questions that you must answer which relate to the project and your implementation. Each section where you will answer a question is preceded by a **'Question X'** header. Carefully read each question and provide thorough answers in the following text boxes that begin with **'Answer:'**. Your project submission will be evaluated based on your answers to each of the questions and the implementation you provide.

>**Note:** Code and Markdown cells can be executed using the **Shift + Enter** keyboard shortcut.  Markdown cells can be edited by double-clicking the cell to enter edit mode.

The rubric contains _optional_ "Stand Out Suggestions" for enhancing the project beyond the minimum requirements. If you decide to pursue the "Stand Out Suggestions", you should include the code in this Jupyter notebook.

---
### Why We're Here

Photo sharing and photo storage services like to have location data for each photo that is uploaded. With the location data, these services can build advanced features, such as automatic suggestion of relevant tags or automatic photo organization, which help provide a compelling user experience. Although a photo's location can often be obtained by looking at the photo's metadata, many photos uploaded to these services will not have location metadata available. This can happen when, for example, the camera capturing the picture does not have GPS or if a photo's metadata is scrubbed due to privacy concerns.

If no location metadata for an image is available, one way to infer the location is to detect and classify a discernible landmark in the image. Given the large number of landmarks across the world and the immense volume of images that are uploaded to photo sharing services, using human judgement to classify these landmarks would not be feasible.

In this notebook, you will take the first steps towards addressing this problem by building models to automatically predict the location of the image based on any landmarks depicted in the image. At the end of this project, your code will accept any user-supplied image as input and suggest the top k most relevant landmarks from 50 possible landmarks from across the world. The image below displays a potential sample output of your finished project.

![Sample landmark classification output](images/sample_landmark_output.png)


# The Road Ahead

We break the notebook into separate steps.  Feel free to use the links below to navigate the notebook.

* [Step 0](#step0): Download Datasets and Install Python Modules
* [Step 1](#step1): Create a CNN to Classify Landmarks (from Scratch)
* [Step 2](#step2): Create a CNN to Classify Landmarks (using Transfer Learning)
* [Step 3](#step3): Write Your Landmark Prediction Algorithm

<a id='step0'></a>
## Step 0: Download Datasets and Install Python Modules

**Note: if you are using the Udacity workspace, *YOU CAN SKIP THIS STEP*. The dataset can be found in the `/data` folder and all required Python modules have been installed in the workspace.**

Download the [landmark dataset](https://udacity-dlnfd.s3-us-west-1.amazonaws.com/datasets/landmark_images.zip).
Unzip the folder and place it in this project's home directory, at the location `/landmark_images`. The landmark images are a subset of the **Google Landmarks Dataset v2**.

Install the following Python modules:
* cv2
* matplotlib
* numpy
* PIL
* torch
* torchvision

In [None]:
import cv2
import PIL
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
from torchvision import transforms, datasets, models
from torch.utils.data.sampler import SubsetRandomSampler, RandomSampler

import os
import random as rd
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
import zipfile
from google.colab import drive
drive.mount('/content/drive')

zipfile.ZipFile('/content/drive/MyDrive/Colab Notebooks/Udacity/deep_learning/landmark_images.zip').extractall()

In [None]:
use_cuda = torch.cuda.is_available()
#torch.cuda.empty_cache()
#torch.cuda.list_gpu_processes()
#torch.cuda.memory_snapshot()

In [None]:
upload = True

if upload:
    try:
        train_shapes = pd.read_csv('train_shapes.csv',header=0, index_col=0)
        test_shapes = pd.read_csv('test_shapes.csv',header=0, index_col=0)
    except FileNotFoundError:
        print ("Must upload files first.")

else:
    #read train image file data:
    train_path = './landmark_images/train/'
    cat_path = [train_path + cat + '/' for cat in os.listdir(train_path)]
    jpg_list = [[cp + jpg for jpg in os.listdir(cp)] for cp in cat_path]

    train_images = []
    for jl in jpg_list:
        for jpg in jl:
            train_images.append(jpg)

    train_image_shapes = []
    for imfile in train_images:
        train_image_shapes.append(cv2.imread(imfile).shape)
    heights = [shp[0] for shp in train_image_shapes]    
    widths = [shp[1] for shp in train_image_shapes]
    
    train_images_df = pd.DataFrame(columns=['filepath','height','width'])
    train_images_df['filepath'] = train_images
    train_images_df['height'] = heights
    train_images_df['width'] = widths
    train_images_df.to_csv('train_shapes.csv')
    
    #read test image file data:
    test_path = './landmark_images/test/'
    cat_path = [test_path + cat + '/' for cat in os.listdir(test_path)]
    jpg_list = [[cp + jpg for jpg in os.listdir(cp)] for cp in cat_path]

    test_images = []
    for jl in jpg_list:
        for jpg in jl:
            test_images.append(jpg)

    test_image_shapes = []
    for imfile in test_images:
        test_image_shapes.append(cv2.imread(imfile).shape)
    heights = [shp[0] for shp in test_image_shapes]    
    widths = [shp[1] for shp in test_image_shapes] 

    test_images_df = pd.DataFrame(columns=['filepath','height','width'])
    test_images_df['filepath'] = test_images
    test_images_df['height'] = heights
    test_images_df['width'] = widths
    test_images_df.to_csv('test_shapes.csv')

    #cleanup
    del train_images, train_image_shapes, train_path
    del cat_path, jpg_list, widths, heights
    del test_images, test_image_shapes, test_path
    #del train_images_df, test_images_df

#### Data distractions

In [None]:
plt.scatter(train_shapes["height"], train_shapes["width"])
plt.scatter(test_shapes["height"], test_shapes["width"])

In [None]:
critA = train_shapes["width"] >= 700
critB = train_shapes["height"] >= 444
#critC = train_shapes["height"] <= 700.loc[critC]
train_shapes["height"].hist()
train_shapes.loc[critA].loc[critB]["height"].hist()

critD = test_shapes["width"] >= 700
critE = test_shapes["height"] >= 444
#critF = test_shapes["height"] <= 700.loc[critF]
test_shapes["height"].hist()
test_shapes.loc[critD].loc[critE]["height"].hist()

In [None]:
train_shapes.shape,\
train_shapes.loc[critA].loc[critB].loc[critC].shape,\
test_shapes.shape,\
test_shapes.loc[critD].loc[critE].loc[critF].shape

In [None]:
train_shapes["height"].min(), test_shapes["height"].min()

#### Preprocessing

In [None]:
#@ title Image transpose
def bgr2rgb(im_bgr):
    # because cv2.imread() returns BGR image format
    im_rgb = im_bgr.copy()
    im_rgb[:,:,0] = im_bgr[:,:,2]
    im_rgb[:,:,1] = im_bgr[:,:,1]
    im_rgb[:,:,2] = im_bgr[:,:,0]
    return im_rgb

In [None]:
def quick_label(row):
    label = row['filepath'][-40:-20] + "\nWidth: " + str(row['width']) + "  Height: " + str(row['height'])
    return label

choices = rd.sample(list(train_shapes.index), 4)
paths = [(train_shapes.iloc[s])['filepath'] for s in choices]
labels = [quick_label(train_shapes.iloc[s]) for s in choices]
imgs_bgr = [cv2.imread(p) for p in paths]
imgs_rgb = [bgr2rgb(im) for im in imgs_bgr] 

fig, axes = plt.subplots(figsize=(18,16), nrows=2, ncols=2)
axes = axes.reshape(4)
for ii, im in enumerate(imgs_rgb):
    ax0 = axes[ii] 
    ax0.set_title(labels[ii])
    ax0.imshow(im)


In [None]:
irgb = imgs_rgb[0]
irgb.dtype

### (IMPLEMENTATION) Specify Data Loaders for the Landmark Dataset

Use the code cell below to create three separate [data loaders](http://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader): one for training data, one for validation data, and one for test data. Randomly split the images located at `landmark_images/train` to create the train and validation data loaders, and use the images located at `landmark_images/test` to create the test data loader.

All three of your data loaders should be accessible via a dictionary named `loaders_scratch`. Your train data loader should be at `loaders_scratch['train']`, your validation data loader should be at `loaders_scratch['valid']`, and your test data loader should be at `loaders_scratch['test']`.

You may find [this documentation on custom datasets](https://pytorch.org/docs/stable/torchvision/datasets.html#datasetfolder) to be a useful resource.  If you are interested in augmenting your training and/or validation data, check out the wide variety of [transforms](http://pytorch.org/docs/stable/torchvision/transforms.html?highlight=transform)!

In [None]:
# image directory location
data_dir = '/content/landmark_images'

# set target height and width for all images
H = int(120)
W = int(180)
h0 = 210
w0 = 280

# Define transforms for the training data and testing data   
# 
xform = False
if xform:
    train_transforms = transforms.Compose([ transforms.RandomHorizontalFlip(),
                                            transforms.Resize((int(3.6*H),
                                                                int(3.6*W))),
                                            transforms.RandomCrop((int(2.1*H),
                                                                int(2.1*W))),
                                            transforms.RandomRotation((-10,10), 
                                                                    expand=True, 
                                                                    fill=128),
                                            transforms.RandomCrop(int(1.2*H), 
                                                                  int(1.2*W)),
                                            transforms.CenterCrop((H, W)),
                                            transforms.ToTensor()]) 
else:
    train_transforms = transforms.Compose([transforms.Resize((h0, w0)),
                                           transforms.RandomCrop((H,W)),
                                           transforms.ToTensor()])

test_transforms = transforms.Compose([ transforms.Resize((H, W)),
                                       transforms.ToTensor()])

# Pass transforms in here, then run the next cell to see how the transforms look
train_data = datasets.ImageFolder(data_dir + '/train', transform=train_transforms)
test_data = datasets.ImageFolder(data_dir + '/test', transform=test_transforms)

######### from "Multi-Layer Perceptron, MNIST" notebook #######
# obtain training indices that will be used for validation
num_train = len(train_data)
indices = list(range(num_train))
np.random.shuffle(indices)
split = int(np.floor(0.2 * num_train))
train_idx, valid_idx = indices[split:], indices[:split]

# define samplers for obtaining training and validation batches
train_sampler = SubsetRandomSampler(train_idx)
valid_sampler = SubsetRandomSampler(valid_idx)
test_sampler = RandomSampler(test_data)
###############################################################

trainloader = torch.utils.data.DataLoader(train_data, 
                                          batch_size=30,
                                          sampler=train_sampler)

validloader = torch.utils.data.DataLoader(train_data, 
                                          batch_size=30,
                                          sampler=valid_sampler)

testloader = torch.utils.data.DataLoader(test_data, 
                                         batch_size=30,
                                         sampler=test_sampler)

## Write data loaders for training, validation, and test sets
## Specify appropriate transforms, and batch_sizes

loaders_scratch = {'train': trainloader, 'valid': validloader, 'test': testloader}

In [None]:
## if only want approx 600x800 images
trainable_df = train_shapes.loc[critA].loc[critB]#.loc[critC]
testable_df = test_shapes.loc[critD].loc[critE]#.loc[critF]

In [None]:
## Two class dicts

classes = test_data.classes
classes.sort()
class2num = {c[3:]:int(c[:2]) for c in classes}

classvals = list(set([(str(pair[0]),pair[1]) for pair in test_shapes[['labnum','label']].values]))
num2class = {int(x):y for x,y in classvals}

assert class2num[num2class[7]] == 7

**Question 1:** Describe your chosen procedure for preprocessing the data. 
- How does your code resize the images (by cropping, stretching, etc)?  What size did you pick for the input tensor, and why?
- Did you decide to augment the dataset?  If so, how (through translations, flips, rotations, etc)?  If not, why not?

**Answer**: 

<a id='step1'></a>
## Step 1: Create a CNN to Classify Landmarks (from Scratch)

In this step, you will create a CNN that classifies landmarks.  You must create your CNN _from scratch_ (so, you can't use transfer learning _yet_!), and you must attain a test accuracy of at least 20%.

Although 20% may seem low at first glance, it seems more reasonable after realizing how difficult of a problem this is. Many times, an image that is taken at a landmark captures a fairly mundane image of an animal or plant, like in the following picture.

<img src="images/train/00.Haleakala_National_Park/084c2aa50d0a9249.jpg" alt="Bird in Haleakalā National Park" style="width: 400px;"/>

Just by looking at that image alone, would you have been able to guess that it was taken at the Haleakalā National Park in Hawaii?

An accuracy of 20% is significantly better than random guessing, which would provide an accuracy of just 2%. In Step 2 of this notebook, you will have the opportunity to greatly improve accuracy by using transfer learning to create a CNN.

Remember that practice is far ahead of theory in deep learning.  Experiment with many different architectures, and trust your intuition.  And, of course, have fun!

### (IMPLEMENTATION) Visualize a Batch of Training Data

Use the code cell below to retrieve a batch of images from your train data loader, display at least 5 images simultaneously, and label each displayed image with its class name (e.g., "Golden Gate Bridge").

Visualizing the output of your data loader is a great way to ensure that your data loading and preprocessing are working as expected.

In [None]:
# Custom channel transpose vs np.transpose(im, (1,2,0))
def rechannel(img):
    # img is channels first: shape (C, H, W)
    # new_img is channels last: shape (H, W, C)
    new_img = torch.zeros(img.shape[1], img.shape[2], 3)
    for c in range(3):
        new_img[:,:,c] = img[c, :, :]
    return new_img

In [None]:

## TODO: visualize a batch of the train data loader

## the class names can be accessed at the `classes` attribute
## of your dataset object (e.g., `train_dataset.classes`)

try:
    images, labels = train_data_iter.next()
except NameError:
    loader = loaders_scratch['train']
    train_data_iter = iter(loader)
    images, labels = train_data_iter.next()

fig = plt.figure(figsize=(21,8),clear=True)
axes = fig.subplots(nrows=2, ncols=4, sharex=True, sharey=True)
axes = axes.flatten()
for i, im in enumerate(images[0:8]):
    ax = axes[i]
    title = classes[labels[i].numpy()][3:]
    ax.set_title(title)
    axes[i].set_xticks([])
    axes[i].set_yticks([])
    ax.imshow(np.transpose(im, (1,2,0)))


In [None]:
### View test images
try:
    images, labels = data_iter.next()
except NameError:
    data_iter = iter(testloader)
    images, labels = data_iter.next()

labels = labels.tolist()

fig = plt.figure(figsize=(22,8.5))#, dpi=72)
axes = fig.subplots(nrows=2, ncols=4)
axes = axes.flatten()

idx = rd.randrange(images.shape[0]-8)
for i in range(idx,idx+8):
    img = images[i]
    img = np.transpose(img, (1,2,0))  #rechannel(img)
    title = classes[labels[i]]
    axes[i-idx].set_title(title[3:])
    axes[i-idx].set_xticks([])
    axes[i-idx].set_yticks([])
    axes[i-idx].imshow(img)


### Initialize use_cuda variable

In [None]:
# useful variable that tells us whether we should use the GPU
if not use_cuda: use_cuda = torch.cuda.is_available()
torch.cuda.list_gpu_processes()

### (IMPLEMENTATION) Specify Loss Function and Optimizer

Use the next code cell to specify a [loss function](http://pytorch.org/docs/stable/nn.html#loss-functions) and [optimizer](http://pytorch.org/docs/stable/optim.html).  Save the chosen loss function as `criterion_scratch`, and fill in the function `get_optimizer_scratch` below.

In [None]:
## TODO: select loss function
criterion_scratch = nn.CrossEntropyLoss()

def get_optimizer_scratch(model):
    ## TODO: select and return an optimizer
    optimizer = torch.optim.SGD(model.parameters(), lr=0.01)
    return optimizer

### (IMPLEMENTATION) Model Architecture

Create a CNN to classify images of landmarks.  Use the template in the code cell below.

In [None]:
# Calculate a layer's output shape
def output_volume(w, f, s, p):
    #params:
    #w = input volume (tuple)
    #f = kernel size (tuple)
    #s = stride (int)
    #p = zero padding (int)
    x = (w[0] - f[0] + 2*p)/s + 1
    y = (w[1] - f[1] + 2*p)/s + 1
    return (np.int(np.floor(x)), np.int(np.floor(y)))

In [None]:
#@title Original Net (no running)
# define the CNN architecture
if False:
    class oNet(nn.Module):
        ## TODO: choose an architecture, and complete the class
        def __init__(self):
            super(oNet, self).__init__()
            ## Define layers of a CNN
            # number of nodes in each hidden layer
            nodes_1 = 2048
            nodes_2 = 512
            # input layer (H * W -> nodes_1)
            self.fc1 = nn.Linear(H * W, nodes_1)
            # hidden layer (nodes_1 -> nodes_2)
            self.fc2 = nn.Linear(nodes_1, nodes_2)
            # output layer (nodes_2 -> 10)
            self.fc3 = nn.Linear(nodes_2, 10)
            # dropout layers
            self.dropout2 = nn.Dropout(0.2)
            self.dropout3 = nn.Dropout(0.3)
            self.dropout4 = nn.Dropout(0.4)
            
        def forward(self, x):
            ## Define forward pass
            # x is input image tensor
            # flatten input
            x = x.view(-1, H * W)
            # first hidden layer
            x = F.relu(self.fc1(x))
            # first dropout layer
            x = self.dropout3(x)
            # second hidden layer
            x = F.relu(self.fc2(x))
            # second dropout layer
            x = self.dropout2(x)
            # output layer
            x = self.fc3(x)
            return x

    #-#-# Do NOT modify the code below this line. #-#-#

    # instantiate the CNN
    model_scratch = Net()

    # move tensors to GPU if CUDA is available
    if use_cuda:
        model_scratch.cuda()

In [None]:
print("input:", W, H)

(w,h) = output_volume((W,H),(50,50),5,25) #1 
print("shrink1 out:",w, h,"\tshrink2 in:",w, h)

(w,h) = output_volume((w,h),(30,30),1,3) #1 
print("shrink2 out:",w, h,"\t1rst in:",w, h)

(w,h) = output_volume((w,h),(11,11),1,1) #1 
print("1rst out:",w, h,"\t2nd in:",w//2, h//2)

(w,h) = output_volume((w//2,h//2),(7,7),1,1) #2
print("2nd out:",w, h,"\t3rd in:",w//2, h//2)

(w,h) = output_volume((w//2,h//2),(5,5),1,1) #3
print("3rd out:", w, h,"\t4th in:",w//2, h//2)

(w,h) = output_volume((w//2,h//2),(3,3),1,1) 
print("4th out:", w, h,"\tfinal out:",w, h)
print("flattened:", w*h)

#@title original CNN architecture
class cifar_Net(nn.Module):
    def __init__(self):
        super(cifar_Net, self).__init__()

        # (160x120x3 image tensor) --> conv1 
        # W,H --> 160,120 - 11 + 2 +1 = 152,112        
        self.conv1 = nn.Conv2d(3, 20, 11, padding=1)
        #  --> pool /2 --> (76x56x20) --> conv2
        self.conv2 = nn.Conv2d(20, 30, 7, padding=1)
        # 72,52 - 7 + 2 + 1 = 72,52
        # --> pool /2 -->  (36x26x30) --> conv3
        self.conv3 = nn.Conv2d(30, 40, 5, padding=1)
        # 36,26 - 5 + 2 + 1 = 34,24
        # --> pool /2 --> 17,12 (17x12x40) --> conv4
        self.conv4 = nn.Conv2d(40, 50, 3, padding=1)
        # 17,12 - 3 + 2 + 1 = 17,12
        # (17x12x50)=10200 --> flatten

        # max pooling --> conv_layer/2
        self.pool = nn.MaxPool2d(2)

        # dense layers (17 * 12 * 50 -> 10200)
        self.dense1 = nn.Linear(10200, 2040)
        self.dense2 = nn.Linear(2040, 10)

        # dropout layers
        self.dropout1 = nn.Dropout(0.3)
        self.dropout2 = nn.Dropout(0.15)

    def forward(self, x):
        # add sequence of convolutional and max pooling layers
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = self.pool(F.relu(self.conv3(x)))
        x = F.relu(self.conv4(x))
        # flatten image input
        x = x.view(-1, 10200)
        # add dropout layer
        x = self.dropout1(x)
        # 1st hidden layer, relu activation
        x = F.relu(self.dense1(x))
        # dropout layer
        x = self.dropout2(x)
        # 2nd hidden --> output
        x = self.dense2(x)
        return x

# create a complete CNN
cifar_model = cifar_Net()
print(cifar_model)

# move tensors to GPU if CUDA is available
if use_cuda:
    cifar_model.cuda()

In [None]:
print("input:", W, H)

(w,h) = output_volume((W,H),(50,50),5,25) #1 
print("shrink1 out:",w, h,"\tshrink2 in:",w, h)

(w,h) = output_volume((w,h),(30,30),1,3) #1 
print("shrink2 out:",w, h,"\t1rst in:",w, h)

(w,h) = output_volume((w,h),(11,11),1,1) #1 
print("1rst out:",w, h,"\t2nd in:",w//2, h//2)

(w,h) = output_volume((w//2,h//2),(7,7),1,1) #2
print("2nd out:",w, h,"\t3rd in:",w//2, h//2)

(w,h) = output_volume((w//2,h//2),(5,5),1,1) #3
print("3rd out:", w, h,"\t4th in:",w//2, h//2)

(w,h) = output_volume((w//2,h//2),(3,3),1,1) 
print("4th out:", w, h,"\tfinal out:",w, h)
print("flattened:", w*h)

#@title altered CNN architecture
class cNet(nn.Module):
    def __init__(self):
        super(cNet, self).__init__()

        self.shrink1 = nn.Conv2d(3, 3, 50, stride=5, padding=25)
        self.shrink2 = nn.Conv2d(3, 3, 30, stride=1, padding=3)

        # (138x98x3 image tensor) --> conv1 
        # --> 138,98 - 11 + 2 + 1 = 130,90        
        self.conv1 = nn.Conv2d(3, 20, 11, padding=1)
        #  --> pool /2 --> (65x45x20) --> conv2
        self.conv2 = nn.Conv2d(20, 30, 7, padding=1)
        # 65,45 - 7 + 2 + 1 = 61,41
        # --> pool /2 -->  (30x20x30) --> conv3
        self.conv3 = nn.Conv2d(30, 40, 5, padding=1)
        # 30,20 - 5 + 2 + 1 = 28,18
        # --> pool /2 --> 17,12 (14x9x40) --> conv4
        self.conv4 = nn.Conv2d(40, 50, 3, padding=1)
        # 14,9 - 3 + 2 + 1 = 14,9
        # (14x9x50)=6300 --> flatten

        # max pooling --> conv_layer/2
        self.pool = nn.MaxPool2d(2)

        # dense layers (14 * 9 * 50 -> 6300)
        self.dense1 = nn.Linear(6300, 2100)
        self.dense2 = nn.Linear(2100, 10)

        # dropout layers
        self.dropout1 = nn.Dropout(0.2)
        self.dropout2 = nn.Dropout(0.1)

    def forward(self, x):
        # add sequence of convolutional and max pooling layers
        x = F.relu(self.shrink1(x))
        x = F.relu(self.shrink2(x))
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = self.pool(F.relu(self.conv3(x)))
        x = F.relu(self.conv4(x))
        # flatten image input
        x = x.view(-1, 14*9*50)
        # add dropout layer
        x = self.dropout1(x)
        # 1st hidden layer, relu activation
        x = F.relu(self.dense1(x))
        # dropout layer
        x = self.dropout2(x)
        # 2nd hidden --> output
        x = self.dense2(x)
        return x

# create a complete CNN
cnet_model = cNet()
print(cnet_model)

# move tensors to GPU if CUDA is available
if use_cuda:
    cnet_model.cuda()

In [None]:
#@title Skynet

out = output_volume((H,W),(5,5),2,2)
print("first in {} \t first out {}".format((H,W), out))
inn = out
out = output_volume(inn,(3,3),2,1)
print("second in {} \t second out {}".format(inn,out))
inn = out
out = output_volume(inn,(5,5),1,2)
print("third in {} \t third out {}".format(inn,out))
inn = (out[0]//2,out[1]//2)
out = output_volume(inn,(3,3),1,1)
print("fourth in {} \t fourth out {}".format(inn,out))
inn = (out[0]//2,out[1]//2)
out = output_volume(inn,(3,3),1,1)
print("fifth in {} \t fifth out {}".format(inn,out))
#out0 = out[0]//2
#out1 = out[1]//2
#print("pool in {} \t pool out {}".format(out, (out0,out1) ))
print("\nflatout = {}".format(out[0] * out[1]))
flatout = out[0] * out[1]

class skyNet(nn.Module):
    ## TODO: choose an architecture, and complete the class
    def __init__(self):
        super(skyNet, self).__init__()
        ## Define layers of a CNN
        # number of nodes in each hidden layer

        # input layer ((H , W) -> con1)
        self.con1 = nn.Conv2d(3, 12, 5, stride=2, padding=2)
        self.con2 = nn.Conv2d(12, 24, 3, stride=2, padding=1)
        self.con3 = nn.Conv2d(24, 36, 5, stride=1, padding=2)
        self.con4 = nn.Conv2d(36, 36, 3, stride=1, padding=1)
        self.con5 = nn.Conv2d(36, 36, 3, stride=1, padding=1)
        self.pool = nn.MaxPool2d(2)
        self.flat = nn.Flatten()
        self.dense = nn.Linear(flatout*36, 777)
        self.out = nn.Linear(777, 50)
        # dropout layers
        self.dropout1 = nn.Dropout(0.1)
        self.dropout2 = nn.Dropout(0.2)
        
    def forward(self, x):
        ## Define forward pass
        # convolutional layers
        x = F.relu(self.con1(x))
        x = F.relu(self.con2(x))
        x = self.pool(F.relu(self.con3(x)))
        x = self.pool(F.relu(self.con4(x)))
        x = F.relu(self.con5(x))
        # flatten input   
        #print (x.shape)
        x = self.flat(x)
        # first dropout layer
        x = self.dropout2(x)
        # first hidden layer
        x = F.relu(self.dense(x))
        # second dropout layer
        x = self.dropout1(x)
        # output layer
        x = self.out(x)     
        return x

#-#-# Do NOT modify the code below this line. #-#-#

# instantiate the CNN
skynet = skyNet()

# move tensors to GPU if CUDA is available
if use_cuda:
    skynet.cuda()

skynet

In [None]:
for param in skynet.parameters():
    print(param.size())

In [None]:
skynet.apply(custom_weight_init)
skyopt = torch.optim.SGD(skynet.parameters(), lr=0.01)
criterion = nn.CrossEntropyLoss()

skynet = train(12, loaders_scratch, skynet, skyopt, criterion, use_cuda, 'skynet.pt')

In [None]:
# Sample guess

print (data[7].unsqueeze(0).shape, skynet(data[7].unsqueeze(0)).shape)
fig = plt.figure(figsize=(16,8),clear=True)
axes = fig.subplots(nrows=1, ncols=1, sharex=True, sharey=True)
im = data[7].unsqueeze(0)
ans = skynet(data[7].unsqueeze(0))

im=np.transpose(im.squeeze().detach().cpu().numpy(), (1,2,0))
axes.set_title(num2class[np.argmax(ans[0].detach().cpu().numpy())])
axes.set_xticks([])
axes.set_yticks([])
axes.imshow(im)

__Question 2:__ Outline the steps you took to get to your final CNN architecture and your reasoning at each step.  

__Answer:__  

### * * * * *  **TRAIN** here!  * * * * *

(IMPLEMENTATION) Implement the Training Algorithm

Implement your training algorithm in the code cell below.  [Save the final model parameters](http://pytorch.org/docs/master/notes/serialization.html) at the filepath stored in the variable `save_path`.

In [None]:
#@ title train def
def train(n_epochs, loaders, model, optimizer, criterion, use_cuda, save_path):
    """returns trained model"""
    # initialize tracker for minimum validation loss
    valid_loss_min = np.Inf 
    #total_loss = 0.0
    #total_valid_loss = 0.0

    for epoch in range(1, n_epochs+1):
        # initialize variables to monitor training and validation loss
        train_loss = 0.0
        valid_loss = 0.0
        #if epoch == 1:
        #        print ("train_loss \t\tbatch loss \t\trunning avg")

        ###################
        # train the model #
        ###################
        # set the module to training mode
        model.train()
        for batch_idx, (data, target) in enumerate(loaders['train']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()

            ## TODO: find the loss and update the model parameters accordingly
            ## record the average training loss, using something like
            ## train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data.item() - train_loss))

            # initialize optimizer variables to zero
            optimizer.zero_grad()
            # forward pass
            output = model(data)
            # calculate loss
            loss = criterion(output, target)
            # backward pass
            loss.backward()
            # update parameters
            optimizer.step()
            # update training loss
            train_loss += loss.item()
            #if epoch==1:
            #    print (train_loss,"\t", loss.item(),"\t",
            #            train_loss/(batch_idx+1))
        
        train_loss = train_loss/data.size(0)        
        #total_loss += train_loss/n_epochs

        ######################    
        # validate the model #
        ######################
        # set the model to evaluation mode
        model.eval()
        for batch_idx, (data, target) in enumerate(loaders['valid']):
            # move to GPU
            if use_cuda:
                data, target = data.cuda(), target.cuda()

            ## TODO: update average validation loss 

            # forward pass
            output = model(data)
            # compute batch loss
            loss = criterion(output, target)
            # update validation loss 
            valid_loss += loss.item()
            #if epoch==1:
            #    print (valid_loss,"\t", loss.item(),"\t",
            #            valid_loss/(batch_idx+1))
        
        valid_loss /= data.size(0)
        #total_valid_loss += valid_loss/n_epochs
        
        # print training/validation statistics 
        print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(
                                                                    epoch, 
                                                                    train_loss,
                                                                    valid_loss
                                                                    ))
        #print('\t\t\tTotal Loss: {:.6f} \tTotal Validation Loss: {:.6f}'.format( 
        #                                                            total_loss,
        #                                                            total_valid_loss))


        ## TODO: if the validation loss has decreased, save the model 
        ##       at the filepath stored in save_path
        
        if valid_loss < valid_loss_min:
            print('Loss decrease ({:.6f} --> {:.6f})'.format(valid_loss_min,
                                                                 valid_loss))
            torch.save(model.state_dict(), save_path)
            valid_loss_min = valid_loss
        
    return model

### (IMPLEMENTATION) Experiment with the Weight Initialization

Use the code cell below to define a custom weight initialization, and then train with your weight initialization for a few epochs. Make sure that neither the training loss nor validation loss is `nan`.

Later on, you will be able to see how this compares to training with PyTorch's default weight initialization.

In [None]:
def custom_weight_init(m):
    ## TODO: implement a weight initialization strategy
    for p in m.parameters():
        n = p.size(0)
        torch.nn.init.normal_(p, mean=0.0, std=1.0/(n**0.5))

In [None]:
#-#-# Do NOT modify the code below this line. #-#-#
model_scratch.apply(custom_weight_init)
optimizer_scratch = get_optimizer_scratch(model_scratch)
model_scratch = train(10, loaders_scratch, model_scratch, optimizer_scratch,
                      criterion_scratch, use_cuda, 'ignore_scratch.pt')

### (IMPLEMENTATION) Train and Validate the Model

Run the next code cell to train your model.

###Scratch

In [None]:
input = torch.randn(33, 3, 600, 800)
m = nn.Sequential(
            nn.Conv2d(3, 3, 3, stride=3, padding=2),
            nn.Flatten(),
            nn.Linear(161604, 1000),
            nn.Linear(1000, 50)
)
output = m(input)
output.size()

In [None]:
out = output_volume((H,W),(3,3),3,2)
print("first in {} \t first out {}".format((H,W), out))

print("\nflattened {}".format(out[0] * out[1]))
flatout = out[0] * out[1]

#@title define the mnNet architecture
class mnNet(nn.Module):
    ## TODO: choose an architecture, and complete the class
    def __init__(self):
        super(mnNet, self).__init__()
        ## Define layers of a CNN
        # number of nodes in each hidden layer

        # input layer ((H , W) -> nodes_1)
        self.con1 = nn.Conv2d(3, 3, 3, stride=3, padding=2)
        self.flat = nn.Flatten()
        self.den1 = nn.Linear(flatout*3, 1000)
        self.out = nn.Linear(1000, 50)
        # dropout layers
        self.dropout1 = nn.Dropout(0.1)
        self.dropout2 = nn.Dropout(0.2)
        
    def forward(self, x):
        ## Define forward pass
        # convolutional layers
        x = F.relu(self.con1(x))
        # flatten input
                        #x = x.view(-1, 53868)
        x = self.flat(x)
        # first dropout layer
        x = self.dropout2(x)
        # first hidden layer
        x = F.relu(self.den1(x))
        # second dropout layer
        x = self.dropout1(x)
        # output layer
        x = F.relu(self.out(x))       
        return x

#-#-# Do NOT modify the code below this line. #-#-#

# instantiate the CNN
mn = mnNet()

# move tensors to GPU if CUDA is available
if use_cuda:
    mn.cuda()

#mn(data)

In [None]:
mn.apply(custom_weight_init)
mnopt= get_optimizer_scratch(mn)
epochs = 1 
optimizer = mnopt
criterion = nn.CrossEntropyLoss()

valid_loss_min = np.Inf 
    
train_loss = 0.0
valid_loss = 0.0
sum_loss = 0.0
print ("sum_loss \t\tbatch loss \t\trunning avg")

###################
# train the model #
###################
# set the module to training mode
mn.train()
for batch_idx, (data, target) in enumerate(trainloader):
    # move to GPU
    if use_cuda:
        data, target = data.cuda(), target.cuda()

    ## TODO: find the loss and update the model parameters accordingly
    ## record the average training loss, using something like
    ## train_loss = train_loss + ((1 / (batch_idx + 1)) * (loss.data.item() - train_loss))

    # initialize optimizer variables to zero
    mnopt.zero_grad()
    # forward pass
    output = mn(data)
    #print("output shape",output.shape,"\ttarget shape",target.shape)
    # calculate loss
    loss = criterion(output, target)
    # backward pass
    loss.backward()
    # update parameters
    mnopt.step()
    # update training loss
    sum_loss += loss.item()
    #train_loss = train_loss + (loss.item() - train_loss)/ (batch_idx + 1))
    print (sum_loss,"\t", loss.item(),"\t",
           sum_loss/(batch_idx+1))

train_loss += sum_loss/data.size(0)
print ("train_loss",train_loss)

In [None]:
mn = train(1, loaders_scratch, mn.apply(custom_weight_init), 
           mnopt, criterion_scratch, use_cuda, 'mn.pt')

In [None]:
######################    
# validate the model #
######################
# set the model to evaluation mode
mn.eval()
for batch_idx, (data, target) in enumerate(validloader):
    # move to GPU
    if use_cuda:
        data, target = data.cuda(), target.cuda()

    ## TODO: update average validation loss 

    # forward pass
    output = mn(data)
    # compute batch loss
    loss = criterion(output, target)
    # update validation loss 
    valid_loss += loss.item()
    
# calculate average losses
valid_loss = valid_loss/data.size(0)

# print training/validation statistics 
print('Epoch: {} \tValidation Loss: {:.6f}'.format(0, valid_loss))

In [None]:
#weights = []
for param in mn.parameters():
    print(param.size())

### (IMPLEMENTATION) Test the Model

Run the code cell below to try out your model on the test dataset of landmark images. Run the code cell below to calculate and print the test loss and accuracy.  Ensure that your test accuracy is greater than 20%.

In [None]:
def test(loaders, model, criterion, use_cuda):

    # monitor test loss and accuracy
    test_loss = 0.
    correct = 0.
    total = 0.

    # set the module to evaluation mode
    model.eval()

    for batch_idx, (data, target) in enumerate(loaders['test']):
        # move to GPU
        if use_cuda:
            data, target = data.cuda(), target.cuda()
        # forward pass: compute predicted outputs by passing inputs to the model
        output = model(data)
        # calculate the loss
        loss = criterion(output, target)
        # update average test loss 
        test_loss = test_loss + ((1 / (batch_idx + 1)) * (loss.data.item() - test_loss))
        # convert output probabilities to predicted class
        pred = output.data.max(1, keepdim=True)[1]
        # compare predictions to true label
        correct += np.sum(np.squeeze(pred.eq(target.data.view_as(pred))).cpu().numpy())
        total += data.size(0)
            
    print('Test Loss: {:.6f}\n'.format(test_loss))

    print('\nTest Accuracy: %2d%% (%2d/%2d)' % (
        100. * correct / total, correct, total))

# load the model that got the best validation accuracy
#model_scratch.load_state_dict(torch.load('model_scratch.pt'))
#test(loaders_scratch, model_scratch, criterion_scratch, use_cuda)
test(loaders_scratch, mn, criterion_scratch, use_cuda)

---
<a id='step2'></a>
## Step 2: Create a CNN to Classify Landmarks (using Transfer Learning)

You will now use transfer learning to create a CNN that can identify landmarks from images.  Your CNN must attain at least 60% accuracy on the test set.

### (IMPLEMENTATION) Specify Data Loaders for the Landmark Dataset

Use the code cell below to create three separate [data loaders](http://pytorch.org/docs/stable/data.html#torch.utils.data.DataLoader): one for training data, one for validation data, and one for test data. Randomly split the images located at `landmark_images/train` to create the train and validation data loaders, and use the images located at `landmark_images/test` to create the test data loader.

All three of your data loaders should be accessible via a dictionary named `loaders_transfer`. Your train data loader should be at `loaders_transfer['train']`, your validation data loader should be at `loaders_transfer['valid']`, and your test data loader should be at `loaders_transfer['test']`.

If you like, **you are welcome to use the same data loaders from the previous step**, when you created a CNN from scratch.

In [None]:
### TODO: Write data loaders for training, validation, and test sets
## Specify appropriate transforms, and batch_sizes

loaders_transfer = loaders_scratch #{'train': None, 'valid': None, 'test': None}


### (IMPLEMENTATION) Specify Loss Function and Optimizer

Use the next code cell to specify a [loss function](http://pytorch.org/docs/stable/nn.html#loss-functions) and [optimizer](http://pytorch.org/docs/stable/optim.html).  Save the chosen loss function as `criterion_transfer`, and fill in the function `get_optimizer_transfer` below.

In [None]:
## TODO: select loss function
criterion_transfer = criterion_scratch

def get_optimizer_transfer(model):
    ## TODO: select and return optimizer
    return get_optimizer_scratch(model)


### (IMPLEMENTATION) Model Architecture

Use transfer learning to create a CNN to classify images of landmarks.  Use the code cell below, and save your initialized model as the variable `model_transfer`.

In [None]:
## TODO: Specify model architecture

model_transfer = None




#-#-# Do NOT modify the code below this line. #-#-#

if use_cuda:
    model_transfer = model_transfer.cuda()

__Question 3:__ Outline the steps you took to get to your final CNN architecture and your reasoning at each step.  Describe why you think the architecture is suitable for the current problem.

__Answer:__  

### (IMPLEMENTATION) Train and Validate the Model

Train and validate your model in the code cell below.  [Save the final model parameters](http://pytorch.org/docs/master/notes/serialization.html) at filepath `'model_transfer.pt'`.

In [None]:
# TODO: train the model and save the best model parameters at filepath 'model_transfer.pt'



#-#-# Do NOT modify the code below this line. #-#-#

# load the model that got the best validation accuracy
model_transfer.load_state_dict(torch.load('model_transfer.pt'))

### (IMPLEMENTATION) Test the Model

Try out your model on the test dataset of landmark images. Use the code cell below to calculate and print the test loss and accuracy.  Ensure that your test accuracy is greater than 60%.

In [None]:
test(loaders_transfer, model_transfer, criterion_transfer, use_cuda)

---
<a id='step3'></a>
## Step 3: Write Your Landmark Prediction Algorithm

Great job creating your CNN models! Now that you have put in all the hard work of creating accurate classifiers, let's define some functions to make it easy for others to use your classifiers.

### (IMPLEMENTATION) Write Your Algorithm, Part 1

Implement the function `predict_landmarks`, which accepts a file path to an image and an integer k, and then predicts the **top k most likely landmarks**. You are **required** to use your transfer learned CNN from Step 2 to predict the landmarks.

An example of the expected behavior of `predict_landmarks`:
```
>>> predicted_landmarks = predict_landmarks('example_image.jpg', 3)
>>> print(predicted_landmarks)
['Golden Gate Bridge', 'Brooklyn Bridge', 'Sydney Harbour Bridge']
```

In [None]:
import cv2
from PIL import Image

## the class names can be accessed at the `classes` attribute
## of your dataset object (e.g., `train_dataset.classes`)

def predict_landmarks(img_path, k):
    ## TODO: return the names of the top k landmarks predicted by the transfer learned CNN
    


# test on a sample image
predict_landmarks('images/test/09.Golden_Gate_Bridge/190f3bae17c32c37.jpg', 5)

### (IMPLEMENTATION) Write Your Algorithm, Part 2

In the code cell below, implement the function `suggest_locations`, which accepts a file path to an image as input, and then displays the image and the **top 3 most likely landmarks** as predicted by `predict_landmarks`.

Some sample output for `suggest_locations` is provided below, but feel free to design your own user experience!
![](images/sample_landmark_output.png)

In [None]:
def suggest_locations(img_path):
    # get landmark predictions
    predicted_landmarks = predict_landmarks(img_path, 3)
    
    ## TODO: display image and display landmark predictions

    
    

# test on a sample image
suggest_locations('images/test/09.Golden_Gate_Bridge/190f3bae17c32c37.jpg')

### (IMPLEMENTATION) Test Your Algorithm

Test your algorithm by running the `suggest_locations` function on at least four images on your computer. Feel free to use any images you like.

__Question 4:__ Is the output better than you expected :) ?  Or worse :( ?  Provide at least three possible points of improvement for your algorithm.

__Answer:__ (Three possible points for improvement)

In [None]:
## TODO: Execute the `suggest_locations` function on
## at least 4 images on your computer.
## Feel free to use as many code cells as needed.



# Keras model

## load data and modules

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

import os
import random as rd
import numpy as np
import pandas as pd

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras import Model
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.losses import CategoricalCrossentropy as cat_entropy
from tensorflow.keras.losses import SparseCategoricalCrossentropy as sparse_cat_entropy

In [None]:
import zipfile
from google.colab import drive
drive.mount('/content/drive')

zipfile.ZipFile('/content/drive/MyDrive/Colab Notebooks/Udacity/deep_learning/landmark_images.zip').extractall()

In [None]:
try:
    train_shapes = pd.read_csv('train_shapes.csv',header=0, index_col=0)
    test_shapes = pd.read_csv('test_shapes.csv',header=0, index_col=0)
except FileNotFoundError:
    print ("Must upload train_shapes.csv and test_shapes.csv first.")

if False:
    
    #read train image file data:
    train_path = './landmark_images/train/'
    cat_path = [train_path + cat + '/' for cat in os.listdir(train_path)]
    cat_list = [cat for cat in os.listdir(train_path)]
    cats, train_images = [], []
    for c in cat_list:
        cp = str(train_path + c + '/') 
        for imfile in os.listdir(cp):
            train_images.append(str(cp + imfile))
            cats.append(c)

    train_image_shapes = []
    for imfile in train_images:
        train_image_shapes.append(plt.imread(imfile).shape)
    heights = [shp[0] for shp in train_image_shapes]    
    widths = [shp[1] for shp in train_image_shapes]

    train_images_df = pd.DataFrame(columns=['labnum','label',
                                            'height','width',
                                            'filepath'])
    train_images_df['labnum'] = [int(c[:2]) for c in cats]
    train_images_df['label'] = [c[3:] for c in cats]
    train_images_df['height'] = heights
    train_images_df['width'] = widths
    train_images_df['filepath'] = train_images

    #read test image file data:
    test_path = './landmark_images/test/'
    cat_path = [test_path + cat + '/' for cat in os.listdir(test_path)]
    cat_list = [cat for cat in os.listdir(test_path)]
    cats, test_images = [], []
    for c in cat_list:
        cp = str(test_path + c + '/') 
        for imfile in os.listdir(cp):
            test_images.append(str(cp + imfile))
            cats.append(c)

    test_image_shapes = []
    for imfile in test_images:
        test_image_shapes.append(plt.imread(imfile).shape)
    heights = [shp[0] for shp in test_image_shapes]    
    widths = [shp[1] for shp in test_image_shapes]

    test_images_df = pd.DataFrame(columns=['labnum','label',
                                            'height','width',
                                            'filepath'])
    test_images_df['labnum'] = [int(c[:2]) for c in cats]
    test_images_df['label'] = [c[3:] for c in cats]
    test_images_df['height'] = heights
    test_images_df['width'] = widths
    test_images_df['filepath'] = test_images

    #save dfs
    train_images_df.to_csv('train_shapes.csv')
    test_images_df.to_csv('test_shapes.csv')

    #cleanup
    del train_images, train_image_shapes, train_path
    del test_images, test_image_shapes, test_path
    del cats, cat_list, cat_path, widths, heights

In [None]:
## Only want exaclty 600x800 images
testable_df = test_shapes.loc[lambda df: df['width'] == 800].loc[lambda df: df['height'] == 600]
trainable_df = train_shapes.loc[lambda df: df['width'] == 800].loc[lambda df: df['height'] == 600]

## Make class dict
classvals = list(set([(str(pair[0]),pair[1]) for pair in test_shapes[['labnum','label']].values]))
num2class = {int(x):y for x,y in classvals}

In [None]:
f = plt.figure(num=1, figsize=(20,20))
for i in range(3):
    randix = rd.choice(list(testable_df.index))
    img = plt.imread(test_shapes.iloc[randix]['filepath'])/255.
    tar = test_shapes.iloc[randix]['labnum']

    ax = f.add_subplot(1,3,i+1)
    ax.set_title("Class #"+str(tar)+" "+num2class[tar])
    ax.imshow(img)

## experimental models

In [None]:
#@title Experimental Keras Model

input_layer = layers.Input(shape=(600,800,3))
z = layers.Conv2D(10, 3, 
                  strides=2,
                  activation='relu',
                  #kernel_initializer='uniform',
                  use_bias=True,
                  padding='same')(input_layer)

y = layers.Conv2D(10, 3, 
                  strides=2,
                  activation='relu',
                  #kernel_initializer='normal',
                  use_bias=True,
                  padding='same')(z)

x = layers.Conv2D(20, 7, 
                  strides=1,
                  activation='relu',
                  use_bias=True,
                  padding='valid')(y)
x = layers.MaxPooling2D(2)(x)

x = layers.Conv2D(30, 5, 
                  strides=1,
                  activation='relu',
                  use_bias=True,
                  padding='valid')(x)
x = layers.MaxPooling2D(2)(x)

x = layers.Conv2D(40, 3, 
                  strides=1,
                  activation='relu',
                  use_bias=True,
                  padding='valid')(x)
x = layers.MaxPooling2D(2)(x)

x = layers.Conv2D(60, 3, 
                  strides=1,
                  activation='relu',
                  use_bias=True,
                  padding='valid')(x)
x = layers.MaxPooling2D(2)(x)

x = layers.Flatten()(x)
x = layers.Dropout(0.2)(x)
x = layers.Dense(1000, activation='relu')(x)
x = layers.Dropout(0.2)(x)
output_layer = layers.Dense(10, activation='softmax')(x)

shrinkX = Model(input_layer, y)
modelX = Model(input_layer, output_layer)
modelX.summary()

In [None]:
def scale(a):
    return (a-a.min())/(a.max()-a.min())

In [None]:
f = plt.figure(num=2,figsize=(20,5))
sp1 = f.add_subplot(1,4,1)
sp2 = f.add_subplot(1,4,2)
sp3 = f.add_subplot(1,4,3)
sp4 = f.add_subplot(1,4,4)

randix = rd.choice(list(trainable_df.index))
img = plt.imread(train_shapes.iloc[randix]['filepath'])/255.
tar = train_shapes.iloc[randix]['labnum']

out_img = shrinkX(np.expand_dims(img,0))
out1 = out_img.numpy()[0,:,:,0:3]
out2 = out_img.numpy()[0,:,:,5:8]
out3 = out_img.numpy()[0,:,:,7:10]


sp1.set_title("Class #"+str(tar)+" "+num2class[tar])
sp1.imshow(scale(img))
sp2.set_title("Model Output One")
sp2.imshow(scale(out1))
sp3.set_title("Model Output Two")
sp3.imshow(scale(out2))
sp4.set_title("Model Output Three")
sp4.imshow(scale(out3))

In [None]:
print("min\t max\t avg\t std")
for i in range(10):  
    print("{:.4f}\t {:.4f}\t {:.4f}\t {:.4f}".format(   out_img.numpy()[0,:,:,i].min(), 
                                                        out_img.numpy()[0,:,:,i].max(),
                                                        out_img.numpy()[0,:,:,i].mean(),
                                                        out_img.numpy()[0,:,:,i].std()) )

#### build and train fnxs

In [None]:
#traintupleDG, testupleDG = tf.keras.datasets.mnist.load_data()

In [None]:
## Initiate hyperparameters
epochs = 5
learning_rate = 0.01
batch_size = 40
validation_batch_size = 20

## Construct datasets
x_train = np.array([plt.imread(fp) for fp in trainable_df["filepath"]])/255.
y_train = float(trainable_df[["labnum"]].values.squeeze())

x_test = np.array([plt.imread(fp) for fp in testable_df["filepath"]])/255.
y_test = float(testable_df[["labnum"]].values.squeeze())

traintuple = (x_train[:4], y_train[:4])
testtuple = (x_test[:4], y_test[:4])

In [None]:
del x_test
del y_test
del x_train
del y_train
del traintuple
del testtuple

In [None]:
#@title Define functions for building, training, compiling and evaluating models

def build_model(verbose=False):
    '''Builds a model and returns it uncompiled

    '''
    input_layer = layers.Input(shape=(600,800,3))
    z = layers.Conv2D(10, 3, 
                    strides=2,
                    activation='relu',
                    #kernel_initializer='uniform',
                    use_bias=True,
                    padding='same')(input_layer)

    y = layers.Conv2D(10, 3, 
                    strides=2,
                    activation='relu',
                    #kernel_initializer='normal',
                    use_bias=True,
                    padding='same')(z)

    x = layers.Conv2D(20, 7, 
                    strides=1,
                    activation='relu',
                    use_bias=True,
                    padding='valid')(y)
    x = layers.MaxPooling2D(2)(x)

    x = layers.Conv2D(30, 5, 
                    strides=1,
                    activation='relu',
                    use_bias=True,
                    padding='valid')(x)
    x = layers.MaxPooling2D(2)(x)

    x = layers.Conv2D(40, 3, 
                    strides=1,
                    activation='relu',
                    use_bias=True,
                    padding='valid')(x)
    x = layers.MaxPooling2D(2)(x)

    x = layers.Conv2D(60, 3, 
                    strides=1,
                    activation='relu',
                    use_bias=True,
                    padding='valid')(x)
    x = layers.MaxPooling2D(2)(x)

    x = layers.Flatten()(x)
    x = layers.Dropout(0.2)(x)
    x = layers.Dense(1000, activation='relu')(x)
    x = layers.Dropout(0.2)(x)
    output_layer = layers.Dense(10, activation='softmax')(x)

    return Model(input_layer, output_layer)

## Same training loop for all models
def train(model, traintuple, valtuple, epochs=epochs):
    '''Train a model on the given sets of data
        Params: the given model,
                the train data as a tuple of x,y,
                the test data as a tuple of x,y
        Returns: a dictionary of metric values after each epoch of training
    '''
    (x_train, y_train) = traintuple    
    history = model.fit(x=x_train,
                        y=y_train,
                        batch_size=batch_size,
                        validation_data=valtuple,
                        validation_batch_size=validation_batch_size,
                        epochs=epochs,  
                        verbose=1)   
    return history.history
print("Loaded function train(model, traintuple, testuple, epochs=epochs)")

## All models are compiled the same
def compile_model(model):    
    model.compile(  loss=sparse_cat_entropy,
                    optimizer=SGD(learning_rate=learning_rate),                    
                    metrics=['acc'])
    print ("Compiled model", model.name)
print("loaded function compile_model(model)")

In [None]:
def build_mini(verbose=False):
    '''Builds a minimal experimental model and returns it uncompiled

    '''
    input_layer = layers.Input(shape=(600,800,3))
    x = layers.Conv2D(10, 3, 
                    strides=2,
                    activation='relu',
                    use_bias=True,
                    padding='same')(input_layer)

    x = layers.MaxPooling2D(3)(x)
    x = layers.Flatten()(x)
    x = layers.Dropout(0.1)(x)
    output_layer = layers.Dense(50, activation='softmax')(x)

    return Model(input_layer, output_layer, name='minimodel')

In [None]:
def qtrain(model, traintuple, valtuple, epochs=epochs):
    '''Train a model on the given sets of data
        Params: the given model,
                the train data as a tuple of x,y,
                the test data as a tuple of x,y
        Returns: a dictionary of metric values after each epoch of training
    '''
    (x_train, y_train) = traintuple    
    history = model.fit(x=x_train,
                        y=y_train,
                        batch_size=batch_size,
                        validation_data=valtuple,
                        validation_batch_size=validation_batch_size,
                        epochs=epochs,  
                        verbose=1)   
    return history.history

def qcompile_model(model):    
    model.compile(  loss=sparse_cat_entropy,
                    optimizer=SGD(learning_rate=learning_rate))#,                    
                    #metrics=['acc'])
    print ("Compiled model", model.name)

In [None]:
epochs = 1
learning_rate = 0.01
batch_size = 2
validation_batch_size = 2

In [None]:
### Build
mm = build_mini()

### Compile
qcompile_model(mm)


In [None]:
######### Print out summary table
print(mm.summary(),"\n")

######## Plot model diagrams
tf.keras.utils.plot_model(mm, 
                          show_layer_names=True, 
                          show_shapes=True, 
                          to_file="mm.png")

In [None]:
### Train
train_stats = qtrain(mm, traintuple, testtuple, epochs=epochs)

In [None]:
randix = rd.choice(list(trainable_df.index))
img = plt.imread(train_shapes.iloc[randix]['filepath'])/255.
tar = train_shapes.iloc[randix]['labnum']

out = np.argmax(mm(np.expand_dims(img,0)).numpy().squeeze())
print("target",tar,"\toutput", out)

In [None]:
traintuple[1].dtype

#pretrained model

---
## Define the Model

To define a model for training we'll follow these steps:
1. Load in a pre-trained VGG16 model
2. "Freeze" all the parameters, so the net acts as a fixed feature extractor 
3. Remove the last layer
4. Replace the last layer with a linear classifier of our own

**Freezing simply means that the parameters in the pre-trained model will *not* change during training.**

In [None]:
# Load the pretrained model from pytorch
vgg16 = models.vgg16(pretrained=True)

# print out the model structure
print(vgg16)



```
(classifier): Sequential(
    (0): Linear(in_features=25088, out_features=4096, bias=True)
    (1): ReLU(inplace=True)
    (2): Dropout(p=0.5, inplace=False)
    (3): Linear(in_features=4096, out_features=4096, bias=True)
    (4): ReLU(inplace=True)
    (5): Dropout(p=0.5, inplace=False)
    (6): Linear(in_features=4096, out_features=1000, bias=True)
  )
```



In [None]:
# Freeze training for all "features" layers
for param in vgg16.features.parameters():
    #param.requires_grad = False
    print(param.size(), param.requires_grad) 
    
for clsfr in vgg16.classifier[:2]:
    for prmtr in clsfr.parameters():
        #prmtr.requires_grad = False
        print(prmtr.size(), prmtr.requires_grad) 

---
### Final Classifier Layer

Once you have the pre-trained feature extractor, you just need to modify and/or add to the final, fully-connected classifier layers. In this case, we suggest that you repace the last layer in the vgg classifier group of layers. 
> This layer should see as input the number of features produced by the portion of the network that you are not changing, and produce an appropriate number of outputs for the flower classification task.

You can access any layer in a pretrained network by name and (sometimes) number, i.e. `vgg16.classifier[6]` is the sixth layer in a group of layers named "classifier".

In [None]:
n_inputs = vgg16.classifier[6].in_features

# new layers automatically have requires_grad = True
fork_layer = nn.Linear(n_inputs, len(classes))
#last_layer = nn.Linear(50*len(classes), len(classes))

vgg16.classifier[6] = fork_layer
#vgg16.classifier[7] = last_layer

# if GPU is available, move the model to GPU
if use_cuda:
    vgg16.cuda()

# check to see that your last layer produces the expected number of outputs
print(vgg16)

In [None]:
# Check Freeze training for all layers
for param in vgg16.features.parameters():
    print(param.size(), param.requires_grad) 
for param in vgg16.classifier.parameters():
    print(param.size(), param.requires_grad) 

### Specify [Loss Function](http://pytorch.org/docs/stable/nn.html#loss-functions) and [Optimizer](http://pytorch.org/docs/stable/optim.html)

Below we'll use cross-entropy loss and stochastic gradient descent with a small learning rate. Note that the optimizer accepts as input _only_ the trainable parameters `vgg.classifier.parameters()`.

In [None]:
#import torch.optim as optim

# specify loss function (categorical cross-entropy)
criterion = nn.CrossEntropyLoss()

# specify optimizer (stochastic gradient descent) and learning rate = 0.001
optimizer = optim.SGD(vgg16.classifier.parameters(), lr=0.01)

---
## Training

Here, we'll train the network.

> **Exercise:** So far we've been providing the training code for you. Here, I'm going to give you a bit more of a challenge and have you write the code to train the network. Of course, you'll be able to see my solution if you need help.

In [None]:
optimizer = optim.SGD(vgg16.classifier.parameters(), lr=0.01)

In [None]:
# number of epochs to train the model
n_epochs = 2

for epoch in range(1, n_epochs+1):

    # keep track of training and validation loss
    train_loss = 0.0
    
    ###################
    # train the model #
    ###################
    # model by default is set to train
    for batch_i, (data, target) in enumerate(trainloader):
        # move tensors to GPU if CUDA is available
        if use_cuda:
            data, target = data.cuda(), target.cuda()
        # clear the gradients of all optimized variables
        optimizer.zero_grad()
        # forward pass: compute predicted outputs by passing inputs to the model
        output = vgg16(data)
        # calculate the batch loss
        loss = criterion(output, target)
        # backward pass: compute gradient of the loss with respect to model parameters
        loss.backward()
        # perform a single optimization step (parameter update)
        optimizer.step()
        # update training loss 
        train_loss += loss.item()
        
        if batch_i % 20 == 19:    # print training loss every specified number of mini-batches
            print('Epoch %d, Batch %d loss: %.16f' %
                  (epoch, batch_i + 1, train_loss / 20))
            train_loss = 0.0

---
## Testing

Below you see the test accuracy for each flower class.

In [None]:
testloader = torch.utils.data.DataLoader(test_data, 
                                         batch_size=30,
                                         sampler=test_sampler)

In [None]:
# track test loss 
# over 5 flower classes
test_loss = 0.0
class_correct = list(0. for i in range(50))
class_total = list(0. for i in range(50))

vgg16.eval() # eval mode

# iterate over test data
for data, target in testloader:
    # move tensors to GPU if CUDA is available
    if use_cuda:
        data, target = data.cuda(), target.cuda()
    # forward pass: compute predicted outputs by passing inputs to the model
    output = vgg16(data)
    # calculate the batch loss
    loss = criterion(output, target)
    # update  test loss 
    test_loss += loss.item()*data.size(0)
    # convert output probabilities to predicted class
    _, pred = torch.max(output, 1)    
    # compare predictions to true label
    correct_tensor = pred.eq(target.data.view_as(pred))
    correct = np.squeeze(correct_tensor.numpy()) if not use_cuda else np.squeeze(correct_tensor.cpu().numpy())
    # calculate test accuracy for each object class
    for i in range(data.size(0)):
        label = target.data[i]
        class_correct[label] += correct[i].item()
        class_total[label] += 1

# calculate avg test loss
test_loss = test_loss/len(testloader.dataset)
print('Test Loss: {:.6f}\n'.format(test_loss))

for i in range(50):
    if class_total[i] > 0:
        print('Test Accuracy of %5s: %2d%% (%2d/%2d)' % (
            num2class[i], 100 * class_correct[i] / class_total[i],
            np.sum(class_correct[i]), np.sum(class_total[i])))
    else:
        print('Test Accuracy of %5s: N/A (no training examples)' % (num2class[i]))

print('\nTest Accuracy (Overall): %2d%% (%2d/%2d)' % (
    100. * np.sum(class_correct) / np.sum(class_total),
    np.sum(class_correct), np.sum(class_total)))

### Visualize Sample Test Results

In [None]:
# obtain one batch of test images

try:
    images, labels = dataiter.next()
except NameError:
    dataiter = iter(testloader)
    images, labels = dataiter.next()
images.numpy()

# move model inputs to cuda, if GPU available
if use_cuda:
    images = images.cuda()

# get sample outputs
output = vgg16(images)
# convert output probabilities to predicted class
_, preds_tensor = torch.max( output, 1)
preds = np.squeeze(preds_tensor.numpy()) if not use_cuda else np.squeeze(preds_tensor.cpu().numpy())

# plot the images in the batch, along with predicted and true labels
fig = plt.figure(figsize=(21, 7.5))
for idx in np.arange(8):
    ax = fig.add_subplot(2, 8/2, idx+1, xticks=[], yticks=[])
    plt.imshow(np.transpose(images.cpu()[idx], (1, 2, 0)))
    ax.set_title("{}\n({})".format(num2class[preds[idx]], num2class[labels[idx].item()]),
                                  color=("green" if preds[idx]==labels[idx].item() else "red"))

In [None]:
preds[idx], labels[idx].item()

In [None]:
num2class[preds[idx]], labels[idx].item()
