# CNN

### Test for [CUDA](http://pytorch.org/docs/stable/cuda.html)

Since these are larger (244x244x3) images, it may prove useful to speed up your training time by using a GPU. CUDA is a parallel computing platform and CUDA Tensors are the same as typical Tensors, only they utilize GPU's for computation.


In [111]:
import torch
import numpy as np

# check if CUDA is available
train_on_gpu = torch.cuda.is_available()

if not train_on_gpu:
    print('CUDA is not available.  Training on CPU ...')
else:
    print('CUDA is available!  Training on GPU ...')

CUDA is available!  Training on GPU ...


# The Road Ahead

We break the notebook into separate steps.  Feel free to use the links below to navigate the notebook

* [Step 0](#step0): Import Datasets
* [Step 1](#step1): Detect Birds
* [Step 2](#step2): Create a CNN to Classify Birds Breeds (from Scratch)
* [Step 3](#step3): Create a CNN to Classify Birds Breeds (using Transfer Learning)
* [Step 4](#step4): Write Algorithm
* [Step 5](#step5): Test Algorithm

Step 0: Import Datasets

Download the dog dataset from kaggle. Unzip the folder and place to project's home directory, at the location /dataset.

In [112]:
import numpy as np
from glob import glob

# load filenames for dog images
bird_files = np.array(glob("dataset/*/*/*"))

# print number of images in each dataset
print('There are %d total bird images.' % len(bird_files))

There are 33050 total bird images.


# Step 1: Detect Bird

In this section, using a pre-trained model(VGG16) to detect Birds in images.

In [161]:
import torch
import torchvision.models as models

# define VGG16 model
VGG16 = models.vgg16(pretrained=True)


# check if CUDA is available
use_cuda = torch.cuda.is_available()

# move model to GPU if CUDA is available
if use_cuda:
    VGG16 = VGG16.cuda()

# Making Predictions with a Pre-trained Mode

Function that accepts a path to an image (such as 'dataset/train/AFRICAN FIREFINCH/001.jpg') as input and returns the index corresponding to the ImageNet class that is predicted by the pre-trained VGG-16 model. The output should always be an integer between 0 and 999, inclusive.

In [194]:
from PIL import Image
import torch.nn as nn
from torchvision import datasets, transforms as T

# Set PIL to be tolerant of image files that are truncated.
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

def VGG16_predict(img_path):
    VGG16.eval()

#read image file
    fp = open(img_path, "rb")
    p = ImageFile.Parser()
    while 1:
        s = fp.read(1024)
        if not s:
            break
        image = Image.open(img_path)
        
#config VGG16
    preprocess = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
    ])
    input_tensor = preprocess(image)
    input_batch = input_tensor.unsqueeze(0)

# move the input and model to GPU for speed if available
    if torch.cuda.is_available():
        input_batch = input_batch.to('cuda')
        VGG16.to('cuda')

#use the model
    output = VGG16(input_batch)
    if torch.cuda.is_available():
        output = output.cpu()

#find the arg max from output tensor
    index = output.data.numpy().argmax()
    
    return index # predicted class index

In [202]:
VGG16_predict('dataset/train/ALBATROSS/001.jpg')

146