Name : Lany Malis
<br>
Class: ITE-A
<br>
Project: Animal Recognition

# Animal Recognition Using Convolutional Neural Network

In this part, we will take our best training model to make prediction in 30 images in purpose to find the number of error in between 30 images.

## 1. Library

- **os** : provide a way of using operating system depent functionality such as reading and writing to file system. 
- **numpy** : support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
- **torch** : is opensource machine learning framework.
- **glob** : is a module which provide ability to search for the file in the directory.
- **torch.nn** : is module which provice the various neural network layer and loss function.
- **transforms** : is a module which is able to do transformation on the images such as resizing, flipping, cropping, normalizing.
- **DataLoader** : provide the ability to load and iterate through all the data including shuffling and batching.
- **torchvision** : provide access to the model architecture such as CNNs architechture.
- **torch.optim** : provide various optimiser to update the parameter while training.
- **pathlib** : provide class which's accessible to working with file paths.

In [29]:
import torch 
import torch.nn as nn
from torchvision.transforms import transforms
import numpy as np
from torch.autograd import Variable
from torchvision.models import squeezenet1_1
import torch.functional as F
from io import open
import os
from PIL import Image
import pathlib
import glob

## 2. Dataloading 

There are 2 steps for loading dataset to implement the algorithms such as:
  1. **Determine the Path** : Determine the location of the Dataset in Device and it's acessible.
<br>
  2. **Divide Dataset Categories** : First, we create the variable of pathlib to access the file. Second, we split the image of each category by slash '/' and retrieve all images in each folder. Last, we sort the name of each category by alphabet.  
 

In [9]:
#Determine Path of training_set and prediction_set
train_path = r'C:/Users/User/Desktop/Recognition/CatDog/training_set'
prediction_path = r'C:\Users\User\Desktop\Recognition\Prediction\Prediction'

In [10]:
#Divied Dataset Category
root = pathlib.Path(train_path)
classes = sorted([j.name.split('/')[-1] for j in root.iterdir()])

## 3. Convolutional Neural Network Model 

Creating ConvNet subclass of nn.Module class is essential for establishing the neural network architecture.

In below block, There are only 2 Methods ConvNet class:
<br>
1. **__init__** : is a default constructor method which's created when instance of ConvNet is called. In this ConvNet Architecture, There are 5 layers of Convolutional Neural Networks such as:
<br>
  - *First Convolutional Layers* : is responsible for extracting the most basic feature, reduce dimensonality of image and create feature map by  taking 3 colors channel image, using 12 filters with size of 3*3 pixels filter, using padding 1 to make the output size is as same as input size, and using stride 1 (same stride) to move by 1 pixels. Also, BatchNorm2d calculates the mean and variance of the activations within each channel across the mini-batch in purpose to capture different features in detail. As well as, we use Relu to get the result output in tensor format which the negative value will set to 0.
<br>
  - *Pooling Layer* : is used for reduce the image size, computational cost and enhances features. Also, we reduce the image size by factor by 2. 
<br>
  - *Second Convolutional Layer* 
  - *Third Convolutional Layer* 
<br>
  - *Fully Connected Layer* : is used to classifies the output of each category and associate features to a particular label. 
<br>
2. **Forward Function** : is used for specify the input data flow through the network layers to get the final output.


In [11]:
#CNN Network
class ConvNet(nn.Module):
    def __init__(self, num_classes=2):
        super(ConvNet, self).__init__()
        
        #Output size after convolution filter
        #((w-f+2P)/s)+1
        
        #Input shape= (256,3, 100, 100)
        
        self.conv1=nn.Conv2d(in_channels=3, out_channels=12, kernel_size=3, stride=1, padding=1)
        #Shape=(256, 12, 100, 100)
        self.bn1=nn.BatchNorm2d(num_features=12)
        #Shape=(256, 12, 100, 100)
        self.relu1=nn.ReLU()
        #Shape=(256, 12, 100, 100)
        
        self.pool=nn.MaxPool2d(kernel_size=2)
        #Reduce the image size be factor 2
        #Shape=(256, 12, 50, 50)
        
        self.conv2=nn.Conv2d(in_channels=12, out_channels=20, kernel_size=3, stride=1, padding=1)
        #Shape=(256, 20, 50, 50)
        self.relu2=nn.ReLU()
        #Shape=(256, 20, 50, 50)
        
        self.conv3=nn.Conv2d(in_channels=20, out_channels=32, kernel_size=3, stride=1, padding=1)
        #Shape=(256, 32, 50, 50)
        self.bn3=nn.BatchNorm2d(num_features=32)
        #Shape=(256, 32, 50, 50)
        self.relu3=nn.ReLU()
        #Shape=(256, 32, 50, 50)
        
        self.fc=nn.Linear(in_features=32*50*50, out_features=num_classes)
        
    #Feed forward function
    def forward(self, input):
        output=self.conv1(input)
        output=self.bn1(output)
        output=self.relu1(output)

        output=self.pool(output)

        output=self.conv2(output)
        ouput=self.relu2(output)

        output=self.conv3(output)
        output=self.bn3(output)
        output=self.relu3(output)

        #Above output will be in matrix form, with shape(256, 32, 50, 50)
        output=output.view(-1, 32*50*50)

        output=self.fc(output)

        return output

## 4. Loading Saving Best Model 

In [12]:
#Loading and evaluate the best_checkpoint.model
checkpoint = torch.load('best_checkpoint.model')
model=ConvNet(num_classes = 2)
#loading model 
model.load_state_dict(checkpoint)
model.eval()

ConvNet(
  (conv1): Conv2d(3, 12, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (bn1): BatchNorm2d(12, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu1): ReLU()
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(12, 20, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu2): ReLU()
  (conv3): Conv2d(20, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (bn3): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu3): ReLU()
  (fc): Linear(in_features=80000, out_features=2, bias=True)
)

## 5. Image Transformation

Transforming images is to defines the input images's standard which will be fed to build the nueral network model for training. Image Transformation can improve the performance, robustness, and memory efficiency of the model. Same as Training part, we use some transformer function such as Resize, ToTensor, and Normalize.

In [13]:
#Transforms
transformer = transforms.Compose([
    transforms.Resize((100, 100)), 
    transforms.ToTensor(), #0-255 to 0-1, numpy to tensors form
    transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]) # 0-1 to [-1, 1], result = x-mean/standard deviation
])

## 6. Prediction Function 

In [24]:
#prediction function
def prediction(img_path, transformer):
    
    #Get image from prediction
    image=Image.open(img_path)
    
    #Transform image using Transform method
    image_tensor=transformer(image).float()
    
    image_tensor=image_tensor.unsqueeze_(0)
    #Checking Device
    if torch.cuda.is_available():
        image_tensor.cuda()
    #Input image for prediction as variable  
    input = Variable(image_tensor)
    #Taking image into Cnns Model process
    output = model(input)
    
    #Get index by finding average maximum
    index = output.data.numpy().argmax()
    #Predict
    predict = classes[index]
    
    return predict

In [25]:
#Get Image and add .jpg
images_path=glob.glob(prediction_path + '/*.jpg')

In [26]:
predicting = {}

#Making Prediction all the prediction image dataset
for i in images_path:
    predicting[i[i.rfind('/')+1:]]=prediction(i, transformer)

## 7. Result

For the Prediction result, we get 23 corrects in 30 images.

In [27]:
predicting

{'C:\\Users\\User\\Desktop\\Recognition\\Prediction\\Prediction\\cat1.jpg': 'dogs',
 'C:\\Users\\User\\Desktop\\Recognition\\Prediction\\Prediction\\cat10.jpg': 'cats',
 'C:\\Users\\User\\Desktop\\Recognition\\Prediction\\Prediction\\cat11.jpg': 'cats',
 'C:\\Users\\User\\Desktop\\Recognition\\Prediction\\Prediction\\cat12.jpg': 'cats',
 'C:\\Users\\User\\Desktop\\Recognition\\Prediction\\Prediction\\cat13.jpg': 'cats',
 'C:\\Users\\User\\Desktop\\Recognition\\Prediction\\Prediction\\cat14.jpg': 'cats',
 'C:\\Users\\User\\Desktop\\Recognition\\Prediction\\Prediction\\cat15.jpg': 'cats',
 'C:\\Users\\User\\Desktop\\Recognition\\Prediction\\Prediction\\cat2.jpg': 'dogs',
 'C:\\Users\\User\\Desktop\\Recognition\\Prediction\\Prediction\\cat3.jpg': 'cats',
 'C:\\Users\\User\\Desktop\\Recognition\\Prediction\\Prediction\\cat4.jpg': 'dogs',
 'C:\\Users\\User\\Desktop\\Recognition\\Prediction\\Prediction\\cat5.jpg': 'cats',
 'C:\\Users\\User\\Desktop\\Recognition\\Prediction\\Prediction\\cat6.