#                                           Animal Identifier

This algorithm classifies whether the given image contains a cat or a dog.

![catdog](https://upload.wikimedia.org/wikipedia/en/thumb/6/64/CatDog.jpeg/250px-CatDog.jpeg)
Is this a cat, or a dog?

In this program, we will train a convolutional neural network with a training set that contains 20,000 images. Then it will be tested using a different dataset which contains 5,000 unique images.

We will start off by importing the required libraries.

### Importing libraries

In [None]:
%matplotlib inline

import matplotlib.pyplot as plt
import os
from sklearn.metrics import accuracy_score
from skimage import io, transform
from PIL import Image
import cv2
import numpy as np
from keras.models import Sequential # Initialise our neural network model as a sequential network
from keras.layers import Conv2D # Convolution operation
from keras.layers import Activation#Applies activation function
from keras.layers import Dropout#Prevents overfitting by randomly converting few outputs to zero
from keras.layers import MaxPooling2D # Maxpooling function
from keras.layers import Flatten # Converting 2D arrays into a 1D linear vector
from keras.layers import Dense # Regular fully connected neural network
from keras.models import model_from_json #Save and load the trained model
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
from IPython.display import display

In [None]:
def reshaped_image(image):
    return transform.resize(image,(50, 50, 3)) #Standardizing all images to fixed shape.

### Function - load_images_from_folder()
The name of this funtion is quite intutive and does exactly what the name suggests. This function loads the images from the database and creates two numpy arrays - one for the images, and another for their labels.

The labels are manually one hot encoded, i.e., if the image has a cat in it, then the label would be a list which looks like this [1,0] or [0,1] if it contains a dog.
The images are loaded using imread() from opencv2 library.
The reshaped image is then appended to the image list.
The label is appended to the labels list.
Both lists are converted into numpy arrays before returning.

In [None]:
def load_images_from_folder():
    Images = os.listdir("./dogscats/train/animals/")
    images = []
    labels = []
    for image in Images:
            label = [0,0] # [cat,dog]
            path = os.path.join("./dogscats/train/animals/", image)
            img = cv2.imread(path)
            images.append(reshaped_image(img))
            if image.find('cat') != -1:
                label = [1,0] 
                labels.append(label)
            elif image.find('dog') != -1:
                label = [0,1] 
                labels.append(label)
                
    return np.array(images), np.array(labels)

In [None]:
def train_test_split(train_data, train_labels, fraction):
    index = int(len(train_data)*fraction)
    return train_data[:index], train_labels[:index], train_data[index:], train_labels[index:]

### Defining the C.N.N.
Convolutional neural network (C.N.N.) are a class of deep, feed-forward neural networks used to analyze imagery. Small filter is passed over the image which maps out certain features and this process is known as convolution. Look at the following gif where the blue square us the image and the green square is the filter.
![url](https://cdn-images-1.medium.com/max/750/0*1PSMTM8Brk0hsJuF.)

#### Types of layers in this C.N.N.

1.Convolution Layer: The convolution layer is the main building block of a convolutional neural network and it comprises of a set of independent filters and each filter is independently convolved with the image generating unique feature maps.

2.Activation Layer: Applies the specified activation function on the outputs of the previous layer. We have used RELU and sigmoid.

3.Pooling layer: Progressively reduces the spatioal size of the representation and hence reduces the number of parameters and amount of computing.

![Max pooling](http://cs231n.github.io/assets/cnn/maxpool.jpeg)


4.Flatten Layer: Flattens the input, i.e., changes its dimension.

5.Dense Layer: is the fully connected neural network.

6.Dropout Layer: prevents overfitting by randomly deactivating certain neurons in a layer.

The working of this C.N.N. is briefly exlained by the following gif:

![url](http://selmandesign.com/wp-content/uploads/2016/12/SelmanDesign_Q-A_CATorDOG-flow.gif)






In [None]:
def cnn_classifier():
    cnn = Sequential()
    cnn.add(Conv2D(32, (3,3), input_shape = (50, 50, 3), padding='same'))
    cnn.add(Activation('relu'))
    cnn.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
    
    cnn.add(Conv2D(32, (3,3), padding='same'))
    cnn.add(Activation('relu'))
    cnn.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
    
    cnn.add(Conv2D(64, (3,3), padding='same'))
    cnn.add(Activation('relu'))
    cnn.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
    
    cnn.add(Flatten())
    cnn.add(Dense(500))
    cnn.add(Activation('relu'))
    cnn.add(Dropout(0.5))
    cnn.add(Dense(2))
    cnn.add(Activation('sigmoid'))
    cnn.compile(optimizer = 'rmsprop', loss = 'binary_crossentropy', metrics = ['accuracy'])
    print(cnn.summary())
    return cnn

### Training the model
The dataset is split into training sets and testing sets.

Number of images in training set = 20,000

Number of images in testing set = 5,000

64 images would be propagated through the network at a time and the C.N.N. is trained for 30 epochs.


In [None]:
train_data, train_labels = load_images_from_folder()
fraction = 0.8
train_data, train_labels, test_data, test_labels = train_test_split(train_data, train_labels, fraction)
print ("Train data size = ", len(train_data))
print ("Test data size = ", len(test_data))

cnn = cnn_classifier()

idx = np.random.permutation(train_data.shape[0])
cnn.fit(train_data[idx], train_labels[idx], batch_size = 64, epochs = 30)
predicted_test_labels = np.argmax(cnn.predict(test_data), axis=1)
test_labels = np.argmax(test_labels, axis=1)

print ("Accuracy score = ", accuracy_score(test_labels, predicted_test_labels))
        

## Saving the model.
It would be sheer stupidity if you had to train your model everytime you had to use it. Therefore, Saving it is an effcient way to re use your model later.
The neural network is saved as json and the weights are stored in h5 fromat.

In [None]:
model_json = cnn.to_json()
with open("model.json", "w") as json_file:
    json_file.write(model_json)
# serialize weights to HDF5
cnn.save_weights("model.h5")
print("Saved model to disk")

## Testing the model on another image

In [None]:
image_path = "./dogscats/test1/9.jpg"
image = cv2.imread(image_path)

test_image = np.expand_dims(reshaped_image(image), axis = 0)
                        
result = cnn.predict(test_image)
plt.imshow(image)

if result[0][0] > result[0][1]:
    print("It is a cat.\nConfidence:",round(result[0][0]*100, 2), "%")
else:
    print("It is a Dog.\nConfidence:",round(result[0][1]*100, 2), "%")

## Conclusion
This C.N.N. is able to make predictions with an accuracy of 84%.