# Dogs vs. Cats Classification

## Data Analysis:
The data provided contains two folders: 
1. train.zip - Contains 25,000 images of cats and dogs and should be used for training the model we build.
2. test1.zip - Contains 12,500 images which must be classified into dogs or cats based on the information we collect and the model we build. 

### Train Data
In the training data, we observe that the name of each image contains information regarding which class that image belongs to. For example, the file "cat.001.jpg", belongs to the class "cat" and the file "dog.001.jpg" belongs to the class "dog".

### Test Data
In the test data, the name of the files do not contain any information regarding the content of that file and it is our job to predict which class(cat/dog) the image belongs to.

# So, why Convolutional Neural Networks?

Why CNNs and why not a vanilla neural network? 

The general applicability of neural networks is one of their advantages, but this advantage turns into a liability when dealing with images. The convolutional neural networks make a conscious tradeoff: if a network is designed for specifically handling the images, some generalizability has to be sacrificed for a much more feasible solution.

If you consider any image, proximity has a strong relation with similarity in it and convolutional neural networks specifically take advantage of this fact. This implies, in a given image, two pixels that are nearer to each other are more likely to be related than the two pixels that are apart from each other. Nevertheless, in a usual neural network, every pixel is linked to every single neuron. Regular Neural Nets don’t scale well to full images. In CIFAR-10, images are only of size 32x32x3 (32 wide, 32 high, 3 color channels), so a single fully-connected neuron in a first hidden layer of a regular Neural Network would have 32*32*3 = 3072 weights. This amount still seems manageable, but clearly this fully-connected structure does not scale to larger images. For example, an image of more respectable size, e.g. 200x200x3, would lead to neurons that have 200*200*3 = 120,000 weights. Moreover, we would almost certainly want to have several such neurons, so the parameters would add up quickly! Clearly, this full connectivity is wasteful and the huge number of parameters would quickly lead to overfitting.

By killing a lot of these less significant connections, convolution solves this problem. In technical terms, convolutional neural networks make the image processing computationally manageable through filtering the connections by proximity. In a given layer, rather than linking every input to every neuron, convolutional neural networks restrict the connections intentionally so that any one neuron accepts the inputs only from a small subsection of the layer before it(say like 5*5 or 3*3 pixels). Hence, each neuron is responsible for processing only a certain portion of an image.(Incidentally, this is almost how the individual cortical neurons function in your brain. Each neuron responds to only a small portion of your complete visual field).


In [None]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

import os
#We see the contents of the root directory
print("Root directory contains: ")
print(os.listdir("../input"))


# Importing dependencies
We use the following packages:
1. Matplotlib - to plot the training and validation accuracy/loss graphs. Matplotlib is extremely effective for visualize any aspect of the data.
2. tqdm - used for showing the progress in your loops
3. TensorFlow - a gradient-based deep learning library that works with Python and has the capability of using GPUs for computation
4. Keras - a deep learning library capable of running on top of TensorFlow.

In [None]:
import matplotlib.pyplot as plt
from tqdm import tqdm
import cv2
import tensorflow as tf
from keras.datasets import mnist
from keras import models
from keras import layers
from keras import optimizers
from keras import Sequential
from keras.layers import Dense,MaxPooling2D,Conv2D,Flatten,Dropout, Activation, BatchNormalization


In [None]:
#Declaring the path to the train and test dat
train_path = '../input/train/train'
test_path = '../input/test1/test1'

We create explicit labels for the training data by obtaining the substring from the file name.
If a file name starts with "cat", we assign it to class "0"
If a file name starts with "dog", we assign it to class "1"

In [None]:
#Initialize two lists for the data and labels respectively
label=[]
data=[]

#Loop iterating over each file in the training folder
for file in tqdm(os.listdir(train_path)):
    #Reading every image and converting it to grayscale
    image=cv2.imread(os.path.join(train_path,file), cv2.IMREAD_GRAYSCALE)
    #Resizing the image into a manageable size
    image=cv2.resize(image,(96,96))
    #If a file name starts with "cat"
    if file.startswith("cat"):
        label.append(0)
    elif file.startswith("dog"):
        label.append(1)
    try:
        data.append(image/255) 
    except:
        label=label[:len(label)-1]

In [None]:
#Converting our data and labels into numpy arrays
train_data=np.array(data)
train_labels=np.array(label)

print (train_data.shape)
print (train_labels.shape)

In [None]:
#Displaying the first image along with the class it belongs to

plt.imshow(train_data[0], cmap='gray')
plt.title('Class '+ str(train_labels[0]))

In [None]:
#Reshaping our data from a 96x96 array into a 96,96,1 array
train_data = train_data.reshape((train_data.shape)[0],(train_data.shape)[1],(train_data.shape)[2],1)
print(train_data.shape)
print(train_labels.shape)

## General Architecture of a CNN
INPUT -  [96x96x1] will hold the raw pixel values of the image, in this case an image of width 96, height 96.

CONV - layer will compute the output of neurons that are connected to local regions in the input, each computing a dot product between their weights and a small region they are connected to in the input volume. This may result in volume such as [96x96x12] if we decided to use 12 filters.

RELU - This layer will apply an elementwise activation function, such as the max(0,x) thresholding at zero. This leaves the size of the volume unchanged ([96x96x12]).

POOL - This layer will perform a downsampling operation along the spatial dimensions (width, height), resulting in volume such as [16x16x12].

FC - (i.e. fully-connected) layer will compute the class scores, resulting in volume of size [1x1x10], where each of the 10 numbers correspond to a class score, such as among the 10 categories of CIFAR-10. As with ordinary Neural Networks and as the name implies, each neuron in this layer will be connected to all the numbers in the previous volume.

## My Model
Total number of layers = 11

Input - [96x96x1]

Layer 1 - Convolution layer with 32 filters , kernel size (3,3) and an activation function as RELU.

Layer 2 - Convolution layer with 64 filters, kernel size (3,3), padding = same and activation function as RELU.

Layer 3 - Max Pooling layer with pool size = (5,5) and strides = (2,2)

Layer 4 - Convolution layer with 10 filters and, kernel size (3,3) and an activation of RELU.

Layer 5 - Convolution layer with 5 filters, kernel size (3,3) and an activation function of RELU.

Layer 6 - Max Pooling layer with pool size = (3,3) and strides = (2,2)

Layer 7 - Convolution layer with 10 filters, kernel size = (2,2) and strides  = 2

Layer 8 - Flatten layer to flatten the vector

Layer 9 - Dropout layer with a dropout rate of 30%. A random 30% of the pixels are initialized to zero.

Layer 10 - Fully connected layer with 100 nodes and activation as Sigmoid

Layer 11 - Output layer with Sigmoid Activation

In [None]:
#Creating the model
model = Sequential()
input_shape = (96,96,1)
model.add(Conv2D(kernel_size=(3,3),filters=32,input_shape=input_shape,activation="relu"))
model.add(Conv2D(kernel_size=(3,3),filters=64,activation="relu",padding="same"))
model.add(MaxPooling2D(pool_size=(5,5),strides=(2,2)))

model.add(Conv2D(kernel_size=(3,3),filters=10,activation="relu"))
model.add(Conv2D(kernel_size=(3,3),filters=5,activation="relu"))

model.add(MaxPooling2D(pool_size=(3,3),strides=(2,2)))

model.add(Conv2D(kernel_size=(2,2),strides=(2,2),filters=10))

model.add(Flatten())

model.add(Dropout(0.3))

model.add(Dense(100,activation="sigmoid"))
model.add(Dense(1,activation="sigmoid"))

In [None]:

model.summary()

In [None]:
#We use the ADADELTA optimization on the binary crossentropy loss function
model.compile(optimizer="adadelta",loss="binary_crossentropy",metrics=["accuracy"])

In [None]:
# We try out the model on the training data.
# Train data has been split. 25% of the training data has been kept aside for Validation. 
# We run the fit function for 20 epochs

model_history = model.fit(train_data,train_labels,validation_split=0.25,epochs=20,batch_size=10)

In [None]:
#Visualizing accuracy and loss of training the model
history_dict=model_history.history

#Test Accuracy
train_acc = history_dict['acc']
#Validation Accuracy
val_acc = history_dict['val_acc']

epochs =range(1,len(train_acc)+1)
#Plottig the training and validation loss
plt.plot(epochs, val_acc, 'bo', label='Validation Accuracy')
plt.plot(epochs, train_acc, 'b', label='Train Accuracy')
plt.title('Train and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()

In [None]:
#Training loss
train_loss = history_dict['loss']
#Validation Loss
val_loss = history_dict['val_loss']

epochs =range(1,len(train_loss)+1)
#Plottig the training and validation loss
plt.plot(epochs, val_loss, 'bo', label='Validation Loss')
plt.plot(epochs, train_loss, 'b', label='Training Loss')
plt.title('Train and Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

In [None]:
test_data=[]
id=[]
for file in tqdm(os.listdir(test_path)):
    image_data=cv2.imread(os.path.join(test_path,file), cv2.IMREAD_GRAYSCALE)
    try:
        image_data=cv2.resize(image_data,(96,96))
        test_data.append(image_data/255)
        id.append((file.split("."))[0])
    except:
        print("")

In [None]:
test_data1=np.array(test_data)
print (test_data1.shape)
test_data1=test_data1.reshape((test_data1.shape)[0],(test_data1.shape)[1],(test_data1.shape)[2],1)

In [None]:
dataframe_output=pd.DataFrame({"id":id})

In [None]:
predicted_labels=model.predict(test_data1)
predicted_labels=np.round(predicted_labels,decimals=2)
print(predicted_labels)
labels=[1 if value>0.5 else 0 for value in predicted_labels]

In [None]:
dataframe_output["label"]=labels
dataframe_output.to_csv("submission.csv",index=False)