# Cat vs Dog Recognition from Images using Deep Learning, Keras and Microsoft Kaggle Dataset

## Part 1: Preparing the Data for Training Neural Network
For this project, we will be using Cat and Dog image datasets from the Microsoft Kaggle inventory.<br>
Download Link: https://www.microsoft.com/en-us/download/confirmation.aspx?id=54765

### Importing Libraries
You might need to install OpenCV in your machine if its not available. You can run "pip install opencv-python" to install it in your machine

In [12]:
import numpy as np
import matplotlib.pyplot as plt
import os
import cv2
import random
import pickle

In [3]:
DATADIR = "F:\Computer Vision\Cat and Dog Recognition using Deep Learning and Kaggle Datasets\images"
IMG_SIZE = 50

### Reading the Images from Folders and Storing Them in Datasets Array

In [4]:
class_labels = []
datasets = []

for class_name in os.listdir(DATADIR):
    class_labels.append(class_name)
    class_num = class_labels.index(class_name)
    
    for image in os.listdir(os.path.join(DATADIR, class_name)):
        try:
            image_data = cv2.imread(os.path.join(DATADIR, class_name, image), cv2.IMREAD_GRAYSCALE)
            resized_image = cv2.resize(image_data, (IMG_SIZE, IMG_SIZE))
            datasets.append([resized_image, class_num])
        except Exception as e:
            pass
        

### Shuffling Datasets
We need to shuffle the datasets as while reading the images, we are reading all the dogs image at first and then reading all the cat images. So, in order to train our neural network properly, we need to shuffle the images for better results.

In [9]:
random.shuffle(datasets)

### Separating Features and Labels in Seperate Arrays

In [10]:
X = []
y = []

for features,label in datasets:
    X.append(features)
    y.append(label)

### Converting Features to Numpy Array and Reshaping for Feeding into Convolutional Layer 

In [11]:
X = np.array(X).reshape(-1, IMG_SIZE, IMG_SIZE, 1)

### Saving the Features and Labels in File for Further Training of the Neural Network
We will save the features and labels in separate files so that we don't need to prepare our datasets everytime we need to run training codes

In [13]:
output_file = open("features.pickle", "wb")
pickle.dump(X, output_file)
output_file.close()

output_file = open("labels.pickle", "wb")
pickle.dump(y, output_file)
output_file.close()