In this project images of cats and dogs are processed through Convolutional Neural Networks. 
The data set used is : https://www.kaggle.com/c/dogs-vs-cats-redux-kernels-edition/data

We do some preprocessing first. We make all the images of the same size and turn them into grayscale images. We use hot-rod arrays for distinguishing cats and dogs

# Package Requirements

numpy, tqdm and tensorflow

In [1]:
pip install numpy


The following command must be run outside of the IPython shell:

    $ pip install numpy

The Python package manager (pip) can only be used from outside of IPython.
Please reissue the `pip` command in a separate terminal or command prompt.

See the Python documentation for more informations on how to install packages:

    https://docs.python.org/3/installing/


In [2]:
pip install tqdm


The following command must be run outside of the IPython shell:

    $ pip install tqdm

The Python package manager (pip) can only be used from outside of IPython.
Please reissue the `pip` command in a separate terminal or command prompt.

See the Python documentation for more informations on how to install packages:

    https://docs.python.org/3/installing/


In [3]:
pip install --upgrade tensorflow


The following command must be run outside of the IPython shell:

    $ pip install --upgrade tensorflow

The Python package manager (pip) can only be used from outside of IPython.
Please reissue the `pip` command in a separate terminal or command prompt.

See the Python documentation for more informations on how to install packages:

    https://docs.python.org/3/installing/


Once you have tensorflow installed you have to install tflearn using pip install tflearn.
Then we get our imports and constants for preprocessing

In [4]:
import cv2 
import numpy as np 
import os 
from random import shuffle
from tqdm import tqdm

In [5]:
TRAIN_DIR = 'train'
TEST_DIR = 'test/'
IMG_SIZE = 50
LR = 1e-3

MODEL_NAME = 'dogsvscats-{}-{}.model'.format(LR,'2conv-basic') #just so we remember which saved model is which

Differentiation between dogs and cats are done using this function. It returns a hot rod array

In [6]:
def label_img(img):
	word_label = img.split('.')[-3]
	if word_label == 'cat':
		return [1,0]
	elif word_label == 'dog':
		return [0,1]
    

Now we make a feature set of all the images and the label with the below function

In [7]:
def create_train_data():
	training_data = []
	for img in tqdm(os.listdir(TRAIN_DIR)):
		label = label_img(img)
		path = os.path.join(TRAIN_DIR,img)
		img = cv2.imread(path,cv2.IMREAD_GRAYSCALE)
		img = cv2.resize(img,(IMG_SIZE,IMG_SIZE))
		training_data.append([np.array(img),np.array(label)])
	shuffle(training_data)
	np.save('train_data.npy',training_data)
	return training_data

This is the function for making a testing data feature set. This has no label

In [8]:
def process_test_data():
	testing_data = []
	for img in tqdm(os.listdir(TEST_DIR)):
		path = os.path.join(TEST_DIR,img)
		img_num = img.split('.')[0]
		img = cv2.imread(path,cv2.IMREAD_GRAYSCALE)
		img = cv2.resize(img,(IMG_SIZE,IMG_SIZE))
		testing_data.append([np.array(img),img_num])
	shuffle(testing_data)
	np.save('test_data.npy',testing_data)
	return testing_data

Now you can run the train and test data using:

In [9]:
train_data = create_train_data()
test_data = process_test_data()

100%|██████████| 25000/25000 [00:59<00:00, 418.18it/s]
100%|██████████| 12500/12500 [00:48<00:00, 260.15it/s]


If you alraedy have saved the train data and test data then use:

In [10]:
#train_data = np.load('train_data.npy')
#test_data = np.load('test_data.npy')

Next we define our neural network and imports

In [11]:
import tflearn
from tflearn.layers.conv import conv_2d,max_pool_2d
from tflearn.layers.core import input_data,dropout,fully_connected
from tflearn.layers.estimator import regression

In [12]:
convnet = input_data(shape=[None,IMG_SIZE,IMG_SIZE,1],name='input')

convnet = conv_2d(convnet,32,5,activation='relu')
convnet = max_pool_2d(convnet,5)

convnet = conv_2d(convnet,64,5,activation='relu')
convnet = max_pool_2d(convnet,5)

convnet = fully_connected(convnet,1024,activation='relu')
convnet = dropout(convnet,0.8)

convnet = fully_connected(convnet,2,activation='softmax')
convnet = regression(convnet,optimizer='adam',learning_rate=LR,loss='categorical_crossentropy',name='targets')

model = tflearn.DNN(convnet,tensorboard_dir='log')

Using saved model

In [None]:
if os.path.exists('{}.meta'.format(MODEL_NAME)):
    model.load(MODEL_NAME)
    print("Model loaded")

In [None]:
train = train_data[:-500]
print(train)
test = test_data[:-500]

X = np.array([i[0] for i in train]).reshape(-1,IMG_SIZE,IMG_SIZE,1)
Y = [i[1] for i in train]

test_x = np.array([i[0] for i in test]).reshape(-1,IMG_SIZE,IMG_SIZE,1)
test_y = [i[1] for i in train]

model.fit({'input':X},{'targets':Y},n_epoch=3,validation_set=({'input':test_x},{'targets':test_y}),snapshot_step=500,show_metric=True,run_id=MODEL_NAME)

[[array([[117, 118, 118, ..., 110, 112, 112],
       [125, 123, 120, ..., 116, 114, 119],
       [124, 125, 119, ..., 120, 119, 117],
       ..., 
       [132, 167, 164, ..., 151, 160, 169],
       [169, 160, 171, ..., 165, 159, 123],
       [161, 184, 172, ..., 159, 135, 122]], dtype=uint8), array([0, 1])], [array([[250, 250, 249, ..., 252, 254, 248],
       [250, 250, 252, ..., 252, 250, 243],
       [246, 246, 250, ..., 252, 253, 246],
       ..., 
       [244, 244, 237, ..., 241, 237, 247],
       [244, 244, 243, ..., 234, 245, 247],
       [242, 242, 243, ..., 253, 249, 249]], dtype=uint8), array([0, 1])], [array([[ 29,  29,  29, ...,   3,   4,   1],
       [ 31,  31,  31, ...,   3,   9,  11],
       [ 32,  29,  31, ...,   4,   6,   3],
       ..., 
       [176, 185, 196, ...,  70,  74,  69],
       [181, 194, 198, ...,  70,  71,  71],
       [175, 186, 199, ...,  70,  65,  69]], dtype=uint8), array([1, 0])], [array([[ 54,  62,  59, ...,  86,  72,  64],
       [ 41,  39,  44, ...,

       [ 74, 119, 151, ...,  30,  32,  42]], dtype=uint8), array([0, 1])]]
---------------------------------
Run id: dogsvscats-0.001-2conv-basic.model
Log directory: log/
INFO:tensorflow:Summary name Accuracy/ (raw) is illegal; using Accuracy/__raw_ instead.
---------------------------------
Training samples: 24500
Validation samples: 12000
--
