# Dogs v/s Cats Classification

This is a classification problem which consists of 25000 images of dogs and cats in jpeg format. Our task is to correctly classify them as dogs and cats using convolutional neural networks.

## Importing Libraries

We will be using keras framework for implementing our model

In [1]:
import keras
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout, Conv2D, Flatten, MaxPooling2D
from keras.utils import to_categorical
from sklearn.metrics import classification_report
import cv2
from tqdm import tqdm

Using TensorFlow backend.


In [2]:
import os
image_ids = os.listdir('../input/train/train/')

As the data consists of images named as "dog.1.jpg", so we will store the image information in the x_train list and the category into y_train list.

In [3]:
x_train = []
y_train = []
for i in tqdm(image_ids):
    category = i.split(".")[0]
    if category == "dog":
        y_train.append(1)
    else:
        y_train.append(0)
        
    img_arr = cv2.imread("../input/train/train/"+i, cv2.IMREAD_GRAYSCALE)
    img_arr = cv2.resize(img_arr, dsize=(128, 128))
    x_train.append(img_arr)

100%|██████████| 25000/25000 [00:43<00:00, 580.42it/s]


In [4]:
x_train = np.array(x_train)
x_train.shape

(25000, 128, 128)

In [6]:
x_train = x_train/255
x_train = x_train.reshape(-1, 128, 128, 1)

In [7]:
import pickle
f = open("x_train.pickle", "wb")
pickle.dump(x_train, f)
f.close()
f = open("y_train.pickle", "wb")
pickle.dump(y_train,f)
f.close()

## Defining our model

In [8]:
model = Sequential()
model.add(Conv2D(4,(3,3),strides=1, padding='valid', activation = 'relu', input_shape = x_train.shape[1:]))
model.add(MaxPooling2D(pool_size = (2,2), strides=2))

model.add(Conv2D(16,(3,3), activation = 'relu', strides=1, padding="valid"))
model.add(MaxPooling2D(pool_size = (2,2), strides=2))

model.add(Conv2D(32, (3,3), activation="relu", strides=1, padding="valid"))
model.add(MaxPooling2D(pool_size=(2,2), strides=2))

model.add(Conv2D(64, (3,3), activation="relu", strides=1, padding="valid"))
model.add(MaxPooling2D(pool_size=(2,2), strides=2))

model.add(Flatten())
model.add(Dense(512, activation="relu"))
model.add(Dense(128, activation='relu'))
model.add(Dense(64, activation='sigmoid'))

model.add(Dense(1, activation='sigmoid'))

Instructions for updating:
Colocations handled automatically by placer.


In [9]:
model.compile(optimizer="adam",
              loss='binary_crossentropy',
              metrics=['accuracy'])

In [10]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 126, 126, 4)       40        
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 63, 63, 4)         0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 61, 61, 16)        592       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 30, 30, 16)        0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 28, 28, 32)        4640      
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 14, 14, 32)        0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 12, 12, 64)        18496     
__________

In [11]:
model.fit(x_train, y_train, epochs=7, batch_size=32, validation_split=0.2)

Instructions for updating:
Use tf.cast instead.
Train on 20000 samples, validate on 5000 samples
Epoch 1/7
Epoch 2/7
Epoch 3/7
Epoch 4/7
Epoch 5/7
Epoch 6/7
Epoch 7/7


<keras.callbacks.History at 0x7f03617a4630>

In [12]:
f = open("model.pickle", "wb")
pickle.dump(model, f)
f.close()

In [13]:
x_test = []
test_files = os.listdir("../input/test1/test1/")
for i in tqdm(test_files):    
    img_arr = cv2.imread("../input/test1/test1/"+i, cv2.IMREAD_GRAYSCALE)
    img_arr = cv2.resize(img_arr, dsize=(128, 128))
    x_test.append(img_arr)

100%|██████████| 12500/12500 [00:21<00:00, 588.67it/s]


In [15]:
x_test = np.array(x_test)/255
x_test = x_test.reshape(-1, 128, 128, 1)

In [16]:
x_test.shape

(12500, 128, 128, 1)

In [17]:
predictions = model.predict(x_test)

In [21]:
results = []
for i in predictions:
    if(i>0.5):
        results.append(1)
    else:
        results.append(0)

In [22]:
df = pd.DataFrame({"id":[i+1 for i in range(12500)], 
                   "lable" : [p for p in results]})

In [31]:
df.to_csv("submission.csv",index=False)