# Dogs vs Cats Classification

This is an image classification task of classifying dogs and cats. <br />
Dataset can be downloaded from [here](https://www.kaggle.com/c/dogs-vs-cats/data) <br />
Please split dataset (both cat & dog label) into different folders, take the 1st 10k images of each label to train and the rest go to validation. <br />

In [1]:
import os, glob
import numpy as np
import pandas as pd
from PIL import Image
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from keras.preprocessing.image import load_img, ImageDataGenerator
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense
from keras.optimizers import SGD

Using TensorFlow backend.


In [18]:
train_datagen = ImageDataGenerator(rescale=1./255,
                                    rotation_range=40,
                                    width_shift_range=0.2,
                                    height_shift_range=0.2,
                                    shear_range=0.2,
                                    zoom_range=0.2,
                                    horizontal_flip=True,
                                    fill_mode='nearest')
val_datagen = ImageDataGenerator(rescale=1./255)

In [19]:
train_gen = train_datagen.flow_from_directory("train/",
                                             target_size=(150,150),
                                             batch_size=20,
                                             class_mode="binary")

Found 20000 images belonging to 2 classes.


In [20]:
val_gen = val_datagen.flow_from_directory("validation",
                                             target_size=(150,150),
                                             batch_size=20,
                                             class_mode="binary")

Found 5000 images belonging to 2 classes.


In [21]:
for data_batch, labels_batch in train_gen:
    print(data_batch.shape, labels_batch.shape)
    break

(20, 150, 150, 3) (20,)


In [22]:
model = Sequential()
model.add(Conv2D(32, (3,3), activation="relu", input_shape=(150,150,3)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(64, activation="relu"))
model.add(Dense(1, activation="sigmoid"))

model.compile(optimizer=SGD(lr=0.002, momentum=0.8), loss="binary_crossentropy", metrics=["accuracy"])
model.summary()

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_2 (Conv2D)            (None, 148, 148, 32)      896       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 74, 74, 32)        0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 175232)            0         
_________________________________________________________________
dense_3 (Dense)              (None, 64)                11214912  
_________________________________________________________________
dense_4 (Dense)              (None, 1)                 65        
Total params: 11,215,873
Trainable params: 11,215,873
Non-trainable params: 0
_________________________________________________________________


In [23]:
history = model.fit(train_gen, steps_per_epoch=100, epochs=10, validation_data=val_gen, validation_steps=50)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [24]:
acc_aug = history.history["accuracy"]
loss_aug = history.history["loss"]
print(np.mean(acc_aug[5:10]), np.mean(loss_aug))

0.5468 0.68962381118536
