# Dogs v/s Cats Redux
### This notebook is for classification of photos into dogs and cats

Below are the python packages needed.

In [1]:
import os
import cv2
import numpy as np
import random
from tqdm import tqdm

TRAIN_DIR = "../input/train"
TEST_DIR = "../input/test"
IMG_SIZE = 50
LR = 1e-3

MODEL_NAME = "dogsvscats-{}-{}.model".format(LR, "2conv-basic")

I've taken one-hot classification labels based on closeness to the features of cat or dog. That means a vector of 2 dimensions `[x, y]`, where `x` is how close to be a cat in that photo and `y` is how close to be a dog it is. So, for to be classified as cat `[1, 0]` and to be dog `[0, 1]`

In [2]:
def label_img(img):
    word_label = img.split(".")[0]
    if word_label == "cat":
        return [1, 0]
    elif word_label == "dog":
        return [0, 1]

Let's create training data.

In [3]:
def create_train_data():
    training_data = []
    for img in tqdm(os.listdir(TRAIN_DIR)):
        label = label_img(img)
        path = os.path.join(TRAIN_DIR, img)
        img_resized = cv2.resize(cv2.imread(path, cv2.IMREAD_GRAYSCALE), (IMG_SIZE, IMG_SIZE))
        training_data.append([np.array(img_resized), np.array(label)])
    random.shuffle(training_data)
    np.save("training_data.npy", training_data)
    return training_data

Then let's process test data.

In [4]:
def process_test_data():
    testing_data = []
    for img in tqdm(os.listdir(TEST_DIR)):
        path = os.path.join(TEST_DIR, img)
        img_num = img.split(".")[0]
        img_resized = cv2.resize(cv2.imread(path, cv2.IMREAD_GRAYSCALE), (IMG_SIZE, IMG_SIZE))
        testing_data.append([np.array(img_resized), img_num])
    np.save("testing_data.npy", testing_data)
    return testing_data

If we saved already, we can just load from the saved numpy file.

In [5]:
train_data = create_train_data()
# train_data = np.load("training_data.npy")

100%|██████████| 25000/25000 [00:33<00:00, 742.87it/s]


Now, I've defined a convolutional network model created in tflearn, with tensorflow backend framework. 

**Two convolutional layers with ReLU activation, optimized by AdamOptimizer,  Categorical Cross-entropy loss function, and with the learnng rate of 0.001**

In [6]:
import tflearn
import tensorflow as tf
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression

tf.reset_default_graph()

convnet = input_data(shape=[None, IMG_SIZE, IMG_SIZE, 1], name='input')

convnet = conv_2d(convnet, 32, 2, activation='relu')
convnet = max_pool_2d(convnet, 2)

convnet = conv_2d(convnet, 64, 2, activation='relu')
convnet = max_pool_2d(convnet, 2)

convnet = fully_connected(convnet, 1024, activation='relu')
convnet = dropout(convnet, 0.8)

convnet = fully_connected(convnet, 2, activation='softmax')
convnet = regression(convnet, optimizer='adam', learning_rate=LR, loss='categorical_crossentropy', name='targets')

model = tflearn.DNN(convnet, tensorboard_dir='log')

  from ._conv import register_converters as _register_converters


Instructions for updating:
Use the retry module or similar alternatives.
Instructions for updating:
Use tf.initializers.variance_scaling instead with distribution=uniform to get equivalent behavior.


Let's quickly check what we have saved. We can see metadata of our notebooks and also other saved files.

In [7]:
print(os.listdir("."))

['.ipynb_checkpoints', '__notebook_source__.ipynb', 'training_data.npy']


If we have already saved the model as checkpoint, then we can just simply load it up and keep working. 

In [8]:
if os.path.exists("{}.meta".format(MODEL_NAME)):
    model.load(MODEL_NAME)
    print("Model loaded")

Let's split for validation stats.

In [9]:
train = train_data[:-500]
test = train_data[-500:]

In [10]:
X = np.array([i[0] for i in train]).reshape(-1, IMG_SIZE, IMG_SIZE , 1)
y = [i[1] for i in train]
test_X = np.array([i[0] for i in test]).reshape(-1, IMG_SIZE, IMG_SIZE , 1)
test_y = [i[1] for i in test]
# X.shape, test_X.shape

Now, let's fit the model and see our validation scores and accuracies w.r.t each epoch as well as overall.

In [None]:
model.fit({"input": X}, 
          {"targets": y}, 
          n_epoch = 5, 
          validation_set = ({"input": test_X}, {"targets": test_y}),
          snapshot_step = 500, 
          show_metric = True, 
          run_id = MODEL_NAME)

---------------------------------
Run id: dogsvscats-0.001-2conv-basic.model
Log directory: log/
INFO:tensorflow:Summary name Accuracy/ (raw) is illegal; using Accuracy/__raw_ instead.
