# Cats vs dogs

## Convnet example

### Data

The data is provided by kaggle, unfortunately the download link is not public, you will need to create an account at kaggle.com (it is free) and then download

https://www.kaggle.com/c/dogs-vs-cats/download/train.zip

Note that they file name is "train.zip" and it contains all the labeled data, we will need to split that into our train, validation and test set.

Once you have all the data, unzip it into a dir named "data" in the same directory of this jupyter notebook.


In [None]:
# Lets import some stuff
import tensorflow as tf
from skimage import io
from IPython.display import Image
#from skimage.transform import resize
import matplotlib.pyplot as plt
from os import listdir
from os import mkdir
#from skimage.io import imsave
import numpy as np
from sklearn.utils import shuffle
import sys
#import os
import cv2

#print("Installed version of tensorflow is ", tf.__version__)
print("Important! tensorflow MUST be 1.2 or higher for this to work fine...")

### Explore the dataset

The definition of kaggle dataset is 

"The training archive contains 25,000 images of dogs and cats. Train your algorithm on these files and predict the labels for test1.zip (1 = dog, 0 = cat)."

Notice! your direcory should look like this (once you have uncompressed the data)
```
.
├── cats-vs-dogs.ipynb
└── data
    └── train [25000 entries exceeds filelimit, not opening dir]
            ├── cat.2976.jpg
            ├── dog.2977.jpg
            ├── cat.2978.jpg
            ├── dog.2979.jpg
            ├── ...

```
Of course images named cat.xxx.jpg are cats and thesame goes for the ones started with dogs.

Lets open a few images


In [None]:
DATA_DIR = "data/train/"
Image(filename=DATA_DIR+'cat.42.jpg')


Also, lets show a dog

In [None]:
Image(filename=DATA_DIR+'dog.42.jpg')

The first obvious thing is that we are dealing here with images of different sizes, we need to make them the same size. Of course bigger sizes will mean more data, which means more processing time... For our example we will modify all images to be 100x100 pixels

In [None]:
IMAGE_WIDTH=100
IMAGE_HEIGHT=100

Lets read one cat image and transform it.

In [None]:
sample_cat_file = DATA_DIR+'cat.42.jpg'
original = cv2.imread(sample_cat_file)
print("Original shape is", original.shape)
transformed = cv2.resize(original, (IMAGE_WIDTH, IMAGE_HEIGHT))
print("Transformed shape is", transformed.shape)

Seems fine... lets convince ourselves it is the same image by displaying it

In [None]:
plt.imshow(cv2.cvtColor(transformed, cv2.COLOR_BGR2RGB))
plt.show()

The image has been resized, of course this means loosing some data, well, nobody is perfect :)
Now we will need to convert ALL our images and we will store them into a separated directory, we will use that directory for subsequent executions

In [None]:

CLEANED_DATA_DIR = DATA_DIR+"cleaned/"
try:
    mkdir(CLEANED_DATA_DIR)
except:
    pass

def convert_images():
    i = 1
    for image_file in listdir(DATA_DIR):
        if ".jpg" in image_file:
            original = cv2.imread(DATA_DIR+image_file)
            transformed = cv2.resize(original, (IMAGE_WIDTH, IMAGE_HEIGHT))
            final_file = CLEANED_DATA_DIR+image_file
            imsave(final_file, transformed)
            if i % 500 == 0:
                print("Converted ", i, " images so far...")
            i += 1
    print("Done!")

In [None]:
convert_images()

Now, we will read the new files and load them, we will load cats and dogs separately (just for convenience), then we will split them into train, validation and test set. 

* Train set 80% of the images
* Validation set 10% of the images
* Test set 10% of the images

In [None]:
CAT_LABEL = 1
DOG_LABEL = 0

def load_data(limit=None):
    """
    Gets all the data and returns
    X_train_cats, y_train_cats
    X_train_dogs, y_train_dogs
    X_validation_cats, y_validation_cats
    X_validation_dogs, y_validation_dogs
    X_test_cats, y_test_cats
    X_test_dogs, y_test_dogs
    """
    dogs = []
    cats = []
    i = 1
    if limit == None:
        limit = 50000
    for image_file in listdir(CLEANED_DATA_DIR):        
        if ".jpg" in image_file:
            if "dog" in image_file and len(dogs) < limit:
                raw_image = io.imread(CLEANED_DATA_DIR+image_file)
                dogs.append(raw_image)
            elif "cat" in image_file and len(cats) < limit:
                raw_image = io.imread(CLEANED_DATA_DIR+image_file)
                cats.append(raw_image)
        if i % 500 == 0:
            print("Loaded ", i, " images so far...")
        i+=1    
            
    return cats, dogs

In [None]:
cats, dogs = load_data(7500)

Now we need to split our data into train, validation and test sets. As there are as many cats as there are dogs, we want to keep our sets balanced, with approximately 50% of cats and 50% of dogs. 

In [None]:
train_limit = int(len(cats) * 0.8)
validation_limit = train_limit + int(len(cats) * 0.1)

X_train_cats = np.array(cats[:train_limit])
X_train_dogs = np.array(dogs[:train_limit])
X_validation_cats = np.array(cats[train_limit:validation_limit])
X_validation_dogs = np.array(dogs[train_limit:validation_limit])
X_test_cats = np.array(cats[validation_limit:])
X_test_dogs = np.array(dogs[validation_limit:])

### Now we need to create our labels, remember that we have dined our labels as variables 
```
CAT_LABEL = 1
DOG_LABEL = 0
```

Meaning that cats are labelled as 1 and dogs as 0s

In [None]:
y_train_cats = np.ones(len(X_train_cats), dtype=np.int8)
y_train_dogs = np.zeros(len(X_train_dogs), dtype=np.int8)
y_validation_cats = np.ones(len(X_validation_cats), dtype=np.int8)
y_validation_dogs = np.zeros(len(X_validation_dogs), dtype=np.int8)
y_test_cats = np.ones(len(X_test_cats), dtype=np.int8)
y_test_dogs = np.zeros(len(X_test_dogs), dtype=np.int8)

y_train_cats.shape

In [None]:
X_train = np.append(X_train_cats, X_train_dogs, axis=0)
y_train = np.append(y_train_cats, y_train_dogs)
X_validation = np.append(X_validation_cats, X_validation_dogs, axis=0)
y_validation = np.append(y_validation_cats, y_validation_dogs)
X_test = np.append(X_test_cats, X_test_dogs, axis=0)
y_test = np.append(y_test_cats, y_test_dogs)

### Lets convert labels into one-hot-encoded values

In [None]:
y_train_one_hot = np.eye(2)[y_train.reshape(-1)]
y_validation_one_hot = np.eye(2)[y_validation.reshape(-1)]
y_test_one_hot = np.eye(2)[y_test.reshape(-1)]

### Now, lets make sure the arrays are in the correct shape

In [None]:
print("X_train ", X_train.shape, " y_train_one_hot ", y_train_one_hot.shape)
print("X_validation ", X_validation.shape, " y_validation_one_hot ", y_validation_one_hot.shape)
print("X_test ", X_test.shape, " y_test_one_hot ", y_test_one_hot.shape)

### And finally shuffle the arrays so that our batches are not all dogs or all cats

In [None]:
X_train, y_train_one_hot, y_train = shuffle(X_train, y_train_one_hot, y_train, 
                                            random_state=0)
X_validation, y_validation_one_hot, y_validation = shuffle(X_validation, y_validation_one_hot, y_validation, 
                                                           random_state=0)
X_test, y_test_one_hot, y_test = shuffle(X_test, y_test_one_hot, y_test, 
                                 random_state=0)

In [None]:
print(y_train[0:10])
print(y_train_one_hot[0:10])

### Now, lets veryfy that we have shuffled correctly, we will display the first ten images of the train set and check that against the first 10 labels

In [None]:
for i in range(10):
    label = y_train_one_hot[i]
    print(label)
    if label[DOG_LABEL] == 1:
        print("This should be a dog...")
        
    elif label[CAT_LABEL] == 1:
        print("This should be a cat...")
    
    plt.imshow(X_train[i])
    plt.show()

### We still need to normalize our data, remember that normalization is done with the following formula
![title](normalization.png)

In [None]:
def normalize(raw_data):
    min_value = np.min(raw_data)
    max_value = np.max(raw_data)
    result = (raw_data - min_value) / (max_value - min_value)
    return result


X_train_normalized = normalize(X_train)
X_validation_normalized = normalize(X_validation)
X_test_normalized = normalize(X_test)


In [None]:
# And remove some data we have in memory but we do not need anymore
del cats
del dogs
del X_train
del X_test
del X_validation
del X_train_cats 
del X_train_dogs
del X_validation_cats
del X_validation_dogs
del X_test_cats
del X_test_dogs

In [None]:
X_train_normalized[1].shape

### Machine learning! (at last!)

Things to do are

* Define hyperparameters
* Build the network itself
  * Placeholder definitions
  * Code perse
* Write the training code

#### Define hyperparameters

In [None]:
FULLY_CONNECTED_LAYER_1 = 1024
FULLY_CONNECTED_LAYER_2 = 1024
CONVOLUTION_1_OUTPUT = 16
CONVOLUTION_2_OUTPUT = 32
BATCH_SIZE = 32
EPOCHS = 100
TOTAL_BATCHES = X_train_normalized.shape[0] // BATCH_SIZE
LABELS = 2 # Either cats or dogs

#### Build the network itself

In [None]:
X = tf.placeholder(tf.float32, shape=(None, IMAGE_WIDTH, IMAGE_HEIGHT, 3), name="X")
y = tf.placeholder(tf.float32, shape=(None, LABELS), name="y")
keep_prob = tf.placeholder(tf.float32, name="keep_prob")

In [None]:
# First convolution, turn the image into 100x100xCONVOLUTION_1_OUTPUT

convolution_1 = tf.layers.conv2d(X,    
                                 CONVOLUTION_1_OUTPUT, # Output size
                                 (3,3), # Kernel/patch size
                                 strides=(1,1), 
                                 padding="SAME",
                                 activation=tf.nn.relu)

#Max pool to reduce image from 100x100xCONVOLUTION_1_OUTPUT to 33x33xCONVOLUTION_1_OUTPUT 

convolution_1 = tf.layers.max_pooling2d(convolution_1, 
                                        3,  # Kernel/patch size 
                                        3,  # Strides, this will effectively shrink the output dimension, making 
                                            # it 100 / 3 = 33.3 ~ 34
                                        padding="SAME")

# Second convolution, turn the image into 34x34xCONVOLUTION_2_OUTPUT
convolution_2 = tf.layers.conv2d(convolution_1,    
                                 CONVOLUTION_2_OUTPUT, # Output size
                                 (3,3), # Kernel/patch size
                                 strides=(1,1), 
                                 padding="SAME",
                                 activation=tf.nn.relu)

#Max pool to reduce image from 34x34xCONVOLUTION_2_OUTPUT to 11x11xCONVOLUTION_2_OUTPUT
convolution_2 = tf.layers.max_pooling2d(convolution_2, 
                                        3,  # Kernel/patch size 
                                        3,  # Strides, this will effectively shrink the output dimension, making 
                                            # it 34 / 2 = 12
                                        padding="SAME")


# So the output of the convolution is 12x12x32 = 4608, lets use that for a "normal" neural network

fully_connected_1 = tf.layers.dense(tf.reshape(convolution_2, (-1, 12*12*32)),
                                FULLY_CONNECTED_LAYER_1, 
                                activation=tf.nn.relu)

fully_connected_1 = tf.nn.dropout(fully_connected_1, keep_prob)

fully_connected_2 = tf.layers.dense(fully_connected_1,
                                FULLY_CONNECTED_LAYER_2, 
                                activation=tf.nn.relu)

fully_connected_2 = tf.nn.dropout(fully_connected_2, keep_prob)

predictions = tf.layers.dense(fully_connected_2, 
                              LABELS)  

softmax_calc = tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=predictions)
cost = tf.reduce_mean(softmax_calc)

train_step = tf.train.AdamOptimizer(learning_rate=0.0001).minimize(cost)


#train_step = tf.train.AdamOptimizer().minimize(error)

#### Write the training code

In [None]:
session = tf.InteractiveSession()
session.run(tf.global_variables_initializer())
last_accuracy = 0
start_at = 0
dropout_training = 0.5
dropout_predicting = 1.0
debug = True
for epoch_no in range(EPOCHS):
    for batch_no in range(TOTAL_BATCHES):
        start_at = batch_no
        slice_start = start_at*BATCH_SIZE
        slice_end = slice_start + BATCH_SIZE
        if slice_start > len(X_train_normalized) or slice_end > len(X_train_normalized):
            start_at = 0
            slice_start = start_at*BATCH_SIZE
            slice_end = slice_start + BATCH_SIZE
            print("From ", slice_start, "--", slice_end)
    
    
        my_X = X_train_normalized[slice_start:slice_end]
        my_y = y_train_one_hot[slice_start:slice_end]

        if debug:
            print("Convolution_1", convolution_1.eval(feed_dict={X: my_X, 
                                                                 y: my_y}).shape)
            print("Convolution_2", convolution_2.eval(feed_dict={X: my_X, 
                                                                 y: my_y}).shape)
            print("Fully connected_1", fully_connected_1.eval(feed_dict={X: my_X, 
                                                                 y: my_y,
                                                                keep_prob: dropout_training}).shape)
            print("Fully connected_2", fully_connected_2.eval(feed_dict={X: my_X, 
                                                                 y: my_y,
                                                                keep_prob: dropout_training}).shape)
            
            debug = False

        train_step.run(feed_dict={X: my_X, 
                                  y: my_y,
                                  keep_prob: dropout_training})
        if batch_no % 100 == 0:
            correct_prediction_val = tf.equal(tf.argmax(y_validation_one_hot,1), tf.argmax(predictions, 1))
            accuracy_val = tf.reduce_mean(tf.cast(correct_prediction_val, tf.float32))
            
            correct_prediction_train_batch = tf.equal(tf.argmax(my_y,1), tf.argmax(predictions, 1))
            accuracy_train_batch = tf.reduce_mean(tf.cast(correct_prediction_train_batch, tf.float32))
            
            print("Epoch ", epoch_no, 
                  " batch number ", batch_no, 
                  " cost ", cost.eval(feed_dict={X: my_X, 
                                                 y: my_y,
                                                 keep_prob: dropout_training}),
                  " \tval accuracy ", accuracy_val.eval(feed_dict={X: X_validation_normalized, 
                                                                   y: y_validation_one_hot,
                                                                   keep_prob: dropout_predicting}),
                  "\ttrain accuracy", accuracy_train_batch.eval(feed_dict={X: my_X, 
                                                                           y: my_y,
                                                                           keep_prob: dropout_training}))

print("DONE!!")        