<a href="https://colab.research.google.com/github/nguyetvo/cat_dog_classification/blob/master/Nguyet_Vo_catdog.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
!python3 -m pip install tensorflow-gpu==2.0.0 --user 

# **Cat & Dog Classification with Tensorflow**
This notebook explores tensorflow to classify a dataset of 25,000 dogs & cats images. It will go through the processes of loading and preparing data, defining the model to train data, and training the model.

# **Importing Libraries & Loading Data**

In [0]:
#!pip install --upgrade "tensorflow==1.4" "keras>=2.0"

In [0]:
## Import libaries
import numpy as np 
import matplotlib.pyplot as plt 
import tensorflow as tf 
import os 

from tensorflow.keras import Sequential
from tensorflow.keras.layers import Flatten, Dense, Activation

In [0]:
!pip install --upgrade --force-reinstall --no-deps kaggle

In [0]:
!kaggle --version

In [0]:
os.environ['KAGGLE_USERNAME'] = "nguyetvo" # username from the json file
os.environ['KAGGLE_KEY'] = "1f34732ea5c843e829d5230feff0e412" # key from the json file

In [0]:
!kaggle competitions download -c dogs-vs-cats

In [0]:
!unzip dogs-vs-cats.zip

In [0]:
!unzip train.zip
!unzip test1.zip

In [0]:
# Path to train & test data

train_dir = 'train/'
test_dir = 'test1/'

# All image paths 
train_data_path = [train_dir + file_name for file_name in os.listdir(train_dir)]

In [0]:
os.listdir(train_dir)[:5]

**Load and Preprocess Data**

This dataset consists of 25,000 images of cats and dogs split equally and in jpeg format.

To process the images, let's apply a few functions from the tensorflow image module. 
1.  **tf.image.decode_jpeg**
- From the module tf.image, the decode_jpeg function decodes a JPEG-encoded image to a uint8 tensor. In our implementation, I input the arguments image as contents and channels with the value 3 which indicates the number of color channels for the decoded image. 
2.  **tf.image.resize** 
- From the same module, I employ resize which per its name - resizes images to a specific size using the specified method. 
3. **Normalization**
- Lastly, I divide the image by 255 to normalize the image to range 0 to 1. Data normalization is an important step which ensures that each input parameter (pixel, in this case) has a similar data distribution </font>. This makes convergence faster while training the network.
4. To load the images from their paths, let's use **read_file** from the tensorflow io module which reads and outputs the entire contents of the input filename.


In [0]:
IMAGE_SIZE = 299

'''
FIRST, I DEFINE A FUCNTION WHERE APPLY TENSORFLOW FUNCTIONS ARE APPLIED TO PROCESS RAW IMAGES.
'''

def preprocess_image(image):
    
    #decode image into tensors
    image = tf.image.decode_jpeg(image, channels = 3) 
    
    #resize image to fit with Xception's required input
    image = tf.image.resize(image, [IMAGE_SIZE, IMAGE_SIZE]) 
    
    #normalize pixels to range (0, 1)
    image /= 255.0 

    #return preprocessed images
    return image

'''
NOW, LET'S DEFINE A FUNCTION TO LOAD IMAGES FROM IMAGE PATHS, 
APPLY PREPROCESSING AND RETURN PREPROCESSED IMAGES. 
'''

def load_and_preprocess_image(path):
    image = tf.io.read_file(path)
    return preprocess_image(image)

**Next, let's employ another one of  tensorflow modules - data.**

**tf.data.Dataset.from_tensor_slices( )** method, I can get the slices of an array in the form of objects.

In [0]:
''' 
FROM IMAGE PATHS (train_data_path), 
SLICE individual path
'''
path_dataset = tf.data.Dataset.from_tensor_slices(train_data_path)

# Create image dataset from path dataset
image_dataset = path_dataset.map(load_and_preprocess_image)

In [0]:
train_data_path[:5]

In [0]:
print(type(train_data_path))
print(type(path_dataset))
print(type(image_dataset))

Our dataset is 'labelled' such that the name of the file contains 'dog' or 'cat' . 

I assign 1 to file name containing 'cat', 0 for file name containing 'dog'. Let's save this in image_label.

Using tf.cast, I change image_label datatype to int64 before applying from_tensor_slices on the labels and save it in another variable called label_dataset.

In [0]:
'''
I ASSIGN THE IMAGE LABEL WITH 1 IF ITS FILE NAME CONTAINS 'CAT'
BINARILY, FILE NAMES CONTAINING 'DOG' WILL BE ASSIGNED WITH 0.
'''

image_label = list(map(lambda x: 1 if 'cat' in x else 0, os.listdir(train_dir)))

'''
NEXT, I USE TF.CAST TO CHANGE THE DATA TYPE TO INT64 BEFORE SLICING.
'''
label_dataset = tf.data.Dataset.from_tensor_slices(tf.cast(image_label, tf.int64))

Before splitting up the dataset into three separate sets <font color = gray> Training, Validation & Testing </font>, let's <font color = red> zip </font> the images and their respective labels together.

In [0]:
# Combine image dataset and image label dataset

dataset = tf.data.Dataset.zip((image_dataset, label_dataset))

In [0]:
dataset

**Split**

Train : Validation : Test = 70 : 15 : 15.

**Train Set**

- To get the 70% of data for the train set, we define train_size to be 70% of dataset_size.
- Next, I apply shuffle to tf.data.Dataset.zip((image_dataset, label_dataset)) or our earlier defined dataset using shuffle_buffer_size of 4096. This will split the dataset into batches of 4096 images, then shuffling each batch before placing them back into the population. The smaller the shuffle_buffer_size, the more randomized the population will end up.
- Then, the train_dataset will be defined by applying take using the train_size (70% * 25,000). It takes grab the first 70% of the images after shuffling.

In [0]:
DATASET_SIZE = 25000
BATCH_SIZE = 128
SHUFFLE_BUFFER_SIZE = 4096

train_size = int(0.7 * DATASET_SIZE)
val_size = int(0.15 * DATASET_SIZE)
test_size = int(0.15 * DATASET_SIZE)

dataset = dataset.shuffle(buffer_size = SHUFFLE_BUFFER_SIZE)
train_dataset = dataset.take(train_size)
test_dataset = dataset.skip(train_size)
val_dataset = test_dataset.skip(val_size)
test_dataset = test_dataset.take(test_size)

Let's perform mini-batching onto the <font color = gray> train_dataset</font> & <font color = gray> test_dataset </font> then save them again.

In [0]:
# Perform mini-batch in train_dataset and test_dataset
train_dataset = train_dataset.shuffle(SHUFFLE_BUFFER_SIZE).batch(BATCH_SIZE, 
                                                                 drop_remainder = True)
test_dataset = test_dataset.batch(BATCH_SIZE, drop_remainder = True)
val_dataset = val_dataset.batch(BATCH_SIZE, drop_remainder = True)

**Building the Model.**

- In this notebook, I employ a pretrained model Xception. Xception is a convolutional neural network that is trained on more than a million images from the ImageNet database. The network is 71 layers deep and can classify images into 1000 object categories, such as keyboard, mouse, pencil, and many animals.
As a result, the network has learned rich feature representations for a wide range of images. The network has an image input size of 299-by-299.

In [0]:
# Define model
def define_model():
    # load model
    pretrained_model = tf.keras.applications.Xception(include_top = False, input_shape = (IMAGE_SIZE, IMAGE_SIZE, 3))
    
    # mark pretrained layers as not trainable
    pretrained_model.trainable = False
    
    # add new classifier layers
    model = tf.keras.Sequential([pretrained_model,
                                 tf.keras.layers.Flatten(),
                                 tf.keras.layers.Dense(128, activation = 'relu', kernel_initializer = 'he_uniform'),
                                 tf.keras.layers.Dense(1, activation = 'sigmoid')])
    
    optimizer = tf.keras.optimizers.RMSprop(lr = 0.001, momentum = 0.9)
    model.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['accuracy'])
    
    return model

model = define_model()

**Callbacks**

Callbacks are utilities called at certain points during training. In this notebook, I employ two callbacks:

1. ReduceLROnPlateau to reduce the learning rate by a factor of 0.1 if val_loss does not improve after 3 epochs. (patience = 3). The min_lr specifies the minimum learning rate to be 0.00001. verbose simply shows the progress.
2. ModelCheckpoint with the param save_best_only = True will saves the model in the file checkpoint.h5 at the epoch with the best results (accuracy, val_loss, val_accuracy).

In [0]:
# Callbacks
from keras.callbacks import ReduceLROnPlateau
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import ModelCheckpoint
from tensorflow.python.training.adam import AdamOptimizer
from tensorflow.keras.callbacks import EarlyStopping
#!pip install --upgrade "tensorflow==1.4" "keras>=2.0"

lr_schedule = tf.keras.callbacks.ReduceLROnPlateau()

callbacks = [ReduceLROnPlateau(factor = 0.1, patience = 3, min_lr = 0.00001, verbose = 1),
             ModelCheckpoint('model_design.h5', verbose = 1, save_best_only = True)]

In [0]:
# Train model
model_history = model.fit(train_dataset, 
                          epochs = 5, 
                          validation_data = val_dataset, 
                          callbacks = callbacks) 

In [0]:
train_accuracy = model_history.history['accuracy']
validation_accuracy = model_history.history['val_accuracy']

train_loss = model_history.history['loss']
validation_loss = model_history.history['val_loss']

plt.figure(figsize=(12, 12))
plt.subplot(2, 1, 1)
plt.plot(train_accuracy, label='Training')
plt.plot(validation_accuracy, label='Validation')
plt.legend()
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([min(plt.ylim()),1])
plt.title('Accuracy')

plt.subplot(2, 1, 2)
plt.plot(train_loss, label='Training')
plt.plot(validation_loss, label='Validation')
plt.legend()
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.ylim([0,max(plt.ylim())])
plt.title('Loss')
plt.show()

In [0]:
# Evaluate model
final_model = tf.keras.models.load_model('model_design.h5')
final_model.evaluate(test_dataset)