# Playing with Horse-Human data set from Laurance Moroney

In this notebook we are going to play with complex image classification. Previously, we played with MNIST, CIFAR and found that those data sets were still very simple and very much specialized to their data set. We have learnt a great tool (CNN) that can do feature extraction for complex computer vision and hence we wish to explore more. We will play with a dataset compiled by Laurance Moroney that contains two image classes: Horses and Humans.

## HorseHumanSmall dataset
Dataset contains different complex images showing horses and humans. These are graphics basically, not real humans or horses. But the scenes are detailed. Like there are backgrounds, shades, different poses, zoom, and varying grid positions etc.
Here is how you get the data set:


In [None]:
!wget --no-check-certificate \
    https://storage.googleapis.com/laurencemoroney-blog.appspot.com/horse-or-human.zip \
    -O /tmp/horse-or-human.zip

!wget --no-check-certificate \
    https://storage.googleapis.com/laurencemoroney-blog.appspot.com/validation-horse-or-human.zip \
    -O /tmp/validation-horse-or-human.zip

In [None]:
import os
import zipfile

localZip = '/tmp/horse-or-human.zip'
zipRef = zipfile.ZipFile(localZip)
zipRef.extractall('/tmp/horse-or-human/')
localZip = '/tmp/validation-horse-or-human.zip'
zipRef = zipfile.ZipFile(localZip)
zipRef.extractall('/tmp/validation-horse-or-human')
zipRef.close()

In [None]:
!echo 'Training Humans:' && ls /tmp/horse-or-human/humans | wc -l
!echo 'Training Horses:' && ls /tmp/horse-or-human/horses | wc -l
!echo 'Validation Humans:' && ls /tmp/validation-horse-or-human/humans | wc -l
!echo 'Training Horses:' && ls /tmp/validation-horse-or-human/horses | wc -l

There are 527 (not sure why 27 more human images, probably an honest mistake) training images for humans and 500 training images for horses. There are 128 images each for validation. This turns out to be quite a small dataset. 

In [None]:
trainDatasetDir = os.path.join('/tmp/horse-or-human/')
valDatasetDir = os.path.join('/tmp/validation-horse-or-human/')
trainImgsHumans = os.path.join(trainDatasetDir, '/humans')
trainImgsHorses = os.path.join(trainDatasetDir, '/horses')
valImgsHumans = os.path.join(valDatasetDir, '/humans')
valImgsHorses = os.path.join(valDatasetDir, '/horses')

In [None]:
# model definition
import tensorflow as tf

model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(300,300,3)),
        tf.keras.layers.MaxPooling2D((2,2)),
        tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
        tf.keras.layers.MaxPooling2D((2,2)),
        tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
        tf.keras.layers.MaxPooling2D((2,2)),
        tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
        tf.keras.layers.MaxPooling2D((2,2)),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(512, activation='relu'),
        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dense(1, activation='sigmoid')
    ]
)

optimizer = tf.keras.optimizers.RMSprop(learning_rate=0.001)
model.compile(optimizer=optimizer, loss='binary_crossentropy', metrics=['acc'])
print(model.summary())

## Loading custom image datasets
There are loads of datasets available directly from tensorflow (TFDS) which are loadable basically in one line. If we wish to load our own datasets, we would need some help from tesnorflow. For images, tensorflow provides ImageGenerator APIs that can be used load custom image datasets into a format tensorflow understands. 

### Augemntation
One important concept to grasp in ImageGenerator API is augmentation. Augmentation helps against overfitting by augmenting the dataset with some nifty tricks. There are many augmentation options https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator.

#### rescale = <Normalize> 
For example if pixel values were 0-255, normalize them from 0-1.

#### rotation_range = <Angle of rtoation> 
Think if all the cat images in youtr dataset were sitting upright with their ears pointing up. You might overfit your model with cats only having ears upright. By using rotation_angle you basically tell tensorflow to rotate your random image by a given angle so that you avoid overfitting and being versatility to your model.

#### width_shift_range = <change position of artifact width wise>
Think if all the images in your dataset were of persons standing in the middle of the image. This wil result in overfitting as the model may not recognize persons standing on the left or right. width_shift_range augments the image by shifting the position width wise by a given factor. Values between 0-1 are used where 1 is 100% and 0 is 0%, i.e, 0.20 would mean 20%

#### height_shift_range = <change position of the artifact height wise>
Same analogy as above. Shift is height wise.

#### shear_range = <shears the image>
Think of a person lying down on the ground. He/she is clearly a human, but if our training data contained only images of persons standing up only, the model may fail to recognize. shear_range tilts the image in all axes so that we would get somthing very close to an image of a person lying down. 0.20 would mean 20% skew.

#### zoom_range = <zoom in>
Imagine our dataset had images of people with full body shown and if the model trained with this data was fed an image with a person's image zoomed in and now his/her legs are missing from the exposure then the model may exhibit overfitting. zoom_range tackles this by randomly zooming in an image so that this feature is also accounted for.

#### horizontal_flip = <flip the image>
Think if all the images in the dataset are the persons with their right hand raised up, and the model trained with this data was fed an image with a left hand raised up. horizontal_flip, randomly flips an image to capture this feature.

#### fill_mode = <fill wiped out pixels>
Some of the augmentation options, e.g., shear_range, shift_range etc are going to wipe out pixels while applying transformations. fill_mode, helps to fill in the gaps left after transformations. One of the fill modes is 'nearest' which means fill the wiped out pixels with the nearest neighbouring pixels.



In [None]:
rescaleVal = 1./255
print(rescaleVal)
trainDataGen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=rescaleVal,
    rotation_range=40,
    width_shift_range=0.20,
    height_shift_range=0.20,
    horizontal_flip=True,
    #vertical_flip=True,
    zoom_range=0.20,
    shear_range=0.20,
    fill_mode='nearest'
)

trainingGenerator = trainDataGen.flow_from_directory(
    trainDatasetDir,
    target_size=(300,300),
    class_mode='binary',
    batch_size=128
)

valDataGen = tf.keras.preprocessing.image.ImageDataGenerator(
    rescale=rescaleVal
)

validationGenerator = valDataGen.flow_from_directory(
    valDatasetDir,
    target_size=(300,300),
    class_mode='binary'
)

model.fit(
    trainingGenerator, 
    steps_per_epoch=8, 
    epochs=15, 
    validation_data=validationGenerator, 
    verbose=1
)