### Bird Classifier - Training the model
#### Notebook Author: Nirupam Purushothama

This notebook trains a CNN to identify if a given image contains a bird or not. 

Refer to my TensorFlow notes before you proceed with this notebook. This notebook will not explain the details relating to how CNN works and what pooling, kernel-sizes, padding etc. are. You can refer to the medium-blog I provided below, it gives you an intuition behind Convolution Neural Networks, but for complete details you will need to understand TensorFlow better (At least for one basic model)

#### System requirement:
1. Memory: Minimum 16GB RAM.
2. Takes 5 hours to finish the training run. You can speed it up by using graphics card processors etc. Refer to the article mentioned below.

#### Tips
1. If you are just using this notebook for learning purposes then a couple of places where you can get free credits are AWS Educate and Azure Student credits. As on March 2019, Azure gives you $\$100$ credit and AWS Educate (via [GitHub Education](https://education.github.com/students)) gets you $\$150$ credit. Both are for one year and six months respectively. Azure activation is immediate and GitHub verification takes 3-5 days.
2. These credits are not much. But should be able to let you run some heavy duty notebooks like these.

#### References: 
1. Machine Learning is Fun - Part 3 - [Medium Blog](https://medium.com/@ageitgey/machine-learning-is-fun-part-3-deep-learning-and-convolutional-neural-networks-f40359318721)
2. Datasets 
    * [CIFAR10](https://www.cs.toronto.edu/~kriz/cifar.html), Contains 6,000 birds and 52,000 other objects
    * [Caltech-UCSD Birds-200-2011 dataset](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html), Contains 12,000 birds
    * Full data is available [here](https://s3-us-west-2.amazonaws.com/ml-is-fun/data.zip)
3. Uses TFLearn (Framework on top of TensorFlow)

In [1]:
from __future__ import division, print_function, absolute_import

# Import tflearn and some helpers
import tflearn
from tflearn.data_utils import shuffle
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.estimator import regression
from tflearn.data_preprocessing import ImagePreprocessing
from tflearn.data_augmentation import ImageAugmentation
import pickle

In [8]:
# Load the data set
# Blog author made the dataset available here: 
# https://s3-us-west-2.amazonaws.com/ml-is-fun/data.zip

# The original code has a problem with loading (at least on my machine. Hence had to modify it accordingly)

with open("../../data/full_dataset.pkl", 'rb') as f:
    X, Y, X_test, Y_test = pickle.load(f, encoding='latin1')

# Shuffle the data
X, Y = shuffle(X, Y)

#### Just a small bit of info about tflearn
I looked at the documentation and it was devoid of any explanations and it is pretty lame. But if you are comfortable with tensorflow then understanding tflearn is not a big deal. You can infer stuff from the method signatures in the documentation. But obviously it is not so user-friendly. 

I can understand tensor-flow at this time and understanding tflearn is not a problem. But given the state of its documentation I think the more prudent thing will be to migrate to Keras in the future. But there are some concerns raised on the speed of Keras as it is a more generic framework on top of tensor-flow (because it can support more platforms like Theano and Microsoft's CNTK). Among these Theano is [dead](https://skymind.ai/wiki/comparison-frameworks-dl4j-tensorflow-pytorch#theano). And I am damn sure that Microsoft's CNTK (although open-source) doesn't stand a chance before other purely open-source frameworks. But despite this, Keras is already more popular, easy to use and has a lot of examples floating around. So, I will move to Keras after these examples. But I already like TFLearn. Hence, I will keep checking TFlearn once in a while to see if it is making stronger progress than Keras.

In [9]:
# Make sure the data is normalized
img_prep = ImagePreprocessing()
img_prep.add_featurewise_zero_center()
img_prep.add_featurewise_stdnorm()

In [10]:
# Create extra synthetic training data by flipping, rotating and blurring the images on our data set.
img_aug = ImageAugmentation()
img_aug.add_random_flip_leftright()
img_aug.add_random_rotation(max_angle=25.)
img_aug.add_random_blur(sigma_max=3.)

#### Build the network
<b>Note:</b>  This is super-cool and awesome. Doing the same thing with Tensorflow would be super tedious. 

In [11]:
# Define our network architecture:

# Input is a 32x32 image with 3 color channels (red, green and blue)
network = input_data(shape=[None, 32, 32, 3],
                     data_preprocessing=img_prep,
                     data_augmentation=img_aug)

# Step 1: Convolution - Num_Convulution_filters: 32, kernel_filter_size = 3x3
network = conv_2d(network, 32, 3, activation='relu')

# Step 2: Max pooling - Kernel size
network = max_pool_2d(network, 2)

# Step 3: Convolution again
network = conv_2d(network, 64, 3, activation='relu')

# Step 4: Convolution yet again
network = conv_2d(network, 64, 3, activation='relu')

# Step 5: Max pooling again
network = max_pool_2d(network, 2)

# Step 6: Fully-connected 512 node neural network
network = fully_connected(network, 512, activation='relu')

# Step 7: Dropout - throw away some data randomly during training to prevent over-fitting
network = dropout(network, 0.5)

# Step 8: Fully-connected neural network with two outputs (0=isn't a bird, 1=is a bird) to make the final prediction
network = fully_connected(network, 2, activation='softmax')

Instructions for updating:
Use tf.initializers.variance_scaling instead with distribution=uniform to get equivalent behavior.


#### Again it is awesome. The same stuff in TensorFlow would have been a pain to setup

In [12]:
# Tell tflearn how we want to train the network
network = regression(network, optimizer='adam',
                     loss='categorical_crossentropy', learning_rate=0.001)

# Wrap the network in a model object
model = tflearn.DNN(network, tensorboard_verbose=0, checkpoint_path='bird-classifier.tfl.ckpt')

# Train it! We'll do 100 training passes and monitor it as it goes.
model.fit(X, Y, n_epoch=100, shuffle=True, validation_set=(X_test, Y_test),
          show_metric=True, batch_size=96,
          snapshot_epoch=True, run_id='bird-classifier')

Training Step: 59199  | total loss: [1m[32m0.17229[0m[0m | time: 262.484s
| Adam | epoch: 100 | loss: 0.17229 - acc: 0.9298 -- iter: 56736/56780
Training Step: 59200  | total loss: [1m[32m0.17152[0m[0m | time: 286.752s
| Adam | epoch: 100 | loss: 0.17152 - acc: 0.9296 | val_loss: 0.18057 - val_acc: 0.9483 -- iter: 56780/56780
--
INFO:tensorflow:/home/nirupam/learning/CNN_ImageAnalysis/BirdClassifier/bird-classifier.tfl.ckpt-59200 is not in all_model_checkpoint_paths. Manually adding it.


In [16]:
# Save model when training is complete to a file
model.save("./bird-classifier.tfl")
print("Network trained and saved as bird-classifier.tfl!")

INFO:tensorflow:/home/nirupam/learning/CNN_ImageAnalysis/BirdClassifier/bird-classifier.tfl is not in all_model_checkpoint_paths. Manually adding it.
Network trained and saved as bird-classifier.tfl!
