# Image Classification with Convolutional Neural Networks, CIFAR-10 dataset

### Make the notebook compatible with both Python 2 and 3

http://python-future.org/compatible_idioms.html

In [None]:
from __future__ import absolute_import, division, print_function

In [113]:
import numpy as np
import tensorflow as tf
from keras.datasets import cifar10

### Plot graphs inline

In [114]:
%matplotlib inline

import matplotlib
import matplotlib.pyplot as plt

In [115]:
print(tf.__version__)
print(np.__version__)
print(matplotlib.__version__)

1.6.0-dev20180115
1.14.0
2.1.0


### Download the CIFAR-10 dataset

More information on the dataset can be found here: https://www.cs.toronto.edu/~kriz/cifar.html

The file is 17MB so this might take a while

The dataset is broken into batches to prevent your machine from running out of memory. The CIFAR-10 dataset consists of 5 batches, named data_batch_1, data_batch_2, etc.. Each batch contains the labels and images that are one of the following:

* 0 - airplane
* 1 - automobile
* 2 - bird
* 3 - cat
* 4 - deer
* 5 - dog
* 6 - frog
* 7 - horse
* 8 - ship
* 9 - truck

### Untar and unzip the files

* The extracted files (one for each batch) are placed in the folder *cifar-10-batches-py/* under your current working directory 
* Each file is named *data_batch_1, data_batch_2* etc.

### Load and pre-process files

* Access the image and the labels from a single batch specified by id (1-5)
* Reshape the images, the images are **fed to the convolutional layer as a 4-D tensor**, notice that the reshape has the channels at axis index 1 
* Transpose the axes of the reshaped image to be in this form: *[batch_size, height, width, channels]*, **channels should be the last axis**

In [116]:
(features, labels), (X_test_orig, Y_test_orig) = cifar10.load_data()
features = features/255 - 0.5
labels = labels.flatten()

test_images = X_test_orig/255 - 0.5
test_labels = Y_test_orig.flatten()


### Explore the data

In [117]:
(features.shape)

(50000, 32, 32, 3)

In [118]:
features[0].shape

(32, 32, 3)

In [119]:
labels = labels.flatten()
labels

array([6, 9, 9, ..., 9, 1, 1], dtype=uint8)

### Helper functions to display images as well as labels

* Map the integer labels to the actual labels for display
* Plot the image

### Access the *training* data and the corresponding labels

Each batch in the CIFAR-10 dataset has randomly picked images, so the images come pre-shuffled


### Access the *test* data and the corresponding labels

### The CIFAR-10 dataset has color images

* Each image is of size 32x32
* The image is RGB so has 3 channels, and requires 3 numbers to represent each pixel

### Placeholders for training data and labels

* The training dataset placeholder can have any number of instances and each instance is an array of 32x32 pixels (we've already reshaped the data earlier)
* The images are fed to the convolutional layer as a 4D tensor *[batch_size, height, width, channels]*

### Add a dropout layer to avoid overfitting the training data

* The training flag is set to False during prediction and is True while training (dropout is applied only in the training phase)
* The dropout_rate indicates the chances that a neuron is turned off during training

### Neural network design

* 2 convolutional layers
* 1 max pooling layer
* 1 convolutional layer
* 1 max pooling layer
* 2 fully connected layers
* Output logits layer

### Pooling reduces the size of the image

The pooled image is only 1/4th the size of the original image with this kernel size and stride

### Reshape the pooled layer to be a 1-D vector (flatten it) 

### There are 10 possible classifications in the CIFAR-10 dataset

The number of outputs of the logits layer should be 10

### The final output layer with softmax activation

The *tf.nn.sparse_softmax_cross_entropy_with_logits* will apply the softmax activation as well as calculate the cross-entropy as our cost function

In [121]:
def forward_propagation(X):
  
  # Convolutional Layer #1
  conv1 = tf.layers.conv2d(
      inputs=X_drop,
      filters=32,
      kernel_size=[5, 5],
      padding="valid",
      activation=tf.nn.relu)

  conv2 = tf.layers.conv2d(conv1, filters=64, 
                       kernel_size=3,
                       strides=2, padding="valid",
                       activation=tf.nn.relu, name="conv2")
  pool3 = tf.nn.max_pool(conv2,
                     ksize=[1, 2, 2, 1],
                     strides=[1, 2, 2, 1],
                     padding="VALID")
  conv4 = tf.layers.conv2d(pool3, filters=128, 
                       kernel_size=4,
                       strides=3, padding="SAME",
                       activation=tf.nn.relu, name="conv4")

  pool5 = tf.nn.max_pool(conv4,
                     ksize=[1, 2, 2, 1],
                     strides=[1, 1, 1, 1],
                     padding="VALID")

  pool5_flat = tf.contrib.layers.flatten(pool5)

  fullyconn1 = tf.layers.dense(pool5_flat, 128,
                           activation=tf.nn.relu, name="fc1")

  fullyconn2 = tf.layers.dense(fullyconn1, 64,
                           activation=tf.nn.relu, name="fc2")
  logits = tf.layers.dense(fullyconn2, 10, name="output")
  return logits

### Check correctness and accuracy of the prediction

* Check whether the highest probability output in logits is equal to the y-label
* Check the accuracy across all predictions (How many predictions did we get right?)

### Set up a helper method to access training data in batches

In [122]:
def get_next_batch(features, labels, train_size, batch_index, batch_size):
    training_images = features[:train_size,:,:]
    training_labels = labels[:train_size]
    
    start_index = batch_index * batch_size
    end_index = start_index + batch_size

    return features[start_index:end_index,:,:], labels[start_index:end_index]

### Train and evaluate the model

* For smaller training data you'll find that the model performs poorly, it improves as you increase the size of the training data (use all batches)
* Ensure that dropout is enabled during training to avoid overfitting

In [131]:
def create_placeholders(n_H0, n_W0, n_C0):
    """
    Creates the placeholders for the tensorflow session.
    
    Arguments:
    n_H0 -- scalar, height of an input image
    n_W0 -- scalar, width of an input image
    n_C0 -- scalar, number of channels of the input
    n_y -- scalar, number of classes
        
    Returns:
    X -- placeholder for the data input, of shape [None, n_H0, n_W0, n_C0] and dtype "float"
    Y -- placeholder for the input labels, of shape [None, n_y] and dtype "float"
    """

    ### START CODE HERE ### (≈2 lines)
    X = tf.placeholder(shape=[None, n_H0, n_W0, n_C0], dtype="float", name="X")
    
    y = tf.placeholder(tf.int32, shape=[None], name="y")

    ### END CODE HERE ###
    
    return X, y

In [136]:
from tensorflow.python.framework import ops
def model(feature, labels, test_images, test_labels, learning_rate = 0.09,
          n_epochs = 20, batch_size = 128, print_cost = True):
    
    ops.reset_default_graph()                         # to be able to rerun the model without overwriting tf variables
    tf.set_random_seed(1)                             # to keep results consistent (tensorflow seed)
    seed = 3                                          # to keep results consistent (numpy seed)
    (m, n_H0, n_W0, n_C0) = feature.shape             
    X, y = create_placeholders(n_H0, n_W0, n_C0)
    dropout_rate = 0.3

    training = tf.placeholder_with_default(False, shape=(), name='training')
    X_drop = tf.layers.dropout(X, dropout_rate, training=training)
    logits = forward_propagation(X_drop)
    xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,
                                                          labels=y)
    loss = tf.reduce_mean(xentropy)
    optimizer = tf.train.AdamOptimizer()
    training_op = optimizer.minimize(loss)
    correct = tf.nn.in_top_k(logits, y, 1)
    accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
    init = tf.global_variables_initializer()
    saver = tf.train.Saver()
    with tf.Session() as sess:
        init.run()
        for epoch in range(n_epochs):
    #         batch_index = 0
            # Add this in when we want to run the training on all batches in CIFAR-10
            batch_index = 0

            train_size = int(len(features))

            for iteration in range(train_size // batch_size):
                X_batch, y_batch = get_next_batch(features, 
                                                                            labels, 
                                                                            train_size, 
                                                                            batch_index,
                                                                            batch_size)
                batch_index += 1

                sess.run(training_op, feed_dict={X: X_batch, y: y_batch, training: True})

            acc_train = accuracy.eval(feed_dict={X: X_batch, y: y_batch})
            acc_test = accuracy.eval(feed_dict={X: test_images, y: test_labels})
            print(epoch, "Train accuracy:", acc_train, "Test accuracy:", acc_test)
            print ("Train Accuracy:", round(acc_train, 2)*100,"%",  end="\t")
            print ("Test Accuracy:", round(acc_test, 2)*100,"%", end="\n")
            save_path = saver.save(sess, "./cifar_model")

In [None]:
parameters = model(features, labels, test_images, test_labels)

0 Train accuracy: 0.5546875 Test accuracy: 0.4739
Train Accuracy: 55.000001192092896 %	Test Accuracy: 46.99999988079071 %
