# Digit Classifier on Cifar Dataset 

In this notebook an image classifier is developed using cifar dataset using `TensorFlow`'s estimator module.

> Due to lack of gpu model is trained for only 500 epochs for better accuracy more training steps are required.

### Import Libs 

In this section all the libraries required for this model is imported.

In [1]:
# import libs 

import tensorflow as tf # for deep learning 

# load helper libs 
import numpy as np # for matrix maths 
import pandas as pd # to load data set in form of tables 

### Load and preprocess data

In this model cifar dataset is loaded along with the preprocessing of the y labels to transform them in between 0-9

In [2]:
# load cifar dataset 
cifar = pd.read_csv('./cifar_10.csv')

# preprocess cifar labels to make them in between 0-9
cifar.y = cifar.y - 1

# describe cifar dataset 
cifar.describe()

Unnamed: 0,V1,V2,V3,V4,V5,V6,V7,V8,V9,V10,...,V1016,V1017,V1018,V1019,V1020,V1021,V1022,V1023,V1024,y
count,60000.0,60000.0,60000.0,60000.0,60000.0,60000.0,60000.0,60000.0,60000.0,60000.0,...,60000.0,60000.0,60000.0,60000.0,60000.0,60000.0,60000.0,60000.0,60000.0,60000.0
mean,0.025701,0.023368,0.026798,0.028833,0.031083,0.033491,0.035288,0.037149,0.039259,0.040771,...,-0.008772,-0.008664,-0.008874,-0.009335,-0.00958,-0.010063,-0.01078,-0.011391,-0.010571,4.5
std,0.283631,0.279697,0.278866,0.277823,0.276529,0.275769,0.274985,0.274774,0.2739,0.273734,...,0.237439,0.237645,0.237352,0.23777,0.237672,0.238511,0.239022,0.24006,0.243355,2.872305
min,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,...,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,0.0
25%,-0.2,-0.2,-0.2,-0.19,-0.19,-0.19,-0.18,-0.18,-0.18,-0.18,...,-0.18,-0.18,-0.18,-0.18,-0.18,-0.18,-0.1825,-0.19,-0.19,2.0
50%,0.02,0.02,0.02,0.02,0.03,0.03,0.03,0.03,0.04,0.04,...,-0.02,-0.02,-0.02,-0.02,-0.02,-0.02,-0.02,-0.025,-0.03,4.5
75%,0.25,0.25,0.25,0.25,0.25,0.25,0.26,0.26,0.26,0.26,...,0.15,0.15,0.15,0.15,0.15,0.15,0.15,0.15,0.15,7.0
max,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,...,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,9.0


### Def input function 

In this section an input function is defined which is used to enter input in the estimator library

In [3]:
# time to make a function in order to generate some inmputs for estimator api 
def genrate_input_fn(df, batch_size, num_epochs=None):
    def input_fun():
        
        # time to load all the values of df 
        pixels = df.iloc[:, :-1].values
        labels = df.iloc[:, -1].values
        
        # convert values of these into a tensor 
        
        pixels = tf.convert_to_tensor(pixels, dtype=tf.float32)
        labels = tf.convert_to_tensor(labels, dtype=tf.int32)
        
        
        # convert all these tensors into a dataset 
        dataset = tf.data.Dataset.from_tensor_slices((pixels,labels))
        
        # batch of dataset 
        dataset = dataset.batch(batch_size)
        
        # num of epochs of dataset 
        dataset = dataset.repeat(num_epochs)
        
        # make oneshot iterator for the data
        iterator = dataset.make_one_shot_iterator()
        
        # get next batch of the pixels 
        batch_pixels, batch_labels = iterator.get_next()
        
        return {'batch_pixels':batch_pixels}, batch_labels
    return input_fun

### Hyper Params Def 

In this section all the hyper params for the model is defined in order to tune Neural Network for better performance. 

In [5]:
# Hyparameters for the model 
NUM_CLASSES = 10 # for number of labels classes 
NUM_ROWS = 32 # num for pixels in a row 
NUM_COLS = 32 # num of pixels in a clos 
BATCH_SIZE = 64 # batch size for train 
NUM_TRAIN_STEPS = 4000 
NUM_TEST_STEPS = 100

### Def CNN function 

In this section a function is defined as a cnn layer which is given to estimator api for training, evaluation and prediction task.

In [12]:
# cnn function for estimatorapi 

def cnn_fun(features,labels, mode):
    
    # inputs 
    inputs = tf.reshape(features['batch_pixels'], [-1, NUM_ROWS, NUM_COLS, 1])
    
    inputs = inputs / 255
    
    # conv layer 1 
    conv1 = tf.layers.conv2d(inputs=inputs, filters=32, kernel_size=[3,3], padding='same', activation=tf.nn.relu)
    
    # pooling layer 1 
    pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2,2], strides=2)
    
    # conv layer 2 
    conv2 = tf.layers.conv2d(inputs=pool1, filters=64, kernel_size=[3,3], padding='same', activation=tf.nn.relu)
    
    # pool layer 2 
    pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2,2], strides=2)
    
    # a full dense layer 
    pool_flat = tf.reshape(pool2, shape=[-1, 8 * 8 * 64])
    
    dense = tf.layers.dense(inputs=pool_flat, units=1024, activation=tf.nn.relu)
    
    # a drop out layer of 40% 
    dropout = tf.layers.dropout(inputs=dense, rate=0.4)
    
    # a final logits layer 
    logits = tf.layers.dense(inputs=dropout, units=NUM_CLASSES)
    
    # making a prediction varibale 
    prediction = {
        'classes':tf.argmax(logits, axis=1),
        'probablities':tf.nn.softmax(logits=logits, name='softmax_tensor')
    }
    
    # return a predictions for predict mode 
    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode, prediction)
    
    # computing loss for the training process
    loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)
    
    # to test for the training mode 
    if mode == tf.estimator.ModeKeys.TRAIN:
        optimizer = tf.train.AdamOptimizer(1e-4)
        train = optimizer.minimize(loss=loss, global_step = tf.train.get_global_step())
        
        return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train)
    
    # to make an eval dict
    eval_op = {
        'accuracy': tf.metrics.accuracy(labels, prediction['classes'])
    }

    # if eval mode
    if mode == tf.estimator.ModeKeys.EVAL:
        return tf.estimator.EstimatorSpec(mode, loss=loss, eval_metric_ops=eval_op)

### Split Dataset 

In this section cifar dataset is splitted into a training set and testing set, 5000 samples for training set and reamining 1000 sets for the validation or evalution phase.

In [7]:
# make a test set for the cifar and train set 
cifar_train = cifar[:5000]

cifar_train.describe()

Unnamed: 0,V1,V2,V3,V4,V5,V6,V7,V8,V9,V10,...,V1016,V1017,V1018,V1019,V1020,V1021,V1022,V1023,V1024,y
count,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,...,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0,5000.0
mean,0.027432,0.024914,0.028012,0.031368,0.035148,0.038932,0.03945,0.039912,0.041508,0.043002,...,-0.009176,-0.00976,-0.009912,-0.010388,-0.00946,-0.008772,-0.00945,-0.009858,-0.008708,4.526
std,0.287523,0.282959,0.281609,0.279776,0.279121,0.278129,0.27764,0.277811,0.276532,0.276687,...,0.240003,0.238287,0.238256,0.239357,0.239393,0.240392,0.241258,0.242532,0.247056,2.867572
min,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,...,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,0.0
25%,-0.21,-0.21,-0.2,-0.2,-0.19,-0.18,-0.18,-0.18,-0.18,-0.18,...,-0.18,-0.19,-0.18,-0.18,-0.18,-0.18,-0.19,-0.1825,-0.19,2.0
50%,0.02,0.02,0.03,0.03,0.04,0.04,0.04,0.04,0.04,0.04,...,-0.02,-0.02,-0.02,-0.03,-0.02,-0.02,-0.02,-0.02,-0.02,5.0
75%,0.26,0.25,0.26,0.25,0.26,0.26,0.26,0.27,0.27,0.27,...,0.15,0.15,0.15,0.15,0.15,0.15,0.15,0.16,0.16,7.0
max,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,...,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,9.0


In [8]:
# test set 
cifar_test = cifar[5000:]

cifar_test.describe()

Unnamed: 0,V1,V2,V3,V4,V5,V6,V7,V8,V9,V10,...,V1016,V1017,V1018,V1019,V1020,V1021,V1022,V1023,V1024,y
count,55000.0,55000.0,55000.0,55000.0,55000.0,55000.0,55000.0,55000.0,55000.0,55000.0,...,55000.0,55000.0,55000.0,55000.0,55000.0,55000.0,55000.0,55000.0,55000.0,55000.0
mean,0.025543,0.023227,0.026687,0.028602,0.030713,0.032996,0.034909,0.036897,0.039055,0.040568,...,-0.008735,-0.008565,-0.008779,-0.009239,-0.009591,-0.01018,-0.010901,-0.011531,-0.010741,4.497636
std,0.283277,0.2794,0.278617,0.277646,0.276291,0.275551,0.274741,0.274498,0.273662,0.273465,...,0.237207,0.237589,0.237271,0.237627,0.237517,0.238341,0.23882,0.239835,0.243017,2.87275
min,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,...,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,-0.5,0.0
25%,-0.2,-0.2,-0.2,-0.19,-0.19,-0.19,-0.18,-0.18,-0.18,-0.18,...,-0.18,-0.18,-0.18,-0.18,-0.18,-0.18,-0.18,-0.19,-0.19,2.0
50%,0.02,0.02,0.02,0.02,0.03,0.03,0.03,0.03,0.04,0.04,...,-0.02,-0.02,-0.02,-0.02,-0.02,-0.02,-0.02,-0.03,-0.03,4.0
75%,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.26,0.26,0.26,...,0.15,0.15,0.15,0.15,0.15,0.15,0.15,0.15,0.15,7.0
max,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,...,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,0.5,9.0


### Def estimator 

In this section an esitmator is analysed for `cnn_fun` and a model dir is provided to store logs and summaries for the model.

In [13]:
# make a estimator for the cnn model 
cnn_model = tf.estimator.Estimator(model_fn=cnn_fun, model_dir='./logs')

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': './logs', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x0000019C91DE9400>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}


In [10]:
cnn_model.train(input_fn=genrate_input_fn(cifar_train, batch_size=BATCH_SIZE), steps=500, hooks=None)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into ./logs\model.ckpt.
INFO:tensorflow:loss = 2.3025465, step = 1
INFO:tensorflow:global_step/sec: 2.44237
INFO:tensorflow:loss = 2.2947392, step = 101 (40.959 sec)
INFO:tensorflow:global_step/sec: 4.34292
INFO:tensorflow:loss = 2.2501962, step = 201 (22.995 sec)
INFO:tensorflow:global_step/sec: 4.44077
INFO:tensorflow:loss = 2.2016666, step = 301 (22.565 sec)
INFO:tensorflow:global_step/sec: 4.33514
INFO:tensorflow:loss = 2.0631025, step = 401 (23.020 sec)
INFO:tensorflow:Saving checkpoints for 500 into ./logs\model.ckpt.
INFO:tensorflow:Loss for final step: 2.143433.


<tensorflow.python.estimator.estimator.Estimator at 0x19c91309fd0>

### Train Step 

In this section Neural Network is trained in `cifar_train` dataset.

### Evaluate Step 

In this step a model is evaluated for checking accuracy on `cifar_test` data set.

In [14]:
# eval model on cifar test 
cnn_model.evaluate(input_fn=genrate_input_fn(cifar_test,BATCH_SIZE), steps=200, hooks=None)

INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2019-01-04-16:18:33
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from ./logs\model.ckpt-500
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [20/200]
INFO:tensorflow:Evaluation [40/200]
INFO:tensorflow:Evaluation [60/200]
INFO:tensorflow:Evaluation [80/200]
INFO:tensorflow:Evaluation [100/200]
INFO:tensorflow:Evaluation [120/200]
INFO:tensorflow:Evaluation [140/200]
INFO:tensorflow:Evaluation [160/200]
INFO:tensorflow:Evaluation [180/200]
INFO:tensorflow:Evaluation [200/200]
INFO:tensorflow:Finished evaluation at 2019-01-04-16:19:17
INFO:tensorflow:Saving dict for global step 500: accuracy = 0.25015625, global_step = 500, loss = 2.1145484
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 500: ./logs\model.ckpt-500


{'accuracy': 0.25015625, 'loss': 2.1145484, 'global_step': 500}

### Predict Step 

In this step a `cifar_test` values are predicted.

In [15]:
# predict a value of dataset  
cnn_model.predict(input_fn=genrate_input_fn(cifar_test, BATCH_SIZE))

<generator object Estimator.predict at 0x0000019C9141A410>