## Image Classification
This is a project in which I have implemented an Convolutional Neural Network both using tensorflow Low Level and High Level API with the required Preprocessing Steps and on one of the most recognized MNIST Handwritten datasets.

Both the model perform well enough on both the training dataset and test Dataset

## Preprocessing the Data
Here we preprocess the data that we need to train the model on and hence represent it in a way such that it can be used for training the model

In [2]:
import os
import struct 
import numpy as np

## Here we define a function that loads the mnist dataset and also normalize the images pixel values between -1 and 1 as it generarlizes better this way for the model
def load_mnist(path,kind = 'train'):
    labels_path = os.path.join(path,'%s-labels.idx1-ubyte' % kind)
    
    images_path = os.path.join(path, '%s-images.idx3-ubyte' % kind)
    
    with open(labels_path,'rb') as lbpath:
        magic, n = struct.unpack('>II',lbpath.read(8))
        
        labels = np.fromfile(lbpath,dtype = np.uint8)
        
    with open(images_path, 'rb') as imgpath:
        magic, num, rows, cols = struct.unpack(">IIII",imgpath.read(16))
        
        images = np.fromfile(imgpath,dtype = np.uint8).reshape(len(labels),784)
        
        images = ((images/255.) - .5)*2
        
    return images, labels

X_data , y_data = load_mnist('./mnist/',kind = 'train')    ## Loading the training and test data respectively
X_test , y_test = load_mnist('./mnist/',kind = 't10k')
X_train , y_train = X_data[:50000] , y_data[:50000]        ## Dividing the training dataset into training and validation sets respectively
X_valid , y_valid = X_data[50000:] , y_data[50000:]

In [3]:
## Here we define a helper function that we can use to divide the 
## dataset into respective mini-batches for training the model
def batch_generator(X,y,batch_size = 64,shuffle = False,random_seed = None):
    idx = np.arange(y.shape[0])
    if shuffle:
        rng = np.random.RandomState(random_seed)
        rng.shuffle(idx)
        X = X[idx]
        y = y[idx]
        
    for i in range(0,X.shape[0],batch_size):
        yield (X[i:i+batch_size,:], y[i:i+batch_size])
## Hence it yields an iterator for this location in the dataset and hence we can train on this partiular set and hence it also doesn't affect the storage that much

In [4]:
## Here we standardize the dataset so that it is efficient for the model to train on the dataset
mean_vals = np.mean(X_train,axis = 0)  
std_val = np.std(X_train)             ## we don't use standard deviation respective row-wise because it could be possible that the pixels could be constant over a whole row or column and hencce which could lead to dividing by zero
X_train_centered = (X_train - mean_vals)/std_val    ## standardizing the dataset respectively
X_valid_centered = (X_valid - mean_vals)/std_val
X_test_centered = (X_test - mean_vals)/std_val
## Hence now we have preprocessed the dataset and further move on to implementing the model

## Implementing the Convolutional Neural Network

### Low level API
Here we implement the Low level CNN by defining all the built in functions from scratch to understand the intricacies of the model and hence we train the model and test the model

In [7]:
import tensorflow as tf
import numpy as np

## Here we define the Convolution layers which would be used initially for extracting important features from the images

def conv_layer(input_tensor,name,kernel_size,n_output_channels,padding_mode = 'SAME',strides = (1,1,1,1)):
    with tf.variable_scope(name):
        input_shape = input_tensor.get_shape().as_list()
        n_input_channels = input_shape[-1]
        
        weights_shape = list(kernel_size) + [n_input_channels,n_output_channels]
        weights = tf.get_variable(name = '_weigths',shape = weights_shape)
        biases = tf.get_variable(name = '_biases',initializer=tf.zeros(shape = [n_output_channels]))
        conv = tf.nn.conv2d(input = input_tensor,filter = weights,strides = strides,padding = padding_mode)
        conv = tf.nn.bias_add(conv,biases,name = 'net_pre-activation')
        conv = tf.nn.relu(conv,name = 'activation')
        return conv

In [8]:
## Here we define the fully connnected layer that wouuld be trained after some specified number of convolutional layers and hence would be responsible for training on important extracted features
def fc_layer(input_tensor,name,n_output_units,activation_fn=None):
    with tf.variable_scope(name):
        input_shape = input_tensor.get_shape().as_list()[1:]
        n_input_units = np.prod(input_shape)
        if len(input_shape) > 1:
            input_tensor = tf.reshape(input_tensor,shape = (-1,n_input_units))
        
        weights_shape = [n_input_units,n_output_units]
        weights = tf.get_variable(name = '_weights',shape = weights_shape)
        biases = tf.get_variable(name = '_biases',initializer = tf.zeros(shape = [n_output_units]))
        layer = tf.matmul(input_tensor,weights)
        layer = tf.nn.bias_add(layer,biases,name = 'net_pre-activation')

        if activation_fn is None:
            return layer
        layer = activation_fn(layer , name = 'activation')
        return layer

In [9]:
## In this function we use the helper functions defined above and Build the whole CNN Model from Scratch
def build_cnn():
    tf_x = tf.placeholder(tf.float32,shape = [None,784],name = 'tf_x')
    tf_y = tf.placeholder(tf.int32,shape = [None],name = 'tf_y')
    tf_x_image = tf.reshape(tf_x,shape = [-1,28,28,1],name = 'tf_x_reshaped')
    tf_y_onehot = tf.one_hot(indices = tf_y,depth = 10,dtype = tf.float32,name = 'tf_onehot')
    
    print('\nBuilding 1st layer:')
    ## It has alternate layers of convolution and subsampling
    h1 = conv_layer(tf_x_image,name='conv_1',kernel_size = (5,5),padding_mode = 'VALID',n_output_channels = 32)
    
    h1_pool = tf.nn.max_pool(h1,ksize = [1,2,2,1],strides = [1,2,2,1],padding = 'SAME')
    
    print('\nBuilding 2nd layer:')
    h2 = conv_layer(h1_pool,name = 'conv_2',kernel_size = (5,5),padding_mode='VALID',n_output_channels=64)
    
    h2_pool = tf.nn.max_pool(h2,ksize = [1,2,2,1],strides = [1,2,2,1],padding = 'SAME')
    
    print('\nBuilding 3rd layer:')
    ## After two layers of Convolution we add the Fully Connected Layers
    h3 = fc_layer(h2_pool,name = 'fc_3',n_output_units=1024,activation_fn=tf.nn.relu)
    
    keep_prob = tf.placeholder(tf.float32,name = 'fc_keep_prob')
    ## Here we also add the Dropout Layer as a regularizer so that the model doesn't overfit
    h3_drop = tf.nn.dropout(h3,keep_prob = keep_prob,name = 'dropout_layer')
    
    print('\nBuilding 4th layer:')
    ## Here we build the last layer of Fully connected layer after which we give out the logits
    h4 = fc_layer(h3_drop,name = 'fc_4',n_output_units=10,activation_fn=None)
    ## Here we define the softmax for evaluating respective probabilities and the labels for the images
    predictions = {'probabilities':tf.nn.softmax(h4,name = 'probabilities'),'labels':tf.cast(tf.argmax(h4,axis = 1),tf.int32,name = 'labels')}
    ## Here we define the cross entropy loss for the classifier
    cross_entropy_loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = h4,labels = tf_y_onehot),name = 'cross_entropy_loss')
    ## Here we define the respective optimizer for the function
    optimizer = tf.train.AdamOptimizer(learning_rate)
    optimizer = optimizer.minimize(cross_entropy_loss,name = 'train_op')
    ## Evaluate the predictions
    correct_predictions = tf.equal(predictions['labels'],tf_y,name = 'correct_preds')
    ## calculate the accuracy
    accuracy = tf.reduce_mean(tf.cast(correct_predictions,tf.float32),name = 'accuracy')

In [10]:
## Here we define the reespective helper functions to save the model and restore it whenever needed and use it for further dataset
## Saving the model
def save(saver,sess,epoch,path = './model/'):
    if not os.path.isdir(path):
        os.makedirs(path)
    print('Saving model in %s' %path)
    saver.save(sess,os.path.join(path,'cnn-model.ckpt'),global_step = epoch)
## Loading the model    
def load(saver,sess,path,epoch):
    print('Loading model from %s' % path)
    saver.restore(sess, os.path.join(path,'cnn-model.ckpt-%d' % epoch))
## Training the model
def train(sess,training_set,validation_set=None,initialize = True,epochs = 20,shuffle = True,dropout = 0.5,random_seed = None):
    X_data = np.array(training_set[0])
    y_data = np.array(training_set[1])
    training_loss = []
    
    if initialize:
        sess.run(tf.global_variables_initializer())
        
    np.random.seed(random_seed)
    for epoch in range(1,epochs+1):
        batch_gen = batch_generator(X_data,y_data,shuffle = shuffle)
        avg_loss = 0.0
        for i,(batch_x,batch_y) in enumerate(batch_gen):
            feed = {'tf_x:0' : batch_x,'tf_y:0': batch_y,'fc_keep_prob:0':dropout}
            loss, _ = sess.run(['cross_entropy_loss:0','train_op'],feed_dict = feed)
            avg_loss += loss
        training_loss.append(avg_loss/(i+1))
        
        print('Epoch %02d Training Avg. Loss: %7.3f ' %(epoch,avg_loss),end = ' ')
        
        if validation_set is not None:
            feed = {'tf_x:0':validation_set[0],
                    'tf_y:0':validation_set[1],
                    'fc_keep_prob:0':1.0}
            valid_acc = sess.run('accuracy:0',feed_dict = feed)
            print(' Validation Acc: %7.3f' % valid_acc)
        else:
            print()

## prerdicting from the model
def predict(sess,X_test,return_proba = False):
    feed = {'tf_x:0':X_test,'fc_keep_prob:0':1.0}
    if return_proba:
        return sess.run('probabilities:0',feed_dict = feed)
    else:
        return sess.run('labels:0',feed_dict= feed)
    

In [12]:
## Here we use some values for learning rate which turned out to be efficient 
learning_rate = 1e-4
random_seed = 123
## building the model
g = tf.Graph()
with g.as_default():
    tf.set_random_seed(random_seed)
    build_cnn()
    
    saver = tf.train.Saver()


Building 1st layer:

Building 2nd layer:

Building 3rd layer:

Building 4th layer:


In [13]:
## Training the model and checking its validation accuracy
with tf.Session(graph = g) as sess:
    train(sess,training_set = (X_train_centered,y_train),validation_set=(X_valid_centered,y_valid),initialize=True,random_seed=123)
    save(saver,sess,epoch = 20)

Epoch 01 Training Avg. Loss: 273.745   Validation Acc:   0.974
Epoch 02 Training Avg. Loss:  74.354   Validation Acc:   0.984
Epoch 03 Training Avg. Loss:  50.427   Validation Acc:   0.984
Epoch 04 Training Avg. Loss:  39.705   Validation Acc:   0.988
Epoch 05 Training Avg. Loss:  32.010   Validation Acc:   0.988
Epoch 06 Training Avg. Loss:  27.765   Validation Acc:   0.988
Epoch 07 Training Avg. Loss:  23.297   Validation Acc:   0.991
Epoch 08 Training Avg. Loss:  20.094   Validation Acc:   0.990
Epoch 09 Training Avg. Loss:  17.445   Validation Acc:   0.992
Epoch 10 Training Avg. Loss:  16.106   Validation Acc:   0.991
Epoch 11 Training Avg. Loss:  12.627   Validation Acc:   0.992
Epoch 12 Training Avg. Loss:  11.549   Validation Acc:   0.992
Epoch 13 Training Avg. Loss:  10.415   Validation Acc:   0.990
Epoch 14 Training Avg. Loss:   9.284   Validation Acc:   0.992
Epoch 15 Training Avg. Loss:   8.617   Validation Acc:   0.992
Epoch 16 Training Avg. Loss:   7.419   Validation Acc: 

In [16]:
del g
## checking if the load and save method works and then predicting from the model
g2 = tf.Graph()
with g2.as_default():
    tf.set_random_seed(random_seed)
    build_cnn()
    saver = tf.train.Saver()
    
with tf.Session(graph = g2) as sess:
    load(saver,sess,epoch = 20,path = './model/')
    
    preds = predict(sess,X_test_centered,return_proba=False)
    print('Test Accuracy: %.3f%%' % (100*np.sum(preds == y_test)/len(y_test)))
## Hence we obtain a good enough testing accuracy which is equivalent to the validation accurarcy


Building 1st layer:

Building 2nd layer:

Building 3rd layer:

Building 4th layer:
Loading model from ./model/
INFO:tensorflow:Restoring parameters from ./model/cnn-model.ckpt-20
Test Accuracy: 99.250%


In [18]:
## retraining the model for another 20 epochs from the loaded checkpoint and then again testing the model
with tf.Session(graph = g2) as sess:
    load(saver,sess,epoch = 20,path = './model/')
    train(sess,training_set=(X_train_centered,y_train),validation_set=(X_valid_centered,y_valid),initialize = False,epochs = 20,random_seed = 123)
    save(saver,sess,epoch = 40,path = './model/')
    preds = predict(sess,X_test_centered,return_proba=False)
    print('Test Accuracy: %.3f%%' %(100*np.sum(preds == y_test)/len(y_test)))

Loading model from ./model/
INFO:tensorflow:Restoring parameters from ./model/cnn-model.ckpt-20
Epoch 01 Training Avg. Loss:   4.677   Validation Acc:   0.992
Epoch 02 Training Avg. Loss:   4.158   Validation Acc:   0.992
Epoch 03 Training Avg. Loss:   3.544   Validation Acc:   0.992
Epoch 04 Training Avg. Loss:   3.178   Validation Acc:   0.992
Epoch 05 Training Avg. Loss:   3.171   Validation Acc:   0.992
Epoch 06 Training Avg. Loss:   2.638   Validation Acc:   0.991
Epoch 07 Training Avg. Loss:   3.575   Validation Acc:   0.993
Epoch 08 Training Avg. Loss:   2.499   Validation Acc:   0.992
Epoch 09 Training Avg. Loss:   2.484   Validation Acc:   0.992
Epoch 10 Training Avg. Loss:   1.835   Validation Acc:   0.993
Epoch 11 Training Avg. Loss:   2.733   Validation Acc:   0.993
Epoch 12 Training Avg. Loss:   1.897   Validation Acc:   0.992
Epoch 13 Training Avg. Loss:   2.218   Validation Acc:   0.991
Epoch 14 Training Avg. Loss:   2.267   Validation Acc:   0.993
Epoch 15 Training Avg.

###    High Level API
Here we build the same model but using the High Level Layers API of Tensorflow

In [19]:
import tensorflow as tf
import numpy as np
## Here we store the model in tflayers-model 
## We define the class ConvNN which would contain all the required methods for training and building the model
class ConvNN(object):
    def __init__(self,batchsize = 64,epochs = 20,learning_rate = 1e-4,dropout_rate = 0.5,shuffle = True,random_seed = None):
        np.random.seed(random_seed)
        self.batchsize = batchsize
        self.epochs = epochs
        self.learning_rate = learning_rate
        self.dropout_rate = dropout_rate
        self.shuffle = shuffle
        
        g = tf.Graph()
        with g.as_default():
            tf.set_random_seed(random_seed)
            self.build()
            self.init_op = tf.global_variables_initializer()
            self.saver = tf.train.Saver()
        self.sess = tf.Session(graph = g)
        
    def build(self):
        tf_x = tf.placeholder(tf.float32,shape = [None,784],name = 'tf_x')
        tf_y  =tf.placeholder(tf.int32,shape = [None],name = 'tf_y')
        
        is_train = tf.placeholder(tf.bool,shape = (),name = 'is_train')
        
        tf_x_image = tf.reshape(tf_x,shape = [-1,28,28,1],name = 'input_x_2dimages')
        tf_y_onehot = tf.one_hot(indices = tf_y,depth = 10,dtype = tf.float32,name = 'input_y_onehot')
        
        h1 = tf.layers.conv2d(tf_x_image,kernel_size = (5,5),filters = 32,activation = tf.nn.relu)
        h1_pool = tf.layers.max_pooling2d(h1,pool_size=(2,2),strides=(2,2))
        
        h2 = tf.layers.conv2d(h1_pool,kernel_size=(5,5),filters=64,activation=tf.nn.relu)
        h2_pool = tf.layers.max_pooling2d(h2,pool_size=(2,2),strides=(2,2))
        
        input_shape = h2_pool.get_shape().as_list()
        n_input_units = np.prod(input_shape[1:])
        h2_pool_flat = tf.reshape(h2_pool,shape = [-1,n_input_units])
        h3 = tf.layers.dense(h2_pool_flat,1024,activation=tf.nn.relu)
        
        h3_drop = tf.layers.dropout(h3,rate = self.dropout_rate,training = is_train)
        h4 = tf.layers.dense(h3_drop,10,activation = None)
        
        predictions = {'probabilities':tf.nn.softmax(h4,name = 'probabilities'),'labels':tf.cast(tf.argmax(h4,axis = 1),tf.int32,name ='labels')}
        cross_entropy_loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = h4,labels = tf_y_onehot),name = 'cross_entropy_loss')
        optimizer = tf.train.AdamOptimizer(self.learning_rate)
        optimizer = optimizer.minimize(cross_entropy_loss,name = 'train_op')
        
        correct_predictions = tf.equal(predictions['labels'],tf_y,name = 'correct_preds')
        accuracy = tf.reduce_mean(tf.cast(correct_predictions,tf.float32),name = 'accuracy')
        
    def save(self,epoch,path = './tflayers-model/'):
        if not os.path.isdir(path):
            os.makedirs(path)
        print('Saving model in %s ' % path)
        self.saver.save(self.sess,os.path.join(path,'model.ckpt'),global_step = epoch)
    
    def load(self,epoch,path):
        print('Loading model from %s' % path)
        self.saver.restore(self.sess,os.path.join(path,'model.ckpt-%d' % epoch))
        
    def train(self,training_set,validation_set = None,initialize = True):
        if initialize:
            self.sess.run(self.init_op)
        
        self.train_cost_ = []
        X_data = np.array(training_set[0])
        y_data =np.array(training_set[1])
        
        for epoch in range(1,self.epochs+1):
            batch_gen = batch_generator(X_data,y_data,shuffle = self.shuffle)
            avg_loss = 0.0
            for i,(batch_x,batch_y) in enumerate(batch_gen):
                feed = {'tf_x:0':batch_x,'tf_y:0':batch_y,'is_train:0':True}
                loss, _ = self.sess.run(['cross_entropy_loss:0','train_op'],feed_dict = feed)
                avg_loss += loss

            print('Epoch %02d: Training Avg. Loss: %7.3f ' %(epoch,avg_loss),end = ' ')
            if validation_set is not None:
                feed = {'tf_x:0':validation_set[0],'tf_y:0':validation_set[1],'is_train:0':False}
                valid_acc = self.sess.run('accuracy:0',feed_dict = feed)
                print('Validation Acc: %7.3f ' % valid_acc)
            else:
                print()

    def predict(self,X_test,return_proba = False):
        feed = {'tf_x:0':X_test,'is_train:0':False}
        if return_proba:
            return self.sess.run('probabilities:0',feed_dict = feed)
        else:
            return self.sess.run('labels:0',feed_dict = feed)

In [28]:
## Instantiating the model and building and training it
cnn = ConvNN(random_seed = 123)
cnn.train(training_set = (X_train_centered,y_train),validation_set = (X_valid_centered,y_valid),initialize = True)
cnn.save(epoch = 20)
## The respecting warnings are occuring due to colliding version in tensorflow and gast and would be updated later

Epoch 01: Training Avg. Loss: 271.610  Validation Acc:   0.974 
Epoch 02: Training Avg. Loss:  75.700  Validation Acc:   0.982 
Epoch 03: Training Avg. Loss:  51.315  Validation Acc:   0.987 
Epoch 04: Training Avg. Loss:  39.694  Validation Acc:   0.988 
Epoch 05: Training Avg. Loss:  31.472  Validation Acc:   0.988 
Epoch 06: Training Avg. Loss:  27.187  Validation Acc:   0.989 
Epoch 07: Training Avg. Loss:  23.141  Validation Acc:   0.991 
Epoch 08: Training Avg. Loss:  19.402  Validation Acc:   0.990 
Epoch 09: Training Avg. Loss:  16.660  Validation Acc:   0.991 
Epoch 10: Training Avg. Loss:  15.587  Validation Acc:   0.990 
Epoch 11: Training Avg. Loss:  12.854  Validation Acc:   0.991 
Epoch 12: Training Avg. Loss:  11.148  Validation Acc:   0.992 
Epoch 13: Training Avg. Loss:  10.020  Validation Acc:   0.992 
Epoch 14: Training Avg. Loss:   8.942  Validation Acc:   0.992 
Epoch 15: Training Avg. Loss:   8.371  Validation Acc:   0.991 


Epoch 16: Training Avg. Loss:   7.281  Validation Acc:   0.991 
Epoch 17: Training Avg. Loss:   5.560  Validation Acc:   0.992 
Epoch 18: Training Avg. Loss:   6.090  Validation Acc:   0.993 
Epoch 19: Training Avg. Loss:   4.900  Validation Acc:   0.993 
Epoch 20: Training Avg. Loss:   4.851  Validation Acc:   0.992 
Saving model in ./tflayers-model/ 


In [29]:
del cnn
## loading the model again and hence using it to predict the testing dataset
cnn2 = ConvNN(random_seed = 123)
cnn2.load(epoch = 20,path = './tflayers-model/')
preds = cnn2.predict(X_test_centered)
print('Test Accuracy: %.2f%%' %(100*np.sum(y_test == preds)/len(y_test)))
## Hence we again obtain a good enough testing accuracy

Loading model from ./tflayers-model/
INFO:tensorflow:Restoring parameters from ./tflayers-model/model.ckpt-20
Test Accuracy: 99.40%
