# Convolutional Neural Network with Fractional Max Pooling#
## Full Process Jupyter Notebook ##

### Introduction ###
This model is inspired by the all-convolutional network article, which is cited in the main report. In our cnnfmp model, we replace the conv2x2 layers with stride 2 with the fractional max pooling layers. The author of all-CNN model believes that pooling layers decrease information layers contain, so replacing max pooling layers with convolutional layers will be effective. In the results, all-cnn outperforms many methods except for fractional pooling method. We believe that fractional max pooling method has improved pooling. So we replace the stride-2 conv2x2 layers in all-CNN model with fractional max pooling layers to see whether it improve the performance of all-cnn model. For more information about fractional max pooling and all-cnn model, please refer to our report

### Step0：Import Packages

In [1]:
import tensorflow as tf
import numpy as np
import pickle
import random as rd

### Step1: Data Processing
we use CIFAR-10 dataset for python. To unpickle this data file, we first define help function. Then we load data from six batches. We then have four batches of training data of size 10000, one batch validation data of size 10000 and one batch of test data of size 10000. 
Then we need to modify the dimension of y labels. Right now of labels are in 1-d, we will change then to a 10-dimensional vector containing indicators for each category.

In [2]:
def unpickle(file):
   
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='bytes')
    return dict

In [3]:
x_train=np.array([])
y_train=np.array([])
x_val=np.array([])
y_val=np.array([])
for i in range(1,5):
    
    batch = unpickle("/data/cifar-10-batches-py/data_batch_%d"%(i))
    if len(x_train)==0 & len(y_train)==0:
        x_train = batch[b'data']
        y_train = batch[b'labels']
    else:
        x_train = np.concatenate((x_train, batch[b'data'])) 
        y_train = np.concatenate((y_train, batch[b'labels']))

v_batch = unpickle("/data/cifar-10-batches-py/data_batch_5")
x_val = v_batch[b'data']
y_val = v_batch[b'labels']

In [4]:
y_train0 = np.empty((0,10), int)
for y in y_train:
    indi= [1 if i==y else 0 for i in range(0,10)]
    y_train0 = np.append(y_train0, np.array([indi]), axis=0)
y_val0 = np.empty((0,10), int)
for y in y_val:
    indi= [1 if i==y else 0 for i in range(0,10)]
    y_val0 = np.append(y_val0, np.array([indi]), axis=0)

In [5]:
test_batch = unpickle("/data/cifar-10-batches-py/test_batch")
x_test = test_batch[b'data']
y_test = test_batch[b'labels']

In [6]:
y_test0 = np.empty((0,10), int)
for y in y_test:
    indi= [1 if i==y else 0 for i in range(0,10)]
    y_test0 = np.append(y_test0, np.array([indi]), axis=0)

In addition to the data preprocessing part. We notice that there are many data augmentation methods. The one the all-CNN net author use is changing the 32x32x3 data into 126x126x3 aggressively. We tried this method, however, due to the GPU capacity(Tesla k80), we had a hard time training the model with data of dimension 126x126x3. So we use the original images size.

### Step2: Build The CNNFMP Model
We replaced all the stride2 conv2x2 layers with fractional max pooling layers of pooling size sqrt(2) except for the last stride 2 conv2x2 layer. Because stride 2 conv2x2 layers are mainly used to reduce the dimensions of images, so we use fractional max pooling for same function.

In [7]:
def compute_logits(x):
    """
    CNN-FMP network model
    This function compute the logits for the layers, the input is a x data set with dimension [none:32*32*3]
    
    """
    
    #input x.shape = [,32*32*3]
    x_image = tf.reshape(x,[-1,32,32,3])
    
    #define some constants, stack size for convolutional layers
    n1=320
    n2=640
    n3=960
    n4=1280
    n5=1600
    n6=1920

    
    #Block1
    #cnn_1
    W_conv1 = tf.get_variable('W_conv1', shape=[2, 2, 3, n1])
    b_conv1 = tf.get_variable('b_conv1', shape=[n1])
    h_conv1 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(x_image, W_conv1, strides=[1, 1, 1, 1], padding='SAME'), b_conv1))
    #cnn_2
    W_conv2 = tf.get_variable('W_conv2', shape=[2, 2, n1, n1])
    b_conv2 = tf.get_variable('b_conv2', shape=[n1])
    h_conv2 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(h_conv1, W_conv2, strides=[1, 1, 1, 1], padding='SAME'), b_conv2))
    #fractional_max_pooling_1 (pooling size is sqrt(2)=1.414)
    fmp_1=tf.nn.fractional_max_pool(h_conv2,pooling_ratio=[1.0, 1.414, 1.414, 1.0])[0]
    

    
    #block_2
    #dropout rate = 0.1 
    #cnn_3
    W_conv3 = tf.get_variable('W_conv3', shape=[2, 2, n1, n2])
    b_conv3 = tf.get_variable('b_conv3', shape=[n2])
    h_conv3 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(fmp_1, W_conv3, strides=[1, 1, 1, 1], padding='SAME'), b_conv3))
    #dropout_1
    d_1=tf.nn.dropout(h_conv3,0.9)
    #cnn_4
    W_conv4 = tf.get_variable('W_conv4', shape=[2, 2, n2, n2])
    b_conv4 = tf.get_variable('b_conv4', shape=[n2])
    h_conv4 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(d_1, W_conv4, strides=[1, 1, 1, 1], padding='SAME'), b_conv4))
    #dropout_2
    d_2=tf.nn.dropout(h_conv4,0.9)
    #fractional_max_pooling_2
    fmp_2 = tf.nn.fractional_max_pool(d_2,pooling_ratio=[1.0, 1.414, 1.414, 1.0])[0]
    
    
    
    #block_3
    #dropout=0.2
    #cnn_5
    W_conv5 = tf.get_variable('W_conv5', shape=[2, 2, n2, n3])
    b_conv5 = tf.get_variable('b_conv5', shape=[n3])
    h_conv5 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(fmp_2, W_conv5, strides=[1, 1, 1, 1], padding='SAME'), b_conv5))
    #dropout_3
    d_3=tf.nn.dropout(h_conv5,0.8)
    #cnn_6
    W_conv6 = tf.get_variable('W_conv6', shape=[2, 2, n3, n3])
    b_conv6 = tf.get_variable('b_conv6', shape=[n3])
    h_conv6 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(d_3, W_conv6, strides=[1, 1, 1, 1], padding='SAME'), b_conv6))
    #dropout_4
    d_4=tf.nn.dropout(h_conv6,0.8)
    #fractional_max_pooling_3
    fmp_3 = tf.nn.fractional_max_pool(d_4,pooling_ratio=[1.0, 1.414, 1.414, 1.0])[0]
    

    
    #block_4
    #dropout=0.3
    #cnn_7
    W_conv7 = tf.get_variable('W_conv7', shape=[2, 2, n3, n4])
    b_conv7 = tf.get_variable('b_conv7', shape=[n4])
    h_conv7 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(fmp_3, W_conv7, strides=[1, 1, 1, 1], padding='SAME'), b_conv7))
    #dropout_5
    d_5=tf.nn.dropout(h_conv7,0.7)
    #cnn_8
    W_conv8 = tf.get_variable('W_conv8', shape=[2, 2, n4, n4])
    b_conv8 = tf.get_variable('b_conv8', shape=[n4])
    h_conv8 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(d_5, W_conv8, strides=[1, 1, 1, 1], padding='SAME'), b_conv8))
    #dropout_6
    d_6=tf.nn.dropout(h_conv8,0.7)
    #fractional_max_pooling_4
    fmp_4 = tf.nn.fractional_max_pool(d_6,pooling_ratio=[1.0, 1.414, 1.414, 1.0])[0]
    
    
    #block_5
    #dropout=0.4
    #cnn_9
    W_conv9 = tf.get_variable('W_conv9', shape=[2, 2, n4, n5])
    b_conv9 = tf.get_variable('b_conv9', shape=[n5])
    h_conv9 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(fmp_4, W_conv9, strides=[1, 1, 1, 1], padding='SAME'), b_conv9))
    #dropout_7
    d_7=tf.nn.dropout(h_conv9,0.6)
    #cnn_10
    W_conv10 = tf.get_variable('W_conv10', shape=[2, 2, n5, n5])
    b_conv10 = tf.get_variable('b_conv10', shape=[n5])
    h_conv10 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(d_7, W_conv10, strides=[1, 1, 1, 1], padding='SAME'), b_conv10))
    #dropout_8
    d_8=tf.nn.dropout(h_conv10,0.6)
    #fractional_max_pooling_5
    fmp_5 = tf.nn.fractional_max_pool(d_8,pooling_ratio=[1.0, 1.414, 1.414, 1.0])[0]
   


    #block_6
    #dropout=0.5
    #cnn_11
    W_conv11 = tf.get_variable('W_conv11', shape=[2, 2, n5, n6])
    b_conv11 = tf.get_variable('b_conv11', shape=[n6])
    h_conv11 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(fmp_5, W_conv11, strides=[1, 1, 1, 1], padding='SAME'), b_conv11))
    #dropout_9
    d_9=tf.nn.dropout(h_conv11,0.5)
    #cnn_12
    W_conv12 = tf.get_variable('W_conv12', shape=[2, 2, n6, n6])
    b_conv12 = tf.get_variable('b_conv12', shape=[n6])
    h_conv12 = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(d_9, W_conv12, strides=[1, 1, 1, 1], padding='SAME'), b_conv12))
    #dropout_10
    d_10=tf.nn.dropout(h_conv12,0.5)
    #fractional_max_pooling_6
    fmp_6 = tf.nn.fractional_max_pool(d_10,pooling_ratio=[1.0, 1.414, 1.414, 1.0])[0]
    
    
    
    #lcnn with stride=2 
    W_conv = tf.get_variable('W_conv', shape=[2, 2, n6, n6])
    b_conv = tf.get_variable('b_conv', shape=[n6])
    h_conv = tf.nn.leaky_relu(tf.add(tf.nn.conv2d(fmp_6, W_conv, strides=[1, 2, 2, 1], padding='SAME'), b_conv))

    # fc layer to logits 10
    h_flat = tf.reshape(h_conv, [-1, 1*1*n6])
    W_fc1 = tf.get_variable('W_fc1', shape=[1*1*n6, 10])
    b_fc1 = tf.get_variable('b_fc1', shape=[10])
    #layer of output
    logits = tf.add(tf.matmul(h_flat, W_fc1), b_fc1, name='logits')
    
    return(logits)

def compute_cross_entropy(logits, y):
    """compute the prediction and cross_entropy of model"""
    # This function is used from the in-class example code
    numerical_instability_example = 0
    if numerical_instability_example:
        y_pred = tf.nn.softmax(logits, name='y_pred') 
        cross_ent = tf.reduce_mean(-tf.reduce_sum(y * tf.log(y_pred), reduction_indices=[1]))
    else:
        sm_ce = tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=logits, name='cross_ent_terms')
        cross_ent = tf.reduce_mean(sm_ce, name='cross_ent')
    return cross_ent

def compute_accuracy(logits, y):
    "compare prediction to labels(also from class example)"
    prediction = tf.argmax(logits, 1, name='pred_class')
    true_label = tf.argmax(y, 1, name='true_class')
    accuracy = tf.reduce_mean(tf.cast(tf.equal(prediction, true_label), tf.float32))
    return accuracy
    
    

### Step3: Train, Validate and Test the CNN-FMP Model
In this part, we also save all the summaries, which are displayed in Tensorboard.
We train the model using small batches(100) due to the GPU capacity. Also, for validation set, we validate model using a 1000 random sample from the validation 10000 data set for the same reason. Test accuracy is calculated using batches of 1000 and their average is calculated below. This result is for models comparison.

In [8]:
dir_name = 'logs/scratch04x/summary'
with tf.Graph().as_default():

    x = tf.placeholder(tf.float32, [None, 32*32*3], name='x')
    y = tf.placeholder(tf.float32, [None, 10], name='y')

    
 
    with tf.name_scope('model'):
        logits = compute_logits(x)
    with tf.name_scope('loss'):
        loss = compute_cross_entropy(logits=logits, y=y)
    with tf.name_scope('accuracy'):
        accuracy = compute_accuracy(logits, y)
    
    with tf.name_scope('opt'):

        opt = tf.train.AdamOptimizer(1e-4)
        train_step = opt.minimize(loss)
    
    with tf.name_scope('summaries'):
        # create summary for loss and accuracy
        tf.summary.scalar('loss', loss) 
        tf.summary.scalar('accuracy', accuracy)
        # create summary for logits
        tf.summary.histogram('logits', logits)
        # create summary for input image
        tf.summary.image('input', tf.reshape(x, [-1, 32, 32, 3]))
    
        summary_op = tf.summary.merge_all()
    
    with tf.Session() as sess:
        summary_writer = tf.summary.FileWriter(dir_name, sess.graph)
        summary_writer_train = tf.summary.FileWriter(dir_name+'/train', sess.graph)
        summary_writer_test = tf.summary.FileWriter(dir_name+'/test')
        summary_writer_val = tf.summary.FileWriter(dir_name+'/val')
        sess.run(tf.global_variables_initializer())
        batch=0
        for i in range(10001):
            
             
            _ , summary = sess.run((train_step, summary_op),
                                feed_dict={x: x_train[100*batch:100*(batch+1)], y: y_train0[100*batch:100*(batch+1)]})
            batch=batch+1
            if batch == 50:
                batch=0
            if i%10==0:
                t=rd.sample(range(0,10000),1000)
                summary_writer_train.add_summary(summary, i)
                (val_accuracy, summary_t) = sess.run((accuracy,summary_op), {x:x_val[t], y:y_val0[t]})
                summary_writer_val.add_summary(summary_t, i)
                if i%1000 == 0:
                    print("\rAfter step {0:3d}, valiation accuracy {1:0.4f}".format(i, val_accuracy), flush=True)
        #calculate test error
        test_ac=[]
        for i in range(0,10):
            acu = sess.run(accuracy,{x:x_test[1000*i:1000*(i+1)],y:y_test0[1000*i:1000*(i+1)]}）
            test_ac.append(acu)
        all_ac = sum(test_ac)*0.1
        print("\rFinal test accuracy is %.4f"all_ac)

    

After step   0, valiation accuracy 0.0950
After step 1000, valiation accuracy 0.3360
After step 2000, valiation accuracy 0.4430
After step 3000, valiation accuracy 0.5390
After step 4000, valiation accuracy 0.5530
After step 5000, valiation accuracy 0.5620
After step 6000, valiation accuracy 0.5860
After step 7000, valiation accuracy 0.5830
After step 8000, valiation accuracy 0.5900
After step 9000, valiation accuracy 0.6200
After step 10000, valiation accuracy 0.6190


IndexError: tuple index out of range

### Conclusion ###
According to the validation accuracy and test accuracy, it seems that cnn-fmp model doesn't outperform the original all-convolutional network. Compared to the input data dimension, our filter size is too big without expanding the data dimension, which creating many variables to calculate gradients, also might cause overfitting. According to our tensorboard output, we actually have over-fitting issues. Maybe we can work on the variables selection part when developing this model. Also, a better GPU might let us try to bigger dimension input data.

In [11]:
test_ac


NameError: name 'test_ac' is not defined