# Multilayer Perceptron
This notebook is quick MLP (multilayer perceptron) predicting digit for [kaggle competition](https://www.kaggle.com/c/digit-recognizer). 

Please consider follow before reading this notebook.
- We don't verify data in this notebook.
- MLP isn't the suitable model for image classification, but digit recognition problem is simply enough to use, however for more complex problem consider CNN (Convolution Neural Network)
- I would like to make this notebook simple enough for anyone who new to MLP can follow the code easily.

## 1. Load data

In [1]:
import pandas as pd
import tensorflow as tf

train_dataset = pd.read_csv('dataset/train.csv')
test_dataset = pd.read_csv('dataset/test.csv')

In [2]:
print("Train Dataset")
print(train_dataset[:5])
print("Test Dataset")
print(test_dataset[:5])
print("Train Size : {}".format(train_dataset.shape[0]))
print("Test Size : {}".format(test_dataset.shape[0]))

Train Dataset
   label  pixel0  pixel1  pixel2  pixel3  pixel4  pixel5  pixel6  pixel7  \
0      1       0       0       0       0       0       0       0       0   
1      0       0       0       0       0       0       0       0       0   
2      1       0       0       0       0       0       0       0       0   
3      4       0       0       0       0       0       0       0       0   
4      0       0       0       0       0       0       0       0       0   

   pixel8    ...     pixel774  pixel775  pixel776  pixel777  pixel778  \
0       0    ...            0         0         0         0         0   
1       0    ...            0         0         0         0         0   
2       0    ...            0         0         0         0         0   
3       0    ...            0         0         0         0         0   
4       0    ...            0         0         0         0         0   

   pixel779  pixel780  pixel781  pixel782  pixel783  
0         0         0         0     

## 2. Prepare Data
Before training we need to prepare data a little bit. here is the detail
- We need to scale each pixel to the range of 0 - 1
- Since Kaggle don't have validate data for us,so we have to split some from training data.
- We need to spearate label (train_y) out of feature (train_x) so we can feed it to our network as label and feature. 

In [3]:
from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import LabelBinarizer

validation_ratio = 0.1

pixel_scaler = MinMaxScaler()
one_hot = LabelBinarizer()
one_hot.fit(range(10))
train_y = one_hot.transform(train_dataset['label'])
train_x = train_dataset.drop('label', axis=1)
train_x = pixel_scaler.fit_transform(train_x)
test_x = pixel_scaler.transform(test_dataset)

split = int(train_dataset.shape[0] * (1 - validation_ratio))

valid_x = train_x[split:train_x.shape[0]]
valid_y = train_y[split:train_x.shape[0]]
train_x = train_x[0:split]
train_y = train_y[0:split]
print("Train Size : {}".format(train_x.shape[0]))
print("Validation Size : {}".format(valid_x.shape[0]))

Train Size : 37800
Validation Size : 4200


## 3. Create Model
Below is the model code.
- Initialize DigitalRecognizerMLP at first.
- Call make_nn() to set up tensorflow graph.
- Call train() for train network after training the network has been stored and ready for prediction.
- Call predict() for make a prediction with the latest saved network.

In [15]:
class DigitRecognizerMLP(object):
    def __init__(self, feature_size=784, label_size=10):
        self.feature_size = feature_size
        self.label_size = label_size
    
    def build_input(self):
        model_x = tf.placeholder(tf.float32, [None, self.feature_size])
        model_y = tf.placeholder(tf.int32, [None, self.label_size])
        model_lr = tf.placeholder(tf.float32)
        return model_x, model_y, model_lr
    
    def build_mlp(self, model_x, model_y):
        hidden_1 = tf.layers.dense(model_x, 1024, activation=tf.nn.relu)
        hidden_2 = tf.layers.dense(hidden_1, 512, activation=tf.nn.relu)
        hidden_3 = tf.layers.dense(hidden_2, 256, activation=tf.nn.relu)
        hidden_4 = tf.layers.dense(hidden_2, 128, activation=tf.nn.relu)
        logits = tf.layers.dense(hidden_3, self.label_size, activation=None)
        return logits
    
    def build_output(self, model_y, logits):
        output = tf.argmax(logits, axis=1)
        loss = tf.losses.softmax_cross_entropy(model_y, logits)
        accuracy = tf.reduce_mean( tf.cast( tf.equal( tf.argmax(model_y, axis=1), output ), tf.float32) )
        return output, loss, accuracy
    
    def model_opt(self, model_lr, loss):
        opt = tf.train.AdamOptimizer(model_lr).minimize(loss)
        return opt
    
    def make_nn(self):
        tf.reset_default_graph()
        self.model_x, self.model_y, self.model_lr = self.build_input()
        self.logits = self.build_mlp(self.model_x, self.model_y)
        self.output, self.loss, self.accuracy = self.build_output(self.model_y, self.logits)
        self.opt = self.model_opt(self.model_lr, self.loss)
        
    def train(self, epoch, learning_rate, train_x, train_y, batch_size, valid_x, valid_y, get_batch_func, print_every):
        t_loss_list = []
        v_loss_list = []
        accuracy_list = []
        saver = tf.train.Saver()
        with tf.Session() as sess :
            sess.run(tf.global_variables_initializer())
            counter = 0
            for e in range(epoch) :
                for x,y in get_batch_func(train_x, train_y, batch_size):
                    feed_dict = {self.model_x:x, self.model_y:y, self.model_lr:learning_rate}
                    loss, _ = sess.run([self.loss, self.opt], feed_dict=feed_dict)
                    if counter % print_every == 0 :
                        feed_dict = {self.model_x:valid_x, self.model_y:valid_y, self.model_lr:learning_rate}
                        v_loss, accuracy = sess.run([self.loss, self.accuracy], feed_dict=feed_dict)
                        print("Epoch: {}/{}, Step: {}, T Loss: {:.4f}, V Loss: {:.4f}, Accuracy: {:.4f}".format(e+1, 
                                    epoch, counter, loss, v_loss, accuracy))
                        t_loss_list.append( loss )
                        v_loss_list.append( v_loss )
                        accuracy_list.append( accuracy )
                        save_path = saver.save(sess, "tensor/model.ckpt")
                    counter += 1
        return t_loss_list, v_loss_list, accuracy_list
    
    def predict(self, predict_x):
        saver = tf.train.Saver()
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            saver.restore(sess, "tensor/model.ckpt")
            feed_dict = {self.model_x:predict_x}
            output = sess.run(self.output, feed_dict=feed_dict)
            return output

In [16]:
# For Testing --- Delete after use
dr_mlp = DigitRecognizerMLP()
dr_mlp.make_nn()

In [17]:
def get_batch(feature, label, batch_size):
    num_batch = feature.shape[0] // batch_size
    for i in range(num_batch):
        start = i * batch_size
        end = (i + 1) * batch_size
        x = feature[start:end]
        y = label[start:end]
        yield x, y

## 4. Train

In [18]:
learning_rate = 0.0001
batch_size = 64
epoch = 50
print_every = 1000

mlp = DigitRecognizerMLP()
mlp.make_nn()
t_loss, v_loss, acc = mlp.train(epoch, learning_rate, train_x, train_y, batch_size, valid_x, valid_y, get_batch, print_every)

Epoch: 1/50, Step: 0, T Loss: 2.3151, V Loss: 2.2903, Accuracy: 0.1014
Epoch: 2/50, Step: 1000, T Loss: 0.0937, V Loss: 0.1498, Accuracy: 0.9550
Epoch: 4/50, Step: 2000, T Loss: 0.0378, V Loss: 0.1087, Accuracy: 0.9671
Epoch: 6/50, Step: 3000, T Loss: 0.0262, V Loss: 0.0960, Accuracy: 0.9695
Epoch: 7/50, Step: 4000, T Loss: 0.0228, V Loss: 0.0870, Accuracy: 0.9726
Epoch: 9/50, Step: 5000, T Loss: 0.1768, V Loss: 0.0906, Accuracy: 0.9729
Epoch: 11/50, Step: 6000, T Loss: 0.0057, V Loss: 0.0971, Accuracy: 0.9740
Epoch: 12/50, Step: 7000, T Loss: 0.0188, V Loss: 0.1006, Accuracy: 0.9743
Epoch: 14/50, Step: 8000, T Loss: 0.0004, V Loss: 0.1107, Accuracy: 0.9729
Epoch: 16/50, Step: 9000, T Loss: 0.0022, V Loss: 0.0959, Accuracy: 0.9779
Epoch: 17/50, Step: 10000, T Loss: 0.0003, V Loss: 0.1017, Accuracy: 0.9779
Epoch: 19/50, Step: 11000, T Loss: 0.0000, V Loss: 0.0972, Accuracy: 0.9790
Epoch: 21/50, Step: 12000, T Loss: 0.0001, V Loss: 0.1041, Accuracy: 0.9779
Epoch: 23/50, Step: 13000, T Lo

## 5. Predict

In [8]:
mlp = DigitRecognizerMLP()
mlp.make_nn()
predict = mlp.predict(test_x)

INFO:tensorflow:Restoring parameters from tensor/model.ckpt


## 6. Make CSV
This is just making csv file as kaggle required for submittion.

In [14]:
import csv

with open('predict.csv', 'w') as csv_file :
    writer = csv.writer(csv_file)
    writer.writerow(['ImageId', 'Label'])
    for i in range(len(predict)) :
        writer.writerow([str(i+1), str(predict[i])])
    csv_file.close()