Multi Layer Perceptron
=============

The Multi Layer Perceptron (MLP) is an extension of the classical [Perceptron](https://en.wikipedia.org/wiki/Perceptron) having one or more hiddel layers.

The MLP in is classical form, is based on an input layer, an hidden layer and an output layer. The transfer function used between the layer and in output is a standard Sigmoid. The loss function can be defined as the mean squared error between the output and the labels.

Implementing the model in Tensorflow
------------------------------------------

It is straightforward to implement the model in Tensorflow. Using the `tf.layers` facilities we can define a perceptron in three lines of code. Here I will use the implementation based on the `Estimator` class that requires to embedd the model into a function and associate it to the estimator object. The model is automatically stored in a folder (specified when you create the estimator) and a checkpoint is saved during the training. Thanks to this trick you can resume the training and load a specific checkpoint.

In [None]:
import tensorflow as tf

In [None]:
def my_model_fn(features, labels, mode):
    #Defining the MLP model
    x = tf.reshape(features, [-1, 2])
    w = tf.layers.dense(inputs=x, units=8, activation=tf.nn.sigmoid)   
    y = tf.layers.dense(inputs=w, units=1, activation=tf.nn.sigmoid)
    #PREDICT mode
    if mode == tf.estimator.ModeKeys.PREDICT:
        predictions = {"classes": tf.round(y),
                       "probabilities": y}
        return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)
    #TRAIN mode
    elif mode == tf.estimator.ModeKeys.TRAIN:
        loss = tf.losses.mean_squared_error(labels=labels, predictions=y)
        #optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
        optimizer = tf.train.AdamOptimizer()
        train_op = optimizer.minimize(loss=loss, global_step=tf.train.get_global_step())
        accuracy = tf.metrics.accuracy(labels=labels, predictions=tf.round(y))
        tf.summary.scalar('accuracy', accuracy[1]) #<-- accuracy[1] to grab the value
        logging_hook = tf.train.LoggingTensorHook({"accuracy" : accuracy[1]}, every_n_iter=250)
        return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op, training_hooks =[logging_hook])
    #EVAL mode
    elif mode == tf.estimator.ModeKeys.EVAL:
        loss = tf.losses.mean_squared_error(labels=labels, predictions=y)
        accuracy = tf.metrics.accuracy(labels=labels, predictions=tf.round(y))
        eval_metric = {"accuracy": accuracy}
        return tf.estimator.EstimatorSpec(mode=mode, loss=loss, eval_metric_ops=eval_metric)

In [None]:
mlp = tf.estimator.Estimator(model_fn=my_model_fn, model_dir="./tf_mlp_model")

Training the model
---------------------

Once we have the model ready, we can train it on a dataset. Here I will use the **XOR dataset** that has been created in [another notebook](../xor/xor.ipynb) of this repository. You do not have to run the notebook, since a version of the dataset has been included in TensorBag and is ready to be used. With the estimator class of Tensorflow it is necessary to pass an input function to the trainer. Here I define this function and I parse the dataset that is stored in TFRecord format. The dataset is allocated as a Tensorflow `Dataset` object, that makes very easy to return samples from it. Remember that you can monitor the training using **Tensorboard** through the `--logdir` parameter from the terminal.

In [None]:
def my_input_fn():  
    def _parse_function(example_proto):
        features = {"feature": tf.VarLenFeature(tf.float32),
                    "label": tf.FixedLenFeature((), tf.int64, default_value=0)}
        parsed_features = tf.parse_single_example(example_proto, features)
        feature = tf.cast(parsed_features["feature"], tf.float32)
        feature = tf.sparse_tensor_to_dense(feature, default_value=0)
        label = tf.reshape(parsed_features["label"], [1])
        label = tf.cast(label, tf.float32)
        return feature, label

    tf_train_dataset = tf.data.TFRecordDataset("../xor/xor_train.tfrecord")
    tf_train_dataset = tf_train_dataset.map(_parse_function)
    tf_train_dataset.cache() # caches entire dataset
    #Setting a buffer_size greater than the number of examples in the Dataset 
    #ensures that the data is completely shuffled. 
    tf_train_dataset = tf_train_dataset.shuffle(buffer_size = 8000 * 2) # shuffle all the elements
    #The repeat method has the Dataset restart when it reaches the end.
    tf_train_dataset = tf_train_dataset.repeat() # repeats dataset this times
    #The batch method collects a number of examples and stacks them, to create batches. 
    #This adds a dimension to their shape. The new dimension is added as the first dimension.
    #The batch may have unknown batch size because the last batch can have fewer elements.
    tf_train_dataset = tf_train_dataset.batch(32) # batch size
    
    iterator = tf_train_dataset.make_one_shot_iterator()
    batch_features, batch_labels = iterator.get_next()
    return batch_features, batch_labels

In [None]:
tf.logging.set_verbosity(tf.logging.INFO)

In [None]:
mlp.train(input_fn=my_input_fn, steps=5000)

Evaluation on the test set
------------------------------

The XOR dataset also has a test set that can be used to estimate the accuracy of the MLP.

In [None]:
def my_eval_input_fn():
    def _parse_function(example_proto):
        features = {"feature": tf.VarLenFeature(tf.float32),
                    "label": tf.FixedLenFeature((), tf.int64, default_value=0)}
        parsed_features = tf.parse_single_example(example_proto, features)
        feature = tf.cast(parsed_features["feature"], tf.float32)
        feature = tf.sparse_tensor_to_dense(feature, default_value=0)
        label = tf.reshape(parsed_features["label"], [1])
        label = tf.cast(label, tf.float32)
        return feature, label

    tf_test_dataset = tf.data.TFRecordDataset("../xor/xor_test.tfrecord")
    tf_test_dataset = tf_test_dataset.map(_parse_function)
    tf_test_dataset.cache() # caches entire dataset
    tf_test_dataset = tf_test_dataset.repeat(1) # repeats dataset this times
    tf_test_dataset = tf_test_dataset.batch(1) # batch size  
    
    iterator_test = tf_test_dataset.make_one_shot_iterator()
    batch_features, batch_labels = iterator_test.get_next()
    return batch_features, batch_labels

In [None]:
mlp.evaluate(input_fn=my_eval_input_fn, steps=2000)

Using the model on new data
--------------------------------

To use the model on custom data it is possible to use the `predict()` method of the estimator class.

In [None]:
def my_predict_input_fn():
    feture_batch = tf.constant([[3.5, 2.9], [3.5, -2.9], [-3.5, 2.9], [-3.5, -2.9]])
    
    tf_predict_dataset = tf.data.Dataset.from_tensor_slices((feture_batch))
    tf_predict_dataset = tf_predict_dataset.repeat(1)
    
    iterator_predict = tf_predict_dataset.make_one_shot_iterator()
    batch_features = iterator_predict.get_next()
    return batch_features

In [None]:
predictions = mlp.predict(input_fn=my_predict_input_fn)

for i, prediction in enumerate(predictions):
    print "Predicted class: " + str(prediction['classes'])
    print "Probabilities: " + str(prediction['probabilities'])
    print ""