Here is the multi-layer perceptron, in raw tensorflow. A little bit of math will show through, and we'll end up making our own training loop.

Once you see how much boilerplate is involved in TensorFlow, you'll really appreciate the Keras solution!

In [1]:
import tensorflow as tf
import numpy as np

  from ._conv import register_converters as _register_converters


Here controlling the size of our network.

In [2]:
num_inputs = 28 * 28 # MNIST image size
num_hidden = 256 # hidden layers
num_outputs = 10 # 10 output digits
batch_size = 64 # mini batch
epochs = 10 # total training loops
learning_rate = 0.01 # amount we update parameters

MNIST digits. It's really convenient, essentially every library has this built in. TensorFlow has a relatively convenient `one_hot` option too!

In [3]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use urllib or similar directly.
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting /tmp/data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from t

Starting with the input placeholders, this is where we will feed data.

In [4]:
# tf Graph input
x = tf.placeholder("float", [None, num_inputs])
y = tf.placeholder("float", [None, num_outputs])

And variables, these are the learned part of the network. We'll have weights and biases, in the calss network style.

In [5]:
weights = {
    'h1': tf.Variable(tf.random_normal([num_inputs, num_hidden])),
    'h2': tf.Variable(tf.random_normal([num_hidden, num_hidden])),
    'out': tf.Variable(tf.random_normal([num_hidden, num_outputs]))
}
biases = {
    'b1': tf.Variable(tf.random_normal([num_hidden])),
    'b2': tf.Variable(tf.random_normal([num_hidden])),
    'out': tf.Variable(tf.random_normal([num_outputs]))
}

Now the actual network construction, this uses `WX + b` with a `relu` activation, so we can see the math a little bit.

Two layers deep, with a final output layer, which is where we'll apply softmax.

In [6]:
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
out_layer = tf.matmul(layer_2, weights['out']) + biases['out']

And now -- the standard formula, a loss function -- here using cross entropy, which is a great choice for multiclass problems, and good old stochastic gradient descent.

In [7]:
# Minimize error using cross entropy
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=out_layer, labels=y))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)

Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See @{tf.nn.softmax_cross_entropy_with_logits_v2}.



Time to initialize all the global variables.

In [8]:
init = tf.global_variables_initializer()

And now -- the training loop. Pulling out batches of X and Y -- images and labels, pulling through the loss, updating the variables with the optimizer.

In [9]:
with tf.Session() as sess:

    # Run the initializer
    sess.run(init)

    # Training cycle
    for epoch in range(epochs):
        total_batch = int(mnist.train.num_examples/batch_size)
        # Loop over all batches
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            # Run optimization op (backprop) and cost op (to get loss value)
            _, c = sess.run([optimizer, loss], feed_dict={x: batch_xs,
                                                          y: batch_ys})

        print("Epoch:", '%04d' % epoch, "cost=", "{:.9f}".format(c))
        # Calculate accuracy
        correct_prediction = tf.equal(tf.argmax(out_layer, 1), tf.argmax(y, 1))
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
        print("Accuracy:", accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))

Epoch: 0000 cost= 98.675399780
Accuracy: 0.8626
Epoch: 0001 cost= 61.107852936
Accuracy: 0.8659
Epoch: 0002 cost= 98.604949951
Accuracy: 0.8701
Epoch: 0003 cost= 31.784706116
Accuracy: 0.8841
Epoch: 0004 cost= 32.108734131
Accuracy: 0.8614
Epoch: 0005 cost= 37.851261139
Accuracy: 0.8712
Epoch: 0006 cost= 29.841491699
Accuracy: 0.8797
Epoch: 0007 cost= 24.118888855
Accuracy: 0.8715
Epoch: 0008 cost= 19.677768707
Accuracy: 0.882
Epoch: 0009 cost= 34.726871490
Accuracy: 0.8756
