Let’s first make TensorFlow perform our little example calculation:

a=(b+c)∗(c+2)  
____________
d = b + c

e = c + 2

a = d * e

In [1]:
import tensorflow as tf
import numpy as np

In [2]:
# first, create a TensorFlow constant
const = tf.constant(2.0, name="const")

In [3]:
# create TensorFlow variables
b = tf.Variable(2.0, name='b')
c = tf.Variable(1.0, name='c')

In [4]:
# now create some operations
d = tf.add(b, c, name='d')
e = tf.add(c, const, name='e')
a = tf.multiply(d, e, name='a')

The next step is to setup an object to initialise the variables and the graph structure:

In [5]:
# setup the variable initialisation
init_op = tf.global_variables_initializer()

The TensorFlow session is an object where all operations are run.

In [6]:
# start the session
with tf.Session() as sess:
    # initialise the variables
    sess.run(init_op)
    # compute the output of the graph
    a_out = sess.run(a)
    print("Variable a is {}".format(a_out))

Variable a is 9.0


# The TensorFlow placeholder
Let’s also say that we didn’t know what the value of the array b would be during the declaration phase of the TensorFlow problem (i.e. before the with tf.Session() as sess) stage.  In this case, TensorFlow requires us to declare the basic structure of the data by using the tf.placeholder variable declaration.  Let’s use it for b:

Because we aren’t providing an initialisation in this declaration, we need to tell TensorFlow what data type each element within the tensor is going to be.  In this case, we want to use tf.float32. The second argument is the shape of the data that will be “injected” into this variable.  In this case, we want to use a (? x 1) sized array – because we are being cagey about how much data we are supplying to this variable (hence the “?”), the placeholder is willing to accept a None argument in the size declaration.  Now we can inject as much 1-dimensional data that we want into the b variable.

Here we remove the mystery and specify exactly what the variable b is to be – a one-dimensional range from 0 to 10. 

In [7]:
# create TensorFlow variables
b = tf.placeholder(tf.float32, [None, 1], name='b')

In [8]:
# start the session
with tf.Session() as sess:
    # initialise the variables
    sess.run(init_op)
    # compute the output of the graph
    a_out = sess.run(a, feed_dict={b: np.arange(0, 10)[:, np.newaxis]})
    print("Variable a is {}".format(a_out))

Variable a is 9.0


# A Neural Network Example

This MNIST dataset is a set of 28×28 pixel grayscale images which represent hand-written digits.  It has 55,000 training rows, 10,000 testing rows and 5,000 validation rows.

We can load the data by running:

The one_hot=True argument specifies that instead of the labels associated with each image being the digit itself i.e. “4”, it is a vector with “one hot” node and all the other nodes being zero i.e. [0, 0, 0, 0, 1, 0, 0, 0, 0, 0].  This lets us easily feed it into the output layer of our neural network.

In [9]:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Instructions for updating:
Please write your own downloading logic.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.


In [10]:
# Python optimisation variables
learning_rate = 0.01
epochs = 100
batch_size = 100

In [11]:
# declare the training data placeholders
# input x - for 28 x 28 pixels = 784
x = tf.placeholder(tf.float32, [None, 784])

In [12]:
# now declare the output data placeholder - 10 digits
y = tf.placeholder(tf.float32, [None, 10])

Now we need to setup the weight and bias variables for the three layer neural network.  There are always L-1 number of weights/bias tensors, where L is the number of layers.  So in this case, we need to setup two tensors for each:

 This neural network will have 300 nodes in the hidden layer, so the size of the weight tensor W1 is [784, 300].  We initialise the values of the weights using a random normal distribution with a mean of zero and a standard deviation of 0.03. 

In [13]:
# now declare the weights connecting the input to the hidden layer
W1 = tf.Variable(tf.random_normal([784, 300], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random_normal([300]), name='b1')

Likewise, we create W2 and b2 variables to connect the hidden layer to the output layer of the neural network.

In [14]:
# and the weights connecting the hidden layer to the output layer
W2 = tf.Variable(tf.random_normal([300, 10], stddev=0.03), name='W2')
b2 = tf.Variable(tf.random_normal([10]), name='b2')

Next, we have to setup node inputs and activation functions of the hidden layer nodes:

In [15]:
# calculate the output of the hidden layer
hidden_out = tf.add(tf.matmul(x, W1), b1)
hidden_out = tf.nn.relu(hidden_out)

In [16]:
# now calculate the hidden layer output - in this case, let's use a softmax activated
# output layer
y_ = tf.nn.softmax(tf.add(tf.matmul(hidden_out, W2), b2))

converting the output y_ to a clipped version, limited between 0.000001 to 0.999999.  This is to make sure that we never get a case were we have a log(0) operation occurring during training – this would return NaN and break the training process.

In [17]:
y_clipped = tf.clip_by_value(y_, 1e-10, 0.9999999)

In [18]:
# cross entropy calculation
cross_entropy = -tf.reduce_mean(tf.reduce_sum(y * tf.log(y_clipped)
                         + (1 - y) * tf.log(1 - y_clipped), axis=1))

# Let’s setup the optimiser in TensorFlow:

Here we are just using the gradient descent optimiser provided by TensorFlow.  We initialize it with a learning rate, then specify what we want it to do – i.e. minimise the cross entropy cost operation we created.  This function will then perform the gradient descent (for more details on gradient descent see here and here) and the backpropagation for you.  How easy is that?  TensorFlow has a library of popular neural network training optimisers, see here.

In [19]:
optimiser = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cross_entropy)

In [20]:
# finally setup the initialisation operator
init_op = tf.global_variables_initializer()

# define an accuracy assessment operation
correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# We now have everything we need to setup the training process of our neural network

In [21]:
# start the session
with tf.Session() as sess:
    # initialise the variables
    sess.run(init_op)
    total_batch = int(len(mnist.train.labels) / batch_size)
    for epoch in range(epochs):
        avg_cost = 0
        for i in range(total_batch):
            batch_x, batch_y = mnist.train.next_batch(batch_size=batch_size)
            _, c = sess.run([optimiser, cross_entropy],feed_dict={x: batch_x, y: batch_y})
            avg_cost += c / total_batch
        print("Epoch:", (epoch + 1), "cost =", "{:.3f}".format(avg_cost))
    print(sess.run(accuracy, feed_dict={x: mnist.test.images, y: mnist.test.labels}))

Epoch: 1 cost = 2.090
Epoch: 2 cost = 0.904
Epoch: 3 cost = 0.694
Epoch: 4 cost = 0.620
Epoch: 5 cost = 0.578
Epoch: 6 cost = 0.551
Epoch: 7 cost = 0.531
Epoch: 8 cost = 0.514
Epoch: 9 cost = 0.499
Epoch: 10 cost = 0.486
Epoch: 11 cost = 0.474
Epoch: 12 cost = 0.462
Epoch: 13 cost = 0.450
Epoch: 14 cost = 0.439
Epoch: 15 cost = 0.428
Epoch: 16 cost = 0.416
Epoch: 17 cost = 0.405
Epoch: 18 cost = 0.395
Epoch: 19 cost = 0.385
Epoch: 20 cost = 0.375
Epoch: 21 cost = 0.365
Epoch: 22 cost = 0.356
Epoch: 23 cost = 0.347
Epoch: 24 cost = 0.338
Epoch: 25 cost = 0.329
Epoch: 26 cost = 0.321
Epoch: 27 cost = 0.313
Epoch: 28 cost = 0.306
Epoch: 29 cost = 0.298
Epoch: 30 cost = 0.291
Epoch: 31 cost = 0.284
Epoch: 32 cost = 0.278
Epoch: 33 cost = 0.272
Epoch: 34 cost = 0.266
Epoch: 35 cost = 0.260
Epoch: 36 cost = 0.254
Epoch: 37 cost = 0.249
Epoch: 38 cost = 0.243
Epoch: 39 cost = 0.239
Epoch: 40 cost = 0.234
Epoch: 41 cost = 0.229
Epoch: 42 cost = 0.224
Epoch: 43 cost = 0.220
Epoch: 44 cost = 0.2