<a href="https://colab.research.google.com/github/tiensu/Coding-The-Deep-Learning-Revolution/blob/master/A_neural_network_example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import mnist

In [0]:
# Load data
(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [0]:
# Check data
print(x_train.shape)
print(x_test.shape)
print(y_train.shape)
print(y_test.shape)

The x data is the image information – 60,000 images of 28 x 28 pixels size in the training set. The
images are grayscale (i.e black and white) with maximum values, specifying the intensity of whites,
of 255. The x data will need to be scaled so that it resides between 0 and 1, as this improves training
efficiency. The y data is the matching image labels – signifying what digit is displayed in the image.
This will need to be transformed to “one hot” format.

In [0]:
# Extract the training data in batches of samples
def get_batch(x_data, y_data, batch_size):
  idxs = np.random.randint(0, len(y_data), batch_size)
  return x_data[idxs, :, :], y_data[idxs]

In [0]:
# Python optimisation variables
learning_rate = 0.5
epochs = 50
batch_size = 100

In [0]:
# Declare the training data placeholders
x = tf.placeholder(tf.float32, [None, 28, 28])
# Reshape input x - for 28x28 pixels = 274
x_rs = tf.reshape(x, [-1, 784])
# Scale the input data (maximum is 1.0, minimum is 0.0)
x_sc = tf.div(x_rs, 255.0)
# Declare the output data placeholder - 10 digits
y = tf.placeholder(tf.int64, [None, 1])
# Convert the y data to one hot values
y_one_hot = tf.reshape(tf.one_hot(y, 10), [-1, 10])

In [0]:
# Declare the weights connecting the input to the hidden layer
W1 = tf.Variable(tf.random_normal([784, 300], stddev=0.03), name='W1')
b1 = tf.Variable(tf.random_normal([300]), name='b1')

# Declare the weights connecting the hidden layer to the output layer
W2 = tf.Variable(tf.random_normal([300, 10], stddev=0.03), name='W2')
b2 = tf.Variable(tf.random_normal([10]), name='b2')

In [0]:
# Calculate the output of hidden layer
hidden_out = tf.add(tf.matmul(x_sc, W1), b1)
hidden_out = tf.nn.relu(hidden_out)

# Calculate the hidden layer out - no activation function applied
logits = tf.add(tf.matmul(hidden_out, W2), b2)

In [0]:
# Define the cost function which we are going to train the model on
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=y_one_hot, logits=logits))

In [0]:
# Add an optimiser
optimiser = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cross_entropy)

In [0]:
# Finally setup the initialisation operator
init_op = tf.global_variables_initializer()

In [0]:
# Define an accuracy assesment operation
correct_prediction = tf.equal(tf.argmax(y_one_hot, 1), tf.argmax(logits, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

###Train the network

In [0]:
# Start the session
with tf.Session() as sess:
  # initialise the variables
  sess.run(init_op)
  total_batch = int(len(y_train)/batch_size)
  for epoch in range(epochs):
    avg_cost = 0
    for i in range(total_batch):
      batch_x, batch_y = get_batch(x_train, y_train, batch_size=batch_size)
      _,c = sess.run([optimiser, cross_entropy], feed_dict={x:batch_x, y:batch_y.reshape(-1,1)})
      avg_cost += c/total_batch
      acc = sess.run(accuracy, feed_dict={x:x_test, y:y_test.reshape(-1,1)})
    print("Epoch: {}, cost={:.3}, test set accuracy={:.3f}%".format(epoch+1, avg_cost, acc*100))
  print("\nTraining complete!")