### This code is by no means original and is heavily based on https://www.tensorflow.org/tutorials/keras/classification. It serves as the "Hello World" programming example for a neural network

###############################################################################################################################
### Importing ###
###############################################################################################################################

We first need to import all the tools which have made ML easy in the last few years:

In [None]:
import tensorflow as tf #Tensorflow has made ML simple
from tensorflow import keras #Keras makes using tensorflow even easier!
import numpy as np #Everything needs numpy
import matplotlib.pyplot as plt #For plotting
print(tf.__version__) # Check we are running tensorflow version >2.0 (there were major changes between 1.x and 2.x)
import seaborn as sns #makes plotting prettier, optional

###############################################################################################################################
### Data loading and normalisation ###
###############################################################################################################################

Keras comes with some datasets which have been curated for us. In this example we are using 70,000 greyscale images of different fashion items (an alternative to character recognition). There are 10 different catagories and the dataset even comes with its own data loader.

In [None]:
fashion_mnist = keras.datasets.fashion_mnist
(train_images,train_labels), (test_images,test_labels) = fashion_mnist.load_data()

We now define our class names, these are for our benefit so we know what the numerical labels mean

In [None]:
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

Lets inspect the data a bit

In [None]:
print("Size of training set = ", train_images.shape)
print("First label = ", train_labels[0])
print("Size of label set = ", train_labels.shape)

The dataset has been split into 60,000 training images and 10,000 test images --> Note no validation here (yet)
We can tell that each image is 28x28 pixels (not high resolution, but you will see why shortly)

In [None]:
plt.figure()
plt.imshow(train_images[0])
plt.colorbar()
plt.gca().grid(False)

We now see that, as expected, the first entry in the dataset is an "Ankle boot", but what actually is this image?

In [None]:
# Inspect some pixels:
print("Corner pixel = ",train_images[0][0,0])
print("A middle pixel = ", train_images[0][10,20])

In [None]:
train_images = train_images/255.0
test_images = test_images/255.0

We want to normalise these pixel values to be [0,1] rather than [0,255]

In [None]:
plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i],cmap=plt.cm.binary)
    plt.xlabel(class_names[train_labels[i]])

###############################################################################################################################
### Building a network! ###
###############################################################################################################################

In [None]:
model = keras.Sequential([keras.layers.Flatten(input_shape=(28,28)),
                         keras.layers.Dense(128, activation=tf.nn.relu),
                          keras.layers.Dense(10, activation=tf.nn.softmax)
                         ])

###############################################################################################################################
### Training the network ###
###############################################################################################################################

In [None]:
model.compile(optimizer="ADAM",
             loss = "sparse_categorical_crossentropy",
             metrics=["accuracy"]) #This is the one hint in this example that keras/tensorflow is not just straight python. 
# We must compile the model before we run it

In [None]:
model.summary() #A nice summary to show us how big our network is, and how many parameters it has 

In [None]:
history = model.fit(train_images, train_labels, epochs=10) #Now we set it training, you will see why we set this as a variable soon

##### Notice that the more we train the better our accuracy becomes!

In [None]:
test_loss, test_acc = model.evaluate(test_images, test_labels)
print("Test Accuracy:", test_acc)

#### But our test accuracy is not as good 

###############################################################################################################################
### Examine the training ###
###############################################################################################################################

In [None]:
#We can look at the history of the training
history_dict = history.history
history_dict.keys()

In [None]:
acc = history.history['accuracy']
loss = history.history['loss']

epochs = range(1, len(acc) + 1) #Giving us something to plot against

plt.plot(epochs, acc, 'b', label='Training accuracy')
plt.plot(epochs, test_acc*np.ones(len(epochs)), 'r', label='Final test accuracy')
plt.title('Training accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()

plt.show()

#### Why is the gap between training and test accuracy increasing?

###############################################################################################################################
### Examine the predictions ###
###############################################################################################################################

In [None]:
predictions = model.predict(test_images)
print(predictions[0])
print(test_labels[0])

In [None]:
# Plot the first 25 test images, their predicted label, and the true label
# Color correct predictions in blue, incorrect predictions in red. True labels are in brackets
plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid('off')
    plt.imshow(test_images[i], cmap=plt.cm.binary)
    predicted_label = np.argmax(predictions[i])
    true_label = test_labels[i]
    if predicted_label == true_label:
      color = 'blue'
    else:
      color = 'red'
    plt.xlabel("{} ({})".format(class_names[predicted_label], 
                                  class_names[true_label]),
                                  color=color)

###############################################################################################################################
### Enable single predictions ###
###############################################################################################################################

To test a single image we see that is it (as expected) a 28x28 pixel object

In [None]:
img = test_images[0]
print(img.shape)

However, to use model.predict we need to make it a collection (of one), so we give it an extra dimension 

In [None]:
img = np.expand_dims(img,0)
print(img.shape)

In [None]:
#Now we can predict on this:
predictions = model.predict(img)
print(predictions)

Not exactly what we want still, so we just say that the most likely class is the one we choose. For more complex problems we are able to set thresholds here instead of a straight choice.

In [None]:
prediction = predictions[0]
print(class_names[np.argmax(prediction)])

###############################################################################################################################
### Tasks ###
###############################################################################################################################

1. Play around with the network architecture and see if you can increase the accuracy for training and test: try a different optimizer, more/less epochs or more/less nodes or hidden layers (max 5 mins playing advised)

2. Split the training data into a training and validation set, then implement a validation step during training (hint look at the keras API documentation for model.fit - this step is easier than you think)

3. Plot the training and validation loss/accuracy with epoch

4. Change the prediction to admit when it doesn't really know, say based on some threshold of confidence (examine one it got wrong for example)