# Tutorial week 8 - Neural Networks

#### After installing TensorFlow (see instructions in 6COSC020W_TutorialWeek8.pdf), run the following code and answer the 5 questions. Then try the 8 exercises described at the end.
#### 1) Download MNIST dataset

In [None]:
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

seed = 123 # to ensure we always get the same results
np.random.seed(seed) # to ensure we always get the same results
tf.keras.utils.set_random_seed(seed) # to ensure we always get the same results

fashion_mnist = tf.keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()

class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
               'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']

plt.figure(figsize=(3, 3))
plt.imshow(train_images[0])
plt.title(class_names[train_labels[0]])
plt.colorbar()
plt.grid(False)
plt.show()

#### **Question 1: How many images do you have in the training dataset and the in the testing dataset?**

In [None]:
# Add code here to answer the question



#### **Question 2: How many images do you have for each class in the training dataset? Is it a balanced dataset?**

In [None]:
# Add code here to answer the question



#### 2) Display some images from the train dataset

In [None]:
plt.figure(figsize=(10,10))
for i in range(25):
    plt.subplot(5,5,i+1)
    plt.xticks([])
    plt.yticks([])
    plt.grid(False)
    plt.imshow(train_images[i], cmap=plt.cm.binary)
    plt.xlabel(class_names[train_labels[i]])
plt.show()

#### 3) Create a neural network with Tensorflow with 2 Dense layers 

In [None]:
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28, 28)),
    tf.keras.layers.Dense(128, activation='relu'), # number of units = 128
    tf.keras.layers.Dense(10) # number of outputs = 10 (10 classess)
])

#### 4) Compile the model. Use Adam optimiser, and a loss function SparseCategoricalCrossentropy

In [None]:
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])
model.summary()

#### 5) Train the model

In [None]:
epochs = 10
history = model.fit(train_images, train_labels, validation_data=[test_images, test_labels], epochs=epochs)

#### 6) Evaluate the model

In [None]:
test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)

print('\nTest accuracy:', test_acc)

In [None]:
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']

loss = history.history['loss']
val_loss = history.history['val_loss']

epochs_range = range(epochs)

plt.figure(figsize=(8, 8))
plt.subplot(1, 2, 1)
plt.plot(epochs_range, acc, label='Training Accuracy')
plt.plot(epochs_range, val_acc, label='Validation Accuracy')
plt.legend(loc='lower right')
plt.title('Training and Validation Accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')

plt.subplot(1, 2, 2)
plt.plot(epochs_range, loss, label='Training Loss')
plt.plot(epochs_range, val_loss, label='Validation Loss')
plt.legend(loc='upper right')
plt.title('Training and Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.show()

#### **Question 3: Is the validation accuracy higher or lower than the training accuracy at epoch 10? Why?**

Answer:

#### **Question 4: Looking at the plot, do you think the accuracy could be improved? What would happen if we increase the number of epochs?**

Answer:

#### 7) Get probabilities. For each test image will tell us the probability to belong to each of the 10 classes (i.e., for each test image will output an array of 10 values).
Since our model returns the logits, we add a Softmax layer in order to convert logits to probabilities.

In [None]:
probability_model = tf.keras.Sequential([model, 
                                         tf.keras.layers.Softmax()])
probabilities = probability_model.predict(test_images)
print('Size of variable probabilities: ' + str(probabilities.shape)) # You can see the size of the arrays here (10000, 10)
print('Probabilities:')
print(probabilities) # Returns an array of 10000 arrays with 10 probabilities each (one for each class)

#### 8) Get predicted class
For each image, we have a vector of 10 probabilities (1 for each possible class) that tells us the probability that that image belongs to the class. We now want to get the class with the highest probability. We use argmax to get the class with the highest probability

In [None]:
predictions = np.argmax(probabilities, axis = 1) # gets the maximum probability of each image (maximum value)

print('Size of variable predictions: ' + str(predictions.shape)) # We now have one value (class) for each image.
print('Predictions:')
print(predictions)
print('Labels:')
print(test_labels)

#### 9) Final accuracy (test set)

In [None]:
accuracy_test = np.count_nonzero(predictions==test_labels)/len(test_images)
print('The accuracy on the test set is: ' + str(accuracy_test))

#### 10) Visualisation
Predicted class of first image:

In [None]:
class_names[predictions[0]]

We now create a function to plot some results

In [None]:
def plot_image(i, predictions_array, true_label, img):
  true_label, img = true_label[i], img[i]
  plt.grid(False)
  plt.xticks([])
  plt.yticks([])

  plt.imshow(img, cmap=plt.cm.binary)

  predicted_label = np.argmax(predictions_array)
  if predicted_label == true_label:
    color = 'blue'
  else:
    color = 'red'

  plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label],
                                100*np.max(predictions_array),
                                class_names[true_label]),
                                color=color)

def plot_value_array(i, predictions_array, true_label):
  true_label = true_label[i]
  plt.grid(False)
  plt.xticks(range(10))
  plt.yticks([])
  thisplot = plt.bar(range(10), predictions_array, color="#777777")
  plt.ylim([0, 1])
  predicted_label = np.argmax(predictions_array)

  thisplot[predicted_label].set_color('red')
  thisplot[true_label].set_color('blue')

We now use the functions to show the first 5 images and the predicted class:

In [None]:
for i in range(5):
    plt.figure(figsize=(6,3))
    plt.subplot(1,2,1)
    plot_image(i, probabilities[i], test_labels, test_images)
    plt.subplot(1,2,2)
    plot_value_array(i, probabilities[i],  test_labels)
    plt.show()

#### **Question 5: Did the model predict the 5 classes correctly? Which is the one with a lower predicted accuracy? why?**

Answer:

## **Exercises**

Next you will perform a manual search of the hyperparameters to fine-tune the network to find the best accuracy possible on the test set. 
Discuss in class which parameters and values seem to give a better accuracy.

1.- Change the number of epochs and evaluate the results by looking at the plots generated, e.g.: 50, 100.

2.- Change the learning rate and evaluate the results by looking at the plots generated, e.g.: 0.01, 0.0001.

3.- Change the number of units in the dense layer (not the number of outputs), and evaluate the results by looking at the plots generatede.g.: 256.

4.- Add a new Dense Layer in the model and observe the changes. Think carefully the number of units that the Dense layer will have.


#### **Which hyperparameters provided the best accuracy?**

Compare your results with other students and discuss which parameters had a bigger impact.

#### **OPTIONAL - Challenge**
Choose a new dataset  from TensorFlow (https://www.tensorflow.org/datasets/catalog/overview) and find the best hyperparameters (epochs, learning rate, number of units).