### Week 2 Lab - getting to learn the Fashion-MNIST dataset

<p>This notebook contains my own re-make of the topics covered in ungraded lab of 
week 2 in course 2.
The main topic of this lab is to get to learn and work with the [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset made by Zalando
</p> 

In [1]:
import tensorflow as tf

In [2]:
# Load the Fashion-MNIST dataset
fmnist = tf.keras.datasets.fashion_mnist

In [3]:
#Calling load_data() on the fmnist object, returns two tuples containing two lists each.
#These tuples will be the training and testing values for the graphics that contain the clothing
#items and their labels

#Load the training and test split of the fashion mnist dataset 
(training_images, training_labels), (test_images, test_labels) = fmnist.load_data()


<p>Let's try to take a look at a training example from the data, to give us an idea of what these look like. Let's look at both an image and a label data entry
</p>

In [10]:
import numpy as np 
import matplotlib.pyplot as plt 

# we can choose an index value between 0 an 59999, since there is a total of 60,000 images in the dataset
index = 175 

# Set number of characters per row when printing
np.set_printoptions(linewidth=320)

# Print the label and image 
print(f'LABEL: {training_labels[index]}')
print(f'\nIMAGE PIXEL ARRAY:\n {training_images[index]}')

LABEL: 7

IMAGE PIXEL ARRAY:
 [[  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   2   0   0   1   1   0   1   0   0   0   0  22   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   1   2   0   0 

From the above output we can see that all of the values are between 0 and 255. If you are training a neural network in image processing, for various reasons it will usually learn better, 
if we scale all values down to values between 0 and 1. 
This process is called _normalization_ 

In Python it is easy to do normalization on arrays, without having to use looping structures

Let's do that next


In [4]:
#Normalize the pixel values of the train and test images 
training_images = training_images/255.0
test_images = test_images/255.0 

Let's now design the model                            

In [12]:
#Build the classification model
model = tf.keras.Sequential([tf.keras.layers.Flatten(),
                            tf.keras.layers.Dense(128, activation=tf.nn.relu),
                            tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

#### A little explanation of the elements in the model structure above: 

**Sequential**:

 - Defines a sequence of layers in the neural network

 **Flatten**:

 - Flatten takes the the square image of n x n pixels matrix, and turns that square into a 1-dimensional array

 **Dense**:

 - Adds a layer of neurons

 Each layer of neurons need an activation function to tell them what to do. There are lots of options but here we just use these:

  - ReLU:
    if x > 0:
        return x
    else:
        return 0

It basically just passes values 0 or greater to the next layer in the network

**Softmax**:

 - Takes a list of values and scales these so the sum of all elements will be equal to 1. When applied to model outputs you can think of the scaled values as the probability for that class. 
 For example, in our classification model which has 10 units in the output dense layer, having the highest value at index=4, means that the model is most confident that the input clothing image belongs to that category (ie. coat)

 If the value is at index=5 the category is sandal and so forth. 

In [23]:
#Declare sample inputs and convert to a tensor
inputs = np.array([[1.0, 3.0, 4.0, 2.0]])
inputs = tf.convert_to_tensor(inputs)

#Feed the inputs to a softmax activation function 
outputs = tf.keras.activations.softmax(inputs)
print(f'output of the softmax function: {outputs.numpy()}')

#Get the sum of all values after the softmax
sum = tf.reduce_sum(outputs)
print(f'sum of outputs: {sum}')

#Get the index with highest value 
prediction = np.argmax(outputs)
print(f'class with highest probability: {prediction}')


output of the softmax function: [[0.0320586  0.23688282 0.64391426 0.08714432]]
sum of outputs: 1.0
class with highest probability: 2


The next thing to do now that the model is defined, is to actually build it. We do this by compiling it with an optimizer and loss function as before -- and then we train it by calling _model.fit()_ asking it to fit our training data to our training labels

The model will then figure out the relationship between the training data and its actual labels 

In [24]:
#Let's actually compile the model
model.compile(optimizer=tf.optimizers.Adam(),
                loss = 'sparse_categorical_crossentropy',
                metrics = ['accuracy'])

model.fit(training_images, training_labels, epochs=5)


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x25568726ec0>

After the model is done training, we can see from the above that this model specification found a pattern match between the image and the labels that worked 89.11% of the time (accuracy score)

Next we try to test how the model will do with unseen data, by evaluating the model on test images and test labels

In [25]:
# Evaluate the model on unseen data 
model.evaluate(test_images, test_labels)



[0.3418787717819214, 0.8773999810218811]

As the evaluation shows, the model has an accuray of approx. 88% on unseen data

## Exploration Exercises

#### Exercise 1 

We are asked to run the below code and answer a few questions 

In [28]:
classifications = model.predict(test_images)

print(classifications[0])

[4.5424604e-05 1.4828730e-08 2.6083151e-06 2.2031463e-07 1.5912965e-05 4.6398580e-02 1.5606549e-05 4.5061149e-02 2.8664910e-04 9.0817380e-01]


### E1Q1: What does this list represent?

 1. It's 10 random meaningless values 
 2. It's the first 10 classifications that the computer mande
 3. It's the probability that this item is each of the 10 classes

 ANSWER: 
 The correct answer is number #

### E1Q2: How do you know that this list tells you that the item is an ankle boot?

 1. There's not enough information to answer that question 
 2. The 10th element on the list is the biggest, and the ankle boot is labelled 9
 3. The ankle boot is label 9, and there are 0->9 elements in the list

 ANSWER: 
 The correct answer is number #! Since the 10th element in the list is the highest value -> the highest probability of belonging to that category


#### Exercise 2 

Here we are focusing on the layers of the model. Experiment with different values for the dense layer with 512 neurons. What different results do you get for loss, training time etc. 

In [9]:
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(1024, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer='adam', 
                loss = 'sparse_categorical_crossentropy')

model.fit(training_images, training_labels, epochs=5)

model.evaluate(test_images, test_labels)

classifications = model.predict(test_images)

print(classifications[0])
print(test_labels[0])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
[1.09742558e-07 2.20290460e-07 1.63339020e-09 6.06345585e-10
 2.88714705e-08 2.36959197e-03 1.26459483e-07 1.89479198e-02
 1.00186375e-08 9.78681982e-01]
9


### E2Q1: increase to 1024 Neurons - what's the impact?

 1. Training takes longer, but is more accurate 
 2. Training takes longer, but no impact on accuracy
 3. Training takes the same time, but is more accurate

ANSWER: # - Adding more Neurons we have to do more calculations, slowing down the process. In this case they have a good impact - we do get more accurate. That doesn't mean it's always a case of "more is better", you can hit the "law of diminishing returns" very quickly. 


#### Exercise 3 

E3Q1: What would happen of you remove the Flatten() layer. Why do you think that's the case? 

In [10]:
model = tf.keras.models.Sequential([tf.keras.layers.Dense(64, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer='adam', 
                loss = 'sparse_categorical_crossentropy')

model.fit(training_images, training_labels, epochs=5)

model.evaluate(test_images, test_labels)

classifications = model.predict(test_images)

print(classifications[0])
print(test_labels[0])

Epoch 1/5


ValueError: in user code:

    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\engine\training.py", line 1051, in train_function  *
        return step_function(self, iterator)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\engine\training.py", line 1040, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\engine\training.py", line 1030, in run_step  **
        outputs = model.train_step(data)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\engine\training.py", line 890, in train_step
        loss = self.compute_loss(x, y, y_pred, sample_weight)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\engine\training.py", line 948, in compute_loss
        return self.compiled_loss(
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\engine\compile_utils.py", line 201, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\losses.py", line 139, in __call__
        losses = call_fn(y_true, y_pred)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\losses.py", line 243, in call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\losses.py", line 1860, in sparse_categorical_crossentropy
        return backend.sparse_categorical_crossentropy(
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\backend.py", line 5238, in sparse_categorical_crossentropy
        res = tf.nn.sparse_softmax_cross_entropy_with_logits(

    ValueError: `labels.shape` must equal `logits.shape` except for the last dimension. Received: labels.shape=(32,) and logits.shape=(896, 10)


ANSWER: We get an error about the shape of the data. It reinforces the rule of thumb, that the first layer in our network should be the same shape as our data. Right now our data is 28x28 images, and 28 layers of 28 neurons would be infeasible, so it makes more sense to 'flatten' that 28x28 into a 784x1. Instead of writing all the code to handle that ourselves, we add the Flatten() layer at the beginning, and when the arrays are loaded into the model later, they'll automatically be flattened for us.  

#### Exercise 4 

Consider the final (output) layers. Why are there 10 of them? What would happen if you had a different amount than 10? For example, try training network with 5!

In [13]:
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(1024, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(5, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam', 
                loss='sparse_categorical_crossentropy')

model.fit(training_images, training_labels, epochs=5)

model.evaluate(test_images, test_labels)

classifications = model.predict(test_images)

print(classifications[0])
print(test_labels[0])

Epoch 1/5


InvalidArgumentError: Graph execution error:

Detected at node 'sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits' defined at (most recent call last):
    File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main
      return _run_code(code, main_globals, None,
    File "C:\Python310\lib\runpy.py", line 86, in _run_code
      exec(code, run_globals)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\ipykernel_launcher.py", line 16, in <module>
      app.launch_new_instance()
    File "c:\Projekter\py_venv\.venv\lib\site-packages\traitlets\config\application.py", line 976, in launch_instance
      app.start()
    File "c:\Projekter\py_venv\.venv\lib\site-packages\ipykernel\kernelapp.py", line 677, in start
      self.io_loop.start()
    File "c:\Projekter\py_venv\.venv\lib\site-packages\tornado\platform\asyncio.py", line 199, in start
      self.asyncio_loop.run_forever()
    File "C:\Python310\lib\asyncio\base_events.py", line 595, in run_forever
      self._run_once()
    File "C:\Python310\lib\asyncio\base_events.py", line 1881, in _run_once
      handle._run()
    File "C:\Python310\lib\asyncio\events.py", line 80, in _run
      self._context.run(self._callback, *self._args)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\ipykernel\kernelbase.py", line 473, in dispatch_queue
      await self.process_one()
    File "c:\Projekter\py_venv\.venv\lib\site-packages\ipykernel\kernelbase.py", line 462, in process_one
      await dispatch(*args)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\ipykernel\kernelbase.py", line 369, in dispatch_shell
      await result
    File "c:\Projekter\py_venv\.venv\lib\site-packages\ipykernel\kernelbase.py", line 664, in execute_request
      reply_content = await reply_content
    File "c:\Projekter\py_venv\.venv\lib\site-packages\ipykernel\ipkernel.py", line 355, in do_execute
      res = shell.run_cell(code, store_history=store_history, silent=silent)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\ipykernel\zmqshell.py", line 532, in run_cell
      return super().run_cell(*args, **kwargs)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\IPython\core\interactiveshell.py", line 2854, in run_cell
      result = self._run_cell(
    File "c:\Projekter\py_venv\.venv\lib\site-packages\IPython\core\interactiveshell.py", line 2900, in _run_cell
      return runner(coro)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\IPython\core\async_helpers.py", line 129, in _pseudo_sync_runner
      coro.send(None)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\IPython\core\interactiveshell.py", line 3098, in run_cell_async
      has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
    File "c:\Projekter\py_venv\.venv\lib\site-packages\IPython\core\interactiveshell.py", line 3301, in run_ast_nodes
      if await self.run_code(code, result, async_=asy):
    File "c:\Projekter\py_venv\.venv\lib\site-packages\IPython\core\interactiveshell.py", line 3361, in run_code
      exec(code_obj, self.user_global_ns, self.user_ns)
    File "C:\Users\Bruger\AppData\Local\Temp\ipykernel_23820\1865654830.py", line 8, in <cell line: 8>
      model.fit(training_images, training_labels, epochs=5)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\utils\traceback_utils.py", line 64, in error_handler
      return fn(*args, **kwargs)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\engine\training.py", line 1409, in fit
      tmp_logs = self.train_function(iterator)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\engine\training.py", line 1051, in train_function
      return step_function(self, iterator)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\engine\training.py", line 1040, in step_function
      outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\engine\training.py", line 1030, in run_step
      outputs = model.train_step(data)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\engine\training.py", line 890, in train_step
      loss = self.compute_loss(x, y, y_pred, sample_weight)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\engine\training.py", line 948, in compute_loss
      return self.compiled_loss(
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\engine\compile_utils.py", line 201, in __call__
      loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\losses.py", line 139, in __call__
      losses = call_fn(y_true, y_pred)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\losses.py", line 243, in call
      return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\losses.py", line 1860, in sparse_categorical_crossentropy
      return backend.sparse_categorical_crossentropy(
    File "c:\Projekter\py_venv\.venv\lib\site-packages\keras\backend.py", line 5238, in sparse_categorical_crossentropy
      res = tf.nn.sparse_softmax_cross_entropy_with_logits(
Node: 'sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits'
Received a label value of 9 which is outside the valid range of [0, 5).  Label values: 5 9 9 4 3 1 6 3 4 4 0 1 1 2 4 2 4 0 2 4 1 9 8 8 6 4 3 1 0 5 6 2
	 [[{{node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits}}]] [Op:__inference_train_function_158807]

ANSWER: We get an error as soon as the model finds an unexpected value. Another rule of thumb - the number of neurons in the last layer should match the number of classes you are classifying for. In this case it's the digits 0-9 so there are 10 of them, hence we should have 10 neurons in our final layer

#### Exercise 5 

Consider the effects of additional layers in the network. What will happen if you add another layer between the one with 512 and the final layer with 10?    

In [15]:
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(1024, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam', 
                loss='sparse_categorical_crossentropy')

model.fit(training_images, training_labels, epochs=5)

model.evaluate(test_images, test_labels)

classifications = model.predict(test_images)

print(classifications[0])
print(test_labels[0])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
[5.2610169e-08 4.5362207e-09 4.6501690e-08 2.0587354e-08 1.2000093e-08
 1.4191066e-04 3.3703711e-07 6.9291624e-03 1.3480063e-07 9.9292833e-01]
9


ANSWER: There isn't a significant impact - because this data is very simple! but for more complex cases, extra layers are often necessary!

#### Exercise 6

### E6Q1: Consider the impact of training for more or less epochs. why do you think that would be the case? 

 - Try for 15 epochs -- you'll probably get a model with a much better loss than the one with 5
 - Try 30 epochs -- you might see the loss value decrease more slowly, and sometimes increases. You'll also likely see that the results of model.evalute() didn't improve much. It can even be slightly worse

 This is a side effect of something called "overfitting"

In [17]:
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(1024, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam', 
                loss='sparse_categorical_crossentropy')

model.fit(training_images, training_labels, epochs=30)

model.evaluate(test_images, test_labels)

#classifications = model.predict(test_images)

#print(classifications[0])
#print(test_labels[0])

Epoch 1/30
Epoch 2/30
Epoch 3/30
Epoch 4/30
Epoch 5/30
Epoch 6/30
Epoch 7/30
Epoch 8/30
Epoch 9/30
Epoch 10/30
Epoch 11/30
Epoch 12/30
Epoch 13/30
Epoch 14/30
Epoch 15/30
Epoch 16/30
Epoch 17/30
Epoch 18/30
Epoch 19/30
Epoch 20/30
Epoch 21/30
Epoch 22/30
Epoch 23/30
Epoch 24/30
Epoch 25/30
Epoch 26/30
Epoch 27/30
Epoch 28/30
Epoch 29/30
Epoch 30/30


0.5411862730979919

#### Exercise 7

What is the impact of not normalizing the data (divided by 255) to get values between 0-1? 

In [18]:
training_images = training_images
test_images = test_images


model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam', 
                loss='sparse_categorical_crossentropy')

model.fit(training_images, training_labels, epochs=5)
model.evaluate(test_images, test_labels)

classifications = model.predict(test_images)
print(classifications[0])
print(test_labels[0])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
[5.7215317e-07 7.5133482e-09 9.1857842e-08 1.8318741e-07 1.3786276e-07
 2.2646420e-02 3.5476504e-07 1.0511803e-01 1.8075802e-07 8.7223405e-01]
9


ANSWER: My guess is that it will get more difficult for the model to get good fits and extract more deeply and complex features! When the data is normalized, it should be easier for the model to extract out "differences" from one datapoint to another... So, by not normalizing the data I guess it will be more difficult for the data to learn and "spot" complex differences (hence features) amongst the datapoints! 

#### Exercise 8 

Introduction to callbacks... Callbacks give us an opportunity to stop the training when a certain level of loss is reached, so that we don't have to wait for the full model to finish, but can finish after a desired level of loss is reached 

In [21]:
class myCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs={}):
        if(logs.get('accuracy') >= 0.5): #Experiment with changing this value! 
            print("\nReached 60% acc. so cancelling training!")
            self.model.stop_training = True 

callbacks = myCallback()

training_images = training_images/255.0
test_images = test_images/255.0

model = tf.keras.models.Sequential([ 
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(512, activation=tf.nn.relu),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

model.compile(optimizer='adam', loss="sparse_categorical_crossentropy", metrics=['accuracy'])
model.fit(training_images, training_labels, epochs=5, callbacks=[callbacks])


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x1cc47c21630>