# An example of Computer Vision

In the previous notebook, you created a neural network that can predict the price of a house. Of course, this was exaggerated because it would have been easier to write the function Y = 50x + 50 directly, instead of creating a machine learning model.

But what if the rules that define a dataset are more complicated, like a computer vision problem for example? Imagine the scenario where you can recognize different types of clothes, trained from a dataset containing 10 different types.

For this, we will use the Fashion MNIST dataset. https://www.kaggle.com/zalando-research/fashionmnist


## Let's go for the code!

Start by importing Tensorflow. You will need version 2.0

In [None]:
import numpy as np
import tensorflow as tf
print(tf.__version__)

The Fashion MNIST dataset is available directly from the Keras API tf.keras. You can load it like this:

In [None]:
mnist = tf.keras.datasets.fashion_mnist

Calling the load_data method on this object will give you 2 sets of 2 lists. These are the train and test values corresponding to the images of garments and their labels.

In [None]:
(train_X, train_Y), (test_X, test_Y) = mnist.load_data()

What do these values look like? Display a train image and its label to see.
Experiment with changing array indices. For example, you can see index 42, this is a different boot than index 0.

In [None]:
import matplotlib.pyplot as plt
plt.imshow(train_X[0], cmap='gray')
print(train_Y[0])
print(train_X[0])

You will notice that all the values of the array are numbers between 0 and 255. You will have to normalize them in the following way:

In [None]:
train_X  = train_X / 255.0
test_X = test_X / 255.0

You can now create your model. There are a few new concepts here, but don't worry, we'll explain them to you one by one.

In [None]:
model = tf.keras.models.Sequential([tf.keras.layers.Flatten(), 
                                    tf.keras.layers.Dense(128, activation=tf.nn.relu), 
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

**Sequential**: This defines a SEQUENCE of layers in the neural network.

**Flatten**: Remember your images were a square matrix when you displayed them? Flatten will just grab that matrix and convert it to 1D.

**Dense**: Adds a layer of neurons

Each neuron layer needs an **activation function**.

**Relu** is a function that means "If X>0 then return X, else return 0" -- so it will only pass values greater than or equal to 0 into the next network layer.

**Softmax** retrieves a set of values, and selects the largest number. For example, if the output of the last layer looks like [0.1, 0.1, 0.05, 0.1, 9.5, 0.1, 0.05, 0.05, 0.05], it will look for the largest value and convert the output to [0,0, 0,0,1,0,0,0,0].

The next thing to do, now that the model is defined, is to build it. You can do this by calling the compile method and defining an optimizer and a loss function, then you train it by calling **model.fit**.

In [None]:
model.compile(optimizer = 'Adam',
              loss = 'sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(train_X, train_Y, epochs=5)

Once the training has been completed, you should see the accuracy at the end of each epoch. It should be a value like 0.9091. This means that your neural network can classify your train data with around 91% accuracy. It's not the best result we can have, but it's pretty good considering the low number of epochs that has been defined and the speed of the training.

But what performance did the model achieve on unknown data ? That's why we have the test images. You can use model.evaluate to evaluate your model on your test dataset.

In [None]:
model.evaluate(test_X, test_Y)

For us, it returned an accurary of 0.8759, which means it's about 88% accurate. As expected, it's less specific about data its never seen.

To further understand the different parameters, try the following exploration exercises:

# Exploration exercises

These exercises don't really require you to code. Just sometimes change a few values so that you better understand the different parameters given to functions.

### Exercise 1:

For this first exercise, run the following cell: it creates a set of classifications for each image in the test dataset. Then display the first classifications entry. The output corresponds to a list of values. What do you think ? In your opinion, what do these values represent?

In [None]:
classifications = model.predict(test_X)

print(classifications[0])

Hint: try running `print(test_Y[0])`, and you'll get a 9. Does that help you understand the usefulness of this list?

In [None]:
print(test_Y[0])

### Quiz: What does this list represent?


1. This is 10 random values
2. These are the first 10 classifications that the computer made
3. This is the probability that the item is in each of the 10 classes

#### Answer:
The correct answer is 3.

The output of the model is a list of 10 numbers. These numbers correspond to the probabilities that the value being classified matches the class. The first value in the list is the probability that the image matches class 0, the next class 1, and so on.

For our case, the value displayed to us is a 9 because it has the highest probability.

### How do you know how this list tells you that the item you are predicting is a boot?


1. There is not enough information to answer this question
2. The 10th item in the list is the largest, and the boot is represented by the label 9
3. The bundle is labeled with 9, and there are 0->9 items in the list.

### Answer

The correct answer is 2. Each list starts with index 0, and the label 9 corresponding to the boot means that it is the 10th class. The list containing the largest value at its tenth element means that the neural network predicted that the given input item was classified as a boot.

## Exercise 2:

Now look at the model layers. Experiment with different values for the dense layer with 512 neurons.
What different results do you get for loss, training time etc.? ? Why do you think there are these differences?

In [None]:
import tensorflow as tf
print(tf.__version__)

mnist = tf.keras.datasets.fashion_mnist

(train_X, train_Y) ,  (test_X, test_Y) = mnist.load_data()

train_X = train_X/255.0
test_X = test_X/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy')

model.fit(train_X, train_Y, epochs=5)

model.evaluate(test_X, test_Y)

classifications = model.predict(test_X)

print(classifications[0])
print(test_Y[0])

### Question 1. Increase it to 1024 neurons, what is the impact?

1. Training takes longer, but is more precise
2. Training takes longer, but does not impact accuracy
3. Training takes the same time, but is more precise

#### Answer

The correct answer is answer 1. By adding more neurons, we have more computations, which increases the training time. But in this case you have better accuracy. However, this does not mean that 'more is better'. You will have the opportunity to understand this later.

## Exercise 3:

What happens if you remove the Flatten() layer? What do you think training will give you as a result?

You will have an error on the shape of the data. Indeed, a layer cannot contain 28x28 neurons, it must rather be flattened, so that this 28x28 is converted into 784x1. Instead of writing all the code yourself, you can directly call Flatten() at the very beginning.

In [None]:
import tensorflow as tf
print(tf.__version__)

mnist = tf.keras.datasets.fashion_mnist

(train_X, train_Y) ,  (test_X, test_Y) = mnist.load_data()

train_X = train_X/255.0
test_X = test_X/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy')

model.fit(train_X, train_Y, epochs=5)

model.evaluate(test_X, test_Y)

classifications = model.predict(test_X)

print(classifications[0])
print(test_Y[0])

## Exercise 4:

Analyze the final layer (output layer). Why must it contain 10 neurons? What happens if you put a number other than 10? Try for example with 5

You get an error directly. Indeed, your labels correspond to 10 classes or one neuron per class.

In [None]:
import tensorflow as tf
print(tf.__version__)

mnist = tf.keras.datasets.fashion_mnist

(train_X, train_Y) ,  (test_X, test_Y) = mnist.load_data()

train_X = train_X/255.0
test_X = test_X/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy')

model.fit(train_X, train_Y, epochs=5)

model.evaluate(test_X, test_Y)

classifications = model.predict(test_X)

print(classifications[0])
print(test_Y[0])

## Exercise 5:

Now try adding additional layers in the network. What happens if you add another layer between the one containing 512 neurons and those containing 10?

Answer: There is no significant impact, because it is rather simple data. For more complex data, adding a layer of neurons can be useful.

In [None]:
import tensorflow as tf
print(tf.__version__)

mnist = tf.keras.datasets.fashion_mnist

(train_X, train_Y) ,  (test_X, test_Y) = mnist.load_data()

train_X = train_X/255.0
test_X = test_X/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy')

model.fit(train_X, train_Y, epochs=5)

model.evaluate(test_X, test_Y)

classifications = model.predict(test_X)

print(classifications[0])
print(test_Y[0])

# Exercise 6:

Try to train your data with more or less epochs. What do you think of the result?

Try with 15 epochs: you will surely have a model with a better loss than the one with 5 epochs

Try with 30 epochs: you will see that the loss stops decreasing, and sometimes increases. This is **overfitting**. Track your training, there's no point in continuing to train a model if they've already reached their best performance, right? ;)

In [None]:
import tensorflow as tf
print(tf.__version__)

mnist = tf.keras.datasets.fashion_mnist

(train_X, train_Y) ,  (test_X, test_Y) = mnist.load_data()

train_X = train_X/255.0
test_X = test_X/255.0

model = tf.keras.models.Sequential([tf.keras.layers.Flatten(),
                                    tf.keras.layers.Dense(512, activation=tf.nn.relu),
                                    tf.keras.layers.Dense(10, activation=tf.nn.softmax)])

model.compile(optimizer = 'adam',
              loss = 'sparse_categorical_crossentropy')

model.fit(train_X, train_Y, epochs=5)

model.evaluate(test_X, test_Y)

classifications = model.predict(test_X)

print(classifications[34])
print(test_Y[34])

### Exercise 7:

Earlier you trained your model with too many epochs and your loss was not changing. You had to wait for the final result, and you thought it would be nice if you could stop training as soon as a certain value was reached. For example, you believe that 95% accuracy is enough for you and if you manage to obtain it after 3 epochs, why expect even more? So how do you stop training? You have... callbacks! Let's watch them in action.

In [None]:
import tensorflow as tf
print(tf.__version__)

class myCallback(tf.keras.callbacks.Callback):
  def on_epoch_end(self, epoch, logs={}):
    if(logs.get('loss')<0.4):
      print("\nAtteint 60% d'accuracy donc on coupe l'entraînement !")
      self.model.stop_training = True

callbacks = myCallback()
mnist = tf.keras.datasets.fashion_mnist
(train_X, train_Y), (test_X, test_Y) = mnist.load_data()
train_X=train_X/255.0
test_X=test_X/255.0
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy')
model.fit(train_X, train_Y, epochs=5, callbacks=[callbacks])

Now, head over to `3.handwritten_digits.ipynb` to implement your own model from scratch !