# Training a Neural Network to categorize MNSIT data

[TensorFlow 2.0](https://www.geeksforgeeks.org/tensorflow-2-0/)

In [25]:
%reset -sf
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

In [26]:
import tensorflow as tf

The Fashion MNIST data is available directly in the tf.kerasdatasets API. You load it like this:

In [27]:
mnist = tf.keras.datasets.fashion_mnist

Calling load_data on this object will give you two sets of two
lists, these will be the training and testing values for the graphics
that contain the clothing items and their labels.

In [28]:
(training_images, training_labels), (test_images, test_labels) = mnist.load_data()

You'll notice that all of the values in the number are between 0 and 255. If we are training a neural network, for various reasons it's easier if we treat all values as between 0 and 1, a process called '**normalizing**'...and fortunately in Python it's easy to normalize a list like this without looping. So, perform it like - 

In [29]:
training_images = training_images / 255.0
test_images = test_images / 255.0

Now you might be wondering why there are 2 sets...training and testing
-- remember we spoke about this in the intro? The idea is to have 1 set of
data for training, and then another set of data...that the model hasn't yet
seen...to see how good it would be at classifying values. After all, when
you're done, you're going to want to try it out with data that it hadn't
previously seen!

Let's now design the model. There are quite a few new concepts here,
but don't worry, you'll get the hang of them.

In [30]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation=tf.nn.relu),
    tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

**Sequential**: That defines a SEQUENCE of layers in the neural network

**Flatten**: Remember earlier where our images were a square when
you printed them out? Flatten just takes that square and turns it
into a 1-dimensional set.

**Dense**: Adds a layer of neurons

Each layer of neurons needs an **activation function** to tell them what to
do. There are lots of options, but just use these for now.

**Relu** effectively means "If X>0 return X, else return 0" -- so what it does
it only passes values 0 or greater to the next layer in the network.

**Softmax** takes a set of values, and effectively picks the biggest one, so,
for example, if the output of the last layer looks like [0.1, 0.1, 0.05, 0.1,
9.5, 0.1, 0.05, 0.05, 0.05], it saves you from fishing through it looking for
the biggest value, and turns it into [0,0,0,0,1,0,0,0,0] --
The goal is to save a lot of coding!

The next thing to do, now the model is defined, is to actually build it.
You do this by compiling it with an optimizer and loss function as before --
and then you train it by calling **model.fit ** asking it to fit your training
data to your training labels -- i.e. have it figure out the relationship between
the training data and its actual labels, so in future, if you have data that
looks like the training data, then it can make a prediction for what that data
would look like.

In [31]:
model.compile(optimizer = tf.keras.optimizers.Adam(),
			loss = 'sparse_categorical_crossentropy',
			metrics=['accuracy'])

model.fit(training_images, training_labels, epochs=10)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f87b00ab910>

Once it's done training -- you should see an accuracy value at the end of the
final epoch. It might look something like 0.9098. This tells you that your
neural network is about 91% accurate in classifying the training data. I.E.,
it figured out a pattern match between the image and the labels that worked
91% of the time. Not great, but not bad considering it was only trained for 10
epochs and done quite quickly.

But how would it work with unseen data? That's why we have the test images. We
can call model.evaluate, and pass in the two sets, and it will report back the
loss for each. Let's give it a try:

In [32]:
model.evaluate(test_images, test_labels)



[0.34095972776412964, 0.8816999793052673]

For me, that returned an accuracy of about .8838, which means it was
about 88% accurate. As expected it probably would not do as well with *unseen*
data as it did with data it was trained on!