# Working with Neural Networks ssing Tensorflow

<a target="_blank" href="https://colab.research.google.com/github/LuWidme/uk259/blob/05f7e58e35048d2ee227109791520f41d34b7343/demos/NN%20in%20tensowrflow.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

In [17]:
import tensorflow.keras as ks
import tensorflow as tf
mnist = ks.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data() # 70'000 handwritten digits
# scale data to [0,1] range
x_train, x_test = x_train / 255.0, x_test / 255.0

In [18]:
#attention: usig matplot and tensorflow in the same noetbooks may cause crashes in some cases

# from matplotlib import pyplot as plt
# plt.imshow(x_train[0], interpolation='nearest')
# plt.show()

# print("Label: ", y_train[0])

## Build a machine learning model

Build a `tf.keras.Sequential` model by stacking layers.

In [56]:
model = ks.models.Sequential([
  #Input layer created implicitly
  ks.layers.Flatten(input_shape=(28, 28)),#  reshape input (28 x 28) array to 1-D array, creating 784 nodes, one for each pixel
  ks.layers.Dense(128, activation='relu'), # Dense: all (128) nodes are connected to all preceding nodes
  ks.layers.Dense(10) # this is our output layer, with one node for each number
])

model.summary()


Model: "sequential_16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_16 (Conv2D)          (None, 26, 26, 3)         30        
                                                                 
 max_pooling2d_3 (MaxPooling  (None, 13, 13, 3)        0         
 2D)                                                             
                                                                 
 flatten_4 (Flatten)         (None, 507)               0         
                                                                 
 dense_29 (Dense)            (None, 100)               50800     
                                                                 
 dense_30 (Dense)            (None, 10)                1010      
                                                                 
Total params: 51,840
Trainable params: 51,840
Non-trainable params: 0
_________________________________________________

For each example, the model returns a vector of [logits](https://developers.google.com/machine-learning/glossary#logits) or [log-odds](https://developers.google.com/machine-learning/glossary#log-odds) scores, one for each class.

In [57]:
predictions = model(x_train[:1]).numpy()
predictions

array([[-0.24996652, -0.01613806, -0.3140773 , -0.5564004 ,  0.6531193 ,
         0.23476408, -1.0719975 ,  1.716434  , -0.60812116, -0.24009037]],
      dtype=float32)

The `tf.nn.softmax` function converts these logits to *probabilities* for each class: 

In [58]:
tf.nn.softmax(predictions).numpy()

array([[0.05773113, 0.07293911, 0.05414608, 0.04249399, 0.14243451,
        0.0937402 , 0.02537502, 0.41248384, 0.04035204, 0.05830411]],
      dtype=float32)

Note: It is possible to bake the `tf.nn.softmax` function into the activation function for the last layer of the network. While this can make the model output more directly interpretable, this approach is discouraged as it's impossible to provide an exact and numerically stable loss calculation for all models when using a softmax output. 

Define a loss function for training using `losses.SparseCategoricalCrossentropy`, which takes a vector of logits and a `True` index and returns a scalar loss for each example.

In [59]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

This loss is equal to the negative log probability of the true class: The loss is zero if the model is sure of the correct class.

This *untrained model* gives probabilities close to random (1/10 for each class), so the initial loss should be close to `-tf.math.log(1/10) ~= 2.3`.

In [60]:
loss_fn(y_train[:1], predictions).numpy()

2.8420825

Before you start training, configure and compile the model using Keras `Model.compile`. Set the [`optimizer`](https://www.tensorflow.org/api_docs/python/tf/keras/optimizers) class to `adam`, set the `loss` to the `loss_fn` function you defined earlier, and specify a metric to be evaluated for the model by setting the `metrics` parameter to `accuracy`.

In [61]:
model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])


## Train and evaluate your model
Now its tiem to **train** our model, meanning we use labled data to tune our model such that the predicted labels mach as closely as possible to the actual labels. We do this over many *steps* in multiple *epochs*;

* Epoch: A training epoch represents a complete use of all training data for gradients calculation and optimizations(train the model).

* Step: A training step means using one batch size of training data to train the model.

Use the `Model.fit` method to adjust your model parameters and minimize the loss (i.e. *train* the model). The training will take up to a minute and you can ovserve how the loss and accuracy on the test dataset change over time: 

In [62]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x2235a1b0670>


Congartualtions! you just trained your first Neural Network model!


The `Model.evaluate` method checks the models performance, usually on a "[Validation-set](https://developers.google.com/machine-learning/glossary#validation-set)" or "[Test-set](https://developers.google.com/machine-learning/glossary#test-set)".

For this we let the model predict the label of a dataset it was not trained on and check how acurate it was.

In [63]:
model.evaluate(x_test,  y_test, verbose=2)

313/313 - 1s - loss: 0.2962 - accuracy: 0.8930 - 941ms/epoch - 3ms/step


[0.296237051486969, 0.8930000066757202]

The image classifier is now trained to ~98% accuracy on this dataset. To learn more, read the [TensorFlow tutorials](https://www.tensorflow.org/tutorials/).

If you want your model to return a probability, you can wrap the trained model, and attach the softmax to it:

In [64]:
probability_model = tf.keras.Sequential([
  model,
  tf.keras.layers.Softmax()
])

For example, lets predict the label of the first entry in the test dataset

In [65]:
import numpy as np
Prediction= probability_model(x_test[:1])
print(Prediction)

print("True Label: {} \nPredicted Label: {} with probability: {:0.3f} %".format(np.argmax(Prediction),y_test[0] ,Prediction[0,np.argmax(Prediction)]*100))

tf.Tensor(
[[7.65593450e-06 8.93769879e-07 8.90037981e-08 1.15536466e-07
  2.51213550e-07 6.46695669e-04 1.27561805e-06 8.90392531e-03
  2.25380063e-04 9.90213811e-01]], shape=(1, 10), dtype=float32)
True Label: 9 
Predicted Label: 9 with probability: 99.021 %


## Conclusion

Congratulations! You have trained a machine learning model using a prebuilt dataset using the [Keras](https://www.tensorflow.org/guide/keras/overview) API.

For more examples of using Keras, check out the [tutorials](https://www.tensorflow.org/tutorials/keras/). To learn more about building models with Keras, read the [guides](https://www.tensorflow.org/guide/keras). If you want learn more about loading and preparing data, see the tutorials on [image data loading](https://www.tensorflow.org/tutorials/load_data/images) or [CSV data loading](https://www.tensorflow.org/tutorials/load_data/csv).


*adapted from : https://www.tensorflow.org/tutorials/quickstart/beginner*

## Exercise

Load the [MNist Clothing dataset](https://www.tensorflow.org/datasets/catalog/fashion_mnist) and repeat the steps above to train a neural network model that can categorize different clothing items based on images. You may need to use a more complex network, or you can try to use [convolution layers]("https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D") instead of fully connected (Dense Layers) as they usually perform better on image-recognition tasks.

In [67]:
clothing = ks.datasets.fashion_mnist