## Neural Networks
### Hand-in: Solve MNIST classification with basic tensorflow or pytorch

The goal of this exercise is to learn one of the current state-of-the-art frameworks (tensorflow and pytorch) for using neural network architectures and training.
https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/tutorials/quickstart/beginner.ipynb
https://colab.research.google.com/github/omarsar/pytorch_notebooks/blob/master/pytorch_quick_start.ipynb

We will start off with MNIST from the basic exercise for you to reproduce the results here. Important then is to carefully go through the API and tutorial of your framework of choice and put together an MLP that can classify MNIST.

For the hand-in, go through the following tasks and answer the following questions.

Tasks:
#### 1. Go through the API to understand what documentation and details you can find
#### 2. Learn how to set up and use a data loader.
#### 3. Figure out and reimplement how to build a basic MLP model.
#### 4. Understand and test how to set up the prerequisites of a model: data(loader), task & metrics.
#### 5. Understand and test what the respective loss function and optimisers are.
#### 6. Plot some training progress (e.g. plot the loss)
#### 7. Have a brief (!) experiment with different settings for the hyperparameters: batch_size, learning_rate, hidden_layer_sizes.

Questions:
1. What is the best accuracy that you found?
2. What are good values of batch_size, learning_rate, hidden_layer_sizes - is this the same compared to the basic (sklearn) training?
3. What optimiser options do you have, and which one did you choose (and why)?
4. What did you observe in the results for good and bad hyperparameters in comparison to the basic (sklearn) training?

In [5]:
from IPython.display import display
import tensorflow as tf
import numpy as np
print("TensorFlow version:", tf.__version__)

TensorFlow version: 2.19.0


#### Load MNIST directly from tensorflow or pytorch
Make sure to use training and test (and potentially validation) data properly!

In [14]:
batch_size = 32

# Tensorflow:
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
#(x_trainset, y_trainset), (x_testset, y_testset) = tf.keras.datasets.fashion_mnist.load_data()
x_train, x_test = x_train[..., np.newaxis]/255.0, x_test[..., np.newaxis]/255.0   # Rescale the images from [0,255] to the [0.0,1.0] range.

# Pytorch:
#trainset = torchvision.datasets.MNIST(root='../data', train=True, download=True)
#trainset = torchvision.datasets.FashionMNIST(root='../data', train=True, download=True)
#testset = torchvision.datasets.MNIST(root='../data', train=False, download=True)
#testset = torchvision.datasets.FashionMNIST(root='../data', train=False, download=True)
#trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=2)
#testloader = torch.utils.data.DataLoader(testset, batch_size=1, shuffle=False, num_workers=2)

#### Preprocessing

#### Build the model

In [15]:
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

#### Train the model

In [16]:
predictions = model(x_train[:1]).numpy()
predictions

array([[-0.09475642,  0.39204934, -0.2996678 ,  0.21791562, -0.5918864 ,
         0.5064708 ,  0.49885872, -0.07770862,  0.0036536 ,  0.39944556]],
      dtype=float32)

In [17]:
tf.nn.softmax(predictions).numpy()

array([[0.07805271, 0.12700039, 0.06359107, 0.10670378, 0.04747743,
        0.14239597, 0.14131616, 0.07939475, 0.08612455, 0.1279432 ]],
      dtype=float32)

In [18]:
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

In [19]:
model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

#### Show prediction and visualise the loss over training steps

In [20]:
model.fit(x_train, y_train, epochs=5)

Epoch 1/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 871us/step - accuracy: 0.8545 - loss: 0.4866
Epoch 2/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 827us/step - accuracy: 0.9544 - loss: 0.1543
Epoch 3/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 833us/step - accuracy: 0.9656 - loss: 0.1106
Epoch 4/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 852us/step - accuracy: 0.9731 - loss: 0.0887
Epoch 5/5
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 889us/step - accuracy: 0.9772 - loss: 0.0733


<keras.src.callbacks.history.History at 0x32c666a50>

In [21]:
model.evaluate(x_test,  y_test, verbose=2)

313/313 - 0s - 535us/step - accuracy: 0.9757 - loss: 0.0766


[0.07660997658967972, 0.9757000207901001]