# Woche 9 Exercise 1: NN Classification of MNIST

## MNIST Data
We return to the MNIST data set on handwritten digits to compare non-linear classification algorithms ...   

### Task
* Build a "fully connected" neural network for MNIST
  * Hint: use the code "[Example: MLP for MNIST](https://colab.research.google.com/github/keuperj/DataScience23/blob/main/week_9/keras_intro.ipynb)" from the tutorial to get started.
  * why do we need to use "flatten"
* evaluate different network layouts (number of layers, number of neurons in layer)
* evaluate different hyper-parameter settings (learning rate)

Keras API: https://keras.io/api/

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow import keras

In [None]:
#get data
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [None]:
# Build a simple model
inputs = keras.Input(shape=(28, 28))
l1 = keras.layers.experimental.preprocessing.Rescaling(1.0 / 255)(inputs)
l2 = keras.layers.Flatten()(l1)
# NOTE: in Keras "fully connected", aka matrix multiplication layers are called "dense layers"
l3 = keras.layers.Dense(128, activation="tanh")(l2)
l4 = keras.layers.Dense(128, activation="tanh")(l3)
outputs = keras.layers.Dense(10, activation="softmax")(l4)
model = keras.Model(inputs, outputs)
model.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 28, 28)]          0         
                                                                 
 rescaling (Rescaling)       (None, 28, 28)            0         
                                                                 
 flatten (Flatten)           (None, 784)               0         
                                                                 
 dense (Dense)               (None, 128)               100480    
                                                                 
 dense_1 (Dense)             (None, 128)               16512     
                                                                 
 dense_2 (Dense)             (None, 10)                1290      
                                                                 
Total params: 118,282
Trainable params: 118,282
Non-trainable

In [None]:
# Compile the model
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy")

In [None]:
# Train the model for 1 epoch from Numpy data
batch_size = 64
print("Fit on NumPy data")
history = model.fit(x_train, y_train, batch_size=batch_size, epochs=1)

Fit on NumPy data


In [None]:
print(history.history)

{'loss': [0.2857133746147156]}


why do we need to use "flatten"<br>
Flatten layer is used to make the multidimensional input one-dimensional, commonly used in the transition from the convolution layer to the full connected layer.

In [None]:
# Build a simple model
inputs = keras.Input(shape=(28, 28))
l1 = keras.layers.experimental.preprocessing.Rescaling(1.0 / 255)(inputs)
l2 = keras.layers.Flatten()(l1)
# NOTE: in Keras "fully connected", aka matrix multiplication layers are called "dense layers"
l3 = keras.layers.Dense(128, activation="tanh")(l2) # layer 1
l4 = keras.layers.Dense(64, activation="tanh")(l3) # layer 2
l4 = keras.layers.Dense(32, activation="tanh")(l3) # layer 3
outputs = keras.layers.Dense(10, activation="softmax")(l4)
model = keras.Model(inputs, outputs)
model.summary()

# Compile the model
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy")

# Train the model for 1 epoch from Numpy data
batch_size = 64
print("Fit on NumPy data")
history = model.fit(x_train, y_train, batch_size=batch_size, epochs=1)

Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_2 (InputLayer)        [(None, 28, 28)]          0         
                                                                 
 rescaling_1 (Rescaling)     (None, 28, 28)            0         
                                                                 
 flatten_1 (Flatten)         (None, 784)               0         
                                                                 
 dense_3 (Dense)             (None, 128)               100480    
                                                                 
 dense_5 (Dense)             (None, 32)                4128      
                                                                 
 dense_6 (Dense)             (None, 10)                330       
                                                                 
Total params: 104,938
Trainable params: 104,938
Non-trainab

In [None]:
print(history.history)

{'loss': [0.33871346712112427]}


In [None]:
# Build a simple model
inputs = keras.Input(shape=(28, 28))
l1 = keras.layers.experimental.preprocessing.Rescaling(1.0 / 255)(inputs)
l2 = keras.layers.Flatten()(l1)
# NOTE: in Keras "fully connected", aka matrix multiplication layers are called "dense layers"
l3 = keras.layers.Dense(128, activation="tanh")(l2) # layer 1
l4 = keras.layers.Dense(64, activation="tanh")(l3) # layer 2
l4 = keras.layers.Dense(32, activation="tanh")(l3) # layer 3
outputs = keras.layers.Dense(10, activation="softmax")(l4)
model = keras.Model(inputs, outputs)
model.summary()

# Compile the model
model.compile(optimizer=keras.optimizers.Adam(learning_rate=1e-3), loss="sparse_categorical_crossentropy")

# Train the model for 1 epoch from Numpy data
batch_size = 64
print("Fit on NumPy data")
history = model.fit(x_train, y_train, batch_size=batch_size, epochs=1)

Model: "model_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_3 (InputLayer)        [(None, 28, 28)]          0         
                                                                 
 rescaling_2 (Rescaling)     (None, 28, 28)            0         
                                                                 
 flatten_2 (Flatten)         (None, 784)               0         
                                                                 
 dense_7 (Dense)             (None, 128)               100480    
                                                                 
 dense_9 (Dense)             (None, 32)                4128      
                                                                 
 dense_10 (Dense)            (None, 10)                330       
                                                                 
Total params: 104,938
Trainable params: 104,938
Non-trainab

In [None]:
print(history.history)

{'loss': [0.3222421407699585]}
