# ANN Intro

This NB is constructed such that it also runs in reasonable time on Laptop CPUs (e.g. an i3)

## Local Setup

There are different options to set up the TensorFlow library (which now includes [Keras](https://keras.io) as backend library) on your own computer. The simplest of them is using only the CPU and can be installed in 1 command via [`conda`](https://docs.anaconda.com/anaconda/user-guide/tasks/tensorflow/), in an anaconda shell run:

```
conda install tensorflow
```

**NOTE**: TF migth not be compatible with your current environment, so here we create a [new environment](https://conda.io/docs/user-guide/tasks/manage-environments.html#creating-an-environment-with-commands) first:

```
conda create -n tf tensorflow
conda activate tf
```

In that case you need to install jupyter, scikit-learn, matplotlib, numpy and pandas in that environment again, with e.g.:

```
conda install jupyter scikit-learn matplotlib numpy pandas
```

(If you have a [supported Nvidia graphics card](https://developer.nvidia.com/cuda-gpus) in your machine and would like to use it for accelerated network training, make sure to follow [this guide](https://www.tensorflow.org/install/gpu) to install required packages and finally use the `tensorflow-gpu` library.)

The usage of the TensorFlow library in Python will be the same for CPU and GPU.

In [None]:
## fetch MNIST dataset (as done in last NB)


In [None]:
## Scale the input data into the range [0, 1]
## use sklearn's train_test_split to split the data into 
## 50000 instances for training (X_train, y_train), 10000 for validation (X_val, y_val) and 10000 for testing (X_test, y_test)


In [None]:
## load an MLP classifier from sklearn with all its defaults, only specifying `random_state=42`

## try printing out the sizes of the hidden layers, the number of layers and the number of output neurons/units

## train the MLP with the train set, time its execution

## try again printing out the sizes of the hidden layers, the number of layers and the number of output neurons/units


In [None]:
## print the scores of the trained MLP on the train and on the test set:
print("Training set score: %f" % )
print("Test set score: %f" % )

### Questions 1

1. What are the default values assumed for the MLPClassifier of sklearn?
2. What MLP ist constructed with the defaults? \
   I.e. how many hidden layers and how many input, hidden and output neurons/units does the MLP have?

### Answers

1. 
2. 

In [None]:
## Now construct another MLP classifier as above but with 2 hidden layers of 100 and 50 neurons/units.
## In addition it should used mini-batch gradient descent (mBGD) with a mini-batch size of 100
## and train only for 100 epochs.
## Read the docs carefully to figure out what you need to specify!

## try printing out the sizes of the hidden layers, the number of layers and the number of output neurons/units

## train the MLP with the train set, time its execution

## try again printing out the sizes of the hidden layers, the number of layers and the number of output neurons/units


In [None]:
## print the scores of the trained MLP on the train and on the test set:
print("Training set score: %f" % )
print("Test set score: %f" % )

In [None]:
def n_params(model): # from: https://stackoverflow.com/questions/59078110/way-to-count-the-number-of-parameters-in-a-scikit-learn-model
    """Return total number of parameters in a 
    Scikit-Learn model.

    This works for the following model types:
     - sklearn.neural_network.MLPClassifier
     - sklearn.neural_network.MLPRegressor
     - sklearn.linear_model.LinearRegression
     - and maybe some others
    """
    return (sum([a.size for a in model.coefs_]) +  
            sum([a.size for a in model.intercepts_]))

## use the given function to get the number of model parameters of the last MLP


### Questions 2

1. Does the returned number of parameters match your expectations? Write down your own calculation!


### Answers

1. 

In [None]:
## Now use the example from: https://scikit-learn.org/stable/auto_examples/neural_networks/plot_mnist_filters.html
## to plot ALL weight matrices of the first layer of the MLP trained above
## using subplots with 20 columns.


Now it's time to test your TensorFlow installation by importing the package. The following code cell should execute without errors:

In [None]:
import tensorflow as tf

Now let's check which computing devices TensorFlow has found on this machine. If you don't have the GPU setup on your computer, the list should just contain one CPU: `/device:CPU:0` 

In [None]:
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())

The cell below creates a similar MLP as above using tf.keras, see also this [tutorial network](https://github.com/keras-team/keras/blob/fcc0bfa354c5a47625d681d0297a66ef9ff43a9e/examples/mnist_mlp.py) which also uses the MNIST dataset.

Keras has a nice method `model.summary()` that prints a tabular overview of your network architecture, together with the input/output dimensions and number of parameters for each layer.

In [None]:
from tensorflow import keras

print(tf.__version__)
print(keras.__version__)

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import RMSprop

batch_size = 100
num_classes = 10
epochs = 100

print(X_train.shape[0], 'train samples')
print(X_test.shape[0], 'test samples')

## convert class vectors to binary class matrices
y_train_c = keras.utils.to_categorical(y_train, num_classes)
y_test_c = keras.utils.to_categorical(y_test, num_classes)

model = Sequential()
model.add(Dense(100, activation='relu', input_shape=(784,)))
model.add(Dense( 50, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

model.summary()

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

history = model.fit(X_train, y_train_c,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=1,
                    validation_data=(X_test, y_test_c))

In [None]:
score = model.evaluate(X_test, y_test_c, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

You should see the training going through the epochs and in the end the trained network is evaluated on the test set. 
It shoud reache at least a classification accurary of 97%.

### Exercise

Now try to tune the hyper-parameters of the MLP to achieve more than 98% accuracy.\
List the parameters you changed to achieve this score.