# Task 1

Start with reading the section “Implementing MLPs with Keras” from Chapter 10 of Geron’s text-book (pages 292-325).
Then install `TensorFlow 2.0+` and experiment with the code included in this section.
Additionally, study the official documentation (https://keras.io/) and get an idea of the numerous options offered by Keras (layers, loss functions, metrics, optimizers, activations, initializers, regularizers).
Don’t get overwhelmed with the number of options – you will frequently return to this site in the coming months.

### Imports

In [None]:
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.datasets import fashion_mnist
from tensorflow.keras.models import Sequential

---
## Part 1

Check out this official repository with many examples of Keras implementations of various sorts of deep neural networks [here](https://github.com/keras-team/keras/tree/tf-keras-2/examples).
We recommend cloning this repository and try to get some of these examples running on your system (or Colab/DeepNote).
In particular, experiment with `mnist_mlp.py` and `mnist_cnn.py` scripts which show you how to build simple neural networks for the MNIST dataset (useful for the next task).

*insert findings*

---

## Part 2

Next, take the two well-known datasets: Fashion MNIST (introduced in _Ch. 10, p. 295_) and CIFAR-10.
The first dataset contains 2D (grayscale) images of size 28x28, split into 10 categories; 60,000 images for training and 10,000 for testing, while the latter contains 32x32x3 RGB images (50,000/10,000 train/test).
Apply two reference networks on the fashion MNIST dataset: a MLP described in detail in _Ch. 10, pp. 297-307_ and a CNN described in _Ch. 14, p. 447_.
Experiment with both networks, trying various options: initializations, activations, optimizers (and their hyperparameters), regularizations (L1, L2, Dropout, no Dropout).
You may also experiment with changing the architecture of both networks: adding/removing layers, number of convolutional filters, their sizes, etc.

After you have found the best performing hyperparameter sets, take the 3 best ones and train new models on the CIFAR-10 dataset, see whether your performance gains translate to a different dataset.
Provide your thoughts on these results in the report.

First we create a MLP model for the fashion MNIST dataset.
We use the same model as in the book, but we add a dropout layer after the first dense layer.
We also use the Adam optimizer with a learning rate of 0.001.
We train the model for 10 epochs and use a batch size of 32.
We use the same model for the CIFAR-10 dataset, but we change the number of epochs to 20 and the batch size to 64.
We also use a learning rate of 0.0001 for the CIFAR-10 dataset.

### Fashion MNIST

load dataset

In [None]:
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()

x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

create model

In [None]:
model = Sequential()
model.add(layers.Flatten(input_shape=(28, 28)))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dropout(0.2))
model.add(layers.Dense(10, activation='softmax'))

compile & train model

In [None]:
model.compile(
    optimizer = keras.optimizers.Adam(learning_rate=0.001),
    loss = 'sparse_categorical_crossentropy',
    metrics = ['accuracy']
)

model.fit(x_train, y_train, epochs=5, batch_size=32)

evaluate model on test set

In [None]:
model.evaluate(x_test, y_test)

We're getting a loss of about 0.34, which is not too bad.