# Keras Implementation of FF Network for MNIST

This notebook servers as a quick introducion to `tf.keras`. It implements the neural net we build in numpy in a few lines and invites you to explore the keras API and train more complex nets on MNIST

In [None]:
import numpy as np
import matplotlib.pyplot as plt

from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Dropout, Activation, Input
from tensorflow.keras.utils import to_categorical

from sklearn.model_selection import train_test_split
from sklearn.metrics import log_loss

%matplotlib inline

## 1. Get MNIST Dataset (from Keras.datasets)

In [None]:
(X_train, y_train), (X_test, y_test) = mnist.load_data()

In [None]:
RESHAPED = 784
NB_CLASSES = 10

X_train = X_train.reshape(60000, RESHAPED)
X_test = X_test.reshape(10000, RESHAPED)
X_train = X_train.astype("float32")
X_test = X_test.astype("float32")

# Normalize
X_train /= 255
X_test /= 255

In [None]:
# Convert class vectors to categoricals
y_train = to_categorical(y_train, NB_CLASSES)
y_test = to_categorical(y_test, NB_CLASSES)

In [None]:
# finally let's create a validation set
X_train, X_val, y_train, y_val = train_test_split(
    X_train, y_train, test_size=0.2, random_state=42
)

## 2. Neural Net Architecture (Sequential API)

Keras has two neural net APIs, the `Sequential` and `Functional` APIs. In the Sequential case, we add layers one by one from input to output. There are no branches. The functional API is more flexible and simple requires the user to specify input and output tensors and how they are connected (the network can branch).

We'll look at the Sequential API first

In [None]:
model = Sequential()
model.add(Dense(NB_CLASSES, input_shape=(RESHAPED,), activation="softmax"))
model.summary()

model.compile(
    loss="categorical_crossentropy", optimizer="sgd", metrics=["accuracy"]
)

## 3. Neural Net Architecture (Functional API)

In [None]:
inputs = Input(shape=(RESHAPED,))
predictions = Dense(NB_CLASSES, activation="softmax")(inputs)
functional_model = Model(inputs=inputs, outputs=predictions)

functional_model.summary()

functional_model.compile(
    loss="categorical_crossentropy", optimizer="sgd", metrics=["accuracy"]
)

## 4. Training

Choose one of the two APIs above and train your model

In [None]:
BATCH_SIZE = 128
NB_EPOCH = 20

In [None]:
history = model.fit(
    X_train,
    y_train,
    validation_data=(X_val, y_val),
    batch_size=BATCH_SIZE,
    epochs=NB_EPOCH,
)

In [None]:
plt.plot(history.history["loss"], label="Training Loss")
plt.plot(history.history["val_acc"], label="Validation Accuracy")
plt.legend(frameon=False)

In [None]:
def get_accuracy(keras_model, X, y):
    predictions = model.predict(X).argmax(axis=1)
    return (predictions == y.argmax(axis=1)).mean()

In [None]:
# test scores
test_logloss = log_loss(y_test, model.predict(X_test))
test_accuracy = get_accuracy(model, X_test, y_test)
print(f"Test logloss is {test_logloss:.5f}")
print(f"Test accuracy is {test_accuracy:.3f}")

## 5. Challenge!

What's the highest accuracy you can get on MNIST using Dense() network layers. Feel free to experiment with CNNs as well if you're already familiar with them