Dataset:
Digits: 10 class handwritten digits
http://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html#sklearn.datasets.load_digits

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import load_digits
from sklearn.preprocessing import StandardScaler

digits = load_digits()
digits.images.shape
digits.data.shape
digits.target.shape
sample_index = 45
plt.figure(figsize=(3, 3))
plt.imshow(digits.images[sample_index], cmap=plt.cm.gray_r,
           interpolation='nearest')
plt.title("image label: %d" % digits.target[sample_index]);

Train / Test Split

Let's keep some held-out data to be able to measure the generalization performance of our model.

In [None]:
from sklearn.model_selection import train_test_split


data = np.asarray(digits.data, dtype='float32')
target = np.asarray(digits.target, dtype='int32')

X_train, X_test, y_train, y_test = train_test_split(
    data, target, test_size=0.15, random_state=37)
X_train.shape
X_test.shape
y_train.shape
y_test.shape

Preprocessing of the Input Data:

Make sure that all input variables are approximately on the same scale via input normalization:

In [None]:
from sklearn import preprocessing


# mean = 0 ; standard deviation = 1.0
scaler = preprocessing.StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# print(scaler.mean_)
# print(scaler.scale_)
X_train.shape
X_train.mean(axis=0)
X_train.std(axis=0)

Let's display the one of the transformed sample (after feature standardization):

In [None]:
sample_index = 45
plt.figure(figsize=(3, 3))
plt.imshow(X_train[sample_index].reshape(8, 8),
           cmap=plt.cm.gray_r, interpolation='nearest')
plt.title("transformed sample\n(standardization)");

The scaler objects makes it possible to recover the original sample:

In [None]:
plt.figure(figsize=(3, 3))
sample_data = X_train[sample_index].reshape(1, -1)
transformed_sample = scaler.inverse_transform(sample_data)
plt.imshow(transformed_sample.reshape(8, 8), cmap=plt.cm.gray_r, interpolation='nearest')
plt.title("Original Sample");
print(X_train.shape, y_train.shape)
print(X_test.shape, y_test.shape)

Preprocessing of the Target Data

To train a first neural network we also need to turn the target variable into a vector "one-hot-encoding" representation. Here are the labels of the first samples in the training set encoded as integers:

In [None]:
y_train[:3]


Keras provides a utility function to convert integer-encoded categorical variables as one-hot encoded values:

In [None]:
from tensorflow.keras.utils import to_categorical

Y_train = to_categorical(y_train)
Y_train[:3]
Y_train.shape

Feed Forward Neural Networks with Keras

Objectives of this section:

Build and train a first feedforward network using Keras
https://www.tensorflow.org/guide/keras/overview

Experiment with different optimizers, activations, size of layers, initializations

A First Keras Model

We can now build an train a our first feed forward neural network using the high level API from keras:

first we define the model by stacking layers with the right dimensions,
then we define a loss function and plug the SGD optimizer,
then we feed the model the training data for fixed number of epochs

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras import optimizers

input_dim = X_train.shape[1]
hidden_dim = 100
output_dim = Y_train.shape[1]

model = Sequential()
model.add(Dense(hidden_dim, input_dim=input_dim, activation="tanh"))
model.add(Dense(output_dim, activation="softmax"))

model.compile(optimizer=optimizers.SGD(learning_rate=0.1),
              loss='categorical_crossentropy', metrics=['accuracy'])

history = model.fit(X_train, Y_train, validation_split=0.2, epochs=15, batch_size=32)

Visualizing the Convergence

Let's wrap the keras history info into a pandas dataframe for easier plotting:

In [None]:
import pandas as pd

history_df = pd.DataFrame(history.history)
history_df["epoch"] = history.epoch
fig, (ax0, ax1) = plt.subplots(nrows=2, sharex=True, figsize=(12, 6))
history_df.plot(x="epoch", y=["loss", "val_loss"], ax=ax0)
history_df.plot(x="epoch", y=["accuracy", "val_accuracy"], ax=ax1);

Monitoring Convergence with Tensorboard

Tensorboard is a built-in neural network monitoring tool.

In [None]:
%load_ext tensorboard
!rm -rf tensorboard_logs
import datetime
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.models import Sequential

model = Sequential()
model.add(Dense(hidden_dim, input_dim=input_dim, activation="tanh"))
model.add(Dense(output_dim, activation="softmax"))

model.compile(optimizer=optimizers.Adam(),
              loss='categorical_crossentropy', metrics=['accuracy'])

timestamp =  datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
log_dir = "tensorboard_logs/" + timestamp
tensorboard_callback = TensorBoard(log_dir=log_dir, histogram_freq=1)

model.fit(x=X_train, y=Y_train, validation_split=0.2, epochs=15,
          callbacks=[tensorboard_callback]);
%tensorboard --logdir tensorboard_logs
