# 1. Generalization

In [1]:
from tensorflow import keras
from sklearn.utils import resample

In [2]:
def mnist_classification(sampling:int=1, depth:int=1, width:int=32, optimizer:str='adam', batch_size:int=128, epochs:int=5):
    (x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
    if sampling > 1:
        x_train, y_train = resample(x_train, y_train, replace=False, n_samples=x_train.shape[0]//sampling)
    x_train = x_train / 255.0
    x_test = x_test / 255.0
    y_train = keras.utils.to_categorical(y_train, 10)
    y_test = keras.utils.to_categorical(y_test, 10)

    model = keras.Sequential()
    model.add(keras.layers.Flatten(input_shape=(28, 28)))
    for _ in range(depth):
        model.add(keras.layers.Dense(width, activation='relu'))
    model.add(keras.layers.Dense(10, activation='softmax'))

    model.compile(
        loss='categorical_crossentropy',
        optimizer=optimizer,
        metrics=['accuracy']
    )

    history = model.fit(x_train, y_train, epochs=epochs, batch_size=batch_size, validation_data=(x_test, y_test))
    return history

>Perform a classification task. Note how many epochs the training takes, and in testing, how well it generalizes.

- 幅128の1層, バッチサイズ128, エポック数5でやってみる

In [3]:
res = mnist_classification()

2022-04-22 19:23:47.767535: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


>Perform the classification on a smaller training set, how does learning rate change, how does generalization change.

- サンプルを 1/10 にするとちょっとaccuracy下がる

In [4]:
res = mnist_classification(sampling=10)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


>Vary other elements: try a different optimizer than adam, try a different learning rate, try a different (deeper) architecture, try wider hidden layers.

In [5]:
# Optimizer = SGD だと accuracy が若干下がった
res = mnist_classification(optimizer='sgd')

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [6]:
# deeper architecture は 4くらいだとあまり変わらない
res = mnist_classification(depth=4)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [7]:
# width を広げたら accuracy 上がった
res = mnist_classification(width=256)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
