# Keras and the Last Number Problem

Let's see if we can do better than our simple hidden layer NN with the last number problem.

In [1]:
import numpy as np
import keras
from keras.utils import to_categorical

We'll use the same data class

In [2]:
class ModelDataCategorical:
    """this is the model data for our "last number" training set.  We
    produce input of length N, consisting of numbers 0-9 and store
    the result in a 10-element array as categorical data.

    """
    def __init__(self, N=10):
        self.N = N
        
        # our model input data
        self.x = np.random.randint(0, high=10, size=N)
        self.x_scaled = self.x / 10 + 0.05
        
        # our scaled model output data
        self.y = np.array([self.x[-1]])
        self.y_scaled = np.zeros(10) + 0.01
        self.y_scaled[self.x[-1]] = 0.99
        
    def interpret_result(self, out):
        """take the network output and return the number we predict"""
        return np.argmax(out)

For Keras, we need to pack the scaled data (both input and output) into arrays.  We'll use
the Keras `np_utils.to_categorical()` to make the data categorical.

Let's make both a training set and a test set

In [3]:
x_train = []
y_train = []
for _ in range(10000):
    m = ModelDataCategorical()
    x_train.append(m.x_scaled)
    y_train.append(m.y)

x_train = np.asarray(x_train)
y_train = to_categorical(y_train, 10)

In [4]:
x_test = []
y_test = []
for _ in range(1000):
    m = ModelDataCategorical()
    x_test.append(m.x_scaled)
    y_test.append(m.y)

x_test = np.asarray(x_test)
y_test = to_categorical(y_test, 10)

Check to make sure the data looks like we expect:

In [5]:
x_train[0]

array([0.65, 0.55, 0.55, 0.45, 0.25, 0.95, 0.05, 0.55, 0.15, 0.95])

In [6]:
y_train[0]

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 1.])

Now let's build our network.  We'll use just a single hidden layer,
but instead of the sigmoid used before, we'll use RELU and the softmax activations.

In [9]:
from keras.models import Sequential
from keras.layers import Input, Dense, Dropout, Activation
from keras.optimizers import RMSprop

In [10]:
model = Sequential()
model.add(Input((10,)))
model.add(Dense(100, activation="relu"))
model.add(Dropout(0.1))
model.add(Dense(10, activation="softmax"))

In [11]:
rms = RMSprop()
model.compile(loss='categorical_crossentropy',
              optimizer=rms, metrics=['accuracy'])

Now we can train and test each epoch to see how we do

In [12]:
epochs = 100
batch_size = 256
model.fit(x_train, y_train, epochs=epochs, batch_size=batch_size,
          validation_data=(x_test, y_test), verbose=2)

Epoch 1/100
40/40 - 0s - 3ms/step - accuracy: 0.1495 - loss: 2.2613 - val_accuracy: 0.2090 - val_loss: 2.1986
Epoch 2/100
40/40 - 0s - 4ms/step - accuracy: 0.2058 - loss: 2.1617 - val_accuracy: 0.2590 - val_loss: 2.0943
Epoch 3/100
40/40 - 0s - 4ms/step - accuracy: 0.2544 - loss: 2.0591 - val_accuracy: 0.3100 - val_loss: 1.9917
Epoch 4/100
40/40 - 0s - 3ms/step - accuracy: 0.2919 - loss: 1.9560 - val_accuracy: 0.3210 - val_loss: 1.8866
Epoch 5/100
40/40 - 0s - 4ms/step - accuracy: 0.3197 - loss: 1.8577 - val_accuracy: 0.3720 - val_loss: 1.7878
Epoch 6/100
40/40 - 0s - 3ms/step - accuracy: 0.3465 - loss: 1.7700 - val_accuracy: 0.3560 - val_loss: 1.7039
Epoch 7/100
40/40 - 0s - 3ms/step - accuracy: 0.3625 - loss: 1.6950 - val_accuracy: 0.4190 - val_loss: 1.6320
Epoch 8/100
40/40 - 0s - 3ms/step - accuracy: 0.4042 - loss: 1.6220 - val_accuracy: 0.4200 - val_loss: 1.5649
Epoch 9/100
40/40 - 0s - 3ms/step - accuracy: 0.4266 - loss: 1.5589 - val_accuracy: 0.4840 - val_loss: 1.5043
Epoch 10/1

<keras.src.callbacks.history.History at 0x7fa959aba900>

As we see, the network is essentially perfect now.