# Understanding Stateful LSTM Recurrent Neural Networks

A powerful and popular recurrent neural network is the long short-term model network or LSTM. It is widely used because the architecture overcomes the vanishing and exploding gradient problem that plagues all recurrent neural networks, allowing very large and very deep networks to be created. Like other recurrent neural networks, LSTM networks maintain state, and the specifics of how this is implemented in the Keras framework can be confusing. This lesson will show exactly how the state is maintained in LSTM networks by the Keras deep learning library. After reading this lesson, you will know:
* How to develop a naive LSTM network for a sequence prediction problem.
* How to carefully manage state through batches and features with an LSTM network.
* How to manually manage state in an LSTM network for stateful prediction.

Let's get started.

## Problem Description: Learn the Alphabet

In this tutorial, we will develop and contrast a number of different LSTM recurrent neural network models. The context of these comparisons will be a simple sequence prediction problem of learning the alphabet. That is, given a letter of the alphabet, predict the next letter of the alphabet. Once understood, this is a simple sequence prediction problem that can be generalized to other sequence prediction problems like time series prediction and sequence classification. Let's prepare the problem with some Python code that we can reuse from example to example. Firstly, let's import all of the classes and functions we plan to use in this tutorial.


In [1]:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM
from tensorflow.keras import utils

import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU')
for gpu in physical_devices:
    tf.config.experimental.set_memory_growth(gpu, enable=True)
assert tf.executing_eagerly()

Next, we can seed the random number generator to ensure that the results are the same each time the code is executed.

In [2]:
# fix random seed for reproducibility
np.random.seed(7)

We can now define our dataset, the alphabet. We define the alphabet in uppercase characters for readability. Neural networks model numbers, so we need to map the letters of the alphabet to integer values. We can do this easily by creating a dictionary (map) of the letter index to the character. We can also create a reverse lookup for converting predictions back into characters to be used later.

In [3]:
# define the raw dataset
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

# create mapping of characters to integers (0-25) and the reverse
char_to_int = dict((c, i) for i, c in enumerate(alphabet))
int_to_char = dict((i, c) for i, c in enumerate(alphabet))

Now we need to create our input and output pairs on which to train our neural network. We can do this by defining an input sequence length, then reading sequences from the input alphabet sequence. For example, we use an input length of 1. Starting at the beginning of the raw input data, we can read off the first letter A and the next letter as the prediction B. We move along one character and repeat until we reach a prediction of Z.

In [4]:
# prepare the dataset of input to output pairs encoded as integers
seq_length = 1
dataX = []
dataY = []

for i in range(0, len(alphabet) - seq_length, 1):
    seq_in = alphabet[i:i + seq_length]
    seq_out = alphabet[i + seq_length]
    dataX.append([char_to_int[char] for char in seq_in])
    dataY.append(char_to_int[seq_out])
    # We also print out the input pairs for sanity checking. 
    # Running the code will produce the following output, summarizing input 
    # sequences of length one and a single output character.
    print(seq_in, '->', seq_out)

A -> B
B -> C
C -> D
D -> E
E -> F
F -> G
G -> H
H -> I
I -> J
J -> K
K -> L
L -> M
M -> N
N -> O
O -> P
P -> Q
Q -> R
R -> S
S -> T
T -> U
U -> V
V -> W
W -> X
X -> Y
Y -> Z


We need to reshape the NumPy array into a format expected by the LSTM networks [samples, time steps, features].

In [5]:
# reshape X to be [samples, time steps, features]
X = np.reshape(dataX, (len(dataX), seq_length, 1))

Once reshaped, we can then normalize the input integers to the range 0-to-1, the range of the sigmoid activation functions used by the LSTM network.

In [6]:
# normalize
X = X / float(len(alphabet))

Finally, we can think of this problem as a sequence classification task, where each of the 26 letters represents a different class. As such, we can convert the output (y) to one-hot encoding, using the Keras built-in function `to_categorical()`.

In [7]:
# one hot encode the output variable
y = utils.to_categorical(dataY)

We are now ready to fit different LSTM models.

## LSTM for Learning One-Char to One-Char Mapping

Let's start by designing a simple LSTM to learn how to predict the next character in the alphabet, given the context of just one character. We will frame the problem as a random collection of one-letter input to one-letter output pairs. As we will see, this is a difficult framing of the problem for the LSTM to learn. Let's define an LSTM network with 32 units and an output layer using the softmax activation function for making predictions. Because this is a multiclass classification problem, we can use the log loss function (called `categorical_crossentropy` in Keras), and optimize the network using the ADAM optimization function. The model is fit over 500 epochs with a batch size of 1.

In [8]:
# create and fit the model
model = Sequential()
model.add(LSTM(32, input_shape=(X.shape[1], X.shape[2])))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=500, batch_size=1, verbose=2)

Epoch 1/500
25/25 - 2s - loss: 3.2633 - accuracy: 0.0000e+00
Epoch 2/500
25/25 - 0s - loss: 3.2555 - accuracy: 0.0000e+00
Epoch 3/500
25/25 - 0s - loss: 3.2529 - accuracy: 0.0000e+00
Epoch 4/500
25/25 - 0s - loss: 3.2498 - accuracy: 0.0000e+00
Epoch 5/500
25/25 - 0s - loss: 3.2467 - accuracy: 0.0000e+00
Epoch 6/500
25/25 - 0s - loss: 3.2438 - accuracy: 0.0400
Epoch 7/500
25/25 - 0s - loss: 3.2404 - accuracy: 0.0000e+00
Epoch 8/500
25/25 - 0s - loss: 3.2375 - accuracy: 0.0400
Epoch 9/500
25/25 - 0s - loss: 3.2338 - accuracy: 0.0000e+00
Epoch 10/500
25/25 - 0s - loss: 3.2304 - accuracy: 0.0000e+00
Epoch 11/500
25/25 - 0s - loss: 3.2263 - accuracy: 0.0400
Epoch 12/500
25/25 - 0s - loss: 3.2224 - accuracy: 0.0400
Epoch 13/500
25/25 - 0s - loss: 3.2182 - accuracy: 0.0400
Epoch 14/500
25/25 - 0s - loss: 3.2133 - accuracy: 0.0400
Epoch 15/500
25/25 - 0s - loss: 3.2086 - accuracy: 0.0400
Epoch 16/500
25/25 - 0s - loss: 3.2033 - accuracy: 0.0400
Epoch 17/500
25/25 - 0s - loss: 3.1983 - accuracy

25/25 - 0s - loss: 2.2795 - accuracy: 0.2400
Epoch 142/500
25/25 - 0s - loss: 2.2757 - accuracy: 0.2800
Epoch 143/500
25/25 - 0s - loss: 2.2727 - accuracy: 0.3200
Epoch 144/500
25/25 - 0s - loss: 2.2686 - accuracy: 0.2400
Epoch 145/500
25/25 - 0s - loss: 2.2649 - accuracy: 0.2000
Epoch 146/500
25/25 - 0s - loss: 2.2608 - accuracy: 0.2800
Epoch 147/500
25/25 - 0s - loss: 2.2575 - accuracy: 0.2800
Epoch 148/500
25/25 - 0s - loss: 2.2538 - accuracy: 0.2800
Epoch 149/500
25/25 - 0s - loss: 2.2508 - accuracy: 0.2800
Epoch 150/500
25/25 - 0s - loss: 2.2477 - accuracy: 0.2800
Epoch 151/500
25/25 - 0s - loss: 2.2441 - accuracy: 0.2800
Epoch 152/500
25/25 - 0s - loss: 2.2388 - accuracy: 0.2400
Epoch 153/500
25/25 - 0s - loss: 2.2381 - accuracy: 0.2400
Epoch 154/500
25/25 - 0s - loss: 2.2357 - accuracy: 0.2800
Epoch 155/500
25/25 - 0s - loss: 2.2311 - accuracy: 0.2800
Epoch 156/500
25/25 - 0s - loss: 2.2253 - accuracy: 0.3200
Epoch 157/500
25/25 - 0s - loss: 2.2227 - accuracy: 0.3200
Epoch 158/5

25/25 - 0s - loss: 1.9358 - accuracy: 0.5600
Epoch 281/500
25/25 - 0s - loss: 1.9336 - accuracy: 0.6000
Epoch 282/500
25/25 - 0s - loss: 1.9320 - accuracy: 0.6800
Epoch 283/500
25/25 - 0s - loss: 1.9315 - accuracy: 0.4800
Epoch 284/500
25/25 - 0s - loss: 1.9284 - accuracy: 0.6400
Epoch 285/500
25/25 - 0s - loss: 1.9298 - accuracy: 0.5600
Epoch 286/500
25/25 - 0s - loss: 1.9286 - accuracy: 0.6000
Epoch 287/500
25/25 - 0s - loss: 1.9229 - accuracy: 0.6400
Epoch 288/500
25/25 - 0s - loss: 1.9207 - accuracy: 0.6800
Epoch 289/500
25/25 - 0s - loss: 1.9218 - accuracy: 0.5200
Epoch 290/500
25/25 - 0s - loss: 1.9189 - accuracy: 0.5200
Epoch 291/500
25/25 - 0s - loss: 1.9164 - accuracy: 0.6000
Epoch 292/500
25/25 - 0s - loss: 1.9157 - accuracy: 0.5600
Epoch 293/500
25/25 - 0s - loss: 1.9139 - accuracy: 0.5600
Epoch 294/500
25/25 - 0s - loss: 1.9126 - accuracy: 0.6000
Epoch 295/500
25/25 - 0s - loss: 1.9109 - accuracy: 0.6000
Epoch 296/500
25/25 - 0s - loss: 1.9091 - accuracy: 0.5600
Epoch 297/5

25/25 - 0s - loss: 1.7507 - accuracy: 0.7200
Epoch 420/500
25/25 - 0s - loss: 1.7490 - accuracy: 0.7600
Epoch 421/500
25/25 - 0s - loss: 1.7481 - accuracy: 0.8000
Epoch 422/500
25/25 - 0s - loss: 1.7485 - accuracy: 0.6800
Epoch 423/500
25/25 - 0s - loss: 1.7455 - accuracy: 0.7600
Epoch 424/500
25/25 - 0s - loss: 1.7477 - accuracy: 0.7600
Epoch 425/500
25/25 - 0s - loss: 1.7457 - accuracy: 0.7200
Epoch 426/500
25/25 - 0s - loss: 1.7452 - accuracy: 0.7600
Epoch 427/500
25/25 - 0s - loss: 1.7443 - accuracy: 0.8000
Epoch 428/500
25/25 - 0s - loss: 1.7439 - accuracy: 0.7600
Epoch 429/500
25/25 - 0s - loss: 1.7399 - accuracy: 0.7600
Epoch 430/500
25/25 - 0s - loss: 1.7388 - accuracy: 0.8000
Epoch 431/500
25/25 - 0s - loss: 1.7384 - accuracy: 0.8000
Epoch 432/500
25/25 - 0s - loss: 1.7371 - accuracy: 0.7600
Epoch 433/500
25/25 - 0s - loss: 1.7375 - accuracy: 0.7600
Epoch 434/500
25/25 - 0s - loss: 1.7360 - accuracy: 0.8000
Epoch 435/500
25/25 - 0s - loss: 1.7367 - accuracy: 0.7600
Epoch 436/5

<tensorflow.python.keras.callbacks.History at 0x7fca6c030198>

After we fit the model, we can evaluate and summarize the performance on the entire training dataset.

In [9]:
# summarize performance of the model
scores = model.evaluate(X, y, verbose=0)
print("Model Accuracy: %.2f%%" % (scores[1]*100))

Model Accuracy: 88.00%


We can then re-run the training data through the network and generate predictions, converting both the input and output pairs back into their original character format to visualize how well the network learned the problem.

In [11]:
# demonstrate some model predictions
for pattern in dataX:
    x = np.reshape(pattern, (1, len(pattern), 1))
    x = x / float(len(alphabet))
    prediction = model.predict(x, verbose=0)
    index = np.argmax(prediction)
    result = int_to_char[index]
    seq_in = [int_to_char[value] for value in pattern]
    print(seq_in, "->", result)

['A'] -> B
['B'] -> C
['C'] -> D
['D'] -> E
['E'] -> F
['F'] -> G
['G'] -> H
['H'] -> I
['I'] -> J
['J'] -> K
['K'] -> L
['L'] -> M
['M'] -> N
['N'] -> O
['O'] -> P
['P'] -> Q
['Q'] -> R
['R'] -> S
['S'] -> T
['T'] -> U
['U'] -> V
['V'] -> X
['W'] -> Z
['X'] -> Z
['Y'] -> Z


The entire code listing is provided below for completeness.

In [14]:
# fix random seed for reproducibility
np.random.seed(7)

# define the raw dataset
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

# create mapping of characters to integers (0-25) and the reverse
char_to_int = dict((c, i) for i, c in enumerate(alphabet))
int_to_char = dict((i, c) for i, c in enumerate(alphabet))

# prepare the dataset of input to output pairs encoded as integers
seq_length = 1
dataX = []
dataY = []
for i in range(0, len(alphabet) - seq_length, 1):
    seq_in = alphabet[i:i + seq_length]
    seq_out = alphabet[i + seq_length]
    dataX.append([char_to_int[char] for char in seq_in])
    dataY.append(char_to_int[seq_out])
    print(seq_in, '->', seq_out)

# reshape X to be [samples, time steps, features]
X = np.reshape(dataX, (len(dataX), seq_length, 1))

# normalize
X = X / float(len(alphabet))

# one hot encode the output variable
y = utils.to_categorical(dataY)

# create and fit the model
model = Sequential()
model.add(LSTM(32, input_shape=(X.shape[1], X.shape[2])))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=500, batch_size=1, verbose=2)

# summarize performance of the model
scores = model.evaluate(X, y, verbose=0)
print("Model Accuracy: %.2f%%" % (scores[1]*100))

# demonstrate some model predictions
for pattern in dataX:
    x = np.reshape(pattern, (1, len(pattern), 1))
    x = x / float(len(alphabet))
    prediction = model.predict(x, verbose=0)
    index = np.argmax(prediction)
    result = int_to_char[index]
    seq_in = [int_to_char[value] for value in pattern]
    #Running this example produces the following output
    print(seq_in, "->", result)

A -> B
B -> C
C -> D
D -> E
E -> F
F -> G
G -> H
H -> I
I -> J
J -> K
K -> L
L -> M
M -> N
N -> O
O -> P
P -> Q
Q -> R
R -> S
S -> T
T -> U
U -> V
V -> W
W -> X
X -> Y
Y -> Z
Epoch 1/500
25/25 - 1s - loss: 3.2667 - accuracy: 0.0400
Epoch 2/500
25/25 - 0s - loss: 3.2593 - accuracy: 0.0400
Epoch 3/500
25/25 - 0s - loss: 3.2570 - accuracy: 0.0400
Epoch 4/500
25/25 - 0s - loss: 3.2545 - accuracy: 0.0400
Epoch 5/500
25/25 - 0s - loss: 3.2518 - accuracy: 0.0400
Epoch 6/500
25/25 - 0s - loss: 3.2494 - accuracy: 0.0400
Epoch 7/500
25/25 - 0s - loss: 3.2473 - accuracy: 0.0000e+00
Epoch 8/500
25/25 - 0s - loss: 3.2447 - accuracy: 0.0000e+00
Epoch 9/500
25/25 - 0s - loss: 3.2420 - accuracy: 0.0400
Epoch 10/500
25/25 - 0s - loss: 3.2392 - accuracy: 0.0000e+00
Epoch 11/500
25/25 - 0s - loss: 3.2360 - accuracy: 0.0400
Epoch 12/500
25/25 - 0s - loss: 3.2333 - accuracy: 0.0000e+00
Epoch 13/500
25/25 - 0s - loss: 3.2295 - accuracy: 0.0800
Epoch 14/500
25/25 - 0s - loss: 3.2258 - accuracy: 0.0400
Epoch 

Epoch 139/500
25/25 - 0s - loss: 2.3109 - accuracy: 0.2000
Epoch 140/500
25/25 - 0s - loss: 2.3087 - accuracy: 0.2000
Epoch 141/500
25/25 - 0s - loss: 2.3043 - accuracy: 0.2000
Epoch 142/500
25/25 - 0s - loss: 2.2998 - accuracy: 0.1600
Epoch 143/500
25/25 - 0s - loss: 2.2968 - accuracy: 0.2000
Epoch 144/500
25/25 - 0s - loss: 2.2932 - accuracy: 0.2000
Epoch 145/500
25/25 - 0s - loss: 2.2899 - accuracy: 0.2400
Epoch 146/500
25/25 - 0s - loss: 2.2876 - accuracy: 0.2000
Epoch 147/500
25/25 - 0s - loss: 2.2836 - accuracy: 0.2800
Epoch 148/500
25/25 - 0s - loss: 2.2797 - accuracy: 0.2800
Epoch 149/500
25/25 - 0s - loss: 2.2767 - accuracy: 0.2000
Epoch 150/500
25/25 - 0s - loss: 2.2730 - accuracy: 0.2400
Epoch 151/500
25/25 - 0s - loss: 2.2686 - accuracy: 0.2000
Epoch 152/500
25/25 - 0s - loss: 2.2663 - accuracy: 0.2800
Epoch 153/500
25/25 - 0s - loss: 2.2639 - accuracy: 0.2000
Epoch 154/500
25/25 - 0s - loss: 2.2579 - accuracy: 0.2800
Epoch 155/500
25/25 - 0s - loss: 2.2562 - accuracy: 0.28

Epoch 278/500
25/25 - 0s - loss: 1.9711 - accuracy: 0.5200
Epoch 279/500
25/25 - 0s - loss: 1.9680 - accuracy: 0.5200
Epoch 280/500
25/25 - 0s - loss: 1.9658 - accuracy: 0.4000
Epoch 281/500
25/25 - 0s - loss: 1.9642 - accuracy: 0.4400
Epoch 282/500
25/25 - 0s - loss: 1.9613 - accuracy: 0.5200
Epoch 283/500
25/25 - 0s - loss: 1.9618 - accuracy: 0.4800
Epoch 284/500
25/25 - 0s - loss: 1.9576 - accuracy: 0.5200
Epoch 285/500
25/25 - 0s - loss: 1.9564 - accuracy: 0.5200
Epoch 286/500
25/25 - 0s - loss: 1.9549 - accuracy: 0.4800
Epoch 287/500
25/25 - 0s - loss: 1.9557 - accuracy: 0.5600
Epoch 288/500
25/25 - 0s - loss: 1.9526 - accuracy: 0.4800
Epoch 289/500
25/25 - 0s - loss: 1.9512 - accuracy: 0.4800
Epoch 290/500
25/25 - 0s - loss: 1.9496 - accuracy: 0.5200
Epoch 291/500
25/25 - 0s - loss: 1.9474 - accuracy: 0.5600
Epoch 292/500
25/25 - 0s - loss: 1.9461 - accuracy: 0.4800
Epoch 293/500
25/25 - 0s - loss: 1.9449 - accuracy: 0.6400
Epoch 294/500
25/25 - 0s - loss: 1.9413 - accuracy: 0.52

Epoch 417/500
25/25 - 0s - loss: 1.7852 - accuracy: 0.7200
Epoch 418/500
25/25 - 0s - loss: 1.7826 - accuracy: 0.7600
Epoch 419/500
25/25 - 0s - loss: 1.7833 - accuracy: 0.7200
Epoch 420/500
25/25 - 0s - loss: 1.7827 - accuracy: 0.7600
Epoch 421/500
25/25 - 0s - loss: 1.7815 - accuracy: 0.6400
Epoch 422/500
25/25 - 0s - loss: 1.7820 - accuracy: 0.8000
Epoch 423/500
25/25 - 0s - loss: 1.7808 - accuracy: 0.7200
Epoch 424/500
25/25 - 0s - loss: 1.7781 - accuracy: 0.7200
Epoch 425/500
25/25 - 0s - loss: 1.7772 - accuracy: 0.7200
Epoch 426/500
25/25 - 0s - loss: 1.7759 - accuracy: 0.7600
Epoch 427/500
25/25 - 0s - loss: 1.7746 - accuracy: 0.8000
Epoch 428/500
25/25 - 0s - loss: 1.7727 - accuracy: 0.7200
Epoch 429/500
25/25 - 0s - loss: 1.7741 - accuracy: 0.8000
Epoch 430/500
25/25 - 0s - loss: 1.7723 - accuracy: 0.7600
Epoch 431/500
25/25 - 0s - loss: 1.7703 - accuracy: 0.7200
Epoch 432/500
25/25 - 0s - loss: 1.7702 - accuracy: 0.6400
Epoch 433/500
25/25 - 0s - loss: 1.7693 - accuracy: 0.68

We can see that this problem is indeed difficult for the network to learn. The reason is, the poor LSTM units do not have any context to work with. Each input-output pattern is shown to the network in a random order, and the state of the network is reset after each pattern (each batch where each batch contains one pattern). This is an abuse of the LSTM network architecture, treating it like a standard Multilayer Perceptron. Next, let's try a different framing of the problem to provide more sequence to the network from which to learn.

## LSTM for a Feature Window to One-Char Mapping

A popular approach to adding more context to data for Multilayer Perceptrons is to use the window method. This is where previous steps in the sequence are provided as additional input features to the network. We can try the same trick to provide more context to the LSTM network. Here, we increase the sequence length from 1 to 3, for example:

```
# prepare the dataset of input to output pairs encoded as integers
seq_length = 3
```

Which creates training patterns like:

```
ABC -> D
BCD -> E
CDE -> F
```

Each element in the sequence is then provided as a new input feature to the network. This requires a modification of how the input sequences reshaped in the data preparation step:

```
# reshape X to be [samples, time steps, features]
X = numpy.reshape(dataX, (len(dataX), 1, seq_length))
```

It also requires a modification for how the sample patterns are reshaped when demonstrating predictions from the model.

```
x = numpy.reshape(pattern, (1, 1, len(pattern)))
```

The entire code listing is provided below for completeness.

In [17]:
# fix random seed for reproducibility
np.random.seed(7)

# define the raw dataset
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

# create mapping of characters to integers (0-25) and the reverse
char_to_int = dict((c, i) for i, c in enumerate(alphabet))
int_to_char = dict((i, c) for i, c in enumerate(alphabet))

# prepare the dataset of input to output pairs encoded as integers
seq_length = 3
dataX = []
dataY = []
for i in range(0, len(alphabet) - seq_length, 1):
    seq_in = alphabet[i:i + seq_length]
    seq_out = alphabet[i + seq_length]
    dataX.append([char_to_int[char] for char in seq_in])
    dataY.append(char_to_int[seq_out])
    print(seq_in, '->', seq_out)

# reshape X to be [samples, time steps, features]
X = np.reshape(dataX, (len(dataX), 1, seq_length))

# normalize
X = X / float(len(alphabet))

# one hot encode the output variable
y = utils.to_categorical(dataY)

# create and fit the model
model = Sequential()
model.add(LSTM(32, input_shape=(X.shape[1], X.shape[2])))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=500, batch_size=1, verbose=2)

# summarize performance of the model
scores = model.evaluate(X, y, verbose=0)
print("Model Accuracy: %.2f%%" % (scores[1]*100))

# demonstrate some model predictions
for pattern in dataX:
    x = np.reshape(pattern, (1, 1, len(pattern)))
    x = x / float(len(alphabet))
    prediction = model.predict(x, verbose=0)
    index = np.argmax(prediction)
    result = int_to_char[index]
    seq_in = [int_to_char[value] for value in pattern]
    #Running this example produces the following output
    print(seq_in, "->", result)

ABC -> D
BCD -> E
CDE -> F
DEF -> G
EFG -> H
FGH -> I
GHI -> J
HIJ -> K
IJK -> L
JKL -> M
KLM -> N
LMN -> O
MNO -> P
NOP -> Q
OPQ -> R
PQR -> S
QRS -> T
RST -> U
STU -> V
TUV -> W
UVW -> X
VWX -> Y
WXY -> Z
Epoch 1/500
23/23 - 1s - loss: 3.2634 - accuracy: 0.0435
Epoch 2/500
23/23 - 0s - loss: 3.2516 - accuracy: 0.0435
Epoch 3/500
23/23 - 0s - loss: 3.2452 - accuracy: 0.0435
Epoch 4/500
23/23 - 0s - loss: 3.2390 - accuracy: 0.0435
Epoch 5/500
23/23 - 0s - loss: 3.2318 - accuracy: 0.0435
Epoch 6/500
23/23 - 0s - loss: 3.2248 - accuracy: 0.0000e+00
Epoch 7/500
23/23 - 0s - loss: 3.2176 - accuracy: 0.0435
Epoch 8/500
23/23 - 0s - loss: 3.2097 - accuracy: 0.0000e+00
Epoch 9/500
23/23 - 0s - loss: 3.2016 - accuracy: 0.0000e+00
Epoch 10/500
23/23 - 0s - loss: 3.1924 - accuracy: 0.0000e+00
Epoch 11/500
23/23 - 0s - loss: 3.1833 - accuracy: 0.0000e+00
Epoch 12/500
23/23 - 0s - loss: 3.1728 - accuracy: 0.0000e+00
Epoch 13/500
23/23 - 0s - loss: 3.1627 - accuracy: 0.0435
Epoch 14/500
23/23 - 0s 

Epoch 138/500
23/23 - 0s - loss: 2.2429 - accuracy: 0.2609
Epoch 139/500
23/23 - 0s - loss: 2.2389 - accuracy: 0.2174
Epoch 140/500
23/23 - 0s - loss: 2.2356 - accuracy: 0.2609
Epoch 141/500
23/23 - 0s - loss: 2.2307 - accuracy: 0.2174
Epoch 142/500
23/23 - 0s - loss: 2.2263 - accuracy: 0.2174
Epoch 143/500
23/23 - 0s - loss: 2.2254 - accuracy: 0.2609
Epoch 144/500
23/23 - 0s - loss: 2.2206 - accuracy: 0.2174
Epoch 145/500
23/23 - 0s - loss: 2.2160 - accuracy: 0.2174
Epoch 146/500
23/23 - 0s - loss: 2.2135 - accuracy: 0.2174
Epoch 147/500
23/23 - 0s - loss: 2.2076 - accuracy: 0.2609
Epoch 148/500
23/23 - 0s - loss: 2.2043 - accuracy: 0.2174
Epoch 149/500
23/23 - 0s - loss: 2.2011 - accuracy: 0.2609
Epoch 150/500
23/23 - 0s - loss: 2.1969 - accuracy: 0.2174
Epoch 151/500
23/23 - 0s - loss: 2.1951 - accuracy: 0.2174
Epoch 152/500
23/23 - 0s - loss: 2.1907 - accuracy: 0.2174
Epoch 153/500
23/23 - 0s - loss: 2.1856 - accuracy: 0.3043
Epoch 154/500
23/23 - 0s - loss: 2.1826 - accuracy: 0.30

Epoch 277/500
23/23 - 0s - loss: 1.8779 - accuracy: 0.6087
Epoch 278/500
23/23 - 0s - loss: 1.8742 - accuracy: 0.5217
Epoch 279/500
23/23 - 0s - loss: 1.8738 - accuracy: 0.5652
Epoch 280/500
23/23 - 0s - loss: 1.8716 - accuracy: 0.5652
Epoch 281/500
23/23 - 0s - loss: 1.8683 - accuracy: 0.5652
Epoch 282/500
23/23 - 0s - loss: 1.8692 - accuracy: 0.6522
Epoch 283/500
23/23 - 0s - loss: 1.8666 - accuracy: 0.5652
Epoch 284/500
23/23 - 0s - loss: 1.8642 - accuracy: 0.5652
Epoch 285/500
23/23 - 0s - loss: 1.8628 - accuracy: 0.4348
Epoch 286/500
23/23 - 0s - loss: 1.8590 - accuracy: 0.5652
Epoch 287/500
23/23 - 0s - loss: 1.8602 - accuracy: 0.5217
Epoch 288/500
23/23 - 0s - loss: 1.8581 - accuracy: 0.5652
Epoch 289/500
23/23 - 0s - loss: 1.8539 - accuracy: 0.6522
Epoch 290/500
23/23 - 0s - loss: 1.8544 - accuracy: 0.6087
Epoch 291/500
23/23 - 0s - loss: 1.8521 - accuracy: 0.6087
Epoch 292/500
23/23 - 0s - loss: 1.8503 - accuracy: 0.5652
Epoch 293/500
23/23 - 0s - loss: 1.8498 - accuracy: 0.69

Epoch 416/500
23/23 - 0s - loss: 1.6741 - accuracy: 0.7826
Epoch 417/500
23/23 - 0s - loss: 1.6723 - accuracy: 0.7826
Epoch 418/500
23/23 - 0s - loss: 1.6706 - accuracy: 0.8261
Epoch 419/500
23/23 - 0s - loss: 1.6715 - accuracy: 0.8696
Epoch 420/500
23/23 - 0s - loss: 1.6715 - accuracy: 0.7391
Epoch 421/500
23/23 - 0s - loss: 1.6678 - accuracy: 0.7391
Epoch 422/500
23/23 - 0s - loss: 1.6675 - accuracy: 0.7826
Epoch 423/500
23/23 - 0s - loss: 1.6654 - accuracy: 0.7391
Epoch 424/500
23/23 - 0s - loss: 1.6654 - accuracy: 0.7826
Epoch 425/500
23/23 - 0s - loss: 1.6645 - accuracy: 0.8261
Epoch 426/500
23/23 - 0s - loss: 1.6625 - accuracy: 0.6957
Epoch 427/500
23/23 - 0s - loss: 1.6651 - accuracy: 0.8261
Epoch 428/500
23/23 - 0s - loss: 1.6609 - accuracy: 0.7826
Epoch 429/500
23/23 - 0s - loss: 1.6607 - accuracy: 0.7826
Epoch 430/500
23/23 - 0s - loss: 1.6613 - accuracy: 0.6957
Epoch 431/500
23/23 - 0s - loss: 1.6573 - accuracy: 0.8261
Epoch 432/500
23/23 - 0s - loss: 1.6572 - accuracy: 0.73

We can see a small lift in performance that may or may not be real. This is a simple problem that we could not learn with LSTMs, even with the window method. Again, this is a misuse of the LSTM network by a poor framing of the problem. Indeed, the sequences of letters are time steps of one feature rather than one time step of separate features. We have given more context to the network but not more sequence as expected. In the next section, we will give more context to the network in time steps.

## LSTM for a Time Step Window to One-Char Mapping

In Keras, the intended use of LSTMs is to provide the context in the form of time steps rather than windowed features like with other network types. We can take our first example and change the sequence length from 1 to 3:

```
seq_length = 3
```

Again, this creates input-output pairs that look like:

```
ABC -> D
BCD -> E
CDE -> F
DEF -> G
```

The difference is that the reshaping of the input data takes the sequence as a time step sequence of one feature rather than a single time step of multiple features.

```
# reshape X to be [samples, time steps, features]
X = numpy.reshape(dataX, (len(dataX), seq_length, 1))
```

This is the intended use of providing sequence context to your LSTM in Keras. The complete code example is provided below for completeness.

In [18]:
# fix random seed for reproducibility
np.random.seed(7)

# define the raw dataset
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

# create mapping of characters to integers (0-25) and the reverse
char_to_int = dict((c, i) for i, c in enumerate(alphabet))
int_to_char = dict((i, c) for i, c in enumerate(alphabet))

# prepare the dataset of input to output pairs encoded as integers
seq_length = 3
dataX = []
dataY = []
for i in range(0, len(alphabet) - seq_length, 1):
    seq_in = alphabet[i:i + seq_length]
    seq_out = alphabet[i + seq_length]
    dataX.append([char_to_int[char] for char in seq_in])
    dataY.append(char_to_int[seq_out])
    print(seq_in, '->', seq_out)
    
# reshape X to be [samples, time steps, features]
X = np.reshape(dataX, (len(dataX), seq_length, 1))

# normalize
X = X / float(len(alphabet))

# one hot encode the output variable
y = utils.to_categorical(dataY)

# create and fit the model
model = Sequential()
model.add(LSTM(32, input_shape=(X.shape[1], X.shape[2])))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=500, batch_size=1, verbose=2)

# summarize performance of the model
scores = model.evaluate(X, y, verbose=0)
print("Model Accuracy: %.2f%%" % (scores[1]*100))

# demonstrate some model predictions
for pattern in dataX:
    x = np.reshape(pattern, (1, len(pattern), 1))
    x = x / float(len(alphabet))
    prediction = model.predict(x, verbose=0)
    index = np.argmax(prediction)
    result = int_to_char[index]
    seq_in = [int_to_char[value] for value in pattern]
    # Running this example provides the following output.
    print(seq_in, "->", result)

ABC -> D
BCD -> E
CDE -> F
DEF -> G
EFG -> H
FGH -> I
GHI -> J
HIJ -> K
IJK -> L
JKL -> M
KLM -> N
LMN -> O
MNO -> P
NOP -> Q
OPQ -> R
PQR -> S
QRS -> T
RST -> U
STU -> V
TUV -> W
UVW -> X
VWX -> Y
WXY -> Z
Epoch 1/500
23/23 - 1s - loss: 3.2777 - accuracy: 0.0435
Epoch 2/500
23/23 - 0s - loss: 3.2613 - accuracy: 0.0435
Epoch 3/500
23/23 - 0s - loss: 3.2537 - accuracy: 0.0435
Epoch 4/500
23/23 - 0s - loss: 3.2460 - accuracy: 0.0435
Epoch 5/500
23/23 - 0s - loss: 3.2385 - accuracy: 0.0435
Epoch 6/500
23/23 - 0s - loss: 3.2304 - accuracy: 0.0435
Epoch 7/500
23/23 - 0s - loss: 3.2225 - accuracy: 0.0435
Epoch 8/500
23/23 - 0s - loss: 3.2140 - accuracy: 0.0435
Epoch 9/500
23/23 - 0s - loss: 3.2057 - accuracy: 0.0435
Epoch 10/500
23/23 - 0s - loss: 3.1936 - accuracy: 0.0435
Epoch 11/500
23/23 - 0s - loss: 3.1845 - accuracy: 0.0435
Epoch 12/500
23/23 - 0s - loss: 3.1694 - accuracy: 0.0435
Epoch 13/500
23/23 - 0s - loss: 3.1547 - accuracy: 0.0435
Epoch 14/500
23/23 - 0s - loss: 3.1354 - accurac

Epoch 138/500
23/23 - 0s - loss: 1.2735 - accuracy: 0.8696
Epoch 139/500
23/23 - 0s - loss: 1.2730 - accuracy: 0.7391
Epoch 140/500
23/23 - 0s - loss: 1.2694 - accuracy: 0.8696
Epoch 141/500
23/23 - 0s - loss: 1.2615 - accuracy: 0.7826
Epoch 142/500
23/23 - 0s - loss: 1.2468 - accuracy: 0.8696
Epoch 143/500
23/23 - 0s - loss: 1.2454 - accuracy: 0.8261
Epoch 144/500
23/23 - 0s - loss: 1.2464 - accuracy: 0.7826
Epoch 145/500
23/23 - 0s - loss: 1.2341 - accuracy: 0.8261
Epoch 146/500
23/23 - 0s - loss: 1.2223 - accuracy: 0.7391
Epoch 147/500
23/23 - 0s - loss: 1.2234 - accuracy: 0.7826
Epoch 148/500
23/23 - 0s - loss: 1.2141 - accuracy: 0.9130
Epoch 149/500
23/23 - 0s - loss: 1.2131 - accuracy: 0.7391
Epoch 150/500
23/23 - 0s - loss: 1.2013 - accuracy: 0.8261
Epoch 151/500
23/23 - 0s - loss: 1.1941 - accuracy: 0.8261
Epoch 152/500
23/23 - 0s - loss: 1.1900 - accuracy: 0.8261
Epoch 153/500
23/23 - 0s - loss: 1.1922 - accuracy: 0.8696
Epoch 154/500
23/23 - 0s - loss: 1.1769 - accuracy: 0.91

Epoch 277/500
23/23 - 0s - loss: 0.6574 - accuracy: 0.9565
Epoch 278/500
23/23 - 0s - loss: 0.6585 - accuracy: 0.9130
Epoch 279/500
23/23 - 0s - loss: 0.6568 - accuracy: 0.9565
Epoch 280/500
23/23 - 0s - loss: 0.6496 - accuracy: 0.9565
Epoch 281/500
23/23 - 0s - loss: 0.6468 - accuracy: 0.9565
Epoch 282/500
23/23 - 0s - loss: 0.6395 - accuracy: 0.9565
Epoch 283/500
23/23 - 0s - loss: 0.6395 - accuracy: 0.9565
Epoch 284/500
23/23 - 0s - loss: 0.6465 - accuracy: 0.9130
Epoch 285/500
23/23 - 0s - loss: 0.6331 - accuracy: 0.9565
Epoch 286/500
23/23 - 0s - loss: 0.6373 - accuracy: 0.9565
Epoch 287/500
23/23 - 0s - loss: 0.6320 - accuracy: 0.9130
Epoch 288/500
23/23 - 0s - loss: 0.6301 - accuracy: 0.9565
Epoch 289/500
23/23 - 0s - loss: 0.6271 - accuracy: 0.9130
Epoch 290/500
23/23 - 0s - loss: 0.6255 - accuracy: 1.0000
Epoch 291/500
23/23 - 0s - loss: 0.6217 - accuracy: 0.9130
Epoch 292/500
23/23 - 0s - loss: 0.6189 - accuracy: 0.9565
Epoch 293/500
23/23 - 0s - loss: 0.6141 - accuracy: 0.95

Epoch 416/500
23/23 - 0s - loss: 0.3631 - accuracy: 1.0000
Epoch 417/500
23/23 - 0s - loss: 0.3678 - accuracy: 0.9565
Epoch 418/500
23/23 - 0s - loss: 0.3609 - accuracy: 1.0000
Epoch 419/500
23/23 - 0s - loss: 0.3589 - accuracy: 1.0000
Epoch 420/500
23/23 - 0s - loss: 0.3567 - accuracy: 0.9565
Epoch 421/500
23/23 - 0s - loss: 0.3566 - accuracy: 0.9565
Epoch 422/500
23/23 - 0s - loss: 0.3575 - accuracy: 1.0000
Epoch 423/500
23/23 - 0s - loss: 0.3530 - accuracy: 1.0000
Epoch 424/500
23/23 - 0s - loss: 0.3491 - accuracy: 1.0000
Epoch 425/500
23/23 - 0s - loss: 0.3512 - accuracy: 0.9565
Epoch 426/500
23/23 - 0s - loss: 0.3495 - accuracy: 1.0000
Epoch 427/500
23/23 - 0s - loss: 0.3459 - accuracy: 1.0000
Epoch 428/500
23/23 - 0s - loss: 0.3465 - accuracy: 1.0000
Epoch 429/500
23/23 - 0s - loss: 0.3483 - accuracy: 1.0000
Epoch 430/500
23/23 - 0s - loss: 0.3407 - accuracy: 1.0000
Epoch 431/500
23/23 - 0s - loss: 0.3385 - accuracy: 1.0000
Epoch 432/500
23/23 - 0s - loss: 0.3397 - accuracy: 1.00

We can see that the model learns the problem perfectly, as evidenced by the model evaluation and the example predictions. But it has learned a simpler problem. Specifically, it has learned to predict the next letter from a sequence of three letters in the alphabet. It can be shown any random sequence of three letters from the alphabet and predict the next letter.

It cannot enumerate the alphabet. I expect that a larger enough Multilayer Perceptron network might learn the same mapping using the window method. The LSTM networks are stateful. They should learn the whole alphabet sequence, but by default, the Keras implementation resets the network state after each training batch.

## LSTM State Maintained Between Samples Within A Batch

The Keras implementation of LSTMs resets the state of the network after each batch. This suggests that if we had a large batch size to hold all input patterns and if all the input patterns were ordered sequentially, the LSTM could use the sequence's context to learn the sequence better. We can demonstrate this easily by modifying the first example for learning a one-to-one mapping and increasing the batch size from 1 to the size of the training dataset. Additionally, Keras shuffles the training dataset before each training epoch. To ensure the training data patterns remain sequential, we can disable this shuffling.

```
model.fit(X, y, epochs=5000, batch_size=len(dataX), verbose=2, shuffle=False)
```

The network will learn the mapping of characters using the within-batch sequence, but this context will not be available to the network when making predictions. We can evaluate both the ability of the network to make predictions randomly and in sequence. The full code example
is provided below for completeness.

In [20]:
from tensorflow.keras.preprocessing.sequence import pad_sequences

# fix random seed for reproducibility
np.random.seed(7)

# define the raw dataset
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

# create mapping of characters to integers (0-25) and the reverse
char_to_int = dict((c, i) for i, c in enumerate(alphabet))
int_to_char = dict((i, c) for i, c in enumerate(alphabet))

# prepare the dataset of input to output pairs encoded as integers
seq_length = 1
dataX = []
dataY = []
for i in range(0, len(alphabet) - seq_length, 1):
    seq_in = alphabet[i:i + seq_length]
    seq_out = alphabet[i + seq_length]
    dataX.append([char_to_int[char] for char in seq_in])
    dataY.append(char_to_int[seq_out])
    print(seq_in, '->', seq_out)

# convert list of lists to array and pad sequences if needed
X = pad_sequences(dataX, maxlen=seq_length, dtype='float32')

# reshape X to be [samples, time steps, features]
X = np.reshape(dataX, (X.shape[0], seq_length, 1))

# normalize
X = X / float(len(alphabet))

# one hot encode the output variable
y = utils.to_categorical(dataY)

# create and fit the model
model = Sequential()
model.add(LSTM(16, input_shape=(X.shape[1], X.shape[2])))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=5000, batch_size=len(dataX), verbose=2, shuffle=False)

# demonstrate some model predictions
for pattern in dataX:
    x = np.reshape(pattern, (1, len(pattern), 1))
    x = x / float(len(alphabet))
    prediction = model.predict(x, verbose=0)
    index = np.argmax(prediction)
    result = int_to_char[index]
    seq_in = [int_to_char[value] for value in pattern]
    print(seq_in, "->", result)

# demonstrate predicting random patterns
print("Test a Random Pattern:")
for i in range(0,20):
    pattern_index = np.random.randint(len(dataX))
    pattern = dataX[pattern_index]
    x = np.reshape(pattern, (1, len(pattern), 1))
    x = x / float(len(alphabet))
    prediction = model.predict(x, verbose=0)
    index = np.argmax(prediction)
    result = int_to_char[index]
    seq_in = [int_to_char[value] for value in pattern]
    # Running the example provides the following output.
    print(seq_in, "->", result)

A -> B
B -> C
C -> D
D -> E
E -> F
F -> G
G -> H
H -> I
I -> J
J -> K
K -> L
L -> M
M -> N
N -> O
O -> P
P -> Q
Q -> R
R -> S
S -> T
T -> U
U -> V
V -> W
W -> X
X -> Y
Y -> Z
Epoch 1/5000
1/1 - 1s - loss: 3.2583 - accuracy: 0.0400
Epoch 2/5000
1/1 - 0s - loss: 3.2580 - accuracy: 0.0400
Epoch 3/5000
1/1 - 0s - loss: 3.2577 - accuracy: 0.0400
Epoch 4/5000
1/1 - 0s - loss: 3.2574 - accuracy: 0.0400
Epoch 5/5000
1/1 - 0s - loss: 3.2572 - accuracy: 0.0400
Epoch 6/5000
1/1 - 0s - loss: 3.2569 - accuracy: 0.0400
Epoch 7/5000
1/1 - 0s - loss: 3.2566 - accuracy: 0.0400
Epoch 8/5000
1/1 - 0s - loss: 3.2563 - accuracy: 0.0400
Epoch 9/5000
1/1 - 0s - loss: 3.2560 - accuracy: 0.0400
Epoch 10/5000
1/1 - 0s - loss: 3.2557 - accuracy: 0.0400
Epoch 11/5000
1/1 - 0s - loss: 3.2554 - accuracy: 0.0400
Epoch 12/5000
1/1 - 0s - loss: 3.2551 - accuracy: 0.0400
Epoch 13/5000
1/1 - 0s - loss: 3.2549 - accuracy: 0.0400
Epoch 14/5000
1/1 - 0s - loss: 3.2546 - accuracy: 0.0400
Epoch 15/5000
1/1 - 0s - loss: 3.254

Epoch 141/5000
1/1 - 0s - loss: 3.1881 - accuracy: 0.0400
Epoch 142/5000
1/1 - 0s - loss: 3.1871 - accuracy: 0.0400
Epoch 143/5000
1/1 - 0s - loss: 3.1862 - accuracy: 0.0400
Epoch 144/5000
1/1 - 0s - loss: 3.1852 - accuracy: 0.0400
Epoch 145/5000
1/1 - 0s - loss: 3.1843 - accuracy: 0.0400
Epoch 146/5000
1/1 - 0s - loss: 3.1833 - accuracy: 0.0400
Epoch 147/5000
1/1 - 0s - loss: 3.1823 - accuracy: 0.0400
Epoch 148/5000
1/1 - 0s - loss: 3.1813 - accuracy: 0.0400
Epoch 149/5000
1/1 - 0s - loss: 3.1803 - accuracy: 0.0400
Epoch 150/5000
1/1 - 0s - loss: 3.1793 - accuracy: 0.0400
Epoch 151/5000
1/1 - 0s - loss: 3.1783 - accuracy: 0.0400
Epoch 152/5000
1/1 - 0s - loss: 3.1772 - accuracy: 0.0400
Epoch 153/5000
1/1 - 0s - loss: 3.1762 - accuracy: 0.0400
Epoch 154/5000
1/1 - 0s - loss: 3.1752 - accuracy: 0.0400
Epoch 155/5000
1/1 - 0s - loss: 3.1741 - accuracy: 0.0400
Epoch 156/5000
1/1 - 0s - loss: 3.1730 - accuracy: 0.0400
Epoch 157/5000
1/1 - 0s - loss: 3.1720 - accuracy: 0.0400
Epoch 158/5000

1/1 - 0s - loss: 2.9781 - accuracy: 0.1600
Epoch 283/5000
1/1 - 0s - loss: 2.9762 - accuracy: 0.2000
Epoch 284/5000
1/1 - 0s - loss: 2.9744 - accuracy: 0.2000
Epoch 285/5000
1/1 - 0s - loss: 2.9725 - accuracy: 0.2000
Epoch 286/5000
1/1 - 0s - loss: 2.9707 - accuracy: 0.2000
Epoch 287/5000
1/1 - 0s - loss: 2.9688 - accuracy: 0.2000
Epoch 288/5000
1/1 - 0s - loss: 2.9669 - accuracy: 0.2000
Epoch 289/5000
1/1 - 0s - loss: 2.9651 - accuracy: 0.2000
Epoch 290/5000
1/1 - 0s - loss: 2.9632 - accuracy: 0.2000
Epoch 291/5000
1/1 - 0s - loss: 2.9614 - accuracy: 0.2000
Epoch 292/5000
1/1 - 0s - loss: 2.9595 - accuracy: 0.2000
Epoch 293/5000
1/1 - 0s - loss: 2.9577 - accuracy: 0.2000
Epoch 294/5000
1/1 - 0s - loss: 2.9558 - accuracy: 0.2000
Epoch 295/5000
1/1 - 0s - loss: 2.9539 - accuracy: 0.2000
Epoch 296/5000
1/1 - 0s - loss: 2.9521 - accuracy: 0.2000
Epoch 297/5000
1/1 - 0s - loss: 2.9502 - accuracy: 0.2000
Epoch 298/5000
1/1 - 0s - loss: 2.9484 - accuracy: 0.2000
Epoch 299/5000
1/1 - 0s - los

Epoch 424/5000
1/1 - 0s - loss: 2.7314 - accuracy: 0.3200
Epoch 425/5000
1/1 - 0s - loss: 2.7299 - accuracy: 0.3200
Epoch 426/5000
1/1 - 0s - loss: 2.7283 - accuracy: 0.3200
Epoch 427/5000
1/1 - 0s - loss: 2.7268 - accuracy: 0.3200
Epoch 428/5000
1/1 - 0s - loss: 2.7253 - accuracy: 0.3200
Epoch 429/5000
1/1 - 0s - loss: 2.7238 - accuracy: 0.3200
Epoch 430/5000
1/1 - 0s - loss: 2.7224 - accuracy: 0.3200
Epoch 431/5000
1/1 - 0s - loss: 2.7208 - accuracy: 0.3200
Epoch 432/5000
1/1 - 0s - loss: 2.7194 - accuracy: 0.3200
Epoch 433/5000
1/1 - 0s - loss: 2.7179 - accuracy: 0.3200
Epoch 434/5000
1/1 - 0s - loss: 2.7164 - accuracy: 0.3200
Epoch 435/5000
1/1 - 0s - loss: 2.7149 - accuracy: 0.3200
Epoch 436/5000
1/1 - 0s - loss: 2.7134 - accuracy: 0.3200
Epoch 437/5000
1/1 - 0s - loss: 2.7120 - accuracy: 0.3200
Epoch 438/5000
1/1 - 0s - loss: 2.7105 - accuracy: 0.3200
Epoch 439/5000
1/1 - 0s - loss: 2.7090 - accuracy: 0.3200
Epoch 440/5000
1/1 - 0s - loss: 2.7076 - accuracy: 0.3200
Epoch 441/5000

1/1 - 0s - loss: 2.5474 - accuracy: 0.3200
Epoch 566/5000
1/1 - 0s - loss: 2.5463 - accuracy: 0.3200
Epoch 567/5000
1/1 - 0s - loss: 2.5451 - accuracy: 0.3200
Epoch 568/5000
1/1 - 0s - loss: 2.5440 - accuracy: 0.3200
Epoch 569/5000
1/1 - 0s - loss: 2.5429 - accuracy: 0.3200
Epoch 570/5000
1/1 - 0s - loss: 2.5418 - accuracy: 0.3200
Epoch 571/5000
1/1 - 0s - loss: 2.5407 - accuracy: 0.3200
Epoch 572/5000
1/1 - 0s - loss: 2.5395 - accuracy: 0.3200
Epoch 573/5000
1/1 - 0s - loss: 2.5384 - accuracy: 0.3200
Epoch 574/5000
1/1 - 0s - loss: 2.5373 - accuracy: 0.3200
Epoch 575/5000
1/1 - 0s - loss: 2.5362 - accuracy: 0.3200
Epoch 576/5000
1/1 - 0s - loss: 2.5351 - accuracy: 0.3200
Epoch 577/5000
1/1 - 0s - loss: 2.5340 - accuracy: 0.3200
Epoch 578/5000
1/1 - 0s - loss: 2.5329 - accuracy: 0.3200
Epoch 579/5000
1/1 - 0s - loss: 2.5318 - accuracy: 0.3200
Epoch 580/5000
1/1 - 0s - loss: 2.5307 - accuracy: 0.3200
Epoch 581/5000
1/1 - 0s - loss: 2.5296 - accuracy: 0.3200
Epoch 582/5000
1/1 - 0s - los

Epoch 707/5000
1/1 - 0s - loss: 2.4031 - accuracy: 0.4000
Epoch 708/5000
1/1 - 0s - loss: 2.4022 - accuracy: 0.4000
Epoch 709/5000
1/1 - 0s - loss: 2.4013 - accuracy: 0.4000
Epoch 710/5000
1/1 - 0s - loss: 2.4003 - accuracy: 0.4000
Epoch 711/5000
1/1 - 0s - loss: 2.3994 - accuracy: 0.4000
Epoch 712/5000
1/1 - 0s - loss: 2.3985 - accuracy: 0.4000
Epoch 713/5000
1/1 - 0s - loss: 2.3976 - accuracy: 0.4000
Epoch 714/5000
1/1 - 0s - loss: 2.3967 - accuracy: 0.4000
Epoch 715/5000
1/1 - 0s - loss: 2.3958 - accuracy: 0.4000
Epoch 716/5000
1/1 - 0s - loss: 2.3949 - accuracy: 0.4000
Epoch 717/5000
1/1 - 0s - loss: 2.3939 - accuracy: 0.4000
Epoch 718/5000
1/1 - 0s - loss: 2.3930 - accuracy: 0.4000
Epoch 719/5000
1/1 - 0s - loss: 2.3921 - accuracy: 0.4000
Epoch 720/5000
1/1 - 0s - loss: 2.3912 - accuracy: 0.4000
Epoch 721/5000
1/1 - 0s - loss: 2.3903 - accuracy: 0.4000
Epoch 722/5000
1/1 - 0s - loss: 2.3894 - accuracy: 0.4000
Epoch 723/5000
1/1 - 0s - loss: 2.3885 - accuracy: 0.4000
Epoch 724/5000

1/1 - 0s - loss: 2.2824 - accuracy: 0.4400
Epoch 849/5000
1/1 - 0s - loss: 2.2816 - accuracy: 0.4400
Epoch 850/5000
1/1 - 0s - loss: 2.2808 - accuracy: 0.4400
Epoch 851/5000
1/1 - 0s - loss: 2.2800 - accuracy: 0.4400
Epoch 852/5000
1/1 - 0s - loss: 2.2792 - accuracy: 0.4400
Epoch 853/5000
1/1 - 0s - loss: 2.2784 - accuracy: 0.4400
Epoch 854/5000
1/1 - 0s - loss: 2.2776 - accuracy: 0.4400
Epoch 855/5000
1/1 - 0s - loss: 2.2768 - accuracy: 0.4400
Epoch 856/5000
1/1 - 0s - loss: 2.2760 - accuracy: 0.4400
Epoch 857/5000
1/1 - 0s - loss: 2.2753 - accuracy: 0.4400
Epoch 858/5000
1/1 - 0s - loss: 2.2744 - accuracy: 0.4400
Epoch 859/5000
1/1 - 0s - loss: 2.2737 - accuracy: 0.4400
Epoch 860/5000
1/1 - 0s - loss: 2.2729 - accuracy: 0.4400
Epoch 861/5000
1/1 - 0s - loss: 2.2721 - accuracy: 0.4400
Epoch 862/5000
1/1 - 0s - loss: 2.2713 - accuracy: 0.4400
Epoch 863/5000
1/1 - 0s - loss: 2.2705 - accuracy: 0.4400
Epoch 864/5000
1/1 - 0s - loss: 2.2697 - accuracy: 0.4400
Epoch 865/5000
1/1 - 0s - los

Epoch 990/5000
1/1 - 0s - loss: 2.1760 - accuracy: 0.4800
Epoch 991/5000
1/1 - 0s - loss: 2.1753 - accuracy: 0.4800
Epoch 992/5000
1/1 - 0s - loss: 2.1746 - accuracy: 0.4800
Epoch 993/5000
1/1 - 0s - loss: 2.1739 - accuracy: 0.4800
Epoch 994/5000
1/1 - 0s - loss: 2.1732 - accuracy: 0.4800
Epoch 995/5000
1/1 - 0s - loss: 2.1725 - accuracy: 0.4800
Epoch 996/5000
1/1 - 0s - loss: 2.1718 - accuracy: 0.4800
Epoch 997/5000
1/1 - 0s - loss: 2.1711 - accuracy: 0.4800
Epoch 998/5000
1/1 - 0s - loss: 2.1704 - accuracy: 0.4800
Epoch 999/5000
1/1 - 0s - loss: 2.1697 - accuracy: 0.4800
Epoch 1000/5000
1/1 - 0s - loss: 2.1690 - accuracy: 0.4800
Epoch 1001/5000
1/1 - 0s - loss: 2.1683 - accuracy: 0.4800
Epoch 1002/5000
1/1 - 0s - loss: 2.1676 - accuracy: 0.4800
Epoch 1003/5000
1/1 - 0s - loss: 2.1669 - accuracy: 0.4800
Epoch 1004/5000
1/1 - 0s - loss: 2.1662 - accuracy: 0.4800
Epoch 1005/5000
1/1 - 0s - loss: 2.1655 - accuracy: 0.4800
Epoch 1006/5000
1/1 - 0s - loss: 2.1648 - accuracy: 0.4800
Epoch 1

1/1 - 0s - loss: 2.0837 - accuracy: 0.5600
Epoch 1130/5000
1/1 - 0s - loss: 2.0830 - accuracy: 0.5600
Epoch 1131/5000
1/1 - 0s - loss: 2.0824 - accuracy: 0.5600
Epoch 1132/5000
1/1 - 0s - loss: 2.0817 - accuracy: 0.5600
Epoch 1133/5000
1/1 - 0s - loss: 2.0811 - accuracy: 0.5600
Epoch 1134/5000
1/1 - 0s - loss: 2.0805 - accuracy: 0.5600
Epoch 1135/5000
1/1 - 0s - loss: 2.0799 - accuracy: 0.5600
Epoch 1136/5000
1/1 - 0s - loss: 2.0792 - accuracy: 0.5600
Epoch 1137/5000
1/1 - 0s - loss: 2.0786 - accuracy: 0.5600
Epoch 1138/5000
1/1 - 0s - loss: 2.0780 - accuracy: 0.5600
Epoch 1139/5000
1/1 - 0s - loss: 2.0774 - accuracy: 0.5600
Epoch 1140/5000
1/1 - 0s - loss: 2.0768 - accuracy: 0.5600
Epoch 1141/5000
1/1 - 0s - loss: 2.0761 - accuracy: 0.5600
Epoch 1142/5000
1/1 - 0s - loss: 2.0755 - accuracy: 0.5600
Epoch 1143/5000
1/1 - 0s - loss: 2.0749 - accuracy: 0.5600
Epoch 1144/5000
1/1 - 0s - loss: 2.0742 - accuracy: 0.5200
Epoch 1145/5000
1/1 - 0s - loss: 2.0736 - accuracy: 0.5600
Epoch 1146/50

1/1 - 0s - loss: 2.0007 - accuracy: 0.6400
Epoch 1269/5000
1/1 - 0s - loss: 2.0001 - accuracy: 0.6400
Epoch 1270/5000
1/1 - 0s - loss: 1.9995 - accuracy: 0.6400
Epoch 1271/5000
1/1 - 0s - loss: 1.9990 - accuracy: 0.6400
Epoch 1272/5000
1/1 - 0s - loss: 1.9984 - accuracy: 0.6400
Epoch 1273/5000
1/1 - 0s - loss: 1.9978 - accuracy: 0.6400
Epoch 1274/5000
1/1 - 0s - loss: 1.9973 - accuracy: 0.6400
Epoch 1275/5000
1/1 - 0s - loss: 1.9967 - accuracy: 0.6400
Epoch 1276/5000
1/1 - 0s - loss: 1.9962 - accuracy: 0.6400
Epoch 1277/5000
1/1 - 0s - loss: 1.9956 - accuracy: 0.6400
Epoch 1278/5000
1/1 - 0s - loss: 1.9951 - accuracy: 0.6400
Epoch 1279/5000
1/1 - 0s - loss: 1.9945 - accuracy: 0.6400
Epoch 1280/5000
1/1 - 0s - loss: 1.9939 - accuracy: 0.6400
Epoch 1281/5000
1/1 - 0s - loss: 1.9933 - accuracy: 0.6400
Epoch 1282/5000
1/1 - 0s - loss: 1.9927 - accuracy: 0.6400
Epoch 1283/5000
1/1 - 0s - loss: 1.9922 - accuracy: 0.6400
Epoch 1284/5000
1/1 - 0s - loss: 1.9917 - accuracy: 0.6400
Epoch 1285/50

1/1 - 0s - loss: 1.9257 - accuracy: 0.7200
Epoch 1408/5000
1/1 - 0s - loss: 1.9252 - accuracy: 0.7200
Epoch 1409/5000
1/1 - 0s - loss: 1.9246 - accuracy: 0.7200
Epoch 1410/5000
1/1 - 0s - loss: 1.9241 - accuracy: 0.7200
Epoch 1411/5000
1/1 - 0s - loss: 1.9236 - accuracy: 0.7200
Epoch 1412/5000
1/1 - 0s - loss: 1.9231 - accuracy: 0.7200
Epoch 1413/5000
1/1 - 0s - loss: 1.9226 - accuracy: 0.7200
Epoch 1414/5000
1/1 - 0s - loss: 1.9221 - accuracy: 0.7200
Epoch 1415/5000
1/1 - 0s - loss: 1.9216 - accuracy: 0.7200
Epoch 1416/5000
1/1 - 0s - loss: 1.9211 - accuracy: 0.7200
Epoch 1417/5000
1/1 - 0s - loss: 1.9205 - accuracy: 0.7200
Epoch 1418/5000
1/1 - 0s - loss: 1.9200 - accuracy: 0.7200
Epoch 1419/5000
1/1 - 0s - loss: 1.9195 - accuracy: 0.7200
Epoch 1420/5000
1/1 - 0s - loss: 1.9190 - accuracy: 0.7200
Epoch 1421/5000
1/1 - 0s - loss: 1.9185 - accuracy: 0.7200
Epoch 1422/5000
1/1 - 0s - loss: 1.9180 - accuracy: 0.7200
Epoch 1423/5000
1/1 - 0s - loss: 1.9175 - accuracy: 0.7200
Epoch 1424/50

1/1 - 0s - loss: 1.8574 - accuracy: 0.7200
Epoch 1547/5000
1/1 - 0s - loss: 1.8570 - accuracy: 0.7200
Epoch 1548/5000
1/1 - 0s - loss: 1.8566 - accuracy: 0.7200
Epoch 1549/5000
1/1 - 0s - loss: 1.8561 - accuracy: 0.7200
Epoch 1550/5000
1/1 - 0s - loss: 1.8556 - accuracy: 0.7200
Epoch 1551/5000
1/1 - 0s - loss: 1.8551 - accuracy: 0.7200
Epoch 1552/5000
1/1 - 0s - loss: 1.8547 - accuracy: 0.7200
Epoch 1553/5000
1/1 - 0s - loss: 1.8542 - accuracy: 0.7200
Epoch 1554/5000
1/1 - 0s - loss: 1.8537 - accuracy: 0.7200
Epoch 1555/5000
1/1 - 0s - loss: 1.8533 - accuracy: 0.7200
Epoch 1556/5000
1/1 - 0s - loss: 1.8528 - accuracy: 0.7200
Epoch 1557/5000
1/1 - 0s - loss: 1.8523 - accuracy: 0.7200
Epoch 1558/5000
1/1 - 0s - loss: 1.8518 - accuracy: 0.7200
Epoch 1559/5000
1/1 - 0s - loss: 1.8514 - accuracy: 0.7200
Epoch 1560/5000
1/1 - 0s - loss: 1.8509 - accuracy: 0.7200
Epoch 1561/5000
1/1 - 0s - loss: 1.8505 - accuracy: 0.7200
Epoch 1562/5000
1/1 - 0s - loss: 1.8500 - accuracy: 0.7200
Epoch 1563/50

1/1 - 0s - loss: 1.7951 - accuracy: 0.8000
Epoch 1686/5000
1/1 - 0s - loss: 1.7947 - accuracy: 0.8000
Epoch 1687/5000
1/1 - 0s - loss: 1.7943 - accuracy: 0.8000
Epoch 1688/5000
1/1 - 0s - loss: 1.7939 - accuracy: 0.8000
Epoch 1689/5000
1/1 - 0s - loss: 1.7934 - accuracy: 0.8000
Epoch 1690/5000
1/1 - 0s - loss: 1.7930 - accuracy: 0.8000
Epoch 1691/5000
1/1 - 0s - loss: 1.7926 - accuracy: 0.8000
Epoch 1692/5000
1/1 - 0s - loss: 1.7922 - accuracy: 0.8000
Epoch 1693/5000
1/1 - 0s - loss: 1.7917 - accuracy: 0.8000
Epoch 1694/5000
1/1 - 0s - loss: 1.7912 - accuracy: 0.8000
Epoch 1695/5000
1/1 - 0s - loss: 1.7909 - accuracy: 0.8000
Epoch 1696/5000
1/1 - 0s - loss: 1.7904 - accuracy: 0.8000
Epoch 1697/5000
1/1 - 0s - loss: 1.7900 - accuracy: 0.8000
Epoch 1698/5000
1/1 - 0s - loss: 1.7896 - accuracy: 0.8000
Epoch 1699/5000
1/1 - 0s - loss: 1.7891 - accuracy: 0.8000
Epoch 1700/5000
1/1 - 0s - loss: 1.7888 - accuracy: 0.8000
Epoch 1701/5000
1/1 - 0s - loss: 1.7883 - accuracy: 0.8000
Epoch 1702/50

1/1 - 0s - loss: 1.7378 - accuracy: 0.8400
Epoch 1825/5000
1/1 - 0s - loss: 1.7374 - accuracy: 0.8400
Epoch 1826/5000
1/1 - 0s - loss: 1.7370 - accuracy: 0.8400
Epoch 1827/5000
1/1 - 0s - loss: 1.7366 - accuracy: 0.8400
Epoch 1828/5000
1/1 - 0s - loss: 1.7363 - accuracy: 0.8400
Epoch 1829/5000
1/1 - 0s - loss: 1.7358 - accuracy: 0.8400
Epoch 1830/5000
1/1 - 0s - loss: 1.7354 - accuracy: 0.8400
Epoch 1831/5000
1/1 - 0s - loss: 1.7350 - accuracy: 0.8400
Epoch 1832/5000
1/1 - 0s - loss: 1.7347 - accuracy: 0.8400
Epoch 1833/5000
1/1 - 0s - loss: 1.7343 - accuracy: 0.8400
Epoch 1834/5000
1/1 - 0s - loss: 1.7339 - accuracy: 0.8400
Epoch 1835/5000
1/1 - 0s - loss: 1.7334 - accuracy: 0.8400
Epoch 1836/5000
1/1 - 0s - loss: 1.7330 - accuracy: 0.8400
Epoch 1837/5000
1/1 - 0s - loss: 1.7326 - accuracy: 0.8400
Epoch 1838/5000
1/1 - 0s - loss: 1.7323 - accuracy: 0.8400
Epoch 1839/5000
1/1 - 0s - loss: 1.7319 - accuracy: 0.8400
Epoch 1840/5000
1/1 - 0s - loss: 1.7315 - accuracy: 0.8400
Epoch 1841/50

1/1 - 0s - loss: 1.6847 - accuracy: 0.8400
Epoch 1964/5000
1/1 - 0s - loss: 1.6844 - accuracy: 0.8400
Epoch 1965/5000
1/1 - 0s - loss: 1.6840 - accuracy: 0.8400
Epoch 1966/5000
1/1 - 0s - loss: 1.6836 - accuracy: 0.8400
Epoch 1967/5000
1/1 - 0s - loss: 1.6832 - accuracy: 0.8400
Epoch 1968/5000
1/1 - 0s - loss: 1.6829 - accuracy: 0.8400
Epoch 1969/5000
1/1 - 0s - loss: 1.6825 - accuracy: 0.8400
Epoch 1970/5000
1/1 - 0s - loss: 1.6821 - accuracy: 0.8400
Epoch 1971/5000
1/1 - 0s - loss: 1.6818 - accuracy: 0.8400
Epoch 1972/5000
1/1 - 0s - loss: 1.6814 - accuracy: 0.8400
Epoch 1973/5000
1/1 - 0s - loss: 1.6810 - accuracy: 0.8400
Epoch 1974/5000
1/1 - 0s - loss: 1.6806 - accuracy: 0.8400
Epoch 1975/5000
1/1 - 0s - loss: 1.6803 - accuracy: 0.8400
Epoch 1976/5000
1/1 - 0s - loss: 1.6800 - accuracy: 0.8400
Epoch 1977/5000
1/1 - 0s - loss: 1.6796 - accuracy: 0.8400
Epoch 1978/5000
1/1 - 0s - loss: 1.6792 - accuracy: 0.8400
Epoch 1979/5000
1/1 - 0s - loss: 1.6788 - accuracy: 0.8400
Epoch 1980/50

1/1 - 0s - loss: 1.6351 - accuracy: 0.8400
Epoch 2103/5000
1/1 - 0s - loss: 1.6347 - accuracy: 0.8400
Epoch 2104/5000
1/1 - 0s - loss: 1.6344 - accuracy: 0.8400
Epoch 2105/5000
1/1 - 0s - loss: 1.6341 - accuracy: 0.8400
Epoch 2106/5000
1/1 - 0s - loss: 1.6337 - accuracy: 0.8400
Epoch 2107/5000
1/1 - 0s - loss: 1.6334 - accuracy: 0.8400
Epoch 2108/5000
1/1 - 0s - loss: 1.6330 - accuracy: 0.8400
Epoch 2109/5000
1/1 - 0s - loss: 1.6327 - accuracy: 0.8400
Epoch 2110/5000
1/1 - 0s - loss: 1.6324 - accuracy: 0.8400
Epoch 2111/5000
1/1 - 0s - loss: 1.6319 - accuracy: 0.8400
Epoch 2112/5000
1/1 - 0s - loss: 1.6317 - accuracy: 0.8400
Epoch 2113/5000
1/1 - 0s - loss: 1.6313 - accuracy: 0.8400
Epoch 2114/5000
1/1 - 0s - loss: 1.6310 - accuracy: 0.8400
Epoch 2115/5000
1/1 - 0s - loss: 1.6306 - accuracy: 0.8400
Epoch 2116/5000
1/1 - 0s - loss: 1.6303 - accuracy: 0.8400
Epoch 2117/5000
1/1 - 0s - loss: 1.6299 - accuracy: 0.8400
Epoch 2118/5000
1/1 - 0s - loss: 1.6295 - accuracy: 0.8400
Epoch 2119/50

1/1 - 0s - loss: 1.5884 - accuracy: 0.8400
Epoch 2242/5000
1/1 - 0s - loss: 1.5881 - accuracy: 0.8400
Epoch 2243/5000
1/1 - 0s - loss: 1.5878 - accuracy: 0.8400
Epoch 2244/5000
1/1 - 0s - loss: 1.5875 - accuracy: 0.8400
Epoch 2245/5000
1/1 - 0s - loss: 1.5871 - accuracy: 0.8400
Epoch 2246/5000
1/1 - 0s - loss: 1.5869 - accuracy: 0.8400
Epoch 2247/5000
1/1 - 0s - loss: 1.5865 - accuracy: 0.8400
Epoch 2248/5000
1/1 - 0s - loss: 1.5861 - accuracy: 0.8400
Epoch 2249/5000
1/1 - 0s - loss: 1.5858 - accuracy: 0.8400
Epoch 2250/5000
1/1 - 0s - loss: 1.5855 - accuracy: 0.8400
Epoch 2251/5000
1/1 - 0s - loss: 1.5852 - accuracy: 0.8000
Epoch 2252/5000
1/1 - 0s - loss: 1.5848 - accuracy: 0.8400
Epoch 2253/5000
1/1 - 0s - loss: 1.5845 - accuracy: 0.8400
Epoch 2254/5000
1/1 - 0s - loss: 1.5842 - accuracy: 0.8400
Epoch 2255/5000
1/1 - 0s - loss: 1.5839 - accuracy: 0.8400
Epoch 2256/5000
1/1 - 0s - loss: 1.5836 - accuracy: 0.8400
Epoch 2257/5000
1/1 - 0s - loss: 1.5833 - accuracy: 0.8400
Epoch 2258/50

1/1 - 0s - loss: 1.5441 - accuracy: 0.8400
Epoch 2381/5000
1/1 - 0s - loss: 1.5438 - accuracy: 0.8800
Epoch 2382/5000
1/1 - 0s - loss: 1.5435 - accuracy: 0.9200
Epoch 2383/5000
1/1 - 0s - loss: 1.5432 - accuracy: 0.9200
Epoch 2384/5000
1/1 - 0s - loss: 1.5429 - accuracy: 0.8800
Epoch 2385/5000
1/1 - 0s - loss: 1.5426 - accuracy: 0.9200
Epoch 2386/5000
1/1 - 0s - loss: 1.5423 - accuracy: 0.9200
Epoch 2387/5000
1/1 - 0s - loss: 1.5420 - accuracy: 0.9200
Epoch 2388/5000
1/1 - 0s - loss: 1.5416 - accuracy: 0.8800
Epoch 2389/5000
1/1 - 0s - loss: 1.5413 - accuracy: 0.9200
Epoch 2390/5000
1/1 - 0s - loss: 1.5411 - accuracy: 0.9200
Epoch 2391/5000
1/1 - 0s - loss: 1.5407 - accuracy: 0.8800
Epoch 2392/5000
1/1 - 0s - loss: 1.5404 - accuracy: 0.8800
Epoch 2393/5000
1/1 - 0s - loss: 1.5401 - accuracy: 0.9200
Epoch 2394/5000
1/1 - 0s - loss: 1.5398 - accuracy: 0.8800
Epoch 2395/5000
1/1 - 0s - loss: 1.5395 - accuracy: 0.8800
Epoch 2396/5000
1/1 - 0s - loss: 1.5391 - accuracy: 0.8800
Epoch 2397/50

1/1 - 0s - loss: 1.5018 - accuracy: 0.9200
Epoch 2520/5000
1/1 - 0s - loss: 1.5015 - accuracy: 0.9200
Epoch 2521/5000
1/1 - 0s - loss: 1.5013 - accuracy: 0.9200
Epoch 2522/5000
1/1 - 0s - loss: 1.5009 - accuracy: 0.9200
Epoch 2523/5000
1/1 - 0s - loss: 1.5006 - accuracy: 0.9200
Epoch 2524/5000
1/1 - 0s - loss: 1.5004 - accuracy: 0.9200
Epoch 2525/5000
1/1 - 0s - loss: 1.5000 - accuracy: 0.9200
Epoch 2526/5000
1/1 - 0s - loss: 1.4997 - accuracy: 0.9200
Epoch 2527/5000
1/1 - 0s - loss: 1.4994 - accuracy: 0.9200
Epoch 2528/5000
1/1 - 0s - loss: 1.4992 - accuracy: 0.9200
Epoch 2529/5000
1/1 - 0s - loss: 1.4989 - accuracy: 0.9200
Epoch 2530/5000
1/1 - 0s - loss: 1.4986 - accuracy: 0.9200
Epoch 2531/5000
1/1 - 0s - loss: 1.4983 - accuracy: 0.9200
Epoch 2532/5000
1/1 - 0s - loss: 1.4979 - accuracy: 0.9200
Epoch 2533/5000
1/1 - 0s - loss: 1.4976 - accuracy: 0.9200
Epoch 2534/5000
1/1 - 0s - loss: 1.4973 - accuracy: 0.9200
Epoch 2535/5000
1/1 - 0s - loss: 1.4971 - accuracy: 0.9200
Epoch 2536/50

1/1 - 0s - loss: 1.4612 - accuracy: 0.9200
Epoch 2659/5000
1/1 - 0s - loss: 1.4609 - accuracy: 0.9200
Epoch 2660/5000
1/1 - 0s - loss: 1.4606 - accuracy: 0.9200
Epoch 2661/5000
1/1 - 0s - loss: 1.4603 - accuracy: 0.9200
Epoch 2662/5000
1/1 - 0s - loss: 1.4600 - accuracy: 0.9200
Epoch 2663/5000
1/1 - 0s - loss: 1.4597 - accuracy: 0.9200
Epoch 2664/5000
1/1 - 0s - loss: 1.4595 - accuracy: 0.9200
Epoch 2665/5000
1/1 - 0s - loss: 1.4591 - accuracy: 0.9200
Epoch 2666/5000
1/1 - 0s - loss: 1.4588 - accuracy: 0.9200
Epoch 2667/5000
1/1 - 0s - loss: 1.4586 - accuracy: 0.9200
Epoch 2668/5000
1/1 - 0s - loss: 1.4583 - accuracy: 0.9200
Epoch 2669/5000
1/1 - 0s - loss: 1.4581 - accuracy: 0.9200
Epoch 2670/5000
1/1 - 0s - loss: 1.4577 - accuracy: 0.9200
Epoch 2671/5000
1/1 - 0s - loss: 1.4574 - accuracy: 0.9200
Epoch 2672/5000
1/1 - 0s - loss: 1.4572 - accuracy: 0.9200
Epoch 2673/5000
1/1 - 0s - loss: 1.4569 - accuracy: 0.9200
Epoch 2674/5000
1/1 - 0s - loss: 1.4566 - accuracy: 0.9200
Epoch 2675/50

1/1 - 0s - loss: 1.4219 - accuracy: 0.9200
Epoch 2798/5000
1/1 - 0s - loss: 1.4217 - accuracy: 0.9200
Epoch 2799/5000
1/1 - 0s - loss: 1.4214 - accuracy: 0.9200
Epoch 2800/5000
1/1 - 0s - loss: 1.4211 - accuracy: 0.9200
Epoch 2801/5000
1/1 - 0s - loss: 1.4209 - accuracy: 0.9200
Epoch 2802/5000
1/1 - 0s - loss: 1.4205 - accuracy: 0.9200
Epoch 2803/5000
1/1 - 0s - loss: 1.4203 - accuracy: 0.9200
Epoch 2804/5000
1/1 - 0s - loss: 1.4200 - accuracy: 0.9200
Epoch 2805/5000
1/1 - 0s - loss: 1.4197 - accuracy: 0.9200
Epoch 2806/5000
1/1 - 0s - loss: 1.4194 - accuracy: 0.9200
Epoch 2807/5000
1/1 - 0s - loss: 1.4192 - accuracy: 0.9200
Epoch 2808/5000
1/1 - 0s - loss: 1.4189 - accuracy: 0.9200
Epoch 2809/5000
1/1 - 0s - loss: 1.4186 - accuracy: 0.9200
Epoch 2810/5000
1/1 - 0s - loss: 1.4184 - accuracy: 0.9200
Epoch 2811/5000
1/1 - 0s - loss: 1.4181 - accuracy: 0.9200
Epoch 2812/5000
1/1 - 0s - loss: 1.4178 - accuracy: 0.9200
Epoch 2813/5000
1/1 - 0s - loss: 1.4176 - accuracy: 0.9200
Epoch 2814/50

1/1 - 0s - loss: 1.3840 - accuracy: 0.9200
Epoch 2937/5000
1/1 - 0s - loss: 1.3838 - accuracy: 0.9200
Epoch 2938/5000
1/1 - 0s - loss: 1.3834 - accuracy: 0.9200
Epoch 2939/5000
1/1 - 0s - loss: 1.3831 - accuracy: 0.9200
Epoch 2940/5000
1/1 - 0s - loss: 1.3829 - accuracy: 0.9200
Epoch 2941/5000
1/1 - 0s - loss: 1.3827 - accuracy: 0.9200
Epoch 2942/5000
1/1 - 0s - loss: 1.3824 - accuracy: 0.9200
Epoch 2943/5000
1/1 - 0s - loss: 1.3821 - accuracy: 0.9200
Epoch 2944/5000
1/1 - 0s - loss: 1.3818 - accuracy: 0.9200
Epoch 2945/5000
1/1 - 0s - loss: 1.3816 - accuracy: 0.9200
Epoch 2946/5000
1/1 - 0s - loss: 1.3813 - accuracy: 0.9200
Epoch 2947/5000
1/1 - 0s - loss: 1.3810 - accuracy: 0.9200
Epoch 2948/5000
1/1 - 0s - loss: 1.3807 - accuracy: 0.9200
Epoch 2949/5000
1/1 - 0s - loss: 1.3805 - accuracy: 0.9200
Epoch 2950/5000
1/1 - 0s - loss: 1.3802 - accuracy: 0.9200
Epoch 2951/5000
1/1 - 0s - loss: 1.3800 - accuracy: 0.9200
Epoch 2952/5000
1/1 - 0s - loss: 1.3797 - accuracy: 0.9200
Epoch 2953/50

1/1 - 0s - loss: 1.3471 - accuracy: 0.9200
Epoch 3076/5000
1/1 - 0s - loss: 1.3469 - accuracy: 0.9200
Epoch 3077/5000
1/1 - 0s - loss: 1.3466 - accuracy: 0.9200
Epoch 3078/5000
1/1 - 0s - loss: 1.3464 - accuracy: 0.9200
Epoch 3079/5000
1/1 - 0s - loss: 1.3461 - accuracy: 0.9200
Epoch 3080/5000
1/1 - 0s - loss: 1.3458 - accuracy: 0.9200
Epoch 3081/5000
1/1 - 0s - loss: 1.3456 - accuracy: 0.9200
Epoch 3082/5000
1/1 - 0s - loss: 1.3453 - accuracy: 0.9200
Epoch 3083/5000
1/1 - 0s - loss: 1.3451 - accuracy: 0.9200
Epoch 3084/5000
1/1 - 0s - loss: 1.3448 - accuracy: 0.9200
Epoch 3085/5000
1/1 - 0s - loss: 1.3445 - accuracy: 0.9200
Epoch 3086/5000
1/1 - 0s - loss: 1.3443 - accuracy: 0.9200
Epoch 3087/5000
1/1 - 0s - loss: 1.3440 - accuracy: 0.9200
Epoch 3088/5000
1/1 - 0s - loss: 1.3437 - accuracy: 0.9200
Epoch 3089/5000
1/1 - 0s - loss: 1.3435 - accuracy: 0.9200
Epoch 3090/5000
1/1 - 0s - loss: 1.3433 - accuracy: 0.9200
Epoch 3091/5000
1/1 - 0s - loss: 1.3429 - accuracy: 0.9200
Epoch 3092/50

1/1 - 0s - loss: 1.3113 - accuracy: 0.9200
Epoch 3215/5000
1/1 - 0s - loss: 1.3111 - accuracy: 0.9200
Epoch 3216/5000
1/1 - 0s - loss: 1.3109 - accuracy: 0.9200
Epoch 3217/5000
1/1 - 0s - loss: 1.3105 - accuracy: 0.9200
Epoch 3218/5000
1/1 - 0s - loss: 1.3103 - accuracy: 0.9200
Epoch 3219/5000
1/1 - 0s - loss: 1.3101 - accuracy: 0.9200
Epoch 3220/5000
1/1 - 0s - loss: 1.3098 - accuracy: 0.9200
Epoch 3221/5000
1/1 - 0s - loss: 1.3095 - accuracy: 0.9200
Epoch 3222/5000
1/1 - 0s - loss: 1.3093 - accuracy: 0.9200
Epoch 3223/5000
1/1 - 0s - loss: 1.3090 - accuracy: 0.9200
Epoch 3224/5000
1/1 - 0s - loss: 1.3088 - accuracy: 0.9200
Epoch 3225/5000
1/1 - 0s - loss: 1.3085 - accuracy: 0.9200
Epoch 3226/5000
1/1 - 0s - loss: 1.3082 - accuracy: 0.9200
Epoch 3227/5000
1/1 - 0s - loss: 1.3081 - accuracy: 0.9200
Epoch 3228/5000
1/1 - 0s - loss: 1.3077 - accuracy: 0.9200
Epoch 3229/5000
1/1 - 0s - loss: 1.3075 - accuracy: 0.9200
Epoch 3230/5000
1/1 - 0s - loss: 1.3073 - accuracy: 0.9200
Epoch 3231/50

1/1 - 0s - loss: 1.2765 - accuracy: 0.9200
Epoch 3354/5000
1/1 - 0s - loss: 1.2762 - accuracy: 0.9200
Epoch 3355/5000
1/1 - 0s - loss: 1.2760 - accuracy: 0.9200
Epoch 3356/5000
1/1 - 0s - loss: 1.2758 - accuracy: 0.9200
Epoch 3357/5000
1/1 - 0s - loss: 1.2755 - accuracy: 0.9200
Epoch 3358/5000
1/1 - 0s - loss: 1.2752 - accuracy: 0.9200
Epoch 3359/5000
1/1 - 0s - loss: 1.2750 - accuracy: 0.9200
Epoch 3360/5000
1/1 - 0s - loss: 1.2747 - accuracy: 0.9200
Epoch 3361/5000
1/1 - 0s - loss: 1.2745 - accuracy: 0.9200
Epoch 3362/5000
1/1 - 0s - loss: 1.2743 - accuracy: 0.9200
Epoch 3363/5000
1/1 - 0s - loss: 1.2741 - accuracy: 0.9200
Epoch 3364/5000
1/1 - 0s - loss: 1.2738 - accuracy: 0.9200
Epoch 3365/5000
1/1 - 0s - loss: 1.2735 - accuracy: 0.9200
Epoch 3366/5000
1/1 - 0s - loss: 1.2733 - accuracy: 0.9200
Epoch 3367/5000
1/1 - 0s - loss: 1.2731 - accuracy: 0.9200
Epoch 3368/5000
1/1 - 0s - loss: 1.2728 - accuracy: 0.9200
Epoch 3369/5000
1/1 - 0s - loss: 1.2726 - accuracy: 0.9200
Epoch 3370/50

1/1 - 0s - loss: 1.2425 - accuracy: 0.9200
Epoch 3493/5000
1/1 - 0s - loss: 1.2424 - accuracy: 0.9200
Epoch 3494/5000
1/1 - 0s - loss: 1.2421 - accuracy: 0.9200
Epoch 3495/5000
1/1 - 0s - loss: 1.2419 - accuracy: 0.9200
Epoch 3496/5000
1/1 - 0s - loss: 1.2416 - accuracy: 0.9200
Epoch 3497/5000
1/1 - 0s - loss: 1.2414 - accuracy: 0.9200
Epoch 3498/5000
1/1 - 0s - loss: 1.2412 - accuracy: 0.9200
Epoch 3499/5000
1/1 - 0s - loss: 1.2409 - accuracy: 0.9200
Epoch 3500/5000
1/1 - 0s - loss: 1.2407 - accuracy: 0.9200
Epoch 3501/5000
1/1 - 0s - loss: 1.2405 - accuracy: 0.9200
Epoch 3502/5000
1/1 - 0s - loss: 1.2402 - accuracy: 0.9200
Epoch 3503/5000
1/1 - 0s - loss: 1.2399 - accuracy: 0.9200
Epoch 3504/5000
1/1 - 0s - loss: 1.2397 - accuracy: 0.9200
Epoch 3505/5000
1/1 - 0s - loss: 1.2395 - accuracy: 0.9200
Epoch 3506/5000
1/1 - 0s - loss: 1.2393 - accuracy: 0.9200
Epoch 3507/5000
1/1 - 0s - loss: 1.2390 - accuracy: 0.9200
Epoch 3508/5000
1/1 - 0s - loss: 1.2387 - accuracy: 0.9200
Epoch 3509/50

1/1 - 0s - loss: 1.2096 - accuracy: 0.9200
Epoch 3632/5000
1/1 - 0s - loss: 1.2094 - accuracy: 0.9200
Epoch 3633/5000
1/1 - 0s - loss: 1.2091 - accuracy: 0.9200
Epoch 3634/5000
1/1 - 0s - loss: 1.2089 - accuracy: 0.9200
Epoch 3635/5000
1/1 - 0s - loss: 1.2087 - accuracy: 0.9200
Epoch 3636/5000
1/1 - 0s - loss: 1.2084 - accuracy: 0.9200
Epoch 3637/5000
1/1 - 0s - loss: 1.2082 - accuracy: 0.9200
Epoch 3638/5000
1/1 - 0s - loss: 1.2079 - accuracy: 0.9200
Epoch 3639/5000
1/1 - 0s - loss: 1.2078 - accuracy: 0.9200
Epoch 3640/5000
1/1 - 0s - loss: 1.2075 - accuracy: 0.9200
Epoch 3641/5000
1/1 - 0s - loss: 1.2072 - accuracy: 0.9200
Epoch 3642/5000
1/1 - 0s - loss: 1.2070 - accuracy: 0.9200
Epoch 3643/5000
1/1 - 0s - loss: 1.2067 - accuracy: 0.9200
Epoch 3644/5000
1/1 - 0s - loss: 1.2066 - accuracy: 0.9200
Epoch 3645/5000
1/1 - 0s - loss: 1.2063 - accuracy: 0.9200
Epoch 3646/5000
1/1 - 0s - loss: 1.2061 - accuracy: 0.9200
Epoch 3647/5000
1/1 - 0s - loss: 1.2059 - accuracy: 0.9200
Epoch 3648/50

1/1 - 0s - loss: 1.1775 - accuracy: 0.9200
Epoch 3771/5000
1/1 - 0s - loss: 1.1772 - accuracy: 0.9200
Epoch 3772/5000
1/1 - 0s - loss: 1.1770 - accuracy: 0.9200
Epoch 3773/5000
1/1 - 0s - loss: 1.1767 - accuracy: 0.9200
Epoch 3774/5000
1/1 - 0s - loss: 1.1765 - accuracy: 0.9200
Epoch 3775/5000
1/1 - 0s - loss: 1.1764 - accuracy: 0.9200
Epoch 3776/5000
1/1 - 0s - loss: 1.1761 - accuracy: 0.9200
Epoch 3777/5000
1/1 - 0s - loss: 1.1759 - accuracy: 0.9200
Epoch 3778/5000
1/1 - 0s - loss: 1.1756 - accuracy: 0.9200
Epoch 3779/5000
1/1 - 0s - loss: 1.1754 - accuracy: 0.9200
Epoch 3780/5000
1/1 - 0s - loss: 1.1752 - accuracy: 0.9200
Epoch 3781/5000
1/1 - 0s - loss: 1.1749 - accuracy: 0.9200
Epoch 3782/5000
1/1 - 0s - loss: 1.1747 - accuracy: 0.9200
Epoch 3783/5000
1/1 - 0s - loss: 1.1745 - accuracy: 0.9200
Epoch 3784/5000
1/1 - 0s - loss: 1.1743 - accuracy: 0.9200
Epoch 3785/5000
1/1 - 0s - loss: 1.1740 - accuracy: 0.9200
Epoch 3786/5000
1/1 - 0s - loss: 1.1738 - accuracy: 0.9200
Epoch 3787/50

1/1 - 0s - loss: 1.1462 - accuracy: 0.9200
Epoch 3910/5000
1/1 - 0s - loss: 1.1459 - accuracy: 0.9200
Epoch 3911/5000
1/1 - 0s - loss: 1.1457 - accuracy: 0.9200
Epoch 3912/5000
1/1 - 0s - loss: 1.1455 - accuracy: 0.9200
Epoch 3913/5000
1/1 - 0s - loss: 1.1452 - accuracy: 0.9200
Epoch 3914/5000
1/1 - 0s - loss: 1.1450 - accuracy: 0.9200
Epoch 3915/5000
1/1 - 0s - loss: 1.1448 - accuracy: 0.9200
Epoch 3916/5000
1/1 - 0s - loss: 1.1446 - accuracy: 0.9200
Epoch 3917/5000
1/1 - 0s - loss: 1.1444 - accuracy: 0.9200
Epoch 3918/5000
1/1 - 0s - loss: 1.1442 - accuracy: 0.9200
Epoch 3919/5000
1/1 - 0s - loss: 1.1439 - accuracy: 0.9200
Epoch 3920/5000
1/1 - 0s - loss: 1.1437 - accuracy: 0.9200
Epoch 3921/5000
1/1 - 0s - loss: 1.1435 - accuracy: 0.9200
Epoch 3922/5000
1/1 - 0s - loss: 1.1432 - accuracy: 0.9200
Epoch 3923/5000
1/1 - 0s - loss: 1.1431 - accuracy: 0.9200
Epoch 3924/5000
1/1 - 0s - loss: 1.1428 - accuracy: 0.9200
Epoch 3925/5000
1/1 - 0s - loss: 1.1426 - accuracy: 0.9200
Epoch 3926/50

1/1 - 0s - loss: 1.1157 - accuracy: 0.9200
Epoch 4049/5000
1/1 - 0s - loss: 1.1154 - accuracy: 0.9200
Epoch 4050/5000
1/1 - 0s - loss: 1.1153 - accuracy: 0.9200
Epoch 4051/5000
1/1 - 0s - loss: 1.1151 - accuracy: 0.9200
Epoch 4052/5000
1/1 - 0s - loss: 1.1148 - accuracy: 0.9200
Epoch 4053/5000
1/1 - 0s - loss: 1.1146 - accuracy: 0.9200
Epoch 4054/5000
1/1 - 0s - loss: 1.1144 - accuracy: 0.9200
Epoch 4055/5000
1/1 - 0s - loss: 1.1142 - accuracy: 0.9200
Epoch 4056/5000
1/1 - 0s - loss: 1.1140 - accuracy: 0.9200
Epoch 4057/5000
1/1 - 0s - loss: 1.1138 - accuracy: 0.9200
Epoch 4058/5000
1/1 - 0s - loss: 1.1135 - accuracy: 0.9200
Epoch 4059/5000
1/1 - 0s - loss: 1.1133 - accuracy: 0.9200
Epoch 4060/5000
1/1 - 0s - loss: 1.1131 - accuracy: 0.9200
Epoch 4061/5000
1/1 - 0s - loss: 1.1128 - accuracy: 0.9200
Epoch 4062/5000
1/1 - 0s - loss: 1.1126 - accuracy: 0.9200
Epoch 4063/5000
1/1 - 0s - loss: 1.1124 - accuracy: 0.9200
Epoch 4064/5000
1/1 - 0s - loss: 1.1122 - accuracy: 0.9200
Epoch 4065/50

1/1 - 0s - loss: 1.0860 - accuracy: 0.9600
Epoch 4188/5000
1/1 - 0s - loss: 1.0858 - accuracy: 0.9600
Epoch 4189/5000
1/1 - 0s - loss: 1.0856 - accuracy: 1.0000
Epoch 4190/5000
1/1 - 0s - loss: 1.0854 - accuracy: 0.9200
Epoch 4191/5000
1/1 - 0s - loss: 1.0852 - accuracy: 1.0000
Epoch 4192/5000
1/1 - 0s - loss: 1.0849 - accuracy: 0.9600
Epoch 4193/5000
1/1 - 0s - loss: 1.0847 - accuracy: 0.9600
Epoch 4194/5000
1/1 - 0s - loss: 1.0845 - accuracy: 0.9600
Epoch 4195/5000
1/1 - 0s - loss: 1.0843 - accuracy: 0.9600
Epoch 4196/5000
1/1 - 0s - loss: 1.0840 - accuracy: 0.9200
Epoch 4197/5000
1/1 - 0s - loss: 1.0839 - accuracy: 0.9600
Epoch 4198/5000
1/1 - 0s - loss: 1.0837 - accuracy: 0.9600
Epoch 4199/5000
1/1 - 0s - loss: 1.0835 - accuracy: 1.0000
Epoch 4200/5000
1/1 - 0s - loss: 1.0833 - accuracy: 0.9600
Epoch 4201/5000
1/1 - 0s - loss: 1.0830 - accuracy: 0.9200
Epoch 4202/5000
1/1 - 0s - loss: 1.0828 - accuracy: 1.0000
Epoch 4203/5000
1/1 - 0s - loss: 1.0826 - accuracy: 0.9600
Epoch 4204/50

1/1 - 0s - loss: 1.0570 - accuracy: 1.0000
Epoch 4327/5000
1/1 - 0s - loss: 1.0568 - accuracy: 1.0000
Epoch 4328/5000
1/1 - 0s - loss: 1.0567 - accuracy: 1.0000
Epoch 4329/5000
1/1 - 0s - loss: 1.0565 - accuracy: 1.0000
Epoch 4330/5000
1/1 - 0s - loss: 1.0562 - accuracy: 1.0000
Epoch 4331/5000
1/1 - 0s - loss: 1.0560 - accuracy: 1.0000
Epoch 4332/5000
1/1 - 0s - loss: 1.0558 - accuracy: 1.0000
Epoch 4333/5000
1/1 - 0s - loss: 1.0556 - accuracy: 1.0000
Epoch 4334/5000
1/1 - 0s - loss: 1.0554 - accuracy: 1.0000
Epoch 4335/5000
1/1 - 0s - loss: 1.0552 - accuracy: 1.0000
Epoch 4336/5000
1/1 - 0s - loss: 1.0550 - accuracy: 1.0000
Epoch 4337/5000
1/1 - 0s - loss: 1.0548 - accuracy: 1.0000
Epoch 4338/5000
1/1 - 0s - loss: 1.0546 - accuracy: 1.0000
Epoch 4339/5000
1/1 - 0s - loss: 1.0544 - accuracy: 1.0000
Epoch 4340/5000
1/1 - 0s - loss: 1.0542 - accuracy: 1.0000
Epoch 4341/5000
1/1 - 0s - loss: 1.0540 - accuracy: 1.0000
Epoch 4342/5000
1/1 - 0s - loss: 1.0538 - accuracy: 1.0000
Epoch 4343/50

1/1 - 0s - loss: 1.0288 - accuracy: 1.0000
Epoch 4466/5000
1/1 - 0s - loss: 1.0286 - accuracy: 1.0000
Epoch 4467/5000
1/1 - 0s - loss: 1.0285 - accuracy: 1.0000
Epoch 4468/5000
1/1 - 0s - loss: 1.0283 - accuracy: 1.0000
Epoch 4469/5000
1/1 - 0s - loss: 1.0281 - accuracy: 1.0000
Epoch 4470/5000
1/1 - 0s - loss: 1.0279 - accuracy: 1.0000
Epoch 4471/5000
1/1 - 0s - loss: 1.0276 - accuracy: 1.0000
Epoch 4472/5000
1/1 - 0s - loss: 1.0274 - accuracy: 1.0000
Epoch 4473/5000
1/1 - 0s - loss: 1.0273 - accuracy: 1.0000
Epoch 4474/5000
1/1 - 0s - loss: 1.0271 - accuracy: 1.0000
Epoch 4475/5000
1/1 - 0s - loss: 1.0268 - accuracy: 1.0000
Epoch 4476/5000
1/1 - 0s - loss: 1.0266 - accuracy: 1.0000
Epoch 4477/5000
1/1 - 0s - loss: 1.0265 - accuracy: 1.0000
Epoch 4478/5000
1/1 - 0s - loss: 1.0264 - accuracy: 1.0000
Epoch 4479/5000
1/1 - 0s - loss: 1.0261 - accuracy: 1.0000
Epoch 4480/5000
1/1 - 0s - loss: 1.0259 - accuracy: 1.0000
Epoch 4481/5000
1/1 - 0s - loss: 1.0257 - accuracy: 1.0000
Epoch 4482/50

1/1 - 0s - loss: 1.0014 - accuracy: 1.0000
Epoch 4605/5000
1/1 - 0s - loss: 1.0012 - accuracy: 1.0000
Epoch 4606/5000
1/1 - 0s - loss: 1.0011 - accuracy: 1.0000
Epoch 4607/5000
1/1 - 0s - loss: 1.0008 - accuracy: 1.0000
Epoch 4608/5000
1/1 - 0s - loss: 1.0006 - accuracy: 1.0000
Epoch 4609/5000
1/1 - 0s - loss: 1.0005 - accuracy: 1.0000
Epoch 4610/5000
1/1 - 0s - loss: 1.0003 - accuracy: 1.0000
Epoch 4611/5000
1/1 - 0s - loss: 1.0000 - accuracy: 1.0000
Epoch 4612/5000
1/1 - 0s - loss: 0.9999 - accuracy: 1.0000
Epoch 4613/5000
1/1 - 0s - loss: 0.9996 - accuracy: 1.0000
Epoch 4614/5000
1/1 - 0s - loss: 0.9994 - accuracy: 1.0000
Epoch 4615/5000
1/1 - 0s - loss: 0.9993 - accuracy: 1.0000
Epoch 4616/5000
1/1 - 0s - loss: 0.9990 - accuracy: 1.0000
Epoch 4617/5000
1/1 - 0s - loss: 0.9989 - accuracy: 1.0000
Epoch 4618/5000
1/1 - 0s - loss: 0.9987 - accuracy: 1.0000
Epoch 4619/5000
1/1 - 0s - loss: 0.9985 - accuracy: 1.0000
Epoch 4620/5000
1/1 - 0s - loss: 0.9983 - accuracy: 1.0000
Epoch 4621/50

1/1 - 0s - loss: 0.9746 - accuracy: 1.0000
Epoch 4744/5000
1/1 - 0s - loss: 0.9745 - accuracy: 1.0000
Epoch 4745/5000
1/1 - 0s - loss: 0.9743 - accuracy: 1.0000
Epoch 4746/5000
1/1 - 0s - loss: 0.9740 - accuracy: 1.0000
Epoch 4747/5000
1/1 - 0s - loss: 0.9738 - accuracy: 1.0000
Epoch 4748/5000
1/1 - 0s - loss: 0.9736 - accuracy: 1.0000
Epoch 4749/5000
1/1 - 0s - loss: 0.9735 - accuracy: 1.0000
Epoch 4750/5000
1/1 - 0s - loss: 0.9733 - accuracy: 1.0000
Epoch 4751/5000
1/1 - 0s - loss: 0.9731 - accuracy: 1.0000
Epoch 4752/5000
1/1 - 0s - loss: 0.9730 - accuracy: 1.0000
Epoch 4753/5000
1/1 - 0s - loss: 0.9728 - accuracy: 1.0000
Epoch 4754/5000
1/1 - 0s - loss: 0.9726 - accuracy: 1.0000
Epoch 4755/5000
1/1 - 0s - loss: 0.9724 - accuracy: 1.0000
Epoch 4756/5000
1/1 - 0s - loss: 0.9722 - accuracy: 1.0000
Epoch 4757/5000
1/1 - 0s - loss: 0.9721 - accuracy: 1.0000
Epoch 4758/5000
1/1 - 0s - loss: 0.9718 - accuracy: 1.0000
Epoch 4759/5000
1/1 - 0s - loss: 0.9716 - accuracy: 1.0000
Epoch 4760/50

1/1 - 0s - loss: 0.9486 - accuracy: 1.0000
Epoch 4883/5000
1/1 - 0s - loss: 0.9484 - accuracy: 1.0000
Epoch 4884/5000
1/1 - 0s - loss: 0.9482 - accuracy: 1.0000
Epoch 4885/5000
1/1 - 0s - loss: 0.9481 - accuracy: 1.0000
Epoch 4886/5000
1/1 - 0s - loss: 0.9478 - accuracy: 1.0000
Epoch 4887/5000
1/1 - 0s - loss: 0.9476 - accuracy: 1.0000
Epoch 4888/5000
1/1 - 0s - loss: 0.9475 - accuracy: 1.0000
Epoch 4889/5000
1/1 - 0s - loss: 0.9473 - accuracy: 1.0000
Epoch 4890/5000
1/1 - 0s - loss: 0.9471 - accuracy: 1.0000
Epoch 4891/5000
1/1 - 0s - loss: 0.9469 - accuracy: 1.0000
Epoch 4892/5000
1/1 - 0s - loss: 0.9467 - accuracy: 1.0000
Epoch 4893/5000
1/1 - 0s - loss: 0.9465 - accuracy: 1.0000
Epoch 4894/5000
1/1 - 0s - loss: 0.9464 - accuracy: 1.0000
Epoch 4895/5000
1/1 - 0s - loss: 0.9462 - accuracy: 1.0000
Epoch 4896/5000
1/1 - 0s - loss: 0.9459 - accuracy: 1.0000
Epoch 4897/5000
1/1 - 0s - loss: 0.9458 - accuracy: 1.0000
Epoch 4898/5000
1/1 - 0s - loss: 0.9456 - accuracy: 1.0000
Epoch 4899/50

As we expected, the network can use the within-sequence context to learn the alphabet, achieving 100% accuracy on the training data. Notably, the network can make accurate predictions for the next letter in the alphabet for randomly selected characters. Very impressive.

## Stateful LSTM for a One-Char to One-Char Mapping

We have seen that we can break up our raw data into fixed-size sequences and that the LSTM can learn this representation, but only to learn random mappings of 3 characters to 1 character. We have also seen that we can pervert batch size to offer more sequence to the network, but only during training. Ideally, we want to expose the network to the entire sequence and let it learn the inter-dependencies, rather than us define those dependencies explicitly in
the framing of the problem.

We can do this in Keras by making the LSTM layers stateful and manually resetting the state of the network at the end of the epoch, which is also the end of the training sequence. This is truly how the LSTM networks are intended to be used. We first need to define our LSTM layer as stateful. In so doing, we must explicitly specify the batch size as a dimension on the input shape. This also means that we must also specify and adhere to this same batch size when we evaluate the network or make predictions. This is not a problem now as we are using a batch size of 1. This could introduce difficulties when making predictions when the batch size is not one as predictions will need to be made in batch and in sequence.

```
batch_size = 1
model.add(LSTM(50, 
               batch_input_shape=(batch_size, X.shape[1], X.shape[2]), 
               stateful=True)
         )
```

An important difference in training the stateful LSTM is that we train it manually one epoch at a time and reset the state after each epoch. We can do this in a for loop. Again, we do not shuffle the input, preserving the sequence in which the input training data was created.

```
for i in range(300):
    model.fit(X, y, epochs=1, 
              batch_size=batch_size, verbose=2, shuffle=False)
    model.reset_states()
```

As mentioned, we specify the batch size when evaluating the performance of the network on the entire training dataset.

```
# summarize performance of the model
scores = model.evaluate(X, y, batch_size=batch_size, verbose=0)
model.reset_states()
print("Model Accuracy: %.2f%%" % (scores[1]*100))
```

Finally, we can demonstrate that the network has indeed learned the entire alphabet. We can seed it with the first letter A, request a prediction, feed the prediction back in as an input, and repeat the process all the way to Z.

```
# demonstrate some model predictions
seed = [char_to_int[alphabet[0]]]
for i in range(0, len(alphabet)-1):
    x = np.reshape(seed, (1, len(seed), 1))
    x = x / float(len(alphabet))
    prediction = model.predict(x, verbose=0)
    index = np.argmax(prediction)
    print(int_to_char[seed[0]], "->", int_to_char[index])
    seed = [index]
model.reset_states()
```

We can also see if the network can make predictions starting from an arbitrary letter.

```
# demonstrate a random starting point
letter = "K"
seed = [char_to_int[letter]]
print("New start: ", letter)
for i in range(0, 5):
    x = np.reshape(seed, (1, len(seed), 1))
    x = x / float(len(alphabet))
    prediction = model.predict(x, verbose=0)
    index = np.argmax(prediction)
    print(int_to_char[seed[0]], "->", int_to_char[index])
    seed = [index]
model.reset_states()
```

The entire code listing is provided below for completeness.

In [21]:
# fix random seed for reproducibility
np.random.seed(7)

# define the raw dataset
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

# create mapping of characters to integers (0-25) and the reverse
char_to_int = dict((c, i) for i, c in enumerate(alphabet))
int_to_char = dict((i, c) for i, c in enumerate(alphabet))

# prepare the dataset of input to output pairs encoded as integers
seq_length = 1
dataX = []
dataY = []
for i in range(0, len(alphabet) - seq_length, 1):
    seq_in = alphabet[i:i + seq_length]
    seq_out = alphabet[i + seq_length]
    dataX.append([char_to_int[char] for char in seq_in])
    dataY.append(char_to_int[seq_out])
    print(seq_in, '->', seq_out)

# reshape X to be [samples, time steps, features]
X = np.reshape(dataX, (len(dataX), seq_length, 1))

# normalize
X = X / float(len(alphabet))

# one hot encode the output variable
y = utils.to_categorical(dataY)

# create and fit the model
batch_size = 1
model = Sequential()
model.add(LSTM(50, batch_input_shape=(batch_size, X.shape[1], X.shape[2]), stateful=True))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

for i in range(300):
    model.fit(X, y, epochs=1, batch_size=batch_size, verbose=2, shuffle=False)
    model.reset_states()
    
# summarize performance of the model
scores = model.evaluate(X, y, batch_size=batch_size, verbose=0)
model.reset_states()
print("Model Accuracy: %.2f%%" % (scores[1]*100))

# demonstrate some model predictions
seed = [char_to_int[alphabet[0]]]
for i in range(0, len(alphabet)-1):
    x = np.reshape(seed, (1, len(seed), 1))
    x = x / float(len(alphabet))
    prediction = model.predict(x, verbose=0)
    index = np.argmax(prediction)
    print(int_to_char[seed[0]], "->", int_to_char[index])
    seed = [index]
model.reset_states()

# demonstrate a random starting point
letter = "K"
seed = [char_to_int[letter]]
print("New start: ", letter)

for i in range(0, 5):
    x = np.reshape(seed, (1, len(seed), 1))
    x = x / float(len(alphabet))
    prediction = model.predict(x, verbose=0)
    indexx = np.argmax(prediction)
    print(int_to_char[seed[0]], "->", int_to_char[index])
    seed = [index]
model.reset_states()

A -> B
B -> C
C -> D
D -> E
E -> F
F -> G
G -> H
H -> I
I -> J
J -> K
K -> L
L -> M
M -> N
N -> O
O -> P
P -> Q
Q -> R
R -> S
S -> T
T -> U
U -> V
V -> W
W -> X
X -> Y
Y -> Z
25/25 - 1s - loss: 3.2760 - accuracy: 0.0000e+00
25/25 - 0s - loss: 3.2497 - accuracy: 0.0800
25/25 - 0s - loss: 3.2329 - accuracy: 0.0800
25/25 - 0s - loss: 3.2130 - accuracy: 0.1200
25/25 - 0s - loss: 3.1842 - accuracy: 0.1200
25/25 - 0s - loss: 3.1352 - accuracy: 0.1200
25/25 - 0s - loss: 3.0580 - accuracy: 0.1200
25/25 - 0s - loss: 2.9913 - accuracy: 0.0800
25/25 - 0s - loss: 2.9626 - accuracy: 0.0800
25/25 - 0s - loss: 3.0128 - accuracy: 0.0800
25/25 - 0s - loss: 3.0950 - accuracy: 0.1600
25/25 - 0s - loss: 2.8958 - accuracy: 0.1200
25/25 - 0s - loss: 2.8794 - accuracy: 0.1600
25/25 - 0s - loss: 2.8212 - accuracy: 0.1600
25/25 - 0s - loss: 2.7366 - accuracy: 0.2400
25/25 - 0s - loss: 2.6159 - accuracy: 0.2400
25/25 - 0s - loss: 2.5488 - accuracy: 0.2800
25/25 - 0s - loss: 2.4587 - accuracy: 0.2800
25/25 - 0s 

25/25 - 0s - loss: 0.1086 - accuracy: 1.0000
25/25 - 0s - loss: 0.1064 - accuracy: 1.0000
25/25 - 0s - loss: 0.1041 - accuracy: 1.0000
25/25 - 0s - loss: 0.1019 - accuracy: 1.0000
25/25 - 0s - loss: 0.0998 - accuracy: 1.0000
25/25 - 0s - loss: 0.0977 - accuracy: 1.0000
25/25 - 0s - loss: 0.0956 - accuracy: 1.0000
25/25 - 0s - loss: 0.0936 - accuracy: 1.0000
25/25 - 0s - loss: 0.0917 - accuracy: 1.0000
25/25 - 0s - loss: 0.0897 - accuracy: 1.0000
25/25 - 0s - loss: 0.0878 - accuracy: 1.0000
25/25 - 0s - loss: 0.0860 - accuracy: 1.0000
25/25 - 0s - loss: 0.0842 - accuracy: 1.0000
25/25 - 0s - loss: 0.0824 - accuracy: 1.0000
25/25 - 0s - loss: 0.0807 - accuracy: 1.0000
25/25 - 0s - loss: 0.0790 - accuracy: 1.0000
25/25 - 0s - loss: 0.0773 - accuracy: 1.0000
25/25 - 0s - loss: 0.0757 - accuracy: 1.0000
25/25 - 0s - loss: 0.0741 - accuracy: 1.0000
25/25 - 0s - loss: 0.0726 - accuracy: 1.0000
25/25 - 0s - loss: 0.0711 - accuracy: 1.0000
25/25 - 0s - loss: 0.0696 - accuracy: 1.0000
25/25 - 0s

We can see that the network has memorized the entire alphabet perfectly. It used the samples' context and learned whatever dependency it needed to predict the next character in the sequence. We can also see that if we seed the network with the first letter, it can correctly rattle off the rest of the alphabet. We can also see that it has only learned the full alphabet sequence and that from a cold start. When asked to predict the next letter from K that it predicts B and falls back into regurgitating the entire alphabet. To truly predict
K, the network state would need to be warmed up iteratively fed the letters from A to J. This tells us that we could achieve the same effect with a stateless LSTM by preparing training data like:

```
---a -> b
--ab -> c
-abc -> d
abcd -> e
```

The input sequence is fixed at 25 (a-to-y to predict z), and patterns are prefixed with zero paddings. Finally, this raises the question of training an LSTM network using variable-length input sequences to predict the next character.

## LSTM with Variable Length Input to One-Char Output

In the previous section, we discovered that the Keras stateful LSTM was only a shortcut to replaying the first n-sequences but didn't help us learn a generic alphabet model. This section explores a variation of the stateless LSTM that learns random subsequences of the alphabet and an effort to build a model that can be given arbitrary letters or subsequences of letters and predict the next letter in the alphabet.

Firstly, we are changing the framing of the problem. To simplify, we will define a maximum input sequence length and set it to a small value like 5 to speed up training. This defines the maximum length of subsequences of the alphabet which will be drawn for training. In extensions, this could just be set to the entire alphabet (26) or longer if we allow looping back to the start of the sequence. We also need to define the number of random sequences to create, in this case, 1,000. This, too, could be more or less. I expect fewer patterns are required.

```
# prepare the dataset of input to output pairs encoded as integers
num_inputs = 1000
max_len = 5
dataX = []
dataY = []
for i in range(num_inputs):
    start = np.random.randint(len(alphabet)-2)
    end = np.random.randint(start, min(start+max_len,len(alphabet)-1))
    sequence_in = alphabet[start:end+1]
    sequence_out = alphabet[end + 1]
    dataX.append([char_to_int[char] for char in sequence_in])
    dataY.append(char_to_int[sequence_out])
    print(sequence_in, '->', sequence_out)
```

Running this code in the broader context will create input patterns that look like the following:

```
PQRST -> U
W -> X
O -> P
OPQ -> R
IJKLM -> N
QRSTU -> V
ABCD -> E
X -> Y
GHIJ -> K
```

The input sequences vary in length between 1 and max_len and therefore require zero paddings. Here, we use the left-hand-side (prefix) padding with the Keras built-in `pad_sequences()` function.

```
X = pad_sequences(dataX, maxlen=max_len, dtype='float32')
```

The trained model is evaluated on randomly selected input patterns. This could just as easily be new randomly generated sequences of characters. I also believe this could also be a linear sequence seeded with A with outputs fed back in as single character inputs. The complete code listing is provided below for completeness.

In [22]:
# fix random seed for reproducibility
np.random.seed(7)

# define the raw dataset
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

# create mapping of characters to integers (0-25) and the reverse
char_to_int = dict((c, i) for i, c in enumerate(alphabet))
int_to_char = dict((i, c) for i, c in enumerate(alphabet))

# prepare the dataset of input to output pairs encoded as integers
num_inputs = 1000
max_len = 5
dataX = []
dataY = []
for i in range(num_inputs):
    start = np.random.randint(len(alphabet)-2)
    end = np.random.randint(start, min(start+max_len,len(alphabet)-1))
    sequence_in = alphabet[start:end+1]
    sequence_out = alphabet[end + 1]
    dataX.append([char_to_int[char] for char in sequence_in])
    dataY.append(char_to_int[sequence_out])
    print(sequence_in, '->', sequence_out)
    
# convert list of lists to array and pad sequences if needed
X = pad_sequences(dataX, maxlen=max_len, dtype='float32')

# reshape X to be [samples, time steps, features]
X = np.reshape(X, (X.shape[0], max_len, 1))

# normalize
X = X / float(len(alphabet))

# one hot encode the output variable
y = utils.to_categorical(dataY)

# create and fit the model
batch_size = 1
model = Sequential()
model.add(LSTM(32, input_shape=(X.shape[1], 1)))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=500, batch_size=batch_size, verbose=2)

# summarize performance of the model
scores = model.evaluate(X, y, verbose=0)
print("Model Accuracy: %.2f%%" % (scores[1]*100))

# demonstrate some model predictions
for i in range(20):
    pattern_index = np.random.randint(len(dataX))
    pattern = dataX[pattern_index]
    x = pad_sequences([pattern], maxlen=max_len, dtype='float32')
    x = np.reshape(x, (1, max_len, 1))
    x = x / float(len(alphabet))
    prediction = model.predict(x, verbose=0)
    index = np.argmax(prediction)
    result = int_to_char[index]
    seq_in = [int_to_char[value] for value in pattern]
    # Running this code produces the following output
    print(seq_in, "->", result)

PQRST -> U
W -> X
O -> P
OPQ -> R
IJKLM -> N
QRSTU -> V
ABCD -> E
X -> Y
GHIJ -> K
M -> N
XY -> Z
QRST -> U
ABC -> D
JKLMN -> O
OP -> Q
XY -> Z
D -> E
T -> U
B -> C
QRSTU -> V
HIJ -> K
JKLM -> N
ABCDE -> F
X -> Y
V -> W
DE -> F
DEFG -> H
BCDE -> F
EFGH -> I
BCDE -> F
FG -> H
RST -> U
TUV -> W
STUV -> W
LMN -> O
P -> Q
MNOP -> Q
JK -> L
MNOP -> Q
OPQRS -> T
UVWXY -> Z
PQRS -> T
D -> E
EFGH -> I
IJK -> L
WX -> Y
STUV -> W
MNOPQ -> R
P -> Q
WXY -> Z
VWX -> Y
V -> W
HI -> J
KLMNO -> P
UV -> W
JKL -> M
ABCDE -> F
WXY -> Z
M -> N
CDEF -> G
KLMNO -> P
RST -> U
RS -> T
W -> X
J -> K
WX -> Y
JKLMN -> O
MN -> O
L -> M
BCDE -> F
TU -> V
MNOPQ -> R
NOPQR -> S
HIJ -> K
JKLM -> N
STUVW -> X
QRST -> U
N -> O
VWXY -> Z
B -> C
UVWX -> Y
OP -> Q
K -> L
C -> D
X -> Y
ST -> U
JKLM -> N
B -> C
QR -> S
RS -> T
VWXY -> Z
S -> T
NOP -> Q
KLMNO -> P
IJ -> K
EF -> G
MNOP -> Q
WXY -> Z
HI -> J
P -> Q
STUVW -> X
Q -> R
MN -> O
O -> P
C -> D
L -> M
JKLM -> N
K -> L
IJKLM -> N
FGHIJ -> K
LM -> N
OPQ -> R
U -> V
HIJ

Epoch 1/500
1000/1000 - 2s - loss: 3.1137 - accuracy: 0.0780
Epoch 2/500
1000/1000 - 2s - loss: 2.8142 - accuracy: 0.1170
Epoch 3/500
1000/1000 - 2s - loss: 2.5085 - accuracy: 0.1750
Epoch 4/500
1000/1000 - 2s - loss: 2.2492 - accuracy: 0.2500
Epoch 5/500
1000/1000 - 2s - loss: 2.0930 - accuracy: 0.2930
Epoch 6/500
1000/1000 - 2s - loss: 1.9727 - accuracy: 0.3200
Epoch 7/500
1000/1000 - 2s - loss: 1.8685 - accuracy: 0.3470
Epoch 8/500
1000/1000 - 2s - loss: 1.7833 - accuracy: 0.3670
Epoch 9/500
1000/1000 - 2s - loss: 1.6993 - accuracy: 0.4130
Epoch 10/500
1000/1000 - 2s - loss: 1.6233 - accuracy: 0.4310
Epoch 11/500
1000/1000 - 2s - loss: 1.5596 - accuracy: 0.4600
Epoch 12/500
1000/1000 - 2s - loss: 1.4821 - accuracy: 0.4750
Epoch 13/500
1000/1000 - 2s - loss: 1.4316 - accuracy: 0.5220
Epoch 14/500
1000/1000 - 2s - loss: 1.3793 - accuracy: 0.5190
Epoch 15/500
1000/1000 - 2s - loss: 1.3350 - accuracy: 0.5380
Epoch 16/500
1000/1000 - 2s - loss: 1.2838 - accuracy: 0.5750
Epoch 17/500
1000

Epoch 133/500
1000/1000 - 2s - loss: 0.3197 - accuracy: 0.8930
Epoch 134/500
1000/1000 - 2s - loss: 0.3182 - accuracy: 0.9000
Epoch 135/500
1000/1000 - 2s - loss: 0.3690 - accuracy: 0.8870
Epoch 136/500
1000/1000 - 2s - loss: 0.3148 - accuracy: 0.9020
Epoch 137/500
1000/1000 - 2s - loss: 0.3122 - accuracy: 0.9050
Epoch 138/500
1000/1000 - 2s - loss: 0.3165 - accuracy: 0.9020
Epoch 139/500
1000/1000 - 2s - loss: 0.4027 - accuracy: 0.8730
Epoch 140/500
1000/1000 - 2s - loss: 0.3040 - accuracy: 0.9060
Epoch 141/500
1000/1000 - 2s - loss: 0.3093 - accuracy: 0.8990
Epoch 142/500
1000/1000 - 2s - loss: 0.3184 - accuracy: 0.8980
Epoch 143/500
1000/1000 - 2s - loss: 0.3404 - accuracy: 0.9040
Epoch 144/500
1000/1000 - 2s - loss: 0.2970 - accuracy: 0.9130
Epoch 145/500
1000/1000 - 2s - loss: 0.3044 - accuracy: 0.9080
Epoch 146/500
1000/1000 - 2s - loss: 0.3025 - accuracy: 0.9080
Epoch 147/500
1000/1000 - 2s - loss: 0.3377 - accuracy: 0.8940
Epoch 148/500
1000/1000 - 2s - loss: 0.3700 - accuracy:

1000/1000 - 2s - loss: 0.1914 - accuracy: 0.9500
Epoch 264/500
1000/1000 - 2s - loss: 0.1968 - accuracy: 0.9430
Epoch 265/500
1000/1000 - 2s - loss: 0.1969 - accuracy: 0.9360
Epoch 266/500
1000/1000 - 2s - loss: 0.1971 - accuracy: 0.9410
Epoch 267/500
1000/1000 - 2s - loss: 0.1941 - accuracy: 0.9360
Epoch 268/500
1000/1000 - 2s - loss: 0.2024 - accuracy: 0.9370
Epoch 269/500
1000/1000 - 2s - loss: 0.3425 - accuracy: 0.9030
Epoch 270/500
1000/1000 - 2s - loss: 0.1849 - accuracy: 0.9450
Epoch 271/500
1000/1000 - 2s - loss: 0.1870 - accuracy: 0.9470
Epoch 272/500
1000/1000 - 2s - loss: 0.1911 - accuracy: 0.9400
Epoch 273/500
1000/1000 - 2s - loss: 0.1929 - accuracy: 0.9440
Epoch 274/500
1000/1000 - 2s - loss: 0.2373 - accuracy: 0.9330
Epoch 275/500
1000/1000 - 2s - loss: 0.1856 - accuracy: 0.9480
Epoch 276/500
1000/1000 - 2s - loss: 0.2280 - accuracy: 0.9310
Epoch 277/500
1000/1000 - 2s - loss: 0.2005 - accuracy: 0.9440
Epoch 278/500
1000/1000 - 2s - loss: 0.1814 - accuracy: 0.9460
Epoch 

Epoch 394/500
1000/1000 - 2s - loss: 0.1340 - accuracy: 0.9630
Epoch 395/500
1000/1000 - 2s - loss: 0.1337 - accuracy: 0.9600
Epoch 396/500
1000/1000 - 2s - loss: 0.1365 - accuracy: 0.9560
Epoch 397/500
1000/1000 - 2s - loss: 0.1351 - accuracy: 0.9620
Epoch 398/500
1000/1000 - 2s - loss: 0.1722 - accuracy: 0.9530
Epoch 399/500
1000/1000 - 2s - loss: 0.1310 - accuracy: 0.9660
Epoch 400/500
1000/1000 - 2s - loss: 0.1318 - accuracy: 0.9640
Epoch 401/500
1000/1000 - 2s - loss: 0.1325 - accuracy: 0.9620
Epoch 402/500
1000/1000 - 2s - loss: 0.1337 - accuracy: 0.9630
Epoch 403/500
1000/1000 - 2s - loss: 0.1334 - accuracy: 0.9630
Epoch 404/500
1000/1000 - 2s - loss: 0.1352 - accuracy: 0.9510
Epoch 405/500
1000/1000 - 2s - loss: 0.1311 - accuracy: 0.9590
Epoch 406/500
1000/1000 - 2s - loss: 0.1369 - accuracy: 0.9630
Epoch 407/500
1000/1000 - 2s - loss: 0.2406 - accuracy: 0.9410
Epoch 408/500
1000/1000 - 2s - loss: 0.1357 - accuracy: 0.9670
Epoch 409/500
1000/1000 - 2s - loss: 0.1244 - accuracy:

We can see that although the model did not learn the alphabet perfectly from the randomly generated subsequences, it did very well. The model was not tuned and may require more training, a larger network, or both (an exercise for the reader). This is a good natural extension to the all sequential input examples in each batch alphabet model learned above in that it can handle ad hoc queries, but this time of arbitrary sequence length (up to the max length).

## Summary

In this lesson, you discovered LSTM recurrent neural networks in Keras and how they manage state. Specifically, you learned:
* How to develop a naive LSTM network for one-character to one-character prediction.
* How to configure a naive LSTM to learn a sequence across time steps within a sample.
* How to configure an LSTM to learn a sequence across samples by manually managing state.