<a href="https://colab.research.google.com/github/vladimiralencar/DeepLearning-LANA/blob/master/LSTM/LSTM05.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## LSTMs com Inputs de Comprimento Variável

Vamos explorar agora, uma variação do LSTM Stateless que aprende as subsequências aleatórias do alfabeto em um esforço para construir um modelo que pode receber letras arbitrárias ou subsequências de letras e prever a próxima letra no alfabeto.

Em primeiro lugar, estamos mudando o enquadramento do problema. Para simplificar, nós definiremos o input máximo da sequência e vamos configurar com um valor pequeno, como 5, para acelerar o treinamento. Isso define o comprimento máximo das subsequências do alfabeto que será usado para treinamento. Também precisamos definir o número de sequências aleatórias, neste caso, 1.000. Isso também poderia ser mais ou menos.

As sequências de entrada variam em comprimento entre 1 e max_len e, portanto, exigem zero-padding. Aqui, usamos o left-hand-side (prefix) padding com o Keras criado na função pad_sequences().

In [0]:
# Imports
import numpy
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.utils import np_utils
from keras.preprocessing.sequence import pad_sequences

# Random seed
numpy.random.seed(7)

# Dataset
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

# Cria mapeamento de caracteres para números inteiros (0-25) e o reverso
char_to_int = dict((c, i) for i, c in enumerate(alphabet))
int_to_char = dict((i, c) for i, c in enumerate(alphabet))

# Prepara o conjunto de dados de pares de entrada/saída codificados como números inteiros
num_inputs = 1000
max_len = 5
dataX = []
dataY = []

for i in range(num_inputs):
    start = numpy.random.randint(len(alphabet)-2)
    end = numpy.random.randint(start, min(start+max_len,len(alphabet)-1))
    sequence_in = alphabet[start:end+1]
    sequence_out = alphabet[end + 1]
    dataX.append([char_to_int[char] for char in sequence_in])
    dataY.append(char_to_int[sequence_out])
    print (sequence_in, '->', sequence_out)

Using TensorFlow backend.


PQRST -> U
W -> X
O -> P
OPQ -> R
IJKLM -> N
QRSTU -> V
ABCD -> E
X -> Y
GHIJ -> K
M -> N
XY -> Z
QRST -> U
ABC -> D
JKLMN -> O
OP -> Q
XY -> Z
D -> E
T -> U
B -> C
QRSTU -> V
HIJ -> K
JKLM -> N
ABCDE -> F
X -> Y
V -> W
DE -> F
DEFG -> H
BCDE -> F
EFGH -> I
BCDE -> F
FG -> H
RST -> U
TUV -> W
STUV -> W
LMN -> O
P -> Q
MNOP -> Q
JK -> L
MNOP -> Q
OPQRS -> T
UVWXY -> Z
PQRS -> T
D -> E
EFGH -> I
IJK -> L
WX -> Y
STUV -> W
MNOPQ -> R
P -> Q
WXY -> Z
VWX -> Y
V -> W
HI -> J
KLMNO -> P
UV -> W
JKL -> M
ABCDE -> F
WXY -> Z
M -> N
CDEF -> G
KLMNO -> P
RST -> U
RS -> T
W -> X
J -> K
WX -> Y
JKLMN -> O
MN -> O
L -> M
BCDE -> F
TU -> V
MNOPQ -> R
NOPQR -> S
HIJ -> K
JKLM -> N
STUVW -> X
QRST -> U
N -> O
VWXY -> Z
B -> C
UVWX -> Y
OP -> Q
K -> L
C -> D
X -> Y
ST -> U
JKLM -> N
B -> C
QR -> S
RS -> T
VWXY -> Z
S -> T
NOP -> Q
KLMNO -> P
IJ -> K
EF -> G
MNOP -> Q
WXY -> Z
HI -> J
P -> Q
STUVW -> X
Q -> R
MN -> O
O -> P
C -> D
L -> M
JKLM -> N
K -> L
IJKLM -> N
FGHIJ -> K
LM -> N
OPQ -> R
U -> V
HIJ

In [0]:
# Converter a lista de listas para matrizes e ajusta as sequências se necessário
# https://keras.io/preprocessing/sequence/
X = pad_sequences(dataX, maxlen = max_len, dtype='float32')

# Reshape de X para [samples, time steps, features]
X = numpy.reshape(X, (X.shape[0], max_len, 1))

# Normalização
X = X / float(len(alphabet))

# One-Hot Encoding para as variáveis de saída
y = np_utils.to_categorical(dataY)

# Fit do Modelo
batch_size = 1
model = Sequential()
model.add(LSTM(32, input_shape=(X.shape[1], 1)))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_1 (LSTM)                (None, 32)                4352      
_________________________________________________________________
dense_1 (Dense)              (None, 26)                858       
Total params: 5,210.0
Trainable params: 5,210
Non-trainable params: 0.0
_________________________________________________________________


In [0]:
model.fit(X, y, epochs=500, batch_size=batch_size, verbose=2)

# Performance do Modelo
scores = model.evaluate(X, y, verbose=0)
print("\nAcurácia do Modelo: %.2f%%" % (scores[1]*100))

Epoch 1/500
2s - loss: 3.0779 - acc: 0.0650
Epoch 2/500
2s - loss: 2.7660 - acc: 0.1290
Epoch 3/500
2s - loss: 2.4371 - acc: 0.1960
Epoch 4/500
2s - loss: 2.2109 - acc: 0.2600
Epoch 5/500
2s - loss: 2.0606 - acc: 0.3110
Epoch 6/500
2s - loss: 1.9385 - acc: 0.3270
Epoch 7/500
2s - loss: 1.8379 - acc: 0.3490
Epoch 8/500
2s - loss: 1.7541 - acc: 0.3760
Epoch 9/500
2s - loss: 1.6755 - acc: 0.4180
Epoch 10/500
2s - loss: 1.6035 - acc: 0.4490
Epoch 11/500
2s - loss: 1.5338 - acc: 0.4680
Epoch 12/500
2s - loss: 1.4755 - acc: 0.4930
Epoch 13/500
2s - loss: 1.4209 - acc: 0.5080
Epoch 14/500
2s - loss: 1.3661 - acc: 0.5460
Epoch 15/500
2s - loss: 1.3227 - acc: 0.5540
Epoch 16/500
2s - loss: 1.2758 - acc: 0.5810
Epoch 17/500
2s - loss: 1.2318 - acc: 0.5900
Epoch 18/500
2s - loss: 1.1993 - acc: 0.6160
Epoch 19/500
2s - loss: 1.1497 - acc: 0.6180
Epoch 20/500
2s - loss: 1.1226 - acc: 0.6380
Epoch 21/500
2s - loss: 1.0854 - acc: 0.6530
Epoch 22/500
2s - loss: 1.0623 - acc: 0.6600
Epoch 23/500
2s - l

2s - loss: 0.2342 - acc: 0.9320
Epoch 183/500
2s - loss: 0.2397 - acc: 0.9320
Epoch 184/500
2s - loss: 0.2445 - acc: 0.9270
Epoch 185/500
2s - loss: 0.3301 - acc: 0.8930
Epoch 186/500
2s - loss: 0.2326 - acc: 0.9320
Epoch 187/500
2s - loss: 0.2316 - acc: 0.9350
Epoch 188/500
2s - loss: 0.2330 - acc: 0.9280
Epoch 189/500
2s - loss: 0.2394 - acc: 0.9340
Epoch 190/500
2s - loss: 0.3140 - acc: 0.9100
Epoch 191/500
2s - loss: 0.3259 - acc: 0.9140
Epoch 192/500
2s - loss: 0.2266 - acc: 0.9340
Epoch 193/500
2s - loss: 0.2314 - acc: 0.9250
Epoch 194/500
2s - loss: 0.2276 - acc: 0.9370
Epoch 195/500
2s - loss: 0.2335 - acc: 0.9350
Epoch 196/500
2s - loss: 0.3367 - acc: 0.8910
Epoch 197/500
2s - loss: 0.2321 - acc: 0.9340
Epoch 198/500
2s - loss: 0.2222 - acc: 0.9370
Epoch 199/500
2s - loss: 0.2241 - acc: 0.9380
Epoch 200/500
2s - loss: 0.2339 - acc: 0.9310
Epoch 201/500
2s - loss: 0.2196 - acc: 0.9370
Epoch 202/500
2s - loss: 0.3115 - acc: 0.9010
Epoch 203/500
2s - loss: 0.2190 - acc: 0.9340
Ep

2s - loss: 0.1274 - acc: 0.9660
Epoch 362/500
2s - loss: 0.1281 - acc: 0.9710
Epoch 363/500
2s - loss: 0.1302 - acc: 0.9690
Epoch 364/500
2s - loss: 0.1335 - acc: 0.9650
Epoch 365/500
2s - loss: 0.1359 - acc: 0.9640
Epoch 366/500
2s - loss: 0.1322 - acc: 0.9660
Epoch 367/500
2s - loss: 0.2792 - acc: 0.9370
Epoch 368/500
2s - loss: 0.1231 - acc: 0.9730
Epoch 369/500
2s - loss: 0.1220 - acc: 0.9740
Epoch 370/500
2s - loss: 0.1243 - acc: 0.9700
Epoch 371/500
2s - loss: 0.1266 - acc: 0.9670
Epoch 372/500
2s - loss: 0.1259 - acc: 0.9700
Epoch 373/500
2s - loss: 0.1714 - acc: 0.9580
Epoch 374/500
2s - loss: 0.1442 - acc: 0.9700
Epoch 375/500
2s - loss: 0.1209 - acc: 0.9740
Epoch 376/500
2s - loss: 0.1229 - acc: 0.9700
Epoch 377/500
2s - loss: 0.1259 - acc: 0.9700
Epoch 378/500
2s - loss: 0.1250 - acc: 0.9680
Epoch 379/500
2s - loss: 0.1263 - acc: 0.9640
Epoch 380/500
2s - loss: 0.1263 - acc: 0.9670
Epoch 381/500
2s - loss: 0.2173 - acc: 0.9410
Epoch 382/500
2s - loss: 0.1172 - acc: 0.9820
Ep

In [0]:
# Demonstrando algumas previsões do modelo
true_data = 0
for i in range(20):
    pattern_index = numpy.random.randint(len(dataX))
    pattern = dataX[pattern_index]
    x = pad_sequences([pattern], maxlen=max_len, dtype='float32')
    x = numpy.reshape(x, (1, max_len, 1))
    x = x / float(len(alphabet))
    prediction = model.predict(x, verbose=0)
    index = numpy.argmax(prediction)
    result = int_to_char[index]
    seq_in = [int_to_char[value] for value in pattern]
    print (seq_in, "->", result)
    if result == int_to_char[dataY[pattern_index]]:
        true_data += 1
print("true data:", true_data/20 * 100)

['T', 'U', 'V', 'W', 'X'] -> Y
['V', 'W', 'X', 'Y'] -> Z
['A', 'B', 'C', 'D'] -> E
['C'] -> D
['K', 'L', 'M', 'N'] -> O
['B'] -> C
['C', 'D', 'E', 'F', 'G'] -> H
['Q', 'R'] -> S
['T', 'U', 'V', 'W', 'X'] -> Y
['D', 'E', 'F', 'G', 'H'] -> I
['B', 'C', 'D', 'E', 'F'] -> G
['C', 'D', 'E', 'F'] -> G
['C'] -> D
['K', 'L', 'M'] -> N
['B', 'C', 'D', 'E'] -> F
['N', 'O'] -> P
['P'] -> Q
['W'] -> X
['V', 'W', 'X'] -> Y
['C'] -> D
true data: 100.0


Podemos ver que, embora o modelo não tenha aprendido perfeitamente o alfabeto a partir de sequências geradas aleatoriamente, o resultado foi muito bom. O modelo não foi ajustado e pode exigir mais treinamento ou uma rede maior, ou ambos (um exercício para você). 