<h1 style="text-align: center; color:#01872A; font-size: 80px;
background:#daf2e1; border-radius: 20px;
">LSTM calculator.</h1>

## Please use nbviewer to read this notebook to use all it's features:

https://nbviewer.org/github/sersonSerson/Projects/blob/master/NaturalLanguage/LSTMCalculator/LSTMcalculator.ipynb

# <span style="color:#01872A; display: block; padding:10px; background:#daf2e1;border-radius:20px; text-align: center; font-size: 40px; "> Contents </span>

## 1. [Introduction.](#Step1)
## 2. [Generate data.](#Step2)
## 3. [Data preprocessing.](#Step3)
## 4. [Create and fit NN model.](#Step4)
## 5. [Make a prediction and check the results.](#Step5)
## 6. [Conclusion.](#Step6)

<div id="Step1">
</div>

# <span style="color:#01872A; display: block; padding:10px; background:#daf2e1;border-radius:20px; text-align: center; font-size: 40px; "> Step 1. Introduction. </span>


## Goal: predict summation of two numbers using RNN (LSTM).

Used for training seq-2-seq prediction techniques.

### Idea:
1. Step 1: create a dataset (3*5=15)
2. Step 2: convert to the symbols (['3', '*', '5'...)
3. Step 3: encode the symbols ([3, 12, 5, ..])
4. Step 4: convert into one-hot encoded format ([0, 0, 0, 1, ...])
5. Step 5: create LSTM model and fit it.
6. Step 6. Make a prediction.

Overall result:
Input: 5*3
Output: 15

<div id="Step2">
</div>

# <span style="color:#01872A; display: block; padding:10px; background:#daf2e1;border-radius:20px; text-align: center; font-size: 40px; "> Step 2. Generate data. </span>

In [73]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import random

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, TimeDistributed, Dense, RepeatVector

In [74]:
n_samples = 5
n_numbers = 2
max_num = 9

[[3, 4], [9, 2], [5, 8], [0, 7], [4, 7]]

In [None]:
def generate_data(n_samples, n_numbers, max_num):
    X = []
    y = []
    for i in range(n_samples):
        numbers = [random.randint(0, max_num) for _ in range(n_numbers)]
        X.append(numbers)
        y.append(numbers[0] * numbers[1])
    return X, y
X, y = generate_data(n_samples, n_numbers, max_num)
X

In [75]:
y

[12, 18, 40, 0, 28]

<div id="Step3">
</div>

# <span style="color:#01872A; display: block; padding:10px; background:#daf2e1;border-radius:20px; text-align: center; font-size: 40px; "> Step 3. Data preprocessing. </span>

## Convert data to string

In [76]:
def data_to_str(X, y, max_y_length):
    X_str = X.copy()
    for index, numbers in enumerate(X):
        X_str[index] = '*'.join([str(number) for number in numbers])
    y_str = \
        [' ' * (max_y_length - len(str(number))) + str(number) for number in y]
    return X_str, y_str
X, y = data_to_str(X, y, max_y_length=2)
X

['3*4', '9*2', '5*8', '0*7', '4*7']

In [77]:
y

['12', '18', '40', ' 0', '28']

## Create symbol vocabulary

In [78]:
symbols = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '*', ' ']
vocabulary = dict()
for number, symbol in enumerate(symbols):
    vocabulary[symbol] = number
vocab_length = len(vocabulary)
vocabulary

{'0': 0,
 '1': 1,
 '2': 2,
 '3': 3,
 '4': 4,
 '5': 5,
 '6': 6,
 '7': 7,
 '8': 8,
 '9': 9,
 '*': 10,
 ' ': 11}

## Encode data

In [79]:
def encode_data(X, y, vocabulary):
    for num, seq in enumerate(X):
        X[num] = [vocabulary[symbol] for symbol in seq]
    for num, seq in enumerate(y):
        y[num] = [vocabulary[symbol] for symbol in seq]
    return X, y
encode_data(X, y, vocabulary)

([[3, 10, 4], [9, 10, 2], [5, 10, 8], [0, 10, 7], [4, 10, 7]],
 [[1, 2], [1, 8], [4, 0], [11, 0], [2, 8]])

## One-hot encode data

In [80]:
def one_hot_encode(dataset):
    dataset_one_hot = []
    for seq in dataset:
        seq_one_hot = []
        for value in seq:
            encoded_seq = [0 for i in range(vocab_length)]
            encoded_seq[value] = 1
            seq_one_hot.append(encoded_seq)
        dataset_one_hot.append(seq_one_hot)

    return np.array(dataset_one_hot)
X = one_hot_encode(X)
y = one_hot_encode(y)
X[:1]

array([[[0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0],
        [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]]])

## Function to decode data from one-hot encoded format to original state (like 5 + 3)

In [81]:
def decode(dataset, vocabulary):
    dataset_dec = []
    vocabulary_inv = {value: key for key, value in vocabulary.items()}
    for sample_seq in dataset:
        sample = []
        for symbol_seq in sample_seq:
            # result = np.where(symbol_seq == 1)
            result = np.argmax(symbol_seq)
            symbol = vocabulary_inv[result]
            sample.append(symbol)
        sample = ''.join(sample)
        dataset_dec.append(sample)
    return dataset_dec
decode(X, vocabulary)

['3*4', '9*2', '5*8', '0*7', '4*7']

In [82]:
decode(y, vocabulary)

['12', '18', '40', ' 0', '28']

## Function to create data end-to-end

In [83]:
def create_data(n_samples, n_numbers, max_num):
    X, y = generate_data(n_samples, n_numbers, max_num)
    X, y = data_to_str(X, y, max_y_length=2)
    X, y = encode_data(X, y, vocabulary)
    X = one_hot_encode(X)
    y = one_hot_encode(y)

    return X, y

n_samples = 1000
n_numbers = 2
max_num = 9
n_chars = len(vocabulary)

X, y = create_data(n_samples, n_numbers, max_num)
X.shape, y.shape

((1000, 3, 12), (1000, 2, 12))

<div id="Step4">
</div>

# <span style="color:#01872A; display: block; padding:10px; background:#daf2e1;border-radius:20px; text-align: center; font-size: 40px; "> Step 4. Create and fit NN model. </span>

In [84]:
model = Sequential()
model.add(LSTM(100, input_shape=(n_numbers + 1, n_chars)))
model.add(RepeatVector(2))
model.add(LSTM(50, return_sequences=True))
model.add(TimeDistributed(Dense(n_chars, activation='softmax')))
model.compile(optimizer='adam', loss='categorical_crossentropy',
              metrics=['accuracy'])
model.summary()
print(model.output.shape)
print(model.input.shape)

Model: "sequential_9"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_18 (LSTM)               (None, 100)               45200     
_________________________________________________________________
repeat_vector_9 (RepeatVecto (None, 2, 100)            0         
_________________________________________________________________
lstm_19 (LSTM)               (None, 2, 50)             30200     
_________________________________________________________________
time_distributed_6 (TimeDist (None, 2, 12)             612       
Total params: 76,012
Trainable params: 76,012
Non-trainable params: 0
_________________________________________________________________
(None, 2, 12)
(None, 3, 12)


## Fit model

In [85]:
model.fit(X, y, batch_size=10, epochs=20)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<tensorflow.python.keras.callbacks.History at 0x8b6ca700>

<div id="Step5">
</div>

# <span style="color:#01872A; display: block; padding:10px; background:#daf2e1;border-radius:20px; text-align: center; font-size: 40px; "> Step 5. Make a prediction and check the results. </span>

## Create test data and make a prediction

In [86]:
n_test_samples = 10
X_test, y_test = create_data(n_samples=n_test_samples, n_numbers=n_numbers,
                             max_num=max_num)
y_pred = model.predict(X_test, batch_size=10, verbose=0)



In [87]:
X_test_dec = decode(X_test, vocabulary)
y_test_dec = decode(y_test, vocabulary)
y_pred_dec = decode(y_pred, vocabulary)

for sample_n in range(n_test_samples):
    print(f'Sample: {X_test_dec[sample_n]}={y_test_dec[sample_n]} Predicted: '
          f'{y_test_dec[sample_n]}')

Sample: 7*7=49 Predicted: 49
Sample: 6*4=24 Predicted: 24
Sample: 3*9=27 Predicted: 27
Sample: 1*8= 8 Predicted:  8
Sample: 6*9=54 Predicted: 54
Sample: 3*9=27 Predicted: 27
Sample: 1*4= 4 Predicted:  4
Sample: 7*4=28 Predicted: 28
Sample: 0*2= 0 Predicted:  0
Sample: 0*8= 0 Predicted:  0


<div id="Step6">
</div>

# <span style="color:#01872A; display: block; padding:10px; background:#daf2e1;border-radius:20px; text-align: center; font-size: 40px; "> Step 6. Conclusion. </span>

## The model for sequence predictions was build and works correctly.
* The model could be expanded to accept more operations ('+', '-', '/').
