# Model zamieniający cyfry rzymskie na arabskie
- input: sekwencja cyfr rzymskich
- output: cyfry arabskie 


Examples: 
- input: sequence 'X' output: 10
- input: sequence 'XII' output: 12


Import modułów

In [1]:
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM
import numpy as np
import roman_numerals as cnv
import random

## Model

Utworzenie modelu

In [2]:
model = Sequential()
model.add(LSTM(128,input_shape=(None,1),return_sequences=True)) # sequences of singlen numbers
model.add(LSTM(128))
model.add(Dense(1))

model.compile(loss='mean_squared_error', optimizer="adam",metrics=['mae','mse'])
num_epochs = 0
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm (LSTM)                  (None, None, 128)         66560     
_________________________________________________________________
lstm_1 (LSTM)                (None, 128)               131584    
_________________________________________________________________
dense (Dense)                (None, 1)                 129       
Total params: 198,273
Trainable params: 198,273
Non-trainable params: 0
_________________________________________________________________


## Tworzenie zbioru danych

Metody pomocnicze

In [3]:
# helper method, converts sequence of numbers to text
def to_text(sample):
    return ''.join([idx2char[int(x)] for x in sample])
# helper method, converts text to sequence of numbers
def to_number(words):
    return np.array([char2idx[char] for char in words])

Utworzenie samples i labels

In [4]:
DATASET_SIZE=200

samples = []
labels = []
all_words = ''
max_len = 0

for i in range(DATASET_SIZE):
    labels.append(i + 1)
    words = cnv.convert(i + 1)
    samples.append(words)
    all_words += words
    if len(words)>max_len: 
        max_len = len(words)

all_words += ' '

print('Max len of text',max_len)
vocab = sorted(set(all_words))
vocab_size = len(vocab)
print('vocabulary (used letters)',vocab)
print ('unique characters',vocab_size)

Max len of text 9
vocabulary (used letters) [' ', 'C', 'I', 'L', 'V', 'X']
unique characters 6


#### Creating a mapping from unique characters to indices

In [5]:
char2idx = {char:index for index, char in enumerate(vocab)}
print('char2idx:\n',char2idx)
idx2char = np.array(vocab)
print('idx2char\n',idx2char)

char2idx:
 {' ': 0, 'C': 1, 'I': 2, 'L': 3, 'V': 4, 'X': 5}
idx2char
 [' ' 'C' 'I' 'L' 'V' 'X']


#### Convert letters to numbers using char2idx

In [6]:
samples_int = []
for s in samples:
    v = np.array([char2idx[char] for char in s])
    samples_int.append(v) # different sizes!
print(samples[123],' ->becomes-> ',samples_int[123])

CXXIV  ->becomes->  [1 5 5 2 4]


#### From list of lists to numpy - must have a fixed number of characters (30 -> max_len)

In [7]:
samples = np.zeros((DATASET_SIZE,max_len))
for i in range(len(samples_int)):
    for j in range(len(samples_int[i])):
        samples[i,j] = np.array(samples_int[i][j]) # all not used have '0' which is ' '
print('SAMPLES\n\n',samples)
print(samples.shape)

SAMPLES

 [[2. 0. 0. ... 0. 0. 0.]
 [2. 2. 0. ... 0. 0. 0.]
 [2. 2. 2. ... 0. 0. 0.]
 ...
 [1. 5. 1. ... 2. 0. 0.]
 [1. 5. 1. ... 0. 0. 0.]
 [1. 1. 0. ... 0. 0. 0.]]
(200, 9)


Rozszerzenie wymiaru samples i konwersja labels

In [8]:
samples = np.expand_dims(samples,axis=2) #add the third dimension
labels = np.array(labels,dtype=float)

print("Sample (for 123):\n",samples[123])
print("Sample decoded",to_text(samples[123]))
print("Label (output):",labels[123])

print('samples shape',samples.shape)
print('labels shape',labels.shape)

Sample (for 123):
 [[1.]
 [5.]
 [5.]
 [2.]
 [4.]
 [0.]
 [0.]
 [0.]
 [0.]]
Sample decoded CXXIV    
Label (output): 124.0
samples shape (200, 9, 1)
labels shape (200,)


Rozdzielenie danych na testowe i treningowe

In [9]:
TRAINING_SIZE = .5
from sklearn.model_selection import train_test_split
(trainSamples, testSamples, trainLabels, testLabels) = train_test_split(samples, labels,train_size=TRAINING_SIZE, random_state=1)
print('Training samples:',len(trainSamples),' test samples',len(testSamples))

Training samples: 100  test samples 100


Sprawdzanie modelu (i wywołanie sprawdzenia jak wygląda nieprzetrenowany)

In [10]:
def check_model(verbose=0,how_many=5):
    pred = model.predict(samples)
    print('text => [predicted value] error=[error]')
    error = []
    for i in range(len(pred)):
        res = samples[i]
        error.append(abs(i-pred[i]))
        if verbose==1:
            train = ''
            if i in trainLabels: train='[T]'
            print(i + 1,to_text(res),'=> {:.2f} error = {:.2f}'.format(pred[i,0],abs((i+1)-pred[i,0])),train)
    if verbose<1: # if not verbose just display 'how_many' random samples
        for i in range(how_many):        
            x = random.randrange(DATASET_SIZE)
            res = samples[x]
            print(to_text(res),'=>  {:.2f} error = {:.2f}'.format(pred[x,0],abs(x-pred[x,0])))      
    print('Mean error =',np.mean(error))        
    return np.mean(error)
check_model(1)

text => [predicted value] error=[error]
1 I         => 0.01 error = 0.99 
2 II        => 0.01 error = 1.99 [T]
3 III       => 0.02 error = 2.98 [T]
4 IV        => 0.02 error = 3.98 [T]
5 V         => 0.01 error = 4.99 [T]
6 VI        => 0.02 error = 5.98 
7 VII       => 0.03 error = 6.97 
8 VIII      => 0.03 error = 7.97 [T]
9 IX        => 0.02 error = 8.98 [T]
10 X         => 0.02 error = 9.98 [T]
11 XI        => 0.02 error = 10.98 [T]
12 XII       => 0.03 error = 11.97 [T]
13 XIII      => 0.04 error = 12.96 
14 XIV       => 0.04 error = 13.96 
15 XV        => 0.03 error = 14.97 
16 XVI       => 0.04 error = 15.96 
17 XVII      => 0.04 error = 16.96 [T]
18 XVIII     => 0.05 error = 17.95 
19 XIX       => 0.04 error = 18.96 
20 XX        => 0.03 error = 19.97 
21 XXI       => 0.04 error = 20.96 
22 XXII      => 0.05 error = 21.95 [T]
23 XXIII     => 0.05 error = 22.95 [T]
24 XXIV      => 0.05 error = 23.95 [T]
25 XXV       => 0.04 error = 24.96 [T]
26 XXVI      => 0.05 error = 25.95 [T

99.45477

Trenowanie modelu

In [11]:
EPOCHS=500
BATCH_SIZE = int(len(trainSamples)/10)
print('Training with',len(trainSamples),'samples',EPOCHS,'epochs and batch_size=',BATCH_SIZE)
for x in range(15):
    H = model.fit(trainSamples, trainLabels, epochs=EPOCHS,verbose=0,batch_size=BATCH_SIZE)
    num_epochs += EPOCHS
    print("\n{}/15 Epochs: {} - loss={:6.3f}, loss improvement={:6.3f}".
          format(x + 1, num_epochs,H.history['loss'][-1], H.history['loss'][0]-H.history['loss'][-1]))
    check_model()
print("Done")

Training with 100 samples 500 epochs and batch_size= 10

1/15 Epochs: 500 - loss=3099.394, loss improvement=8904.222
text => [predicted value] error=[error]
IV        =>  96.62 error = 93.62
LXXXVII   =>  96.62 error = 10.62
CXIV      =>  96.62 error = 16.38
XXX       =>  96.62 error = 67.62
CI        =>  96.62 error = 3.38
Mean error = 50.04136

2/15 Epochs: 1000 - loss=3099.076, loss improvement=-0.081
text => [predicted value] error=[error]
XXXVII    =>  96.49 error = 60.49
CX        =>  96.49 error = 12.51
XCVIII    =>  96.49 error = 0.51
CXCIII    =>  96.49 error = 95.51
CXCIX     =>  96.49 error = 101.51
Mean error = 50.045254

3/15 Epochs: 1500 - loss=3099.899, loss improvement=-0.745
text => [predicted value] error=[error]
XCIV      =>  96.67 error = 3.67
LXXX      =>  96.67 error = 17.67
CLXII     =>  96.67 error = 64.33
XXVIII    =>  96.67 error = 69.67
XIII      =>  96.67 error = 84.67
Mean error = 50.0398

4/15 Epochs: 2000 - loss=135.191, loss improvement=2963.930
text => 

Sprawdzenie przetrenowanego modelu

In [12]:
check_model(1)

text => [predicted value] error=[error]
1 I         => 1.03 error = 0.03 
2 II        => 1.98 error = 0.02 [T]
3 III       => 3.02 error = 0.02 [T]
4 IV        => 4.01 error = 0.01 [T]
5 V         => 6.34 error = 1.34 [T]
6 VI        => 5.63 error = 0.37 
7 VII       => 7.07 error = 0.07 
8 VIII      => 7.97 error = 0.03 [T]
9 IX        => 9.01 error = 0.01 [T]
10 X         => 10.03 error = 0.03 [T]
11 XI        => 11.01 error = 0.01 [T]
12 XII       => 10.64 error = 1.36 [T]
13 XIII      => 12.06 error = 0.94 
14 XIV       => 16.68 error = 2.68 
15 XV        => 13.71 error = 1.29 
16 XVI       => 16.07 error = 0.07 
17 XVII      => 18.72 error = 1.72 [T]
18 XVIII     => 20.23 error = 2.23 
19 XIX       => 37.46 error = 18.46 
20 XX        => 18.04 error = 1.96 
21 XXI       => 21.01 error = 0.01 
22 XXII      => 22.18 error = 0.18 [T]
23 XXIII     => 23.04 error = 0.04 [T]
24 XXIV      => 23.93 error = 0.07 [T]
25 XXV       => 24.90 error = 0.10 [T]
26 XXVI      => 26.13 error = 0.13 

4.3801203

Sprawdzenie przykładowej cyfry rzymskiej

In [19]:
x = to_number('X         ') 
x = np.expand_dims(x,axis=1)
x = np.expand_dims(x,axis=0)
model.predict(x)

array([[10.183619]], dtype=float32)

Zapisanie modelu

In [14]:
model.save('model_romans2numbers.h5')