# Adder & Subtractor Practice

## Outline
* 1. [Coding](#coding)
* 2. [Run & Analysis](#analysis)
    * 2.1 [Adder (and test more number of digits)](#adder)
    * 2.2 [Subtractor](#subtractor)
    * 2.3 [Combine (and test more data, epoch)](#combine)
    * 2.4 [Multiplicator (and test binary input)](#multiplicator)
* 3. [Conclusion](#conclusion)

---

## <a name="coding"></a>1. Coding

In [1]:
import numpy as np
from keras import layers
from keras.models import Sequential
from six.moves import range

Using TensorFlow backend.


In [2]:
class CharacterTable(object):
    def __init__(self, chars):
        self.chars = sorted(set(chars))
        self.char_indices = dict((c, i) for i, c in enumerate(self.chars))
        self.indices_char = dict((i, c) for i, c in enumerate(self.chars))
    
    def encode(self, C, num_rows):
        x = np.zeros((num_rows, len(self.chars)))
        for i, c in enumerate(C):
            x[i, self.char_indices[c]] = 1
        return x
    
    def decode(self, x, calc_argmax=True):
        if calc_argmax:
            x = x.argmax(axis=-1)
        return ''.join(self.indices_char[x] for x in x)

In [3]:
class colors:
    ok = '\033[92m'
    fail = '\033[91m'
    close = '\033[0m'

In [4]:
def vectorize(ctable, questions, expected, INPUT_MAXLEN, OUTPUT_MAXLEN, chars):
    x = np.zeros((len(questions), INPUT_MAXLEN, len(chars)), dtype=np.bool)
    y = np.zeros((len(questions), OUTPUT_MAXLEN, len(chars)), dtype=np.bool)
    for i, sentence in enumerate(questions):
        x[i] = ctable.encode(sentence, INPUT_MAXLEN)
    for i, sentence in enumerate(expected):
        y[i] = ctable.encode(sentence, OUTPUT_MAXLEN)
    
    indices = np.arange(len(y))
    np.random.shuffle(indices)
    x = x[indices]
    y = y[indices]
    
    split_at = len(x) - len(x) // 10
    (x_train, x_val) = x[:split_at], x[split_at:]
    (y_train, y_val) = y[:split_at], y[split_at:]
    
    print('Training Data:')
    print(x_train.shape)
    print(y_train.shape)
    
    print('Validation Data:')
    print(x_val.shape)
    print(y_val.shape)
    
    return x_train, y_train, x_val, y_val

In [5]:
def build_model(LAYERS, HIDDEN_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, chars):
    model = Sequential()
    model.add(layers.LSTM(HIDDEN_SIZE, input_shape=(INPUT_MAXLEN, len(chars))))
    model.add(layers.RepeatVector(OUTPUT_MAXLEN))
    for _ in range(LAYERS):
        model.add(layers.LSTM(HIDDEN_SIZE, return_sequences=True))
    
    model.add(layers.TimeDistributed(layers.Dense(len(chars), activation='softmax')))
    model.compile(loss='categorical_crossentropy',
                  optimizer='adam',
                  metrics=['accuracy'])
    model.summary()
    
    return model

In [6]:
def training(model, ctable, BATCH_SIZE, ITERATION, x_train, y_train, x_val, y_val, REVERSE, test=True):
    for iter in range(ITERATION):
        print()
        print('-' * 50)
        print('Iteration', iter + 1)
        model.fit(x_train, y_train,
                  batch_size=BATCH_SIZE,
                  epochs=1,
                  validation_data=(x_val, y_val),
                  verbose=2)
        
        if test:
            testing(model, ctable, x_val, y_val, REVERSE)

In [7]:
def testing(model, ctable, x_val, y_val, REVERSE):
    for i in range(10):
        ind = np.random.randint(0, len(x_val))
        rowx, rowy = x_val[np.array([ind])], y_val[np.array([ind])]
        preds = model.predict_classes(rowx, verbose=0)
        q = ctable.decode(rowx[0])
        correct = ctable.decode(rowy[0])
        guess = ctable.decode(preds[0], calc_argmax=False)
        print('Q', q[::-1] if REVERSE else q, end=' ')
        print('T', correct, end=' ')
        if correct == guess:
            print(colors.ok + '☑' + colors.close, end=' ')
        else:
            print(colors.fail + '☒' + colors.close, end=' ')
        print(guess)

## <a name="analysis"></a>2. Run & Analysis

### <a name="adder"></a>2.1 Adder

#### 2.1.1 Parameters

In [18]:
TRAINING_SIZE = 50000
DIGITS = 3
INPUT_MAXLEN = DIGITS + 1 + DIGITS
OUTPUT_MAXLEN = DIGITS + 1
REVERSE = True
    
chars = '0123456789+ '
ctable = CharacterTable(chars)
    
HIDDEN_SIZE = 128
LAYERS = 1
    
ITERATION = 50
BATCH_SIZE = 128

#### 2.1.2 Generate data

In [19]:
def gen_data(ctable, TRAINING_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, DIGITS, REVERSE, chars):
    questions = []
    expected = []
    seen = set()
    
    while len(questions) < TRAINING_SIZE:
        fn = lambda: int(''.join(np.random.choice(list('0123456789')) for _ in range(np.random.randint(1, DIGITS + 1))))
        a, b = fn(), fn()
        
        key = tuple(sorted((a, b)))
        if key in seen:
            continue
        seen.add(key)
        
        q = '{}+{}'.format(a, b)
        query = q + ' ' * (INPUT_MAXLEN - len(q))
        
        ans = str(a + b)
        ans += ' ' * (OUTPUT_MAXLEN - len(ans))
        
        if REVERSE:
            query = query[::-1]
        
        questions.append(query)
        expected.append(ans)
    print('Total questions:', len(questions))
    
    return vectorize(ctable, questions, expected, INPUT_MAXLEN, OUTPUT_MAXLEN, chars)

#### 2.1.3 Training

In [14]:
x_train, y_train, x_val, y_val = gen_data(ctable, TRAINING_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, DIGITS,REVERSE, chars)
model = build_model(LAYERS, HIDDEN_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, chars)
training(model, ctable, BATCH_SIZE, ITERATION, x_train, y_train, x_val, y_val, REVERSE)

Total questions: 50000
Training Data:
(45000, 7, 12)
(45000, 4, 12)
Validation Data:
(5000, 7, 12)
(5000, 4, 12)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_1 (LSTM)                (None, 128)               72192     
_________________________________________________________________
repeat_vector_1 (RepeatVecto (None, 4, 128)            0         
_________________________________________________________________
lstm_2 (LSTM)                (None, 4, 128)            131584    
_________________________________________________________________
time_distributed_1 (TimeDist (None, 4, 12)             1548      
Total params: 205,324
Trainable params: 205,324
Non-trainable params: 0
_________________________________________________________________

--------------------------------------------------
Iteration 1
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 9s - loss: 1.8831 - acc: 0.32

 - 6s - loss: 0.3021 - acc: 0.9146 - val_loss: 0.2670 - val_acc: 0.9311
Q 17+34   T 51   [92m☑[0m 51  
Q 35+994  T 1029 [92m☑[0m 1029
Q 888+73  T 961  [92m☑[0m 961 
Q 603+2   T 605  [92m☑[0m 605 
Q 343+72  T 415  [92m☑[0m 415 
Q 458+26  T 484  [92m☑[0m 484 
Q 36+29   T 65   [91m☒[0m 66  
Q 28+296  T 324  [92m☑[0m 324 
Q 832+342 T 1174 [92m☑[0m 1174
Q 848+763 T 1611 [92m☑[0m 1611

--------------------------------------------------
Iteration 16
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 6s - loss: 0.2209 - acc: 0.9478 - val_loss: 0.1854 - val_acc: 0.9617
Q 827+71  T 898  [92m☑[0m 898 
Q 626+204 T 830  [92m☑[0m 830 
Q 26+269  T 295  [92m☑[0m 295 
Q 15+562  T 577  [92m☑[0m 577 
Q 78+684  T 762  [92m☑[0m 762 
Q 510+970 T 1480 [92m☑[0m 1480
Q 581+609 T 1190 [92m☑[0m 1190
Q 23+747  T 770  [92m☑[0m 770 
Q 85+645  T 730  [92m☑[0m 730 
Q 834+7   T 841  [92m☑[0m 841 

--------------------------------------------------
Iteration 17
Train o

 - 6s - loss: 0.0372 - acc: 0.9899 - val_loss: 0.0199 - val_acc: 0.9958
Q 5+213   T 218  [92m☑[0m 218 
Q 29+979  T 1008 [92m☑[0m 1008
Q 498+583 T 1081 [92m☑[0m 1081
Q 481+99  T 580  [92m☑[0m 580 
Q 519+9   T 528  [92m☑[0m 528 
Q 87+20   T 107  [92m☑[0m 107 
Q 331+17  T 348  [92m☑[0m 348 
Q 133+95  T 228  [92m☑[0m 228 
Q 698+91  T 789  [92m☑[0m 789 
Q 576+785 T 1361 [92m☑[0m 1361

--------------------------------------------------
Iteration 32
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 6s - loss: 0.0092 - acc: 0.9994 - val_loss: 0.0116 - val_acc: 0.9984
Q 479+2   T 481  [92m☑[0m 481 
Q 346+97  T 443  [92m☑[0m 443 
Q 953+33  T 986  [92m☑[0m 986 
Q 6+336   T 342  [92m☑[0m 342 
Q 40+704  T 744  [92m☑[0m 744 
Q 72+350  T 422  [92m☑[0m 422 
Q 42+481  T 523  [92m☑[0m 523 
Q 7+329   T 336  [92m☑[0m 336 
Q 793+7   T 800  [92m☑[0m 800 
Q 396+398 T 794  [92m☑[0m 794 

--------------------------------------------------
Iteration 33
Train o

 - 6s - loss: 0.0022 - acc: 1.0000 - val_loss: 0.0043 - val_acc: 0.9993
Q 486+9   T 495  [92m☑[0m 495 
Q 638+41  T 679  [92m☑[0m 679 
Q 491+50  T 541  [92m☑[0m 541 
Q 699+44  T 743  [92m☑[0m 743 
Q 246+123 T 369  [92m☑[0m 369 
Q 960+13  T 973  [92m☑[0m 973 
Q 410+389 T 799  [92m☑[0m 799 
Q 681+54  T 735  [92m☑[0m 735 
Q 80+814  T 894  [92m☑[0m 894 
Q 698+26  T 724  [92m☑[0m 724 

--------------------------------------------------
Iteration 48
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 6s - loss: 0.0020 - acc: 1.0000 - val_loss: 0.0042 - val_acc: 0.9992
Q 87+546  T 633  [92m☑[0m 633 
Q 742+236 T 978  [92m☑[0m 978 
Q 392+43  T 435  [92m☑[0m 435 
Q 0+687   T 687  [92m☑[0m 687 
Q 28+524  T 552  [92m☑[0m 552 
Q 236+83  T 319  [92m☑[0m 319 
Q 64+682  T 746  [92m☑[0m 746 
Q 24+535  T 559  [92m☑[0m 559 
Q 316+41  T 357  [92m☑[0m 357 
Q 37+73   T 110  [92m☑[0m 110 

--------------------------------------------------
Iteration 49
Train o

#### 2.1.4 Analysis

- 在Epoch 22時達到99%的精準度，Epoch 22達到99.9%
- 三位數相加對模型算是非常簡單的問題

#### 2.1.5 Test different number of digits and analysis

In [20]:
DIGITS = 10
INPUT_MAXLEN = DIGITS + 1 + DIGITS
OUTPUT_MAXLEN = DIGITS + 1

x_train, y_train, x_val, y_val = gen_data(ctable, TRAINING_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, DIGITS,REVERSE, chars)
model = build_model(LAYERS, HIDDEN_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, chars)
training(model, ctable, BATCH_SIZE, ITERATION, x_train, y_train, x_val, y_val, REVERSE)

Total questions: 50000
Training Data:
(45000, 21, 12)
(45000, 11, 12)
Validation Data:
(5000, 21, 12)
(5000, 11, 12)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_5 (LSTM)                (None, 128)               72192     
_________________________________________________________________
repeat_vector_3 (RepeatVecto (None, 11, 128)           0         
_________________________________________________________________
lstm_6 (LSTM)                (None, 11, 128)           131584    
_________________________________________________________________
time_distributed_3 (TimeDist (None, 11, 12)            1548      
Total params: 205,324
Trainable params: 205,324
Non-trainable params: 0
_________________________________________________________________

--------------------------------------------------
Iteration 1
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 16s - loss: 1.6508 - acc:

 - 15s - loss: 1.0274 - acc: 0.6069 - val_loss: 1.0216 - val_acc: 0.6092
Q 9247276+4171          T 9251447     [91m☒[0m 9247079    
Q 947+1                 T 948         [91m☒[0m 950        
Q 484+5590              T 6074        [91m☒[0m 5676       
Q 235806+7646692218     T 7646928024  [91m☒[0m 7646668800 
Q 48173+1229621217      T 1229669390  [91m☒[0m 1229666650 
Q 68087836+19009        T 68106845    [91m☒[0m 68097000   
Q 4261+8970932842       T 8970937103  [91m☒[0m 8990006668 
Q 390+46381             T 46771       [91m☒[0m 46590      
Q 431+429638322         T 429638753   [91m☒[0m 429633558  
Q 3+60756069            T 60756072    [91m☒[0m 60750066   

--------------------------------------------------
Iteration 11
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 14s - loss: 1.0066 - acc: 0.6168 - val_loss: 0.9949 - val_acc: 0.6210
Q 3+240547313           T 240547316   [91m☒[0m 220544333  
Q 4052974771+54209377   T 4107184148  [91m☒[0m 4171333333

 - 15s - loss: 0.8414 - acc: 0.6787 - val_loss: 0.8549 - val_acc: 0.6719
Q 7926266499+3500       T 7926269999  [91m☒[0m 7926267335 
Q 74072+8278543         T 8352615     [91m☒[0m 8303377    
Q 0+9067628             T 9067628     [91m☒[0m 9067637    
Q 38069458+7345463      T 45414921    [91m☒[0m 42723662   
Q 9979150662+348002     T 9979498664  [91m☒[0m 9979606777 
Q 915489963+7915        T 915497878   [91m☒[0m 915496378  
Q 9296324893+0          T 9296324893  [91m☒[0m 9996328977 
Q 346+222894            T 223240      [91m☒[0m 223076     
Q 3574637+1206          T 3575843     [91m☒[0m 3575067    
Q 5158761734+430811877  T 5589573611  [91m☒[0m 5666666677 

--------------------------------------------------
Iteration 22
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 14s - loss: 0.8275 - acc: 0.6835 - val_loss: 0.8455 - val_acc: 0.6730
Q 75+322394288          T 322394363   [91m☒[0m 322294572  
Q 74868+5206197364      T 5206272232  [91m☒[0m 5206256222

 - 14s - loss: 0.7450 - acc: 0.7158 - val_loss: 0.7697 - val_acc: 0.7025
Q 33054656+614          T 33055270    [91m☒[0m 33055295   
Q 13+530                T 543         [91m☒[0m 547        
Q 423215+309955         T 733170      [91m☒[0m 761202     
Q 114154+28             T 114182      [91m☒[0m 114189     
Q 477837+7              T 477844      [92m☑[0m 477844     
Q 5778475508+625913894  T 6404389402  [91m☒[0m 6440223199 
Q 7592+2321650537       T 2321658129  [91m☒[0m 2221659666 
Q 77132999+3131         T 77136130    [91m☒[0m 77136140   
Q 8841201041+6528245862 T 15369446903 [91m☒[0m 15000000011
Q 454+30                T 484         [91m☒[0m 483        

--------------------------------------------------
Iteration 33
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 14s - loss: 0.7392 - acc: 0.7176 - val_loss: 0.7656 - val_acc: 0.7038
Q 608+51                T 659         [91m☒[0m 654        
Q 9179+96               T 9275        [91m☒[0m 9265      

 - 14s - loss: 0.6886 - acc: 0.7376 - val_loss: 0.7306 - val_acc: 0.7168
Q 95606565+31           T 95606596    [91m☒[0m 95606597   
Q 8122730+4742          T 8127472     [91m☒[0m 8127588    
Q 63126494+9721619      T 72848113    [91m☒[0m 72299392   
Q 5040941+43            T 5040984     [91m☒[0m 5040986    
Q 38875+6417401311      T 6417440186  [91m☒[0m 6417452223 
Q 3+9962339             T 9962342     [91m☒[0m 9962341    
Q 84549725+4            T 84549729    [92m☑[0m 84549729   
Q 2189+85565953         T 85568142    [91m☒[0m 85567328   
Q 74+749                T 823         [91m☒[0m 836        
Q 28023+1               T 28024       [92m☑[0m 28024      

--------------------------------------------------
Iteration 44
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 14s - loss: 0.6836 - acc: 0.7401 - val_loss: 0.7275 - val_acc: 0.7188
Q 60335+583957499       T 584017834   [91m☒[0m 583012221  
Q 8814381839+6          T 8814381845  [91m☒[0m 8814481848

- 精準度只到72%左右
- 越多位數需要更多的training data和epoch才能學得好
- 如果需要做更多位數的運算，應該像硬體設計的方式那樣，先設計1 bit的full adder後串聯起來

---

### <a name="subtractor"></a>2.2 Subtractor

#### 2.2.1 Parameters

In [16]:
TRAINING_SIZE = 50000
DIGITS = 3
INPUT_MAXLEN = DIGITS + 1 + DIGITS
OUTPUT_MAXLEN = DIGITS + 1
REVERSE = True
    
chars = '0123456789- '
ctable = CharacterTable(chars)
    
HIDDEN_SIZE = 128
LAYERS = 1
    
ITERATION = 50
BATCH_SIZE = 128

#### 2.2.2 Generate data

In [18]:
def gen_data(ctable, TRAINING_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, DIGITS, REVERSE, chars):
    questions = []
    expected = []
    seen = set()
    
    while len(questions) < TRAINING_SIZE:
        fn = lambda: int(''.join(np.random.choice(list('0123456789')) for _ in range(np.random.randint(1, DIGITS + 1))))
        a, b = fn(), fn()
        
        key = tuple(sorted((a, b)))
        if key in seen or a < b:
            continue
        seen.add(key)
        
        q = '{}-{}'.format(a, b)
        query = q + ' ' * (INPUT_MAXLEN - len(q))
        
        ans = str(a - b)
        ans += ' ' * (OUTPUT_MAXLEN - len(ans))
        
        if REVERSE:
            query = query[::-1]
        
        questions.append(query)
        expected.append(ans)
    print('Total questions:', len(questions))
    
    return vectorize(ctable, questions, expected, INPUT_MAXLEN, OUTPUT_MAXLEN, chars)

#### 2.2.3 Training

In [19]:
x_train, y_train, x_val, y_val = gen_data(ctable, TRAINING_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, DIGITS,REVERSE, chars)
model = build_model(LAYERS, HIDDEN_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, chars)
training(model, ctable, BATCH_SIZE, ITERATION, x_train, y_train, x_val, y_val, REVERSE)

Total questions: 50000
Training Data:
(45000, 7, 12)
(45000, 4, 12)
Validation Data:
(5000, 7, 12)
(5000, 4, 12)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_3 (LSTM)                (None, 128)               72192     
_________________________________________________________________
repeat_vector_2 (RepeatVecto (None, 4, 128)            0         
_________________________________________________________________
lstm_4 (LSTM)                (None, 4, 128)            131584    
_________________________________________________________________
time_distributed_2 (TimeDist (None, 4, 12)             1548      
Total params: 205,324
Trainable params: 205,324
Non-trainable params: 0
_________________________________________________________________

--------------------------------------------------
Iteration 1
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 7s - loss: 1.7535 - acc: 0.36

 - 6s - loss: 0.0801 - acc: 0.9814 - val_loss: 0.0831 - val_acc: 0.9796
Q 794-2   T 792  [92m☑[0m 792 
Q 629-109 T 520  [92m☑[0m 520 
Q 599-35  T 564  [92m☑[0m 564 
Q 581-61  T 520  [92m☑[0m 520 
Q 362-56  T 306  [92m☑[0m 306 
Q 771-96  T 675  [92m☑[0m 675 
Q 617-3   T 614  [92m☑[0m 614 
Q 65-9    T 56   [91m☒[0m 57  
Q 804-2   T 802  [92m☑[0m 802 
Q 214-13  T 201  [92m☑[0m 201 

--------------------------------------------------
Iteration 16
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 6s - loss: 0.0754 - acc: 0.9815 - val_loss: 0.1177 - val_acc: 0.9601
Q 83-52   T 31   [92m☑[0m 31  
Q 385-92  T 293  [92m☑[0m 293 
Q 854-88  T 766  [92m☑[0m 766 
Q 760-41  T 719  [92m☑[0m 719 
Q 928-23  T 905  [92m☑[0m 905 
Q 690-70  T 620  [92m☑[0m 620 
Q 975-808 T 167  [92m☑[0m 167 
Q 475-9   T 466  [92m☑[0m 466 
Q 707-8   T 699  [92m☑[0m 699 
Q 700-17  T 683  [92m☑[0m 683 

--------------------------------------------------
Iteration 17
Train o

 - 6s - loss: 0.0140 - acc: 0.9977 - val_loss: 0.0183 - val_acc: 0.9952
Q 865-177 T 688  [92m☑[0m 688 
Q 664-48  T 616  [92m☑[0m 616 
Q 968-63  T 905  [92m☑[0m 905 
Q 55-23   T 32   [92m☑[0m 32  
Q 339-95  T 244  [92m☑[0m 244 
Q 751-126 T 625  [92m☑[0m 625 
Q 228-22  T 206  [92m☑[0m 206 
Q 608-482 T 126  [92m☑[0m 126 
Q 118-25  T 93   [92m☑[0m 93  
Q 881-10  T 871  [92m☑[0m 871 

--------------------------------------------------
Iteration 32
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 6s - loss: 0.0298 - acc: 0.9921 - val_loss: 0.0305 - val_acc: 0.9911
Q 309-82  T 227  [92m☑[0m 227 
Q 488-79  T 409  [92m☑[0m 409 
Q 958-50  T 908  [92m☑[0m 908 
Q 278-184 T 94   [92m☑[0m 94  
Q 321-75  T 246  [92m☑[0m 246 
Q 582-81  T 501  [92m☑[0m 501 
Q 731-9   T 722  [92m☑[0m 722 
Q 616-78  T 538  [92m☑[0m 538 
Q 974-755 T 219  [92m☑[0m 219 
Q 240-24  T 216  [92m☑[0m 216 

--------------------------------------------------
Iteration 33
Train o

 - 6s - loss: 0.0042 - acc: 0.9997 - val_loss: 0.0064 - val_acc: 0.9989
Q 581-31  T 550  [92m☑[0m 550 
Q 950-737 T 213  [92m☑[0m 213 
Q 638-16  T 622  [92m☑[0m 622 
Q 866-702 T 164  [92m☑[0m 164 
Q 538-96  T 442  [92m☑[0m 442 
Q 50-25   T 25   [92m☑[0m 25  
Q 339-95  T 244  [92m☑[0m 244 
Q 91-7    T 84   [92m☑[0m 84  
Q 961-740 T 221  [92m☑[0m 221 
Q 319-34  T 285  [92m☑[0m 285 

--------------------------------------------------
Iteration 48
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 6s - loss: 0.0371 - acc: 0.9893 - val_loss: 0.0139 - val_acc: 0.9964
Q 92-62   T 30   [92m☑[0m 30  
Q 351-19  T 332  [92m☑[0m 332 
Q 949-853 T 96   [92m☑[0m 96  
Q 844-8   T 836  [92m☑[0m 836 
Q 681-32  T 649  [92m☑[0m 649 
Q 19-13   T 6    [91m☒[0m 7   
Q 796-6   T 790  [92m☑[0m 790 
Q 925-6   T 919  [92m☑[0m 919 
Q 941-600 T 341  [92m☑[0m 341 
Q 816-70  T 746  [92m☑[0m 746 

--------------------------------------------------
Iteration 49
Train o

#### 2.2.4 Analysis

- Epoch 22時達到99%的精準度
- 三位數相減相比相加需要更多時間訓練，但仍然是較簡單的問題

---

### <a name="combine"></a>2.3 Combine

#### 2.3.1 Parameters

In [12]:
TRAINING_SIZE = 50000
DIGITS = 3
INPUT_MAXLEN = DIGITS + 1 + DIGITS + 1 + DIGITS
OUTPUT_MAXLEN = DIGITS + 1
REVERSE = True
    
chars = '0123456789+- '
ctable = CharacterTable(chars)
    
HIDDEN_SIZE = 128
LAYERS = 1
    
ITERATION = 100
BATCH_SIZE = 128

#### 2.3.2 Generate data

In [13]:
def gen_data(ctable, TRAINING_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, DIGITS, REVERSE, chars):
    questions = []
    expected = []
    seen = set()
    
    while len(questions) < TRAINING_SIZE:
        fn = lambda: int(''.join(np.random.choice(list('0123456789')) for _ in range(np.random.randint(1, DIGITS + 1))))
        fs = lambda: ''.join(np.random.choice(list('+-')))
        
        a, b, c = fn(), fn(), fn()
        sign1, sign2 = fs(), fs()
        
        key = tuple(sorted((a, b, c)))
        ans = a + b if sign1 == '+' else a - b
        ans = ans + c if sign2 == '+' else ans - c
        if key in seen or ans < 0:
            continue
        seen.add(key)

        q = '{}{}{}{}{}'.format(a, sign1, b, sign2, c)
        query = q + ' ' * (INPUT_MAXLEN - len(q))
        
        ans = str(ans)
        ans += ' ' * (OUTPUT_MAXLEN - len(ans))
        
        if REVERSE:
            query = query[::-1]
        
        questions.append(query)
        expected.append(ans)
    print('Total questions:', len(questions))
    
    return vectorize(ctable, questions, expected, INPUT_MAXLEN, OUTPUT_MAXLEN, chars)

#### 2.3.3 Training

In [37]:
x_train, y_train, x_val, y_val = gen_data(ctable, TRAINING_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, DIGITS,REVERSE, chars)
model = build_model(LAYERS, HIDDEN_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, chars)
training(model, ctable, BATCH_SIZE, ITERATION, x_train, y_train, x_val, y_val, REVERSE)

Total questions: 50000
Training Data:
(45000, 11, 13)
(45000, 4, 13)
Validation Data:
(5000, 11, 13)
(5000, 4, 13)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_7 (LSTM)                (None, 128)               72704     
_________________________________________________________________
repeat_vector_4 (RepeatVecto (None, 4, 128)            0         
_________________________________________________________________
lstm_8 (LSTM)                (None, 4, 128)            131584    
_________________________________________________________________
time_distributed_4 (TimeDist (None, 4, 13)             1677      
Total params: 205,965
Trainable params: 205,965
Non-trainable params: 0
_________________________________________________________________

--------------------------------------------------
Iteration 1
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 9s - loss: 1.8457 - acc: 0.

 - 8s - loss: 1.0142 - acc: 0.6241 - val_loss: 1.0238 - val_acc: 0.6123
Q 792-51+115  T 856  [92m☑[0m 856 
Q 3+163+8     T 174  [91m☒[0m 179 
Q 74+7-5      T 76   [91m☒[0m 78  
Q 196+48+8    T 252  [91m☒[0m 266 
Q 57+555-1    T 611  [91m☒[0m 628 
Q 463-1+21    T 483  [91m☒[0m 498 
Q 57+206-0    T 263  [91m☒[0m 268 
Q 481+9+53    T 543  [91m☒[0m 548 
Q 867+87+7    T 961  [91m☒[0m 978 
Q 54-20+128   T 162  [91m☒[0m 153 

--------------------------------------------------
Iteration 15
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 8s - loss: 0.9878 - acc: 0.6351 - val_loss: 1.0044 - val_acc: 0.6188
Q 9+163-1     T 171  [91m☒[0m 179 
Q 8+559-1     T 566  [91m☒[0m 563 
Q 65-28+708   T 745  [91m☒[0m 750 
Q 738+3+895   T 1636 [91m☒[0m 1650
Q 599+388+217 T 1204 [91m☒[0m 1298
Q 3+69+6      T 78   [91m☒[0m 72  
Q 526+1-2     T 525  [91m☒[0m 528 
Q 2+85+832    T 919  [91m☒[0m 926 
Q 1+51+166    T 218  [91m☒[0m 201 
Q 82+7+493    T 582  [91m☒[

 - 8s - loss: 0.8049 - acc: 0.7023 - val_loss: 0.8494 - val_acc: 0.6796
Q 9+78-27     T 60   [91m☒[0m 64  
Q 3-98+608    T 513  [91m☒[0m 523 
Q 910+862-55  T 1717 [91m☒[0m 1728
Q 8-16+98     T 90   [91m☒[0m 99  
Q 4+392+82    T 478  [91m☒[0m 573 
Q 4-78+737    T 663  [91m☒[0m 660 
Q 391+97-6    T 482  [91m☒[0m 483 
Q 293+234-3   T 524  [91m☒[0m 515 
Q 712+659+521 T 1892 [91m☒[0m 1817
Q 182+935+52  T 1169 [91m☒[0m 1161

--------------------------------------------------
Iteration 30
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 8s - loss: 0.7964 - acc: 0.7050 - val_loss: 0.8448 - val_acc: 0.6814
Q 862-3+135   T 994  [91m☒[0m 900 
Q 903-62+14   T 855  [91m☒[0m 852 
Q 774+7-286   T 495  [91m☒[0m 410 
Q 57-5-40     T 12   [91m☒[0m 13  
Q 14-4+0      T 10   [91m☒[0m 12  
Q 445+3+3     T 451  [91m☒[0m 453 
Q 33+74-92    T 15   [91m☒[0m 1   
Q 31-72+179   T 138  [91m☒[0m 133 
Q 98+865-97   T 866  [91m☒[0m 863 
Q 908-95-638  T 175  [91m☒[

 - 8s - loss: 0.7013 - acc: 0.7429 - val_loss: 0.8021 - val_acc: 0.6976
Q 482-299-10  T 173  [91m☒[0m 106 
Q 5+0+581     T 586  [91m☒[0m 584 
Q 8+51+530    T 589  [91m☒[0m 586 
Q 8+505-45    T 468  [91m☒[0m 471 
Q 78-70+79    T 87   [91m☒[0m 81  
Q 582+477-682 T 377  [91m☒[0m 311 
Q 298+92+2    T 392  [91m☒[0m 387 
Q 39-5+19     T 53   [91m☒[0m 51  
Q 614-47-355  T 212  [91m☒[0m 291 
Q 3+607-152   T 458  [91m☒[0m 430 

--------------------------------------------------
Iteration 45
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 8s - loss: 0.6999 - acc: 0.7424 - val_loss: 0.8042 - val_acc: 0.6969
Q 0+299+0     T 299  [91m☒[0m 300 
Q 43+954-39   T 958  [91m☒[0m 955 
Q 44+103+848  T 995  [91m☒[0m 990 
Q 2+288-48    T 242  [91m☒[0m 239 
Q 877+87-61   T 903  [91m☒[0m 899 
Q 1-70+247    T 178  [92m☑[0m 178 
Q 3+2+12      T 17   [92m☑[0m 17  
Q 862-3+135   T 994  [91m☒[0m 990 
Q 188-3-90    T 95   [91m☒[0m 90  
Q 425+77+1    T 503  [91m☒[

 - 8s - loss: 0.6284 - acc: 0.7717 - val_loss: 0.7791 - val_acc: 0.7105
Q 269-7+40    T 302  [91m☒[0m 300 
Q 958-65+419  T 1312 [91m☒[0m 1315
Q 3+53+3      T 59   [91m☒[0m 50  
Q 820-30+96   T 886  [91m☒[0m 889 
Q 83+872+6    T 961  [91m☒[0m 964 
Q 315+8+9     T 332  [91m☒[0m 331 
Q 25+48-69    T 4    [92m☑[0m 4   
Q 260-1+834   T 1093 [91m☒[0m 1099
Q 3+3+76      T 82   [92m☑[0m 82  
Q 7+59+0      T 66   [92m☑[0m 66  

--------------------------------------------------
Iteration 60
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 8s - loss: 0.6246 - acc: 0.7722 - val_loss: 0.7689 - val_acc: 0.7135
Q 8-2+100     T 106  [91m☒[0m 104 
Q 297-3+3     T 297  [92m☑[0m 297 
Q 92+693-381  T 404  [91m☒[0m 484 
Q 42-2+75     T 115  [91m☒[0m 116 
Q 658+238+364 T 1260 [91m☒[0m 1250
Q 495-245+49  T 299  [91m☒[0m 250 
Q 70+697-92   T 675  [91m☒[0m 670 
Q 30+197+7    T 234  [91m☒[0m 236 
Q 14-4+0      T 10   [92m☑[0m 10  
Q 737-52-1    T 684  [91m☒[

 - 8s - loss: 0.5662 - acc: 0.7948 - val_loss: 0.7874 - val_acc: 0.7176
Q 438+985-634 T 789  [91m☒[0m 742 
Q 65-28+708   T 745  [91m☒[0m 744 
Q 3+493-29    T 467  [91m☒[0m 470 
Q 378+5+16    T 399  [91m☒[0m 400 
Q 625+683+21  T 1329 [91m☒[0m 1326
Q 358+8-70    T 296  [91m☒[0m 299 
Q 15+933+6    T 954  [91m☒[0m 951 
Q 61-0+15     T 76   [91m☒[0m 78  
Q 9+95+14     T 118  [91m☒[0m 121 
Q 7-5+73      T 75   [92m☑[0m 75  

--------------------------------------------------
Iteration 75
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 8s - loss: 0.5634 - acc: 0.7957 - val_loss: 0.7788 - val_acc: 0.7198
Q 1+88+26     T 115  [91m☒[0m 114 
Q 1-61+373    T 313  [91m☒[0m 315 
Q 931+794+941 T 2666 [91m☒[0m 2456
Q 0+318+276   T 594  [91m☒[0m 682 
Q 423-70-8    T 345  [91m☒[0m 349 
Q 39+92-40    T 91   [91m☒[0m 93  
Q 64+64-35    T 93   [91m☒[0m 96  
Q 73+84+9     T 166  [92m☑[0m 166 
Q 104-0+1     T 105  [92m☑[0m 105 
Q 46+816+6    T 868  [91m☒[

 - 8s - loss: 0.5205 - acc: 0.8117 - val_loss: 0.8229 - val_acc: 0.7141
Q 134-1-6     T 127  [91m☒[0m 125 
Q 5-75+336    T 266  [91m☒[0m 264 
Q 835+71+568  T 1474 [91m☒[0m 1472
Q 1+950+27    T 978  [91m☒[0m 979 
Q 539+84-8    T 615  [91m☒[0m 610 
Q 8-7+2       T 3    [92m☑[0m 3   
Q 35-12+9     T 32   [91m☒[0m 39  
Q 835+88+2    T 925  [91m☒[0m 920 
Q 198-42+16   T 172  [91m☒[0m 173 
Q 58+111-6    T 163  [91m☒[0m 160 

--------------------------------------------------
Iteration 90
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 8s - loss: 0.5038 - acc: 0.8183 - val_loss: 0.8165 - val_acc: 0.7188
Q 85+618-1    T 702  [91m☒[0m 704 
Q 483+801-30  T 1254 [91m☒[0m 1260
Q 60+4+711    T 775  [91m☒[0m 770 
Q 53-1+77     T 129  [91m☒[0m 120 
Q 547-20+4    T 531  [92m☑[0m 531 
Q 7+540+851   T 1398 [91m☒[0m 1309
Q 94-1+9      T 102  [92m☑[0m 102 
Q 533+33-6    T 560  [91m☒[0m 561 
Q 7+93+0      T 100  [92m☑[0m 100 
Q 71+15+52    T 138  [91m☒[

#### 2.3.4 Analysis

- 精準度最高只到72%
- 相比兩個數字相加或相減，這個問題更困難，需要更多Data和Epoch

#### 2.3.5 Test more data and epoch, and analysis

In [14]:
TRAINING_SIZE = 100000
ITERATION = 200

x_train, y_train, x_val, y_val = gen_data(ctable, TRAINING_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, DIGITS,REVERSE, chars)
model = build_model(LAYERS, HIDDEN_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, chars)
training(model, ctable, BATCH_SIZE, ITERATION, x_train, y_train, x_val, y_val, REVERSE, test=False)
test(model, ctable, x_val, y_val, REVERSE)

Total questions: 100000
Training Data:
(90000, 11, 13)
(90000, 4, 13)
Validation Data:
(10000, 11, 13)
(10000, 4, 13)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_1 (LSTM)                (None, 128)               72704     
_________________________________________________________________
repeat_vector_1 (RepeatVecto (None, 4, 128)            0         
_________________________________________________________________
lstm_2 (LSTM)                (None, 4, 128)            131584    
_________________________________________________________________
time_distributed_1 (TimeDist (None, 4, 13)             1677      
Total params: 205,965
Trainable params: 205,965
Non-trainable params: 0
_________________________________________________________________

--------------------------------------------------
Iteration 1
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 18s - loss: 1.8219 - ac

 - 16s - loss: 0.6525 - acc: 0.7533 - val_loss: 0.7136 - val_acc: 0.7260

--------------------------------------------------
Iteration 38
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.6496 - acc: 0.7549 - val_loss: 0.7113 - val_acc: 0.7277

--------------------------------------------------
Iteration 39
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.6389 - acc: 0.7591 - val_loss: 0.6984 - val_acc: 0.7302

--------------------------------------------------
Iteration 40
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.6358 - acc: 0.7599 - val_loss: 0.6796 - val_acc: 0.7368

--------------------------------------------------
Iteration 41
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.6289 - acc: 0.7623 - val_loss: 0.6972 - val_acc: 0.7342

--------------------------------------------------
Iteration 42
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - l

 - 16s - loss: 0.4882 - acc: 0.8174 - val_loss: 0.6235 - val_acc: 0.7686

--------------------------------------------------
Iteration 80
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.4963 - acc: 0.8141 - val_loss: 0.6446 - val_acc: 0.7590

--------------------------------------------------
Iteration 81
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.4912 - acc: 0.8160 - val_loss: 0.6413 - val_acc: 0.7612

--------------------------------------------------
Iteration 82
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.4882 - acc: 0.8171 - val_loss: 0.6293 - val_acc: 0.7676

--------------------------------------------------
Iteration 83
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.4849 - acc: 0.8187 - val_loss: 0.6185 - val_acc: 0.7717

--------------------------------------------------
Iteration 84
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - l


--------------------------------------------------
Iteration 121
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.4065 - acc: 0.8487 - val_loss: 0.6898 - val_acc: 0.7666

--------------------------------------------------
Iteration 122
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.4061 - acc: 0.8493 - val_loss: 0.6918 - val_acc: 0.7617

--------------------------------------------------
Iteration 123
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.4105 - acc: 0.8469 - val_loss: 0.6393 - val_acc: 0.7798

--------------------------------------------------
Iteration 124
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.3993 - acc: 0.8517 - val_loss: 0.6438 - val_acc: 0.7773

--------------------------------------------------
Iteration 125
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.4034 - acc: 0.8488 - val_loss: 0.6541 - val_acc: 0.7755

----

 - 16s - loss: 0.3528 - acc: 0.8689 - val_loss: 0.7011 - val_acc: 0.7776

--------------------------------------------------
Iteration 163
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16s - loss: 0.3476 - acc: 0.8708 - val_loss: 0.6736 - val_acc: 0.7871

--------------------------------------------------
Iteration 164
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 17s - loss: 0.3470 - acc: 0.8712 - val_loss: 0.6776 - val_acc: 0.7869

--------------------------------------------------
Iteration 165
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 46s - loss: 0.3399 - acc: 0.8746 - val_loss: 0.7449 - val_acc: 0.7688

--------------------------------------------------
Iteration 166
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 34s - loss: 0.3477 - acc: 0.8705 - val_loss: 0.6960 - val_acc: 0.7798

--------------------------------------------------
Iteration 167
Train on 90000 samples, validate on 10000 samples
Epoch 1/1
 - 16

- 增加Data與Epoch次數可以提高精準度，從72%提高至79%

---

### <a name="multiplicator"></a>2.4 Multiplicator

#### 2.4.1 Parameters

In [8]:
TRAINING_SIZE = 50000
DIGITS = 3
INPUT_MAXLEN = DIGITS + 1 + DIGITS
OUTPUT_MAXLEN = DIGITS + DIGITS
REVERSE = True
    
chars = '0123456789* '
ctable = CharacterTable(chars)
    
HIDDEN_SIZE = 128
LAYERS = 1
    
ITERATION = 100
BATCH_SIZE = 128

#### 2.4.2 Generate data

In [18]:
def gen_data(ctable, TRAINING_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, DIGITS, REVERSE, chars):
    questions = []
    expected = []
    seen = set()
    
    while len(questions) < TRAINING_SIZE:
        fn = lambda: int(''.join(np.random.choice(list('0123456789')) for _ in range(np.random.randint(1, DIGITS + 1))))
        a, b = fn(), fn()
        
        key = tuple(sorted((a, b)))
        if key in seen:
            continue
        seen.add(key)
        
        q = '{}*{}'.format(a, b)
        query = q + ' ' * (INPUT_MAXLEN - len(q))
        
        ans = str(a * b)
        ans += ' ' * (OUTPUT_MAXLEN - len(ans))
        
        if REVERSE:
            query = query[::-1]
        
        questions.append(query)
        expected.append(ans)
    print('Total questions:', len(questions))
    
    return vectorize(ctable, questions, expected, INPUT_MAXLEN, OUTPUT_MAXLEN, chars)

#### 2.4.3 Training

In [17]:
x_train, y_train, x_val, y_val = gen_data(ctable, TRAINING_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, DIGITS,REVERSE, chars)
model = build_model(LAYERS, HIDDEN_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, chars)
training(model, ctable, BATCH_SIZE, ITERATION, x_train, y_train, x_val, y_val, REVERSE)

Total questions: 50000
Training Data:
(45000, 7, 12)
(45000, 6, 12)
Validation Data:
(5000, 7, 12)
(5000, 6, 12)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_3 (LSTM)                (None, 128)               72192     
_________________________________________________________________
repeat_vector_2 (RepeatVecto (None, 6, 128)            0         
_________________________________________________________________
lstm_4 (LSTM)                (None, 6, 128)            131584    
_________________________________________________________________
time_distributed_2 (TimeDist (None, 6, 12)             1548      
Total params: 205,324
Trainable params: 205,324
Non-trainable params: 0
_________________________________________________________________

--------------------------------------------------
Iteration 1
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 8s - loss: 1.8556 - acc: 0.31

 - 7s - loss: 1.1069 - acc: 0.5844 - val_loss: 1.0850 - val_acc: 0.5941
Q 568*456 T 259008 [91m☒[0m 251568
Q 7*704   T 4928   [91m☒[0m 4248  
Q 82*253  T 20746  [91m☒[0m 21154 
Q 71*442  T 31382  [91m☒[0m 26662 
Q 332*135 T 44820  [91m☒[0m 55660 
Q 697*39  T 27183  [91m☒[0m 26783 
Q 5*476   T 2380   [91m☒[0m 2270  
Q 1*888   T 888    [92m☑[0m 888   
Q 996*735 T 732060 [91m☒[0m 756630
Q 148*142 T 21016  [91m☒[0m 27774 

--------------------------------------------------
Iteration 15
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 7s - loss: 1.0459 - acc: 0.6120 - val_loss: 1.0221 - val_acc: 0.6215
Q 6*536   T 3216   [91m☒[0m 3206  
Q 8*294   T 2352   [91m☒[0m 2372  
Q 84*3    T 252    [91m☒[0m 222   
Q 906*374 T 338844 [91m☒[0m 334408
Q 0*956   T 0      [92m☑[0m 0     
Q 688*653 T 449264 [91m☒[0m 422424
Q 90*857  T 77130  [91m☒[0m 72230 
Q 412*868 T 357616 [91m☒[0m 317992
Q 709*822 T 582798 [91m☒[0m 512288
Q 61*939  T 57279  [91m☒[0m

 - 7s - loss: 0.7719 - acc: 0.7023 - val_loss: 0.8037 - val_acc: 0.6794
Q 123*64  T 7872   [91m☒[0m 7212  
Q 40*696  T 27840  [91m☒[0m 27640 
Q 477*4   T 1908   [91m☒[0m 1888  
Q 2*618   T 1236   [92m☑[0m 1236  
Q 966*102 T 98532  [91m☒[0m 90952 
Q 74*564  T 41736  [91m☒[0m 44616 
Q 263*923 T 242749 [91m☒[0m 236319
Q 956*459 T 438804 [91m☒[0m 412004
Q 27*200  T 5400   [92m☑[0m 5400  
Q 143*78  T 11154  [91m☒[0m 11474 

--------------------------------------------------
Iteration 30
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 7s - loss: 0.7619 - acc: 0.7065 - val_loss: 0.7974 - val_acc: 0.6838
Q 992*0   T 0      [92m☑[0m 0     
Q 179*21  T 3759   [91m☒[0m 3359  
Q 65*156  T 10140  [91m☒[0m 10050 
Q 573*483 T 276759 [91m☒[0m 263119
Q 30*269  T 8070   [91m☒[0m 8670  
Q 434*273 T 118482 [91m☒[0m 116522
Q 942*8   T 7536   [91m☒[0m 7596  
Q 283*738 T 208854 [91m☒[0m 203114
Q 66*166  T 10956  [91m☒[0m 11536 
Q 706*464 T 327584 [91m☒[0m

 - 7s - loss: 0.6392 - acc: 0.7544 - val_loss: 0.7246 - val_acc: 0.7109
Q 862*5   T 4310   [91m☒[0m 4250  
Q 319*854 T 272426 [91m☒[0m 266666
Q 3*214   T 642    [91m☒[0m 622   
Q 21*82   T 1722   [91m☒[0m 1702  
Q 60*404  T 24240  [92m☑[0m 24240 
Q 468*501 T 234468 [91m☒[0m 238548
Q 29*990  T 28710  [92m☑[0m 28710 
Q 575*328 T 188600 [91m☒[0m 197300
Q 319*70  T 22330  [91m☒[0m 21531 
Q 7*22    T 154    [92m☑[0m 154   

--------------------------------------------------
Iteration 45
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 7s - loss: 0.6385 - acc: 0.7543 - val_loss: 0.7020 - val_acc: 0.7172
Q 345*41  T 14145  [91m☒[0m 13745 
Q 369*806 T 297414 [91m☒[0m 297234
Q 57*535  T 30495  [91m☒[0m 30205 
Q 373*350 T 130550 [91m☒[0m 131050
Q 361*18  T 6498   [91m☒[0m 6038  
Q 4*814   T 3256   [91m☒[0m 3296  
Q 205*94  T 19270  [91m☒[0m 19770 
Q 370*70  T 25900  [91m☒[0m 27900 
Q 757*560 T 423920 [91m☒[0m 422520
Q 297*908 T 269676 [91m☒[0m

 - 7s - loss: 0.5680 - acc: 0.7839 - val_loss: 0.6797 - val_acc: 0.7353
Q 4*861   T 3444   [91m☒[0m 3484  
Q 97*53   T 5141   [91m☒[0m 5021  
Q 159*977 T 155343 [91m☒[0m 156543
Q 66*597  T 39402  [91m☒[0m 40302 
Q 75*594  T 44550  [91m☒[0m 47050 
Q 801*915 T 732915 [91m☒[0m 738215
Q 461*477 T 219897 [91m☒[0m 221397
Q 870*729 T 634230 [91m☒[0m 645630
Q 71*44   T 3124   [91m☒[0m 3164  
Q 222*901 T 200022 [91m☒[0m 209922

--------------------------------------------------
Iteration 60
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 7s - loss: 0.5642 - acc: 0.7859 - val_loss: 0.6679 - val_acc: 0.7387
Q 83*43   T 3569   [91m☒[0m 3529  
Q 573*483 T 276759 [91m☒[0m 278199
Q 915*68  T 62220  [91m☒[0m 61920 
Q 69*292  T 20148  [91m☒[0m 20688 
Q 783*446 T 349218 [91m☒[0m 343858
Q 910*356 T 323960 [91m☒[0m 316760
Q 799*929 T 742271 [91m☒[0m 744551
Q 126*596 T 75096  [91m☒[0m 79896 
Q 889*496 T 440944 [91m☒[0m 424884
Q 866*46  T 39836  [91m☒[0m

 - 7s - loss: 0.5140 - acc: 0.8077 - val_loss: 0.6646 - val_acc: 0.7453
Q 209*2   T 418    [92m☑[0m 418   
Q 27*152  T 4104   [91m☒[0m 4864  
Q 995*61  T 60695  [92m☑[0m 60695 
Q 83*740  T 61420  [91m☒[0m 61020 
Q 296*531 T 157176 [91m☒[0m 153436
Q 339*22  T 7458   [91m☒[0m 7018  
Q 211*594 T 125334 [91m☒[0m 122754
Q 5*299   T 1495   [92m☑[0m 1495  
Q 38*76   T 2888   [91m☒[0m 2848  
Q 831*218 T 181158 [91m☒[0m 171558

--------------------------------------------------
Iteration 75
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 7s - loss: 0.5032 - acc: 0.8130 - val_loss: 0.6592 - val_acc: 0.7490
Q 68*362  T 24616  [91m☒[0m 24696 
Q 93*335  T 31155  [91m☒[0m 30255 
Q 5*436   T 2180   [92m☑[0m 2180  
Q 8*412   T 3296   [92m☑[0m 3296  
Q 52*898  T 46696  [91m☒[0m 46456 
Q 114*5   T 570    [91m☒[0m 530   
Q 886*77  T 68222  [91m☒[0m 68882 
Q 10*399  T 3990   [92m☑[0m 3990  
Q 7*253   T 1771   [92m☑[0m 1771  
Q 75*864  T 64800  [91m☒[0m

 - 7s - loss: 0.4682 - acc: 0.8284 - val_loss: 0.6824 - val_acc: 0.7488
Q 860*839 T 721540 [91m☒[0m 721740
Q 895*870 T 778650 [92m☑[0m 778650
Q 890*462 T 411180 [91m☒[0m 402180
Q 69*285  T 19665  [91m☒[0m 19765 
Q 331*9   T 2979   [92m☑[0m 2979  
Q 4*293   T 1172   [91m☒[0m 1132  
Q 8*535   T 4280   [91m☒[0m 4480  
Q 0*927   T 0      [92m☑[0m 0     
Q 887*27  T 23949  [91m☒[0m 23969 
Q 99*548  T 54252  [91m☒[0m 55152 

--------------------------------------------------
Iteration 90
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 7s - loss: 0.4636 - acc: 0.8304 - val_loss: 0.6978 - val_acc: 0.7446
Q 79*467  T 36893  [91m☒[0m 38853 
Q 39*874  T 34086  [91m☒[0m 34166 
Q 180*14  T 2520   [91m☒[0m 2420  
Q 358*27  T 9666   [91m☒[0m 9406  
Q 521*480 T 250080 [91m☒[0m 258880
Q 943*99  T 93357  [91m☒[0m 92557 
Q 442*16  T 7072   [91m☒[0m 6692  
Q 566*75  T 42450  [91m☒[0m 42250 
Q 88*573  T 50424  [91m☒[0m 50944 
Q 160*35  T 5600   [92m☑[0m

#### 2.4.4 Analysis

- 精準度最高只到75%左右
- 模型最先學會了任意數乘以0和乘以1的問題
- 任意數乘以5的準確度也較高
- 到了中期可以部分解決任意數乘以個位數，或者乘以個位數加上0 (如*30,*200) 的例子 
- 後期能解決部分任意數乘以任意數的例子，但整體精準度仍偏低

#### 2.4.5 Test binary multiplicator and Analysis

In [9]:
DIGITS = 10
INPUT_MAXLEN = DIGITS + 1 + DIGITS
OUTPUT_MAXLEN = DIGITS + DIGITS
REVERSE = True
    
chars = '01* '
ctable = CharacterTable(chars)

def gen_data(ctable, TRAINING_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, DIGITS, REVERSE, chars):
    questions = []
    expected = []
    seen = set()
    
    while len(questions) < TRAINING_SIZE:
        fn = lambda: ''.join(np.random.choice(list('01')) for _ in range(np.random.randint(1, DIGITS + 1)))
        a, b = int(fn(), 2),  int(fn(), 2)
        
        key = tuple(sorted((a, b)))
        if key in seen:
            continue
        seen.add(key)
        
        q = '{}*{}'.format(bin(a)[2:], bin(b)[2:])
        query = q + ' ' * (INPUT_MAXLEN - len(q))
        
        ans = '{}'.format(bin(a*b)[2:])
        ans += ' ' * (OUTPUT_MAXLEN - len(ans))
        
        if REVERSE:
            query = query[::-1]
        
        questions.append(query)
        expected.append(ans)
    print('Total questions:', len(questions))
    
    return vectorize(ctable, questions, expected, INPUT_MAXLEN, OUTPUT_MAXLEN, chars)

x_train, y_train, x_val, y_val = gen_data(ctable, TRAINING_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, DIGITS,REVERSE, chars)
model = build_model(LAYERS, HIDDEN_SIZE, INPUT_MAXLEN, OUTPUT_MAXLEN, chars)
training(model, ctable, BATCH_SIZE, ITERATION, x_train, y_train, x_val, y_val, REVERSE)

Total questions: 50000
Training Data:
(45000, 21, 4)
(45000, 20, 4)
Validation Data:
(5000, 21, 4)
(5000, 20, 4)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
lstm_1 (LSTM)                (None, 128)               68096     
_________________________________________________________________
repeat_vector_1 (RepeatVecto (None, 20, 128)           0         
_________________________________________________________________
lstm_2 (LSTM)                (None, 20, 128)           131584    
_________________________________________________________________
time_distributed_1 (TimeDist (None, 20, 4)             516       
Total params: 200,196
Trainable params: 200,196
Non-trainable params: 0
_________________________________________________________________

--------------------------------------------------
Iteration 1
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 21s - loss: 0.5700 - acc: 0.6

 - 18s - loss: 0.3788 - acc: 0.7519 - val_loss: 0.3661 - val_acc: 0.7552
Q 1010010100*1100001    T 1111101000010100     [91m☒[0m 1000111000000100    
Q 11010101*1110111001   T 110001100011101101   [91m☒[0m 100011100000000001  
Q 1001110*11            T 11101010             [91m☒[0m 101110100           
Q 1011000*1100011110    T 10001001001010000    [91m☒[0m 10011111000110000   
Q 10111*100100          T 1100111100           [91m☒[0m 1010000100          
Q 1*1011000111          T 1011000111           [91m☒[0m 1001100001          
Q 1000100*100001        T 100011000100         [91m☒[0m 100000000100        
Q 10101*11010011        T 1000101001111        [91m☒[0m 1000100000011       
Q 0*1101001             T 0                    [92m☑[0m 0                   
Q 1000110*11000110      T 11011000100100       [91m☒[0m 10001000000100      

--------------------------------------------------
Iteration 10
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 18s - loss

Q 1000001*11010010      T 11010101010010       [91m☒[0m 11100000000010      

--------------------------------------------------
Iteration 18
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 18s - loss: 0.3368 - acc: 0.7873 - val_loss: 0.3207 - val_acc: 0.7966
Q 100000011*10001       T 1000100110011        [91m☒[0m 1000000000001       
Q 10111*11111101        T 1011010111011        [91m☒[0m 1010000000001       
Q 10011101*1011         T 11010111111          [91m☒[0m 11000000111         
Q 111001111*1001        T 1000001000111        [91m☒[0m 1000000000111       
Q 1111011010*1011001    T 10101011011001010    [91m☒[0m 10101000000011110   
Q 10000000*101          T 1010000000           [91m☒[0m 1000000000          
Q 11110001*101110       T 10101101001110       [91m☒[0m 10100000001110      
Q 101101*10000101       T 1011101100001        [91m☒[0m 1011011111111       
Q 1100101*1100100000    T 10011101110100000    [91m☒[0m 10000000001100000   
Q 11*11001111

Q 10011100*101000       T 1100001100000        [91m☒[0m 1011101100000       

--------------------------------------------------
Iteration 27
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 20s - loss: 0.2509 - acc: 0.8435 - val_loss: 0.2319 - val_acc: 0.8532
Q 10100011*10010        T 101101110110         [91m☒[0m 101100000110        
Q 10101100*1010000001   T 11010111010101100    [91m☒[0m 11010001000001100   
Q 1111011100*111010110  T 1110001010111101000  [91m☒[0m 1110101110000001000 
Q 111001*110100001      T 101110011011001      [91m☒[0m 101111110001001     
Q 10*1110110100         T 11101101000          [91m☒[0m 11101001000         
Q 11*1100110            T 100110010            [91m☒[0m 100100010           
Q 110001111*11001       T 10011011110111       [91m☒[0m 10011111000111      
Q 111010*1101111        T 1100100100110        [91m☒[0m 1100000000110       
Q 10101*10101000        T 110111001000         [91m☒[0m 110100001000        
Q 1100101*100

Q 1*1011110100          T 1011110100           [92m☑[0m 1011110100          

--------------------------------------------------
Iteration 36
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 20s - loss: 0.2018 - acc: 0.8774 - val_loss: 0.1903 - val_acc: 0.8827
Q 11100001*1010010      T 100100000010010      [91m☒[0m 100011100000010     
Q 1001111*100010100     T 101010100101100      [91m☒[0m 101000000001100     
Q 10100110*111          T 10010001010          [91m☒[0m 10001011010         
Q 11011110*1111010      T 110100111001100      [91m☒[0m 110100000001100     
Q 10101111*110011111    T 10001101110110001    [91m☒[0m 10001100000000001   
Q 10101*1000111111      T 10111100101011       [91m☒[0m 10111110011011      
Q 1101110*10000111      T 11101000000010       [91m☒[0m 11100110000010      
Q 1001010*110111011     T 1000000000001110     [91m☒[0m 1000000000011110    
Q 1110111101*1110110    T 11011100100011110    [91m☒[0m 11011110000111110   
Q 110110100*1

Q 10010001*101010       T 1011111001010        [91m☒[0m 1011111101010       
Q 1111100*101101111     T 1011000111000100     [91m☒[0m 1011000000000100    
Q 111101011*1110        T 1101011011010        [92m☑[0m 1101011011010       

--------------------------------------------------
Iteration 45
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 19s - loss: 0.2490 - acc: 0.8514 - val_loss: 0.2217 - val_acc: 0.8659
Q 1101101101*1011       T 10010110101111       [91m☒[0m 10010110110011      
Q 11000100*101100       T 10000110110000       [91m☒[0m 10000101110000      
Q 1011*110110001        T 1001010011011        [91m☒[0m 1001010111111       
Q 1000000010*1101       T 1101000011010        [91m☒[0m 1101000000010       
Q 11110100*10110011     T 1010101010011100     [91m☒[0m 1010110111001100    
Q 11101110*1001011      T 100010110111010      [91m☒[0m 100011000000010     
Q 111010010*1000100     T 111101111001000      [91m☒[0m 111101000001000     
Q 110110100*1

 - 19s - loss: 0.1676 - acc: 0.8978 - val_loss: 0.1619 - val_acc: 0.8996
Q 1101011*1101          T 10101101111          [92m☑[0m 10101101111         
Q 1111110*10011010      T 100101111001100      [91m☒[0m 100100100001100     
Q 10111*110             T 10001010             [92m☑[0m 10001010            
Q 1100001001*111        T 1010100111111        [91m☒[0m 1010101101011       
Q 10100100*110010       T 10000000001000       [92m☑[0m 10000000001000      
Q 1100111*1001010010    T 1110111011111110     [91m☒[0m 1111000100101110    
Q 10*100111000          T 1001110000           [92m☑[0m 1001110000          
Q 10000*111011110       T 1110111100000        [92m☑[0m 1110111100000       
Q 11111001*1100         T 101110101100         [91m☒[0m 101111101100        
Q 1011100110*1010       T 1110011111100        [91m☒[0m 1110110011100       

--------------------------------------------------
Iteration 54
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 19s - loss

Q 10101*10110101        T 111011011001         [91m☒[0m 111011111001        
Q 111*1110111           T 1101000001           [92m☑[0m 1101000001          

--------------------------------------------------
Iteration 62
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 20s - loss: 0.1813 - acc: 0.8911 - val_loss: 0.1709 - val_acc: 0.8991
Q 110010000*100000      T 11001000000000       [92m☑[0m 11001000000000      
Q 10*1111010010         T 11110100100          [92m☑[0m 11110100100         
Q 1011000*10110001      T 11110011011000       [91m☒[0m 11110001011000      
Q 111011110*11111101    T 11101100001100110    [91m☒[0m 11101111100000110   
Q 1000111010*1111010100 T 10001000011000001000 [91m☒[0m 10001000000000001000
Q 1001110*1000110000    T 1010101010100000     [91m☒[0m 1010110110100000    
Q 111000001*111111100   T 110111101011111100   [91m☒[0m 110111100010111100  
Q 1110001111*10000      T 11100011110000       [91m☒[0m 1110011111000       
Q 110011010*1

Q 110011111*1000        T 110011111000         [92m☑[0m 110011111000        
Q 1101101*10010110      T 11111111011110       [91m☒[0m 100000001011110     
Q 100110*100011         T 10100110010          [92m☑[0m 10100110010         
Q 100*100001001         T 10000100100          [92m☑[0m 10000100100         
Q 110110000*110110      T 101101100100000      [91m☒[0m 101101000100000     
Q 100110*110111010      T 100000110011100      [91m☒[0m 100000111111100     
Q 110*1101001000        T 1001110110000        [92m☑[0m 1001110110000       
Q 1110101*10100010      T 100101000001010      [91m☒[0m 100101011111010     
Q 1*11001011            T 11001011             [92m☑[0m 11001011            
Q 100*1010100           T 101010000            [92m☑[0m 101010000           

--------------------------------------------------
Iteration 71
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 22s - loss: 0.1551 - acc: 0.9101 - val_loss: 0.1486 - val_acc: 0.9088
Q 111100001*1

Q 100101011*1111000100  T 1000110010111101100  [91m☒[0m 1000101100000101100 
Q 111110001*1           T 111110001            [92m☑[0m 111110001           

--------------------------------------------------
Iteration 79
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 20s - loss: 0.1418 - acc: 0.9181 - val_loss: 0.1455 - val_acc: 0.9139
Q 1101*1001100110       T 1111100101110        [91m☒[0m 1111101011110       
Q 1010011*1110          T 10010001010          [92m☑[0m 10010001010         
Q 1001000*101000        T 101101000000         [92m☑[0m 101101000000        
Q 100001000*1100101     T 110100000101000      [91m☒[0m 110100110101000     
Q 110*1101001000        T 1001110110000        [92m☑[0m 1001110110000       
Q 100000110*110         T 11000100100          [92m☑[0m 11000100100         
Q 10101*110010110       T 10000101001110       [91m☒[0m 10000101111110      
Q 110001*101011         T 100000111011         [92m☑[0m 100000111011        
Q 1010111*101

Q 1110011010*110001111  T 1011001110100000110  [91m☒[0m 1011000010001100110 
Q 101101011*111110110   T 101100011111010010   [91m☒[0m 101100010111000010  
Q 11010*110001110       T 10100001101100       [91m☒[0m 10100011101100      
Q 100*1001010           T 100101000            [92m☑[0m 100101000           
Q 11101000*10010010     T 1000010001010000     [92m☑[0m 1000010001010000    
Q 1000100*11011000      T 11100101100000       [92m☑[0m 11100101100000      
Q 1011010*1101000110    T 10010011010011100    [91m☒[0m 10010011100011100   
Q 1110111111*11         T 101100111101         [92m☑[0m 101100111101        
Q 101000*100110         T 10111110000          [92m☑[0m 10111110000         
Q 1111*111101110        T 1110011110010        [92m☑[0m 1110011110010       

--------------------------------------------------
Iteration 88
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 19s - loss: 0.1311 - acc: 0.9267 - val_loss: 0.1302 - val_acc: 0.9239
Q 110110*1001

Q 1110100000*101        T 1001000100000        [92m☑[0m 1001000100000       
Q 100110010*100011      T 10100111010110       [91m☒[0m 10101011110110      

--------------------------------------------------
Iteration 96
Train on 45000 samples, validate on 5000 samples
Epoch 1/1
 - 19s - loss: 0.1153 - acc: 0.9335 - val_loss: 0.1252 - val_acc: 0.9273
Q 11111001*1110010000   T 110111011100010000   [91m☒[0m 110110111000010000  
Q 100111011*110001      T 11110001001011       [91m☒[0m 11110000001011      
Q 10111111*110011       T 10011000001101       [91m☒[0m 10010100001101      
Q 1011011*1001111       T 1110000010101        [91m☒[0m 1110000100101       
Q 101000*1011           T 110111000            [92m☑[0m 110111000           
Q 111*1011100111        T 1010001010001        [91m☒[0m 1010001000001       
Q 11000011*10           T 110000110            [92m☑[0m 110000110           
Q 100111110*110101      T 100000111010110      [91m☒[0m 100000000110110     
Q 100011*1101

- 精準度從75%提升至88%，將輸入表示從0~9減少至0~1是有效提升精準度的方式

---

## <a name="conclusion"></a>3. Conclusion

- 本次嘗試了基於NLP的加法器、減法器、乘法器與加減法合併
- 基本的加法器、減法器都可達到99%以上的精準度
- 加減法合併則只能達到72%，增加Data與Epoch則達到79%，也許更多個Epoch可以再提升精準度
- 基本的乘法器也只能達到75%
- 透過將輸入改為Binary可以有效提升精準度