https://github.com/CVxTz/keras/blob/master/examples/cnn_seq2seq.py

## Sequence to sequence example in Keras (character-level).

This script demonstrates how to implement a basic character-level sequence-to-sequence model. We apply it to translating English sentences into shorter ones, character-by-character. Note that it is fairly unusual to do character-level machine translation, as word-level models are more common in this domain.

In [1]:
from keras.models import Model
from keras.layers import Input, Convolution1D, Dot, Dense, Activation, Concatenate
from keras.utils import Sequence
import numpy as np
import random
from typing import List, Dict, Tuple
from tqdm.auto import tqdm

Using TensorFlow backend.


In [2]:
def read_pairs(fn: str)-> List[Tuple]:
    res = []
    with open(fn, 'rt', encoding='utf-8') as f:
        next(f)  # skip header
        for line in f:
            line = line.strip()
            original, compressed = line.split('\t')
            res.append((original,compressed))
    return res

In [3]:
train_samples = read_pairs('sent-comp.train.tsv')
print(f"Total samples: {len(train_samples)}")
for s in train_samples[:3]:
    print(s)

Total samples: 200000
('Serge Ibaka -- the Oklahoma City Thunder forward who was born in the Congo but played in Spain -- has been granted Spanish citizenship and will play for the country in EuroBasket this summer, the event where spots in the 2012 Olympics will be decided.', 'Serge Ibaka has been granted Spanish citizenship and will play in EuroBasket.')
('MILAN -Catania held Roma to a 1-1 draw in Serie A on Wednesday as the teams played out the remaining 25 minutes of a game that was called off last month.', 'Catania held Roma to a 1 1 draw in Serie A.')
('State Street Corporation, a provider of investment servicing, investment management and investment research and trading services, has launched a new investment servicing solution to support small to mid-sized asset managers with their investment operations needs.', 'State Street Corporation, has launched a new investment servicing solution.')


In [4]:
eval_samples = read_pairs('comp-data.eval.tsv')
print(f"Total samples: {len(eval_samples)}")
for s in eval_samples[:3]:
    print(s)

Total samples: 10000
('Five people have been taken to hospital with minor injuries following a crash on the A17 near Sleaford this morning.', 'Five people have been taken to hospital with minor injuries following a crash on the A17 near Sleaford.')
("Several school districts in Hampton Roads are holding classes this Presidents' Day to make up for days missed because of the snow.", "Several school districts are holding classes this Presidents ' Day to make up for days missed.")
('Luis Suarez was spotted in London this afternoon and this has led the Daily Star to link the Liverpool striker to a potential move to Chelsea or Arsenal.', 'Luis Suarez was spotted in London.')


In [5]:
samples = train_samples+eval_samples

lengths_orig = sorted([len(s[0]) for s in samples])
lengths_comp = sorted([len(s[1]) for s in samples])

print(lengths_orig[:100])
print('...')
print(lengths_orig[-300:])
print()
print(lengths_comp[:100])
print('...')
print(lengths_comp[-100:])

[22, 22, 23, 25, 25, 25, 25, 26, 26, 27, 27, 27, 27, 27, 28, 28, 28, 28, 28, 28, 28, 29, 29, 29, 29, 29, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 34, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35]
...
[543, 543, 543, 544, 545, 545, 547, 548, 549, 551, 551, 552, 554, 554, 554, 555, 556, 556, 558, 559, 559, 560, 561, 562, 563, 563, 563, 564, 565, 565, 568, 568, 569, 569, 570, 571, 571, 571, 572, 573, 574, 575, 575, 576, 576, 577, 577, 578, 579, 580, 583, 583, 584, 586, 586, 586, 587, 588, 588, 588, 589, 589, 590, 591, 591, 593, 594, 595, 596, 599, 599, 600, 601, 601, 601, 602, 603, 603, 606, 606, 607, 609, 612, 613, 613, 616, 619, 619, 621, 622, 623, 623, 623, 624, 625, 626, 628, 630, 631, 634, 634, 638, 640, 642, 642, 646, 647, 648, 648, 648, 651, 653, 654, 654, 655, 656, 658, 661, 665,

In [6]:
print("======= The most short sentences: =======")
print("Original(s):")
print([s for s in samples if len(s[0])==lengths_orig[0]])
print("Compressed:")
print([s for s in samples if len(s[1])==lengths_comp[0]])
print("======= The most long sentences: =======")
print("Original(s):")
print([s for s in samples if len(s[0])==lengths_orig[-1]])
print("Compressed:")
print([s for s in samples if len(s[1])==lengths_comp[-1]])

Original(s):
[('Oh God, I am not back.', 'I am not back.'), ('What is Siri, you ask?', 'What is Siri.')]
Compressed:
[("WITH Gillian McKeith in the I'm A Celebrity jungle, Anorak has sent one of old Mr Anorak's relief nurses out to buy a degree from Debenham's and stand in the lavatories at Euston train station.", "I 'm."), ("Yesterday in the I'm A Celebrity jungle, Sam Fox and George Hamilton had quite the heart to heart.", "I 'm."), ("The I'm A Celebrity star was first propositioned by Sinitta, 48, when they discussed going Christmas shopping.", "I 'm."), ("With jungle fever taking hold on ITV's award-winning I'm a Celebrity ....", "I 'm."), ("The cross-dressing cagefighter has held secret talks with ITV1 chiefs over a dramatic vine rope entry into the I'm A Celebrity camp in the next few days.", "I 'm."), ("Daniel Baldwin was voted off I'm a Celebrity...", "I 'm."), ("Benidorm actress Crissy Rock has become the next star to be booted off I'm A Celebrity...", "I 'm."), ("The first st

In [7]:
# Clean sentences

In [8]:
samples = [s for s in samples if len(s[0])<512 and len(s[1])>32]
print(len(samples))

193000


In [9]:
lengths_orig = sorted([len(s[0]) for s in samples])
lengths_comp = sorted([len(s[1]) for s in samples])

print(lengths_orig[:100])
print('...')
print(lengths_orig[-300:])
print()
print(lengths_comp[:100])
print('...')
print(lengths_comp[-100:])

[38, 38, 39, 39, 40, 40, 40, 41, 41, 41, 42, 42, 42, 42, 42, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 43, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 44, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 45, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 46, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47, 47]
...
[429, 430, 430, 430, 430, 430, 431, 431, 431, 431, 431, 431, 431, 431, 431, 432, 432, 432, 432, 432, 432, 432, 432, 432, 433, 433, 433, 433, 433, 434, 434, 434, 434, 434, 434, 434, 435, 435, 435, 435, 436, 436, 436, 436, 437, 437, 437, 437, 438, 438, 438, 439, 439, 439, 439, 439, 440, 440, 440, 440, 441, 441, 441, 441, 441, 442, 443, 443, 443, 443, 443, 443, 444, 444, 444, 444, 444, 445, 445, 445, 445, 445, 445, 445, 445, 446, 447, 447, 447, 447, 447, 447, 448, 448, 448, 448, 449, 449, 449, 449, 450, 450, 450, 451, 451, 451, 451, 451, 452, 452, 452, 452, 452, 453, 453, 453, 453, 453, 453,

In [10]:
print("======= The most short sentences: =======")
print("Original(s):")
print([s for s in samples if len(s[0])==lengths_orig[0]])
print("Compressed:")
print([s for s in samples if len(s[1])==lengths_comp[0]])
print("======= The most long sentences: =======")
print("Original(s):")
print([s for s in samples if len(s[0])==lengths_orig[-1]])
print("Compressed:")
print([s for s in samples if len(s[1])==lengths_comp[-1]])

Original(s):
[('The iTunes feed is working once again.', 'The iTunes feed is working again.'), ("-Jessie J used to ``boo'' her critics.", 'Jessie J used to boo her critics.')]
Compressed:


Original(s):
[("Erislandy Lara sends message to 154lb collective: 'They can run, but they can't hide!' Fast-rising Cuban hotshot Erislandy Lara clashes with former welterweight world title challenger Freddy Hernandez on Saturday, June 30 at the Fantasy Springs Casino in Indio, California but believes that a replication of his first round drubbing of Ronald Hearns could mean he becomes an even more avoided fighter than he currently is: ``The (big) fights will come sooner or later,'' the 29-year-old southpaw recently hoped.", "Erislandy Lara sends message to 154lb collective : They can run, but they can't hide.")]
Compressed:
[('Dynamics Research Corporation, a leading technology and management consulting company focused on driving performance, process and results for government clients, today announced that it has entered into a Cooperative Research and Development Agreement with the Intelligence and Information Warfare Directorate of the US Army Communications-Electronics Research, Dev

In [11]:
batch_size = 64  # Batch size for training.
epochs = 100  # Number of epochs to train for.
latent_dim = 256  # Latent dimensionality of the encoding space.

In [12]:
# Vectorize the data.
input_texts = []
target_texts = []
input_characters = set()
target_characters = set()
for s in samples:
    input_text, target_text = s
    # We use "tab" as the "start sequence" character
    # for the targets, and "\n" as "end sequence" character.
    target_text = '\t' + target_text + '\n'
    input_texts.append(input_text)
    target_texts.append(target_text)
    for char in input_text:
        if char not in input_characters:
            input_characters.add(char)
    for char in target_text:
        if char not in target_characters:
            target_characters.add(char)

input_characters = sorted(list(input_characters))
target_characters = sorted(list(target_characters))
num_encoder_tokens = len(input_characters)
num_decoder_tokens = len(target_characters)
max_encoder_seq_length = max([len(txt) for txt in input_texts])
max_decoder_seq_length = max([len(txt) for txt in target_texts])

print('Number of samples:', len(input_texts))
print('Number of unique input tokens:', num_encoder_tokens)
print('Number of unique output tokens:', num_decoder_tokens)
print('Max sequence length for inputs:', max_encoder_seq_length)
print('Max sequence length for outputs:', max_decoder_seq_length)

Number of samples: 193000
Number of unique input tokens: 249
Number of unique output tokens: 180
Max sequence length for inputs: 511
Max sequence length for outputs: 245


In [13]:
train_input_texts = input_texts[0:-len(eval_samples)]
train_target_texts = target_texts[0:-len(eval_samples)]
eval_input_texts = input_texts[-len(eval_samples):]
eval_target_texts = target_texts[-len(eval_samples):]

In [14]:
input_token_index = dict(
    [(char, i) for i, char in enumerate(input_characters)])
target_token_index = dict(
    [(char, i) for i, char in enumerate(target_characters)])

In [15]:
class DataGenerator(Sequence):
    'Generates data for Keras'
    def __init__(self, input_texts, target_texts, batch_size=batch_size, shuffle=True):
        'Initialization'
        self.input_texts = input_texts
        self.target_texts = target_texts
        self.batch_size = batch_size
        self.shuffle = shuffle
        self.on_epoch_end()

    def __len__(self):
        'Denotes the number of batches per epoch'
        return int(np.floor(len(self.input_texts) / self.batch_size))

    def __getitem__(self, index):        
        'Generate one batch of data'
        encoder_input_data = np.zeros(
            (self.batch_size, max_encoder_seq_length, num_encoder_tokens),
            dtype='float32')
        decoder_input_data = np.zeros(
            (self.batch_size, max_decoder_seq_length, num_decoder_tokens),
            dtype='float32')
        decoder_target_data = np.zeros(
            (self.batch_size, max_decoder_seq_length, num_decoder_tokens),
            dtype='float32')

        from_ind = index*self.batch_size
        to_ind = from_ind+self.batch_size
        for i, (input_text, target_text) in enumerate(zip(self.input_texts[from_ind:to_ind], self.target_texts[from_ind:to_ind])):
            for t, char in enumerate(input_text):
                encoder_input_data[i, t, input_token_index[char]] = 1.
            for t, char in enumerate(target_text):
                # decoder_target_data is ahead of decoder_input_data by one timestep
                decoder_input_data[i, t, target_token_index[char]] = 1.
                if t > 0:
                    # decoder_target_data will be ahead by one timestep
                    # and will not include the start character.
                    decoder_target_data[i, t - 1, target_token_index[char]] = 1.
        
        return [encoder_input_data, decoder_input_data], decoder_target_data

    def on_epoch_end(self):
        'Updates indexes after each epoch'
        if self.shuffle == True:
            c = list(zip(self.input_texts, self.target_texts))
            random.shuffle(c)
            self.input_texts, self.target_texts = zip(*c)


In [26]:
training_generator = DataGenerator(train_input_texts, train_target_texts)
val_generator = DataGenerator(eval_input_texts, eval_target_texts, shuffle=False)

In [17]:
# Define an input sequence and process it.
encoder_inputs = Input(shape=(None, num_encoder_tokens))
# Encoder
x_encoder = Convolution1D(256, kernel_size=3, activation='relu',
                          padding='causal')(encoder_inputs)
x_encoder = Convolution1D(256, kernel_size=3, activation='relu',
                          padding='causal', dilation_rate=2)(x_encoder)
x_encoder = Convolution1D(256, kernel_size=3, activation='relu',
                          padding='causal', dilation_rate=4)(x_encoder)

decoder_inputs = Input(shape=(None, num_decoder_tokens))
# Decoder
x_decoder = Convolution1D(256, kernel_size=3, activation='relu',
                          padding='causal')(decoder_inputs)
x_decoder = Convolution1D(256, kernel_size=3, activation='relu',
                          padding='causal', dilation_rate=2)(x_decoder)
x_decoder = Convolution1D(256, kernel_size=3, activation='relu',
                          padding='causal', dilation_rate=4)(x_decoder)
# Attention
attention = Dot(axes=[2, 2])([x_decoder, x_encoder])
attention = Activation('softmax')(attention)

context = Dot(axes=[2, 1])([attention, x_encoder])
decoder_combined_context = Concatenate(axis=-1)([context, x_decoder])

decoder_outputs = Convolution1D(64, kernel_size=3, activation='relu',
                                padding='causal')(decoder_combined_context)
decoder_outputs = Convolution1D(64, kernel_size=3, activation='relu',
                                padding='causal')(decoder_outputs)
# Output
decoder_dense = Dense(num_decoder_tokens, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)

# Define the model that will turn
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.summary()

Instructions for updating:
Colocations handled automatically by placer.
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            (None, None, 180)    0                                            
__________________________________________________________________________________________________
input_1 (InputLayer)            (None, None, 249)    0                                            
__________________________________________________________________________________________________
conv1d_4 (Conv1D)               (None, None, 256)    138496      input_2[0][0]                    
__________________________________________________________________________________________________
conv1d_1 (Conv1D)               (None, None, 256)    191488      input_1[0][0]                    
_____________________________________

In [18]:
# Run training
model.compile(optimizer='adam', loss='categorical_crossentropy')
model.fit_generator(generator=training_generator,
                    validation_data=val_generator,
                    epochs=5)
# Save model
# model.save('data/s2s.h5')

Instructions for updating:
Use tf.cast instead.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
Epoch 1/5


 314/2859 [==>...........................] - ETA: 3:01:25 - loss: 1.20 - ETA: 1:33:08 - loss: 1.25 - ETA: 1:03:41 - loss: 1.26 - ETA: 48:57 - loss: 1.2580 - ETA: 40:07 - loss: 1.23 - ETA: 34:18 - loss: 1.23 - ETA: 30:05 - loss: 1.22 - ETA: 26:55 - loss: 1.20 - ETA: 24:27 - loss: 1.18 - ETA: 22:28 - loss: 1.17 - ETA: 20:54 - loss: 1.15 - ETA: 19:34 - loss: 1.14 - ETA: 18:26 - loss: 1.12 - ETA: 17:27 - loss: 1.11 - ETA: 16:36 - loss: 1.10 - ETA: 15:52 - loss: 1.09 - ETA: 15:12 - loss: 1.08 - ETA: 14:38 - loss: 1.06 - ETA: 14:06 - loss: 1.05 - ETA: 13:38 - loss: 1.04 - ETA: 13:12 - loss: 1.03 - ETA: 12:49 - loss: 1.03 - ETA: 12:28 - loss: 1.02 - ETA: 12:09 - loss: 1.01 - ETA: 11:51 - loss: 1.00 - ETA: 11:35 - loss: 0.99 - ETA: 11:20 - loss: 0.99 - ETA: 11:06 - loss: 0.98 - ETA: 10:52 - loss: 0.97 - ETA: 10:40 - loss: 0.97 - ETA: 10:28 - loss: 0.96 - ETA: 10:17 - loss: 0.96 - ETA: 10:07 - loss: 0.95 - ETA: 9:57 - loss: 0.9527 - ETA: 9:48 - loss: 0.949 - ETA: 9:39 - loss: 0.944 - ETA: 9:31 













Epoch 2/5


 314/2859 [==>...........................] - ETA: 4:51 - loss: 0.061 - ETA: 4:51 - loss: 0.063 - ETA: 4:53 - loss: 0.059 - ETA: 4:53 - loss: 0.056 - ETA: 4:53 - loss: 0.056 - ETA: 4:53 - loss: 0.055 - ETA: 4:53 - loss: 0.055 - ETA: 4:53 - loss: 0.055 - ETA: 4:53 - loss: 0.055 - ETA: 4:54 - loss: 0.054 - ETA: 4:53 - loss: 0.054 - ETA: 4:53 - loss: 0.053 - ETA: 4:53 - loss: 0.054 - ETA: 4:53 - loss: 0.053 - ETA: 4:53 - loss: 0.054 - ETA: 4:53 - loss: 0.053 - ETA: 4:53 - loss: 0.053 - ETA: 4:53 - loss: 0.053 - ETA: 4:53 - loss: 0.053 - ETA: 4:53 - loss: 0.053 - ETA: 4:52 - loss: 0.053 - ETA: 4:53 - loss: 0.053 - ETA: 4:53 - loss: 0.053 - ETA: 4:53 - loss: 0.053 - ETA: 4:52 - loss: 0.054 - ETA: 4:52 - loss: 0.054 - ETA: 4:52 - loss: 0.054 - ETA: 4:52 - loss: 0.054 - ETA: 4:52 - loss: 0.053 - ETA: 4:51 - loss: 0.053 - ETA: 4:51 - loss: 0.053 - ETA: 4:51 - loss: 0.053 - ETA: 4:51 - loss: 0.053 - ETA: 4:51 - loss: 0.053 - ETA: 4:51 - loss: 0.053 - ETA: 4:51 - loss: 0.053 - ETA: 4:51 - loss: 0













Epoch 3/5


 314/2859 [==>...........................] - ETA: 4:45 - loss: 0.045 - ETA: 4:54 - loss: 0.043 - ETA: 4:53 - loss: 0.042 - ETA: 4:52 - loss: 0.043 - ETA: 4:53 - loss: 0.042 - ETA: 4:53 - loss: 0.043 - ETA: 4:53 - loss: 0.044 - ETA: 4:54 - loss: 0.044 - ETA: 4:55 - loss: 0.044 - ETA: 4:56 - loss: 0.043 - ETA: 4:56 - loss: 0.042 - ETA: 4:59 - loss: 0.042 - ETA: 4:59 - loss: 0.041 - ETA: 4:59 - loss: 0.042 - ETA: 4:59 - loss: 0.042 - ETA: 4:59 - loss: 0.042 - ETA: 5:00 - loss: 0.042 - ETA: 5:00 - loss: 0.041 - ETA: 5:00 - loss: 0.041 - ETA: 5:00 - loss: 0.041 - ETA: 5:00 - loss: 0.041 - ETA: 5:00 - loss: 0.041 - ETA: 5:00 - loss: 0.041 - ETA: 5:00 - loss: 0.041 - ETA: 5:00 - loss: 0.041 - ETA: 5:00 - loss: 0.041 - ETA: 4:59 - loss: 0.041 - ETA: 4:59 - loss: 0.041 - ETA: 4:59 - loss: 0.041 - ETA: 4:58 - loss: 0.041 - ETA: 4:58 - loss: 0.041 - ETA: 4:58 - loss: 0.041 - ETA: 4:58 - loss: 0.041 - ETA: 4:58 - loss: 0.041 - ETA: 4:58 - loss: 0.041 - ETA: 4:57 - loss: 0.041 - ETA: 4:57 - loss: 0













Epoch 4/5


 314/2859 [==>...........................] - ETA: 5:08 - loss: 0.037 - ETA: 4:58 - loss: 0.034 - ETA: 4:58 - loss: 0.035 - ETA: 4:56 - loss: 0.037 - ETA: 4:57 - loss: 0.036 - ETA: 4:57 - loss: 0.036 - ETA: 4:56 - loss: 0.035 - ETA: 4:57 - loss: 0.036 - ETA: 4:57 - loss: 0.035 - ETA: 4:57 - loss: 0.036 - ETA: 4:58 - loss: 0.036 - ETA: 4:59 - loss: 0.036 - ETA: 4:59 - loss: 0.036 - ETA: 4:59 - loss: 0.036 - ETA: 4:58 - loss: 0.036 - ETA: 4:58 - loss: 0.036 - ETA: 4:57 - loss: 0.036 - ETA: 4:57 - loss: 0.035 - ETA: 4:57 - loss: 0.035 - ETA: 4:57 - loss: 0.035 - ETA: 4:57 - loss: 0.035 - ETA: 4:57 - loss: 0.035 - ETA: 4:56 - loss: 0.035 - ETA: 4:56 - loss: 0.035 - ETA: 4:56 - loss: 0.035 - ETA: 4:56 - loss: 0.036 - ETA: 4:56 - loss: 0.036 - ETA: 4:56 - loss: 0.036 - ETA: 4:55 - loss: 0.036 - ETA: 4:55 - loss: 0.036 - ETA: 4:55 - loss: 0.035 - ETA: 4:55 - loss: 0.036 - ETA: 4:55 - loss: 0.035 - ETA: 4:55 - loss: 0.036 - ETA: 4:54 - loss: 0.036 - ETA: 4:54 - loss: 0.035 - ETA: 4:54 - loss: 0













Epoch 5/5


 314/2859 [==>...........................] - ETA: 4:45 - loss: 0.038 - ETA: 4:48 - loss: 0.036 - ETA: 4:50 - loss: 0.034 - ETA: 4:52 - loss: 0.035 - ETA: 4:55 - loss: 0.034 - ETA: 4:55 - loss: 0.035 - ETA: 4:55 - loss: 0.036 - ETA: 4:56 - loss: 0.035 - ETA: 4:56 - loss: 0.035 - ETA: 4:56 - loss: 0.035 - ETA: 4:56 - loss: 0.035 - ETA: 4:55 - loss: 0.034 - ETA: 4:56 - loss: 0.034 - ETA: 4:56 - loss: 0.034 - ETA: 4:55 - loss: 0.033 - ETA: 4:55 - loss: 0.033 - ETA: 4:55 - loss: 0.033 - ETA: 4:55 - loss: 0.033 - ETA: 4:55 - loss: 0.033 - ETA: 4:54 - loss: 0.033 - ETA: 4:54 - loss: 0.034 - ETA: 4:54 - loss: 0.034 - ETA: 4:54 - loss: 0.034 - ETA: 4:54 - loss: 0.034 - ETA: 4:54 - loss: 0.034 - ETA: 4:54 - loss: 0.034 - ETA: 4:54 - loss: 0.034 - ETA: 4:54 - loss: 0.034 - ETA: 4:53 - loss: 0.034 - ETA: 4:53 - loss: 0.034 - ETA: 4:53 - loss: 0.034 - ETA: 4:53 - loss: 0.033 - ETA: 4:53 - loss: 0.033 - ETA: 4:53 - loss: 0.033 - ETA: 4:53 - loss: 0.033 - ETA: 4:53 - loss: 0.033 - ETA: 4:53 - loss: 0















<keras.callbacks.History at 0x1b3ff88b710>

In [31]:
# Save model
model.save('data/s2s.h5')

In [27]:
inp_data, _ = val_generator[0]
encoder_input_data = inp_data[0]

In [28]:
if 'in_decoder' in locals():
    del in_decoder
    
print(len(input_texts), max_decoder_seq_length, num_decoder_tokens)
print(len(input_texts)*max_decoder_seq_length*num_decoder_tokens)

193000 245 180
8511300000


In [32]:
def print_predictions():
    # Define sampling models
    reverse_input_char_index = dict(
        (i, char) for char, i in input_token_index.items())
    reverse_target_char_index = dict(
        (i, char) for char, i in target_token_index.items())

    nb_examples = 100
    in_encoder = encoder_input_data[:nb_examples]
    in_decoder = np.zeros(
        (len(in_encoder), max_decoder_seq_length, num_decoder_tokens),
        dtype='float32')

    in_decoder[:, 0, target_token_index["\t"]] = 1

    predict = np.zeros(
        (len(in_encoder), max_decoder_seq_length),
        dtype='float32')

    for i in tqdm(range(max_decoder_seq_length - 1), total=max_decoder_seq_length - 1):
        predict = model.predict([in_encoder, in_decoder])
        predict = predict.argmax(axis=-1)
        predict_ = predict[:, i].ravel().tolist()
        for j, x in enumerate(predict_):
            in_decoder[j, i + 1, x] = 1

    for seq_index in range(len(in_encoder)):
        # Take one sequence (part of the training set)
        # for trying out decoding.
        output_seq = predict[seq_index, :].ravel().tolist()
        decoded = []
        for x in output_seq:
            if reverse_target_char_index[x] == "\n":
                break
            else:
                decoded.append(reverse_target_char_index[x])
        decoded_sentence = "".join(decoded)
        print('-')
        print('Input sentence:', eval_input_texts[seq_index])
        print('Decoded sentence:', decoded_sentence)

In [33]:
print_predictions()

HBox(children=(IntProgress(value=0, max=244), HTML(value='')))


-
Input sentence: The St. Louis Downtown Airport in Cahokia is showing off its new control tower, which is 75 feet higher than the one it replaced.
Decoded sentence: The St. Louis Downtown Airport.
-
Input sentence: Coldplay paid tribute to the late Amy Winehouse Wednesday night during a concert in Los Angeles to benefit the Grammy Foundation.
Decoded sentence: Coldplay paid tribute to the late Amy Winehouse.
-
Input sentence: Colombia unveiled a tax reform bill on Tuesday aimed at creating jobs, closing loopholes and simplifying the tax system, but not increasing the tax take as the Andean country was on track for record collections this year.
Decoded sentence: Colombia unveiled a tax reform collections.
-
Input sentence: Oil prices are not reflecting fundamentals, the United Arab Emirates oil minister said on Monday, while his Qatari counterpart said there will be no output increase at OPEC's meeting later this month.
Decoded sentence: Oil prices are not reflecting fundamentals.
-
I

Input sentence: Other songs from Jessie J's debut studio album ``Who You Are'' that she performed at the 'VEVO Lift' showcase included: ``Price Tag'', ``Nobody's Perfect'', ``Mamma Knows Best'' and the title track ``Who You Are''.
Decoded sentence: Price Tag ' Price Tag.
-
Input sentence: First Financial Bancorp has paid a cash dividend every quarter since theformation of the holding company in April 1983.
Decoded sentence: First Financial Bancorp has paid a cash dividend every quarter.
-
Input sentence: NASA video shows solar storm Updated: 09:35, Saturday February 11, 2012 Stunning new videos showing a solar storm hitting the Earth have been released by NASA.
Decoded sentence: NASA video shows solar storm Updated.
-
Input sentence: Ontario Sen. Michael Pitfield resigned Tuesday night after serving 27 years in the Senate.
Decoded sentence: Michael Pitfield resigned after serving 27 years.


In [34]:
model.fit_generator(generator=training_generator,
                    validation_data=val_generator,
                    epochs=5)

Epoch 1/5


 314/2859 [==>...........................] - ETA: 7:39 - loss: 0.031 - ETA: 7:38 - loss: 0.033 - ETA: 6:41 - loss: 0.032 - ETA: 6:10 - loss: 0.032 - ETA: 5:53 - loss: 0.032 - ETA: 5:43 - loss: 0.032 - ETA: 5:34 - loss: 0.032 - ETA: 5:27 - loss: 0.032 - ETA: 5:22 - loss: 0.031 - ETA: 5:18 - loss: 0.031 - ETA: 5:15 - loss: 0.031 - ETA: 5:13 - loss: 0.032 - ETA: 5:11 - loss: 0.032 - ETA: 5:09 - loss: 0.032 - ETA: 5:07 - loss: 0.032 - ETA: 5:05 - loss: 0.032 - ETA: 5:04 - loss: 0.032 - ETA: 5:03 - loss: 0.032 - ETA: 5:03 - loss: 0.032 - ETA: 5:02 - loss: 0.032 - ETA: 5:01 - loss: 0.032 - ETA: 5:00 - loss: 0.032 - ETA: 4:59 - loss: 0.032 - ETA: 4:57 - loss: 0.032 - ETA: 4:57 - loss: 0.032 - ETA: 4:56 - loss: 0.032 - ETA: 4:55 - loss: 0.032 - ETA: 4:55 - loss: 0.032 - ETA: 4:54 - loss: 0.032 - ETA: 4:54 - loss: 0.033 - ETA: 4:53 - loss: 0.033 - ETA: 4:53 - loss: 0.033 - ETA: 4:53 - loss: 0.033 - ETA: 4:52 - loss: 0.033 - ETA: 4:52 - loss: 0.033 - ETA: 4:51 - loss: 0.033 - ETA: 4:51 - loss: 0













Epoch 2/5


 314/2859 [==>...........................] - ETA: 4:34 - loss: 0.026 - ETA: 4:38 - loss: 0.030 - ETA: 4:41 - loss: 0.029 - ETA: 4:42 - loss: 0.030 - ETA: 4:42 - loss: 0.031 - ETA: 4:43 - loss: 0.031 - ETA: 4:45 - loss: 0.030 - ETA: 4:46 - loss: 0.031 - ETA: 4:47 - loss: 0.031 - ETA: 4:49 - loss: 0.031 - ETA: 4:49 - loss: 0.031 - ETA: 4:49 - loss: 0.031 - ETA: 4:49 - loss: 0.031 - ETA: 4:50 - loss: 0.031 - ETA: 4:50 - loss: 0.031 - ETA: 4:51 - loss: 0.031 - ETA: 4:51 - loss: 0.031 - ETA: 4:50 - loss: 0.030 - ETA: 4:51 - loss: 0.030 - ETA: 4:51 - loss: 0.030 - ETA: 4:51 - loss: 0.030 - ETA: 4:50 - loss: 0.030 - ETA: 4:50 - loss: 0.030 - ETA: 4:50 - loss: 0.030 - ETA: 4:50 - loss: 0.031 - ETA: 4:49 - loss: 0.031 - ETA: 4:49 - loss: 0.030 - ETA: 4:49 - loss: 0.030 - ETA: 4:49 - loss: 0.030 - ETA: 4:49 - loss: 0.030 - ETA: 4:49 - loss: 0.031 - ETA: 4:49 - loss: 0.031 - ETA: 4:48 - loss: 0.031 - ETA: 4:48 - loss: 0.031 - ETA: 4:48 - loss: 0.031 - ETA: 4:48 - loss: 0.031 - ETA: 4:47 - loss: 0













Epoch 3/5


 314/2859 [==>...........................] - ETA: 4:59 - loss: 0.026 - ETA: 4:54 - loss: 0.028 - ETA: 4:55 - loss: 0.028 - ETA: 4:53 - loss: 0.028 - ETA: 4:53 - loss: 0.028 - ETA: 4:53 - loss: 0.029 - ETA: 4:54 - loss: 0.029 - ETA: 4:53 - loss: 0.030 - ETA: 4:54 - loss: 0.030 - ETA: 4:55 - loss: 0.030 - ETA: 4:55 - loss: 0.030 - ETA: 4:55 - loss: 0.030 - ETA: 4:55 - loss: 0.030 - ETA: 4:55 - loss: 0.029 - ETA: 4:54 - loss: 0.029 - ETA: 4:54 - loss: 0.029 - ETA: 4:54 - loss: 0.029 - ETA: 4:54 - loss: 0.029 - ETA: 4:53 - loss: 0.030 - ETA: 4:53 - loss: 0.030 - ETA: 4:52 - loss: 0.030 - ETA: 4:52 - loss: 0.029 - ETA: 4:51 - loss: 0.029 - ETA: 4:51 - loss: 0.030 - ETA: 4:51 - loss: 0.030 - ETA: 4:51 - loss: 0.030 - ETA: 4:51 - loss: 0.029 - ETA: 4:51 - loss: 0.030 - ETA: 4:51 - loss: 0.029 - ETA: 4:51 - loss: 0.029 - ETA: 4:51 - loss: 0.029 - ETA: 4:51 - loss: 0.029 - ETA: 4:51 - loss: 0.029 - ETA: 4:50 - loss: 0.029 - ETA: 4:51 - loss: 0.029 - ETA: 4:51 - loss: 0.030 - ETA: 4:51 - loss: 0













Epoch 4/5


 314/2859 [==>...........................] - ETA: 4:59 - loss: 0.033 - ETA: 4:54 - loss: 0.030 - ETA: 4:51 - loss: 0.030 - ETA: 4:52 - loss: 0.029 - ETA: 4:54 - loss: 0.028 - ETA: 4:55 - loss: 0.029 - ETA: 4:55 - loss: 0.029 - ETA: 4:56 - loss: 0.030 - ETA: 4:56 - loss: 0.030 - ETA: 4:56 - loss: 0.030 - ETA: 4:56 - loss: 0.030 - ETA: 4:56 - loss: 0.030 - ETA: 4:56 - loss: 0.030 - ETA: 4:56 - loss: 0.029 - ETA: 4:56 - loss: 0.029 - ETA: 4:56 - loss: 0.030 - ETA: 4:56 - loss: 0.029 - ETA: 4:56 - loss: 0.029 - ETA: 4:55 - loss: 0.030 - ETA: 4:55 - loss: 0.029 - ETA: 4:55 - loss: 0.030 - ETA: 4:54 - loss: 0.030 - ETA: 4:54 - loss: 0.029 - ETA: 4:54 - loss: 0.030 - ETA: 4:53 - loss: 0.030 - ETA: 4:53 - loss: 0.030 - ETA: 4:53 - loss: 0.029 - ETA: 4:52 - loss: 0.029 - ETA: 4:52 - loss: 0.029 - ETA: 4:52 - loss: 0.029 - ETA: 4:51 - loss: 0.029 - ETA: 4:51 - loss: 0.029 - ETA: 4:51 - loss: 0.029 - ETA: 4:50 - loss: 0.029 - ETA: 4:50 - loss: 0.029 - ETA: 4:50 - loss: 0.029 - ETA: 4:50 - loss: 0













Epoch 5/5


 314/2859 [==>...........................] - ETA: 4:37 - loss: 0.027 - ETA: 4:42 - loss: 0.028 - ETA: 4:43 - loss: 0.029 - ETA: 4:44 - loss: 0.028 - ETA: 4:45 - loss: 0.028 - ETA: 4:47 - loss: 0.029 - ETA: 4:49 - loss: 0.029 - ETA: 4:50 - loss: 0.029 - ETA: 4:51 - loss: 0.028 - ETA: 4:52 - loss: 0.028 - ETA: 4:53 - loss: 0.028 - ETA: 4:53 - loss: 0.028 - ETA: 4:54 - loss: 0.028 - ETA: 4:54 - loss: 0.028 - ETA: 4:54 - loss: 0.028 - ETA: 4:54 - loss: 0.028 - ETA: 4:54 - loss: 0.028 - ETA: 4:54 - loss: 0.028 - ETA: 4:54 - loss: 0.028 - ETA: 4:54 - loss: 0.028 - ETA: 4:54 - loss: 0.028 - ETA: 4:53 - loss: 0.028 - ETA: 4:53 - loss: 0.028 - ETA: 4:53 - loss: 0.028 - ETA: 4:53 - loss: 0.028 - ETA: 4:53 - loss: 0.028 - ETA: 4:52 - loss: 0.028 - ETA: 4:52 - loss: 0.028 - ETA: 4:52 - loss: 0.028 - ETA: 4:52 - loss: 0.028 - ETA: 4:52 - loss: 0.028 - ETA: 4:51 - loss: 0.028 - ETA: 4:51 - loss: 0.028 - ETA: 4:51 - loss: 0.028 - ETA: 4:51 - loss: 0.028 - ETA: 4:51 - loss: 0.028 - ETA: 4:50 - loss: 0















<keras.callbacks.History at 0x1b3847979b0>

In [35]:
print_predictions()

HBox(children=(IntProgress(value=0, max=244), HTML(value='')))


-
Input sentence: The St. Louis Downtown Airport in Cahokia is showing off its new control tower, which is 75 feet higher than the one it replaced.
Decoded sentence: The St. Louis Downtown Airport in Cahokia is showing off its new control tower.
-
Input sentence: Coldplay paid tribute to the late Amy Winehouse Wednesday night during a concert in Los Angeles to benefit the Grammy Foundation.
Decoded sentence: Coldplay paid tribute to the Amy Winehouse.
-
Input sentence: Colombia unveiled a tax reform bill on Tuesday aimed at creating jobs, closing loopholes and simplifying the tax system, but not increasing the tax take as the Andean country was on track for record collections this year.
Decoded sentence: Colombia unveiled a tax reform bill.
-
Input sentence: Oil prices are not reflecting fundamentals, the United Arab Emirates oil minister said on Monday, while his Qatari counterpart said there will be no output increase at OPEC's meeting later this month.
Decoded sentence: Oil prices 

Input sentence: SKIPPER Paul Wellens makes his 400th appearance for St Helens tonight as they look to pick themselves up from their Challenge Cup semi-final defeat.
Decoded sentence: Paul Wellens makes his 400th appearance for St Helens tonight.
-
Input sentence: President Gloria Macapagal Arroyo leaves for the People's Republic of China today to attend the World Expo 2010 in Shanghai.
Decoded sentence: Gloria Macapagal Arroyo leaves for the People's Republic of China today to attend the World Expo 2010.
-
Input sentence: The Deputy Minister of Information and Communications, Hon. Sheka Tarawalie, said on Saturday 16th April 2011 in Addis Ababa, Ethiopia that His Excellency President Dr. Ernest Bai Koroma wants Sierra Leone to be a role model of good governance in the world ``whereby other countries can learn from our bright and shimmering example.''
Decoded sentence: His Excellency President Dr. Ernest Bai Koroma wants Sierra Leone to be a role model of good governance.
-
Input senten

In [47]:
config = model.optimizer.get_config()
print(config)
#config['lr'] /= 5
#print(config)
#model.optimizer = model.optimizer.from_config(config)

{'lr': 0.00020000000949949026, 'beta_1': 0.8999999761581421, 'beta_2': 0.9990000128746033, 'decay': 0.0, 'epsilon': 1e-07, 'amsgrad': False}


In [48]:
model.fit_generator(generator=training_generator,
                    validation_data=val_generator,
                    epochs=5)

Epoch 1/5


 314/2859 [==>...........................] - ETA: 6:25 - loss: 0.027 - ETA: 7:24 - loss: 0.025 - ETA: 6:34 - loss: 0.026 - ETA: 6:07 - loss: 0.026 - ETA: 5:50 - loss: 0.027 - ETA: 5:39 - loss: 0.027 - ETA: 5:31 - loss: 0.027 - ETA: 5:25 - loss: 0.027 - ETA: 5:20 - loss: 0.026 - ETA: 5:20 - loss: 0.027 - ETA: 5:16 - loss: 0.027 - ETA: 5:13 - loss: 0.027 - ETA: 5:10 - loss: 0.027 - ETA: 5:09 - loss: 0.027 - ETA: 5:08 - loss: 0.027 - ETA: 5:07 - loss: 0.027 - ETA: 5:05 - loss: 0.027 - ETA: 5:04 - loss: 0.027 - ETA: 5:03 - loss: 0.027 - ETA: 5:02 - loss: 0.027 - ETA: 5:01 - loss: 0.027 - ETA: 5:00 - loss: 0.027 - ETA: 4:59 - loss: 0.027 - ETA: 4:58 - loss: 0.028 - ETA: 4:57 - loss: 0.027 - ETA: 4:57 - loss: 0.027 - ETA: 4:56 - loss: 0.027 - ETA: 4:55 - loss: 0.027 - ETA: 4:55 - loss: 0.027 - ETA: 4:54 - loss: 0.027 - ETA: 4:54 - loss: 0.027 - ETA: 4:53 - loss: 0.027 - ETA: 4:53 - loss: 0.027 - ETA: 4:53 - loss: 0.027 - ETA: 4:52 - loss: 0.027 - ETA: 4:52 - loss: 0.027 - ETA: 4:52 - loss: 0













Epoch 2/5


 314/2859 [==>...........................] - ETA: 4:34 - loss: 0.029 - ETA: 4:38 - loss: 0.028 - ETA: 4:39 - loss: 0.027 - ETA: 4:40 - loss: 0.028 - ETA: 4:41 - loss: 0.028 - ETA: 4:41 - loss: 0.028 - ETA: 4:43 - loss: 0.028 - ETA: 4:44 - loss: 0.028 - ETA: 4:45 - loss: 0.028 - ETA: 4:44 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:45 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:43 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:45 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:44 - loss: 0.027 - ETA: 4:44 - loss: 0.026 - ETA: 4:44 - loss: 0.026 - ETA: 4:44 - loss: 0.026 - ETA: 4:44 - loss: 0.026 - ETA: 4:44 - loss: 0.026 - ETA: 4:44 - loss: 0.026 - ETA: 4:44 - loss: 0.026 - ETA: 4:44 - loss: 0.026 - ETA: 4:44 - loss: 0













Epoch 3/5


 314/2859 [==>...........................] - ETA: 4:48 - loss: 0.027 - ETA: 4:52 - loss: 0.027 - ETA: 4:52 - loss: 0.027 - ETA: 4:52 - loss: 0.027 - ETA: 4:52 - loss: 0.026 - ETA: 4:51 - loss: 0.027 - ETA: 4:51 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:51 - loss: 0.026 - ETA: 4:51 - loss: 0.026 - ETA: 4:50 - loss: 0.027 - ETA: 4:49 - loss: 0.027 - ETA: 4:49 - loss: 0.026 - ETA: 4:49 - loss: 0.026 - ETA: 4:49 - loss: 0.026 - ETA: 4:49 - loss: 0.026 - ETA: 4:49 - loss: 0.026 - ETA: 4:49 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:51 - loss: 0.026 - ETA: 4:51 - loss: 0.026 - ETA: 4:51 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:50 - loss: 0.026 - ETA: 4:49 - loss: 0













Epoch 4/5


 314/2859 [==>...........................] - ETA: 4:45 - loss: 0.028 - ETA: 4:46 - loss: 0.030 - ETA: 4:47 - loss: 0.029 - ETA: 4:48 - loss: 0.028 - ETA: 4:49 - loss: 0.028 - ETA: 4:50 - loss: 0.027 - ETA: 4:51 - loss: 0.027 - ETA: 4:50 - loss: 0.027 - ETA: 4:51 - loss: 0.027 - ETA: 4:52 - loss: 0.026 - ETA: 4:52 - loss: 0.026 - ETA: 4:53 - loss: 0.026 - ETA: 4:53 - loss: 0.026 - ETA: 4:53 - loss: 0.026 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:54 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:53 - loss: 0.025 - ETA: 4:52 - loss: 0













Epoch 5/5


 314/2859 [==>...........................] - ETA: 4:34 - loss: 0.026 - ETA: 4:39 - loss: 0.024 - ETA: 4:44 - loss: 0.025 - ETA: 4:47 - loss: 0.025 - ETA: 4:49 - loss: 0.024 - ETA: 4:50 - loss: 0.024 - ETA: 4:50 - loss: 0.024 - ETA: 4:51 - loss: 0.024 - ETA: 4:51 - loss: 0.025 - ETA: 4:51 - loss: 0.025 - ETA: 4:52 - loss: 0.025 - ETA: 4:51 - loss: 0.025 - ETA: 4:51 - loss: 0.025 - ETA: 4:51 - loss: 0.025 - ETA: 4:51 - loss: 0.025 - ETA: 4:51 - loss: 0.025 - ETA: 4:51 - loss: 0.025 - ETA: 4:50 - loss: 0.025 - ETA: 4:50 - loss: 0.025 - ETA: 4:50 - loss: 0.025 - ETA: 4:50 - loss: 0.025 - ETA: 4:50 - loss: 0.025 - ETA: 4:49 - loss: 0.024 - ETA: 4:49 - loss: 0.024 - ETA: 4:50 - loss: 0.024 - ETA: 4:49 - loss: 0.024 - ETA: 4:49 - loss: 0.024 - ETA: 4:49 - loss: 0.024 - ETA: 4:49 - loss: 0.024 - ETA: 4:49 - loss: 0.024 - ETA: 4:49 - loss: 0.024 - ETA: 4:49 - loss: 0.024 - ETA: 4:49 - loss: 0.024 - ETA: 4:49 - loss: 0.024 - ETA: 4:49 - loss: 0.024 - ETA: 4:49 - loss: 0.024 - ETA: 4:49 - loss: 0















<keras.callbacks.History at 0x1b389e38240>

In [49]:
print_predictions()

HBox(children=(IntProgress(value=0, max=244), HTML(value='')))


-
Input sentence: The St. Louis Downtown Airport in Cahokia is showing off its new control tower, which is 75 feet higher than the one it replaced.
Decoded sentence: The St. Louis Downtown Airport in Cahokia is showing off its new control tower.
-
Input sentence: Coldplay paid tribute to the late Amy Winehouse Wednesday night during a concert in Los Angeles to benefit the Grammy Foundation.
Decoded sentence: Coldplay paid tribute to the Amy Winehouse.
-
Input sentence: Colombia unveiled a tax reform bill on Tuesday aimed at creating jobs, closing loopholes and simplifying the tax system, but not increasing the tax take as the Andean country was on track for record collections this year.
Decoded sentence: Colombia unveiled a tax reform bill.
-
Input sentence: Oil prices are not reflecting fundamentals, the United Arab Emirates oil minister said on Monday, while his Qatari counterpart said there will be no output increase at OPEC's meeting later this month.
Decoded sentence: Oil prices 

Input sentence: Other songs from Jessie J's debut studio album ``Who You Are'' that she performed at the 'VEVO Lift' showcase included: ``Price Tag'', ``Nobody's Perfect'', ``Mamma Knows Best'' and the title track ``Who You Are''.
Decoded sentence: Mamma Knows B Price Tag.
-
Input sentence: First Financial Bancorp has paid a cash dividend every quarter since theformation of the holding company in April 1983.
Decoded sentence: First Financial Bancorp has paid a cash dividend.
-
Input sentence: NASA video shows solar storm Updated: 09:35, Saturday February 11, 2012 Stunning new videos showing a solar storm hitting the Earth have been released by NASA.
Decoded sentence: NASA video showing a solar storm Updated.
-
Input sentence: Ontario Sen. Michael Pitfield resigned Tuesday night after serving 27 years in the Senate.
Decoded sentence: Ontario Sen. Michael Pitfield resigned after serving 27 years.


In [50]:
# Save model
model.save('data/s2s.h5')