# Tradutor de texto
Neste projeto, será criando um modelo chamado ´sequence to sequence´ (Seq2Seq) utilizando um dataset com sentenças em ingles e sentenças em francês, que irá traduzir textos em inglês para textos em francês.

## Importar os dados

Uma vez que traduzir todo o idioma do inglês para o francês levará muito tempo para treinar, fornecemos uma pequena porção do corpus inglês

In [4]:
import helper # Utilizado para salvar/carregar os dados 
import problem_unittests as tests # Utilizado para realizar testes em cada bloco

source_path = 'data/small_vocab_en'
target_path = 'data/small_vocab_fr'
source_text = helper.load_data(source_path)
target_text = helper.load_data(target_path)

## Explorar os dados
Utilize a função view_sentence_range para ver diversas partes diferentes do dataset

In [5]:
view_sentence_range = (0, 10)

import numpy as np

print('Estatísticas do dataset')
print('Numero de palavras unicas: {}'.format(len({word: None for word in source_text.split()})))

sentences = source_text.split('\n')
word_counts = [len(sentence.split()) for sentence in sentences]

print("Numero aproximado de sentenças: {}".format(len(sentences)))
print("Média de palavras por sentença: {}".format(np.average(word_counts)))
print()
print("Sentenças em inglês de {} a {}".format(*view_sentence_range))
print('\n'.join(source_text.split('\n')[view_sentence_range[0]:view_sentence_range[1]]))
print()
print('Sentenças em francês de {} a {}:'.format(*view_sentence_range))
print('\n'.join(target_text.split('\n')[view_sentence_range[0]:view_sentence_range[1]]))

Estatísticas do dataset
Numero de palavras unicas: 227
Numero aproximado de sentenças: 137861
Média de palavras por sentença: 13.225277634719028

Sentenças em inglês de 0 a 10
new jersey is sometimes quiet during autumn , and it is snowy in april .
the united states is usually chilly during july , and it is usually freezing in november .
california is usually quiet during march , and it is usually hot in june .
the united states is sometimes mild during june , and it is cold in september .
your least liked fruit is the grape , but my least liked is the apple .
his favorite fruit is the orange , but my favorite is the grape .
paris is relaxing during december , but it is usually chilly in july .
new jersey is busy during spring , and it is never hot in march .
our least liked fruit is the lemon , but my least liked is the grape .
the united states is sometimes busy during january , and it is sometimes warm in november .

Sentenças em francês de 0 a 10:
new jersey est parfois calme penda

## Implementar função de pre processamento
### Texto para ID

Assim com é feito em outras RNNs, os textos devem ser transformados em numeros para que o computador possa entende-los.
Na função `text_to_ids()`, o `source_text` e `target_text` serão passados de textos para ids. Para isso é preciso adicionar a palavra `<EOS>`  e id no final de `target_text`. Isto ajudará a rede neural a prever quando a sentença deve terminar.

É possivel pegar o id da palavra  `<EOS>` através do comando :
- `target_vocab_to_int['<EOS>']`

Pode pegar outros ids de plavara utilizando  `source_vocab_to_int` and `target_vocab_to_int`.

In [6]:
def text_to_ids(source_text, target_text, source_vocab_to_int, target_vocab_to_int):
 
    source_id_text = []
    for indice, linha in enumerate(source_text.split('\n')):
        source_id_text.append([])
        for palavra  in linha.split():
            source_id_text[indice].append(source_vocab_to_int[palavra])
            
    target_id_text =[]
    for indice, linha in enumerate(target_text.split('\n')):
        target_id_text.append([])
        for palavra in linha.split():
            target_id_text[indice].append(target_vocab_to_int[palavra])
        target_id_text[indice].append(target_vocab_to_int['<EOS>'])
    return (source_id_text, target_id_text)

tests.test_text_to_ids(text_to_ids)

Tests Passed


### Processar e salvar todos os dados
Rodar este bloco para processar e salvar todos os dados em um arquivo

In [7]:
helper.preprocess_and_save_data(source_path, target_path, text_to_ids)

# Check Point
Este é o primeiro check point. Se decidir restartar o noteboook, poderá começar daqui. O processamento dos dados foi salvo para o disco.

In [8]:
import numpy as np
import helper
import problem_unittests as tests

(source_int_text, target_int_text), (source_vocab_to_int, target_vocab_to_int), _ = helper.load_preprocess()

### Verificar versão do TensorFlow e acesso a GPU
Este bloco checará se a versão do Tensorflow está correta, e se você possui acesso a GPU

In [9]:
from distutils.version import LooseVersion
import warnings
import tensorflow as tf
from tensorflow.python.layers.core import Dense

# Check TensorFlow Version
assert LooseVersion(tf.__version__) >= LooseVersion('1.1'), 'Please use TensorFlow version 1.1 or newer'
print('TensorFlow Version: {}'.format(tf.__version__))

# Check for a GPU
if not tf.test.gpu_device_name():
    warnings.warn('GPU Não encontrada. Por favor utilize uma GPU para treinar a rede neural.')
else:
    print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))

TensorFlow Version: 1.8.0




## Construir a Rede Neural

Construção dos componentes necessários para construir um modelo Sequence-to-Sequence (Seq2Seq) implementando as funções abaixo:

- `model_inputs`
- `process_decoder_input`
- `encoding_layer`
- `decoding_layer_train`
- `decoding_layer_infer`
- `decoding_layer`
- `seq2seq_model`

### Input

Implemente a função `model_inputs()` para criar os TF Placeholders para a rede neural. Devem ser criados os seguintes placeholders:


- Input text placeholder named "input" using the TF Placeholder name parameter with rank 2.
- Targets placeholder with rank 2.
- Learning rate placeholder with rank 0.
- Keep probability placeholder named "keep_prob" using the TF Placeholder name parameter with rank 0.
- Target sequence length placeholder named "target_sequence_length" with rank 1
- Max target sequence length tensor named "max_target_len" getting its value from applying tf.reduce_max on the target_sequence_length placeholder. Rank 0.
- Source sequence length placeholder named "source_sequence_length" with rank 1

Retornar os placeholders de acordo com a seguinte tupla (input, targets, learning_rate, keep_prob, target_sequence_len, max_target_sequence_len, source_sequence_len)


In [10]:
def model_inputs():
    
    inputs =  tf.placeholder(tf.int32, shape=[None,None], name='input')
    targets = tf.placeholder(tf.int32, shape=[None,None], name='targets')
    learning_rate = tf.placeholder(tf.float32, name='learning_rate')
    keep_prob = tf.placeholder(tf.float32, name='keep_prob')
    target_sequence_length = tf.placeholder(tf.int32, shape=[None], name= "target_sequence_length")
    max_target_length = tf.reduce_max(target_sequence_length, name= 'max_target_len')
    
    source_sequence_length = tf.placeholder(tf.int32, shape=[None], name= "source_sequence_length")
    return (inputs, targets, learning_rate, keep_prob, target_sequence_length, max_target_length, source_sequence_length)
    
tests.test_model_inputs(model_inputs)

Tests Passed


#### Processar Decode do Input
Implemente a função `process_decoder_input` removendo o ultimo id da palavra para cada batch em `target_data` e concatenar com o id de <GO> para iniciar cada batch


In [11]:
def process_decoder_input(target_data, target_vocab_to_int, batch_size):

    id_go = tf.constant(target_vocab_to_int['<GO>'], shape=(batch_size, 1) , dtype=tf.int32)
    t_data = tf.concat([id_go, target_data[:, :-1]],1)    
    return t_data

tests.test_process_encoding_input(process_decoder_input)

Tests Passed


### Encoding
Implementar função `encoding_layer()` Para criar uma camada de RNN

 * Embed the encoder input using [`tf.contrib.layers.embed_sequence`](https://www.tensorflow.org/api_docs/python/tf/contrib/layers/embed_sequence)
 
 * Construct a [stacked](https://github.com/tensorflow/tensorflow/blob/6947f65a374ebf29e74bb71e36fd82760056d82c/tensorflow/docs_src/tutorials/recurrent.md#stacking-multiple-lstms) [`tf.contrib.rnn.LSTMCell`](https://www.tensorflow.org/api_docs/python/tf/contrib/rnn/LSTMCell) wrapped in a [`tf.contrib.rnn.DropoutWrapper`](https://www.tensorflow.org/api_docs/python/tf/contrib/rnn/DropoutWrapper)
 * Pass cell and embedded input to [`tf.nn.dynamic_rnn()`](https://www.tensorflow.org/api_docs/python/tf/nn/dynamic_rnn)

In [12]:
from imp import reload
reload(tests)

def encoding_layer(rnn_inputs, rnn_size, num_layers, keep_prob, 
                   source_sequence_length, source_vocab_size, 
                   encoding_embedding_size):
    
    
    def build_cell(rnn_size, keep_prob):
        lstm = tf.contrib.rnn.LSTMCell(rnn_size)
        lstm_dropout = tf.contrib.rnn.DropoutWrapper(lstm, output_keep_prob=keep_prob)
        return lstm_dropout
    
    # Criando celula multicamada RNN
    stacked_lstm = tf.contrib.rnn.MultiRNNCell([build_cell(rnn_size, keep_prob) for _ in range(num_layers)])
    
    # Criando camada embed
    embed_encoder = tf.contrib.layers.embed_sequence(rnn_inputs, vocab_size = source_vocab_size, embed_dim = encoding_embedding_size)
    
    output, state = tf.nn.dynamic_rnn(stacked_lstm, embed_encoder, source_sequence_length, dtype=tf.float32)
    return (output, state)

tests.test_encoding_layer(encoding_layer)

Tests Passed


### Decoding - Training
Criar uma camada de decodificação de treinamento:
* Criar um [`tf.contrib.seq2seq.TrainingHelper`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/TrainingHelper) 
* Criar um [`tf.contrib.seq2seq.BasicDecoder`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/BasicDecoder)
* Obter decoder outputs de [`tf.contrib.seq2seq.dynamic_decode`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/dynamic_decode)

In [13]:

def decoding_layer_train(encoder_state, dec_cell, dec_embed_input, target_sequence_length, max_summary_length, output_layer, keep_prob):

    training = tf.contrib.seq2seq.TrainingHelper(dec_embed_input, target_sequence_length)
    decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell, training, encoder_state, output_layer)
    output, dumy_1, dummy_2 = tf.contrib.seq2seq.dynamic_decode(decoder=decoder, maximum_iterations=max_summary_length)
    return output

tests.test_decoding_layer_train(decoding_layer_train)

Tests Passed


### Decoding - Inference
Criar decodificador de inferencia :
* Criar um [`tf.contrib.seq2seq.GreedyEmbeddingHelper`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/GreedyEmbeddingHelper)
* Criar um[`tf.contrib.seq2seq.BasicDecoder`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/BasicDecoder)
* Obter decoder outputs de [`tf.contrib.seq2seq.dynamic_decode`](https://www.tensorflow.org/api_docs/python/tf/contrib/seq2seq/dynamic_decode)

In [14]:
def decoding_layer_infer(encoder_state, dec_cell, dec_embeddings, start_of_sequence_id,
                         end_of_sequence_id, max_target_sequence_length,
                         vocab_size, output_layer, batch_size, keep_prob):
    
    
    start_id = tf.tile([start_of_sequence_id], [batch_size])
   
    embedding = tf.contrib.seq2seq.GreedyEmbeddingHelper(dec_embeddings, start_id, end_of_sequence_id)
    decoder = tf.contrib.seq2seq.BasicDecoder(dec_cell, embedding, encoder_state, output_layer)
    f_output, _, _ = tf.contrib.seq2seq.dynamic_decode(decoder,maximum_iterations=max_target_sequence_length)
    return f_output

tests.test_decoding_layer_infer(decoding_layer_infer)

Tests Passed


### Construir a camada de decodificação
Implementar função `decoding_layer()` para criar uma camada decodificador RNN.

* Incorporar as sequências de destino
* Construa a célula LSTM do decodificador (assim como você construiu a célula do encoder acima)
* Crie uma camada de saída para mapear as saídas do decodificador para os elementos do nosso vocabulário
* Use a função `decoding_layer_train (encoder_state, dec_cell, dec_embed_input, target_sequence_length, max_target_sequence_length, output_layer, keep_prob)` para obter os logits de treinamento.
* Use sua função `decoding_layer_infer (encoder_state, dec_cell, dec_embeddings, start_of_sequence_id, end_of_sequence_id, max_target_sequence_length, vocab_size, output_layer, batch_size, keep_prob)` para obter os logits de inferência.

Nota: Você precisará usar [tf.variable_scope] (https://www.tensorflow.org/api_docs/python/tf/variable_scope) para compartilhar variáveis ​​entre treinamento e inferência.

In [15]:
def decoding_layer(dec_input, encoder_state,
                   target_sequence_length, max_target_sequence_length,
                   rnn_size,
                   num_layers, target_vocab_to_int, target_vocab_size,
                   batch_size, keep_prob, decoding_embedding_size):
    """
    Create decoding layer
    :param dec_input: Decoder input
    :param encoder_state: Encoder state
    :param target_sequence_length: The lengths of each sequence in the target batch
    :param max_target_sequence_length: Maximum length of target sequences
    :param rnn_size: RNN Size
    :param num_layers: Number of layers
    :param target_vocab_to_int: Dictionary to go from the target words to an id
    :param target_vocab_size: Size of target vocabulary
    :param batch_size: The size of the batch
    :param keep_prob: Dropout keep probability
    :param decoding_embedding_size: Decoding embedding size
    :return: Tuple of (Training BasicDecoderOutput, Inference BasicDecoderOutput)
    """
    # Use the same proess as in the encoding layer.
    def build_cell(rnn_size, keep_prob):
        lstm = tf.contrib.rnn.LSTMCell(rnn_size)
        lstm_drop = tf.contrib.rnn.DropoutWrapper(lstm, output_keep_prob=keep_prob)
        return lstm_drop
    # Stack them all
    stacked_lstm = tf.contrib.rnn.MultiRNNCell([build_cell(rnn_size, keep_prob) for _ in range(num_layers)])
    
    dec_embeddings = tf.Variable(tf.random_uniform([target_vocab_size, decoding_embedding_size]))
    dec_embed_input = tf.nn.embedding_lookup(dec_embeddings, dec_input)

    dense_layer = Dense(target_vocab_size,
                         kernel_initializer = tf.truncated_normal_initializer(mean = 0.0, stddev=0.1))
    
    with tf.variable_scope("decode") as scope:
        tr_decoder_output = decoding_layer_train(
            encoder_state, stacked_lstm, dec_embed_input, 
            target_sequence_length, max_target_sequence_length, 
            dense_layer, keep_prob)
        scope.reuse_variables()
        inf_decoder_output = decoding_layer_infer(
            encoder_state, stacked_lstm, dec_embeddings, 
            target_vocab_to_int['<GO>'], target_vocab_to_int['<EOS>'], 
            max_target_sequence_length, target_vocab_size, 
            dense_layer, batch_size, keep_prob)
    
    return tr_decoder_output, inf_decoder_output

"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_decoding_layer(decoding_layer)

Tests Passed


### Construa a Rede Neural
Aplique as funções implementadas abaixo para:

- Codifique a entrada usando seu `encoding_layer (rnn_inputs, rnn_size, num_layers, keep_prob, source_sequence_length, source_vocab_size, encoding_embedding_size)`.
- Processar dados de destino usando sua função `process_decoder_input (target_data, target_vocab_to_int, batch_size)`.
- Decodifique a entrada codificada usando sua função `decoding_layer (dec_input, enc_state, target_sequence_length, max_target_sentence_length, rnn_size, num_layers, target_vocab_to_int, target_vocab_size, batch_size, keep_prob, dec_embedding_size)`.

In [16]:
def seq2seq_model(input_data, target_data, keep_prob, batch_size,
                  source_sequence_length, target_sequence_length,
                  max_target_sentence_length,
                  source_vocab_size, target_vocab_size,
                  enc_embedding_size, dec_embedding_size,
                  rnn_size, num_layers, target_vocab_to_int):
    output, state = encoding_layer(input_data, rnn_size, num_layers, keep_prob, 
                   source_sequence_length, source_vocab_size, 
                   enc_embedding_size)
    
    processed_input = process_decoder_input(target_data, target_vocab_to_int, batch_size)
    
    tr_decoder_output, inf_decoder_output = decoding_layer(processed_input, state,
                   target_sequence_length, max_target_sentence_length,
                   rnn_size, num_layers, target_vocab_to_int, target_vocab_size,
                   batch_size, keep_prob, dec_embedding_size)
    
    return tr_decoder_output, inf_decoder_output

tests.test_seq2seq_model(seq2seq_model)

Tests Passed


## Treinamento em Redes Neurais
### Hyperparameters
Ajuste os seguintes parâmetros:

- Defina "épocas" para o número de épocas.
- Configure `batch_size` para o tamanho do lote.
- Defina `rnn_size` para o tamanho dos RNNs.
- Defina `num_layers` para o número de camadas.
- Defina `encoding_embedding_size` com o tamanho da incorporação para o codificador.
- Defina `decoding_embedding_size` com o tamanho da incorporação para o decodificador.
- Configure `learning_rate` para a taxa de aprendizado.
- Defina `keep_probability` para a probabilidade de dropout keep
- Defina `display_step` para indicar quantos passos entre cada declaração de saída de depuração

In [25]:
# Number of Epochs
epochs = 10
# Batch Size
batch_size = 512
# RNN Size
rnn_size = 128
# Number of Layers
num_layers = 2
# Embedding Size
encoding_embedding_size = 128
decoding_embedding_size = 128
# Learning Rate
learning_rate = 0.001
# Dropout Keep Probability
keep_probability = 0.55
display_step = True

### Construir o grafo
Construa o grafo usando a rede neural que você implementou.

In [26]:
save_path = 'checkpoints/dev'
(source_int_text, target_int_text), (source_vocab_to_int, target_vocab_to_int), _ = helper.load_preprocess()
max_target_sentence_length = max([len(sentence) for sentence in source_int_text])

train_graph = tf.Graph()
with train_graph.as_default():
    input_data, targets, lr, keep_prob, target_sequence_length, max_target_sequence_length, source_sequence_length = model_inputs()

    #sequence_length = tf.placeholder_with_default(max_target_sentence_length, None, name='sequence_length')
    input_shape = tf.shape(input_data)

    train_logits, inference_logits = seq2seq_model(tf.reverse(input_data, [-1]),
                                                   targets,
                                                   keep_prob,
                                                   batch_size,
                                                   source_sequence_length,
                                                   target_sequence_length,
                                                   max_target_sequence_length,
                                                   len(source_vocab_to_int),
                                                   len(target_vocab_to_int),
                                                   encoding_embedding_size,
                                                   decoding_embedding_size,
                                                   rnn_size,
                                                   num_layers,
                                                   target_vocab_to_int)


    training_logits = tf.identity(train_logits.rnn_output, name='logits')
    inference_logits = tf.identity(inference_logits.sample_id, name='predictions')

    masks = tf.sequence_mask(target_sequence_length, max_target_sequence_length, dtype=tf.float32, name='masks')

    with tf.name_scope("optimization"):
        cost = tf.contrib.seq2seq.sequence_loss(training_logits,targets,masks)
        optimizer = tf.train.AdamOptimizer(lr)
        gradients = optimizer.compute_gradients(cost)
        capped_gradients = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gradients if grad is not None]
        train_op = optimizer.apply_gradients(capped_gradients)


Batch and pad the source and target sequences

In [27]:
def pad_sentence_batch(sentence_batch, pad_int):
    max_sentence = max([len(sentence) for sentence in sentence_batch])
    return [sentence + [pad_int] * (max_sentence - len(sentence)) for sentence in sentence_batch]


def get_batches(sources, targets, batch_size, source_pad_int, target_pad_int):
    for batch_i in range(0, len(sources)//batch_size):
        start_i = batch_i * batch_size

        sources_batch = sources[start_i:start_i + batch_size]
        targets_batch = targets[start_i:start_i + batch_size]

        pad_sources_batch = np.array(pad_sentence_batch(sources_batch, source_pad_int))
        pad_targets_batch = np.array(pad_sentence_batch(targets_batch, target_pad_int))

        pad_targets_lengths = []
        for target in pad_targets_batch:
            pad_targets_lengths.append(len(target))

        pad_source_lengths = []
        for source in pad_sources_batch:
            pad_source_lengths.append(len(source))

        yield pad_sources_batch, pad_targets_batch, pad_source_lengths, pad_targets_lengths


### Treinar
Treine a rede neural nos dados pré-processados. Se você tiver dificuldade em obter uma boa perda, verifique os formulários para ver se alguém está com o mesmo problema.

In [28]:
def get_accuracy(target, logits):
    """
    Calculate accuracy
    """
    max_seq = max(target.shape[1], logits.shape[1])
    if max_seq - target.shape[1]:
        target = np.pad(
            target,
            [(0,0),(0,max_seq - target.shape[1])],
            'constant')
    if max_seq - logits.shape[1]:
        logits = np.pad(
            logits,
            [(0,0),(0,max_seq - logits.shape[1])],
            'constant')

    return np.mean(np.equal(target, logits))

# Split data to training and validation sets
train_source = source_int_text[batch_size:]
train_target = target_int_text[batch_size:]
valid_source = source_int_text[:batch_size]
valid_target = target_int_text[:batch_size]
(valid_sources_batch, valid_targets_batch, valid_sources_lengths, valid_targets_lengths ) = next(get_batches(valid_source,
                                                                                                             valid_target,
                                                                                                             batch_size,
                                                                                                             source_vocab_to_int['<PAD>'],
                                                                                                             target_vocab_to_int['<PAD>']))                                                                                                  
with tf.Session(graph=train_graph) as sess:
    sess.run(tf.global_variables_initializer())

    for epoch_i in range(epochs):
        for batch_i, (source_batch, target_batch, sources_lengths, targets_lengths) in enumerate(
                get_batches(train_source, train_target, batch_size,
                            source_vocab_to_int['<PAD>'],
                            target_vocab_to_int['<PAD>'])):

            _, loss = sess.run(
                [train_op, cost],
                {input_data: source_batch,
                 targets: target_batch,
                 lr: learning_rate,
                 target_sequence_length: targets_lengths,
                 source_sequence_length: sources_lengths,
                 keep_prob: keep_probability})


            if batch_i % display_step == 0 and batch_i > 0:


                batch_train_logits = sess.run(
                    inference_logits,
                    {input_data: source_batch,
                     source_sequence_length: sources_lengths,
                     target_sequence_length: targets_lengths,
                     keep_prob: 1.0})


                batch_valid_logits = sess.run(
                    inference_logits,
                    {input_data: valid_sources_batch,
                     source_sequence_length: valid_sources_lengths,
                     target_sequence_length: valid_targets_lengths,
                     keep_prob: 1.0})

                train_acc = get_accuracy(target_batch, batch_train_logits)

                valid_acc = get_accuracy(valid_targets_batch, batch_valid_logits)

                print('Epoch {:>3} Batch {:>4}/{} - Train Accuracy: {:>6.4f}, Validation Accuracy: {:>6.4f}, Loss: {:>6.4f}'
                      .format(epoch_i, batch_i, len(source_int_text) // batch_size, train_acc, valid_acc, loss))

    # Save Model
    saver = tf.train.Saver()
    saver.save(sess, save_path)
    print('Modelo treinado e salvo')

Epoch   0 Batch    1/269 - Train Accuracy: 0.2329, Validation Accuracy: 0.3096, Loss: 5.7553
Epoch   0 Batch    2/269 - Train Accuracy: 0.2655, Validation Accuracy: 0.3096, Loss: 5.5650
Epoch   0 Batch    3/269 - Train Accuracy: 0.2444, Validation Accuracy: 0.3096, Loss: 5.4162
Epoch   0 Batch    4/269 - Train Accuracy: 0.2317, Validation Accuracy: 0.3096, Loss: 5.2458
Epoch   0 Batch    5/269 - Train Accuracy: 0.2325, Validation Accuracy: 0.3096, Loss: 5.0633
Epoch   0 Batch    6/269 - Train Accuracy: 0.2785, Validation Accuracy: 0.3096, Loss: 4.7383
Epoch   0 Batch    7/269 - Train Accuracy: 0.2765, Validation Accuracy: 0.3096, Loss: 4.5589
Epoch   0 Batch    8/269 - Train Accuracy: 0.2503, Validation Accuracy: 0.3182, Loss: 4.5569
Epoch   0 Batch    9/269 - Train Accuracy: 0.2797, Validation Accuracy: 0.3220, Loss: 4.3420
Epoch   0 Batch   10/269 - Train Accuracy: 0.2567, Validation Accuracy: 0.3312, Loss: 4.3775
Epoch   0 Batch   11/269 - Train Accuracy: 0.3034, Validation Accuracy

### Salvar Parâmetros
Salvar os parametros `batch_size` e `save_path` para inferencia.

In [29]:
helper.save_params(save_path)

# Checkpoint

In [30]:
import tensorflow as tf
import numpy as np
import helper
import problem_unittests as tests

_, (source_vocab_to_int, target_vocab_to_int), (source_int_to_vocab, target_int_to_vocab) = helper.load_preprocess()
load_path = helper.load_params()

## Sentença a Sequencia
Para alimentar uma frase no modelo para tradução, você primeiro precisa pré-processá-la. Implemente a função `sentence_para_seq ()` para pré-processar novas sentenças.

- Converta a frase para minúscula
- Converta palavras em ids usando o `vocab_to_int`
 - Converter palavras não no vocabulário, ao `<UNK>` id palavra.

In [31]:
def sentence_to_seq(sentence, vocab_to_int):
    word_ids = []
    for word in sentence.lower().split():
        if word in vocab_to_int:
            word_ids.append(vocab_to_int[word])
        else:
            word_ids.append(vocab_to_int['<UNK>'])
    return word_ids

tests.test_sentence_to_seq(sentence_to_seq)

Tests Passed


## Traduzir
Isso irá traduzir o `translate_sentence` do inglês para o francês.

In [33]:
translate_sentence = 'he saw a old yellow truck .'

translate_sentence = sentence_to_seq(translate_sentence, source_vocab_to_int)

loaded_graph = tf.Graph()
with tf.Session(graph=loaded_graph) as sess:
    # Load saved model
    loader = tf.train.import_meta_graph(load_path + '.meta')
    loader.restore(sess, load_path)

    input_data = loaded_graph.get_tensor_by_name('input:0')
    logits = loaded_graph.get_tensor_by_name('predictions:0')
    target_sequence_length = loaded_graph.get_tensor_by_name('target_sequence_length:0')
    source_sequence_length = loaded_graph.get_tensor_by_name('source_sequence_length:0')
    keep_prob = loaded_graph.get_tensor_by_name('keep_prob:0')

    translate_logits = sess.run(logits, {input_data: [translate_sentence]*batch_size,
                                         target_sequence_length: [len(translate_sentence)*2]*batch_size,
                                         source_sequence_length: [len(translate_sentence)]*batch_size,
                                         keep_prob: 1.0})[0]

print('Input')
print('  Word Ids:      {}'.format([i for i in translate_sentence]))
print('  English Words: {}'.format([source_int_to_vocab[i] for i in translate_sentence]))

print('\nPrediction')
print('  Word Ids:      {}'.format([i for i in translate_logits]))
print('  French Words: {}'.format(" ".join([target_int_to_vocab[i] for i in translate_logits])))


INFO:tensorflow:Restoring parameters from checkpoints/dev
Input
  Word Ids:      [4, 13, 223, 209, 214, 61, 162]
  English Words: ['he', 'saw', 'a', 'old', 'yellow', 'truck', '.']

Prediction
  Word Ids:      [353, 30, 268, 176, 194, 234, 64, 1]
  French Words: il conduit une petite voiture jaune . <EOS>


## Tradução Imperfeita
Você pode notar que algumas frases traduzem melhor que outras. Como o conjunto de dados que você está usando tem apenas um vocabulário de 227 palavras em inglês dos milhares que você usa, você só verá bons resultados usando essas palavras. Para este projeto, você não precisa de uma tradução perfeita. No entanto, se você quiser criar um modelo de tradução melhor, precisará de dados melhores.

Você pode treinar no [WMT10 corpus francês-inglês] (http://www.statmt.org/wmt10/training-giga-fren.tar). Este conjunto de dados tem mais vocabulário e mais rico em tópicos discutidos. No entanto, isso levará dias para treinar, portanto, verifique se você tem uma GPU e se a rede neural está tendo um bom desempenho no conjunto de dados que fornecemos. Apenas certifique-se de tocar com o corpus do WMT10 depois de enviar este projeto.
## Enviando este projeto
Ao enviar este projeto, certifique-se de executar todas as células antes de salvar o bloco de anotações. Salve o arquivo do notebook como "dlnd_language_translation.ipynb" e salve-o como um arquivo HTML em "File" -> "Download as". Inclua os arquivos "helper.py" e "problem_unittests.py" no seu envio.