From 792f314da7d21678a536dd9833fec0d2216d080f Mon Sep 17 00:00:00 2001 From: Stefan Schweter Date: Sat, 3 Mar 2018 18:42:00 +0100 Subject: [PATCH 1/4] readme: formatting fix --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 755d080b6..cd9ab9331 100644 --- a/README.md +++ b/README.md @@ -154,7 +154,7 @@ For all translation problems, we suggest to try the Transformer model: this should reach a BLEU score of about 28 on the English-German data-set, which is close to state-of-the art. If training on a single GPU, try the `--hparams_set=transformer_base_single_gpu` setting. For very good results -or larger data-sets (e.g., for English-French)m, try the big model +or larger data-sets (e.g., for English-French), try the big model with `--hparams_set=transformer_big`. ## Basics From 6c4ef8109e27d51cbb9e70c949d9062661ea612b Mon Sep 17 00:00:00 2001 From: Stefan Schweter Date: Sat, 3 Mar 2018 18:42:21 +0100 Subject: [PATCH 2/4] docs: formatting fix --- docs/index.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/index.md b/docs/index.md index 8860e03b7..b7d0236c9 100644 --- a/docs/index.md +++ b/docs/index.md @@ -69,7 +69,7 @@ For language modeling, we have these data-sets in T2T: * LM1B (a billion-word corpus): `--problems=languagemodel_lm1b32k` for subword-level modeling and `--problems=languagemodel_lm1b_characters` for character-level modeling. - + We suggest to start with `--model=transformer` on this task and use `--hparams_set=transformer_small` for PTB and `--hparams_set=transformer_base` for LM1B. @@ -95,7 +95,7 @@ For speech-to-text, we have these data-sets in T2T: For summarizing longer text into shorter one we have these data-sets: * CNN/DailyMail articles summarized into a few sentences: `--problems=summarize_cnn_dailymail32k` - + We suggest to use `--model=transformer` and `--hparams_set=transformer_prepend` for this task. This yields good ROUGE scores. @@ -118,5 +118,5 @@ For all translation problems, we suggest to try the Transformer model: this should reach a BLEU score of about 28 on the English-German data-set, which is close to state-of-the art. If training on a single GPU, try the `--hparams_set=transformer_base_single_gpu` setting. For very good results -or larger data-sets (e.g., for English-French)m, try the big model +or larger data-sets (e.g., for English-French), try the big model with `--hparams_set=transformer_big`. From 7b2791800f675306592f802d42a499c320a4cc32 Mon Sep 17 00:00:00 2001 From: Stefan Schweter Date: Sat, 3 Mar 2018 18:42:44 +0100 Subject: [PATCH 3/4] lstm: minor spelling fixes --- tensor2tensor/models/lstm.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tensor2tensor/models/lstm.py b/tensor2tensor/models/lstm.py index 9c6144e9f..d05c1f599 100644 --- a/tensor2tensor/models/lstm.py +++ b/tensor2tensor/models/lstm.py @@ -65,7 +65,7 @@ def dropout_lstm_cell(): attention_mechanism_class = tf.contrib.seq2seq.BahdanauAttention else: raise ValueError("Unknown hparams.attention_mechanism = %s, must be " - "luong or bahdanu." % hparams.attention_mechanism) + "luong or bahdanau." % hparams.attention_mechanism) attention_mechanism = attention_mechanism_class( hparams.hidden_size, encoder_outputs) @@ -338,7 +338,7 @@ def lstm_attention(): @registry.register_hparams def lstm_bahdanau_attention_multi(): - """Multi-head Bahdanu attention.""" + """Multi-head Bahdanau attention.""" hparams = lstm_bahdanau_attention() hparams.num_heads = 4 return hparams From 033b667539cbff17f39c8473d65599f2f65a274a Mon Sep 17 00:00:00 2001 From: Stefan Schweter Date: Sat, 3 Mar 2018 18:43:05 +0100 Subject: [PATCH 4/4] distributed: reference to tf.contrib.learn is deprecated. Link to tf.estimator.RunConfig is used instead. Introduced with https://github.com/tensorflow/tensorflow/commit/c7caa2d87daa37b66811ac99f997ad02acd4ecc8 --- docs/distributed_training.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/distributed_training.md b/docs/distributed_training.md index 9ed9778da..95b499f87 100644 --- a/docs/distributed_training.md +++ b/docs/distributed_training.md @@ -5,7 +5,7 @@ training. T2T uses TensorFlow Estimators and so distributed training is configured with the `TF_CONFIG` environment variable that is read by the -[RunConfig](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/learn/python/learn/estimators/run_config.py) +[RunConfig](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/estimator/run_config.py) along with a set of flags. ## `TF_CONFIG`