[s2s README] Add more dataset download instructions #6737

sshleifer · 2020-08-26T01:32:17Z

Improves formatting in seq2seq readme and adds download instructions for wmt-en-de and cnn_dm_v2 (cleaned cnn_dm without empty examples).

codecov · 2020-08-26T01:39:19Z

Codecov Report

Merging #6737 into master will decrease coverage by 0.10%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #6737      +/-   ##
==========================================
- Coverage   80.02%   79.92%   -0.11%     
==========================================
  Files         157      157              
  Lines       28586    28586              
==========================================
- Hits        22877    22848      -29     
- Misses       5709     5738      +29

Impacted Files	Coverage Δ
src/transformers/configuration_openai.py	`34.28% <0.00%> (-62.86%)`	⬇️
src/transformers/tokenization_albert.py	`28.84% <0.00%> (-58.66%)`	⬇️
src/transformers/modeling_openai.py	`23.87% <0.00%> (-57.10%)`	⬇️
src/transformers/modeling_tf_distilbert.py	`64.47% <0.00%> (-34.36%)`	⬇️
src/transformers/tokenization_dpr.py	`53.15% <0.00%> (-4.51%)`	⬇️
src/transformers/configuration_bart.py	`90.00% <0.00%> (-4.00%)`	⬇️
src/transformers/generation_tf_utils.py	`84.96% <0.00%> (-1.76%)`	⬇️
src/transformers/generation_utils.py	`96.66% <0.00%> (-0.28%)`	⬇️
src/transformers/file_utils.py	`82.66% <0.00%> (+0.25%)`	⬆️
src/transformers/modeling_tf_utils.py	`87.29% <0.00%> (+0.32%)`	⬆️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 22933e6...b4c1c2f. Read the comment docs.

examples/seq2seq/README.md

* Only access loss tensor every logging_steps * tensor.item() was being called every step. This must not be done for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU communication at each step. On RoBERTa MLM for example, it reduces step time by 30%, should be larger for smaller step time models/tasks. * Train batch size was not correct in case a user uses the `per_gpu_train_batch_size` flag * Avg reduce loss accross eval shards * Fix style (#6803) * t5 model should make decoder_attention_mask (#6800) * [s2s] Test hub configs in self-scheduled CI (#6809) * [s2s] round runtime in run_eval (#6798) * Pegasus finetune script: add --adafactor (#6811) * [bart] rename self-attention -> attention (#6708) * [tests] fix typos in inputs (#6818) * Fixed open in colab link (#6825) * Add model card for singbert lite. Update widget for singbert and singbert-large. (#6827) * BR_BERTo model card (#6793) * clearly indicate shuffle=False (#6312) * Clarify shuffle * clarify shuffle Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> * [s2s README] Add more dataset download instructions (#6737) * Style * Patch logging issue * Set default logging level to `WARNING` instead of `INFO` * TF Flaubert w/ pre-norm (#6841) * Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (#6644) * add datacollator and dataset for next sentence prediction task * bug fix (numbers of special tokens & truncate sequences) * bug fix (+ dict inputs support for data collator) * add padding for nsp data collator; renamed cached files to avoid conflict. * add test for nsp data collator * Style Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> * Fix in Adafactor docstrings (#6845) * Fix resuming training for Windows (#6847) * Only access loss tensor every logging_steps * tensor.item() was being called every step. This must not be done for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU communication at each step. On RoBERTa MLM for example, it reduces step time by 30%, should be larger for smaller step time models/tasks. * Train batch size was not correct in case a user uses the `per_gpu_train_batch_size` flag * Avg reduce loss accross eval shards * comments Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Thomas Ashish Cherian <6967017+PandaWhoCodes@users.noreply.github.com> Co-authored-by: Zane Lim <zyuanlim@gmail.com> Co-authored-by: Rodolfo De Nadai <rdenadai@gmail.com> Co-authored-by: xujiaze13 <37360975+xujiaze13@users.noreply.github.com> Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Huang Lianzhe <hlz@pku.edu.cn> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Only access loss tensor every logging_steps * tensor.item() was being called every step. This must not be done for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU communication at each step. On RoBERTa MLM for example, it reduces step time by 30%, should be larger for smaller step time models/tasks. * Train batch size was not correct in case a user uses the `per_gpu_train_batch_size` flag * Avg reduce loss accross eval shards * Fix style (huggingface#6803) * t5 model should make decoder_attention_mask (huggingface#6800) * [s2s] Test hub configs in self-scheduled CI (huggingface#6809) * [s2s] round runtime in run_eval (huggingface#6798) * Pegasus finetune script: add --adafactor (huggingface#6811) * [bart] rename self-attention -> attention (huggingface#6708) * [tests] fix typos in inputs (huggingface#6818) * Fixed open in colab link (huggingface#6825) * Add model card for singbert lite. Update widget for singbert and singbert-large. (huggingface#6827) * BR_BERTo model card (huggingface#6793) * clearly indicate shuffle=False (huggingface#6312) * Clarify shuffle * clarify shuffle Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> * [s2s README] Add more dataset download instructions (huggingface#6737) * Style * Patch logging issue * Set default logging level to `WARNING` instead of `INFO` * TF Flaubert w/ pre-norm (huggingface#6841) * Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (huggingface#6644) * add datacollator and dataset for next sentence prediction task * bug fix (numbers of special tokens & truncate sequences) * bug fix (+ dict inputs support for data collator) * add padding for nsp data collator; renamed cached files to avoid conflict. * add test for nsp data collator * Style Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> * Fix in Adafactor docstrings (huggingface#6845) * Fix resuming training for Windows (huggingface#6847) * Only access loss tensor every logging_steps * tensor.item() was being called every step. This must not be done for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU communication at each step. On RoBERTa MLM for example, it reduces step time by 30%, should be larger for smaller step time models/tasks. * Train batch size was not correct in case a user uses the `per_gpu_train_batch_size` flag * Avg reduce loss accross eval shards * comments Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Thomas Ashish Cherian <6967017+PandaWhoCodes@users.noreply.github.com> Co-authored-by: Zane Lim <zyuanlim@gmail.com> Co-authored-by: Rodolfo De Nadai <rdenadai@gmail.com> Co-authored-by: xujiaze13 <37360975+xujiaze13@users.noreply.github.com> Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Huang Lianzhe <hlz@pku.edu.cn> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

…ace#6737)" This reverts commit 6aaf7f6.

Add wmt-en-de fetching instructions

873197d

sshleifer added 2 commits August 29, 2020 17:48

more md

9f482fe

Cnn cleaned link

4854893

sshleifer changed the title ~~Add wmt-en-de fetching instructions~~ [s2s README] Add wmt-en-de fetching instructions Aug 29, 2020

Merge branch 'master' into en-de-instructions

56a7659

JetRunner reviewed Aug 30, 2020

View reviewed changes

examples/seq2seq/README.md Outdated Show resolved Hide resolved

rename cnn_cln -> cnn_dm_v2

d010de4

sshleifer changed the title ~~[s2s README] Add wmt-en-de fetching instructions~~ [s2s README] Add more dataset download instructions Aug 30, 2020

cleanup

b4c1c2f

sshleifer merged commit dfa10a4 into huggingface:master Aug 30, 2020

sshleifer deleted the en-de-instructions branch August 30, 2020 20:29

stas00 pushed a commit to stas00/transformers that referenced this pull request Sep 1, 2020

[s2s README] Add more dataset download instructions (huggingface#6737)

1e4f348

Zigur pushed a commit to Zigur/transformers that referenced this pull request Oct 26, 2020

[s2s README] Add more dataset download instructions (huggingface#6737)

7fb4cc4

fabiocapsouza pushed a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020

[s2s README] Add more dataset download instructions (huggingface#6737)

6aaf7f6

fabiocapsouza added a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020

Revert "[s2s README] Add more dataset download instructions (huggingf…

032411e

…ace#6737)" This reverts commit 6aaf7f6.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[s2s README] Add more dataset download instructions #6737

[s2s README] Add more dataset download instructions #6737

sshleifer commented Aug 26, 2020 •

edited

Loading

codecov bot commented Aug 26, 2020 •

edited

Loading

[s2s README] Add more dataset download instructions #6737

[s2s README] Add more dataset download instructions #6737

Conversation

sshleifer commented Aug 26, 2020 • edited Loading

codecov bot commented Aug 26, 2020 • edited Loading

Codecov Report

sshleifer commented Aug 26, 2020 •

edited

Loading

codecov bot commented Aug 26, 2020 •

edited

Loading