Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only access loss tensor every logging_steps #6802

Merged
merged 23 commits into from
Aug 31, 2020

Commits on Aug 28, 2020

  1. Only access loss tensor every logging_steps

    * tensor.item() was being called every step. This must not be done
    for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU
    communication at each step. On RoBERTa MLM for example, it reduces step
    time by 30%, should be larger for smaller step time models/tasks.
    * Train batch size was not correct in case a user uses the
    `per_gpu_train_batch_size` flag
    * Avg reduce loss accross eval shards
    jysohn23 committed Aug 28, 2020
    Configuration menu
    Copy the full SHA
    18fc69a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    20f7786 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    3cac867 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    5ab21b0 View commit details
    Browse the repository at this point in the history

Commits on Aug 29, 2020

  1. Configuration menu
    Copy the full SHA
    ac47458 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0f58903 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    22933e6 View commit details
    Browse the repository at this point in the history

Commits on Aug 30, 2020

  1. Configuration menu
    Copy the full SHA
    563485b View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    a584761 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    d176aaa View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    0eecace View commit details
    Browse the repository at this point in the history
  5. clearly indicate shuffle=False (huggingface#6312)

    * Clarify shuffle
    
    * clarify shuffle
    
    Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
    xujiaze13 and JetRunner committed Aug 30, 2020
    Configuration menu
    Copy the full SHA
    32fe440 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    dfa10a4 View commit details
    Browse the repository at this point in the history

Commits on Aug 31, 2020

  1. Style

    LysandreJik committed Aug 31, 2020
    Configuration menu
    Copy the full SHA
    0e83769 View commit details
    Browse the repository at this point in the history
  2. Patch logging issue

    LysandreJik committed Aug 31, 2020
    Configuration menu
    Copy the full SHA
    05c3214 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    4561f05 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    895d394 View commit details
    Browse the repository at this point in the history
  5. Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (

    …huggingface#6644)
    
    * add datacollator and dataset for next sentence prediction task
    
    * bug fix (numbers of special tokens & truncate sequences)
    
    * bug fix (+ dict inputs support for data collator)
    
    * add padding for nsp data collator; renamed cached files to avoid conflict.
    
    * add test for nsp data collator
    
    * Style
    
    Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
    Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
    3 people committed Aug 31, 2020
    Configuration menu
    Copy the full SHA
    2de7ee0 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    d2f9cb8 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    c48546c View commit details
    Browse the repository at this point in the history
  8. Only access loss tensor every logging_steps

    * tensor.item() was being called every step. This must not be done
    for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU
    communication at each step. On RoBERTa MLM for example, it reduces step
    time by 30%, should be larger for smaller step time models/tasks.
    * Train batch size was not correct in case a user uses the
    `per_gpu_train_batch_size` flag
    * Avg reduce loss accross eval shards
    jysohn23 committed Aug 31, 2020
    Configuration menu
    Copy the full SHA
    ac03af4 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    db74df3 View commit details
    Browse the repository at this point in the history
  10. comments

    jysohn23 committed Aug 31, 2020
    Configuration menu
    Copy the full SHA
    2b981cd View commit details
    Browse the repository at this point in the history