-
Notifications
You must be signed in to change notification settings - Fork 25.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only access loss tensor every logging_steps #6802
Commits on Aug 28, 2020
-
Only access loss tensor every logging_steps
* tensor.item() was being called every step. This must not be done for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU communication at each step. On RoBERTa MLM for example, it reduces step time by 30%, should be larger for smaller step time models/tasks. * Train batch size was not correct in case a user uses the `per_gpu_train_batch_size` flag * Avg reduce loss accross eval shards
Configuration menu - View commit details
-
Copy full SHA for 18fc69a - Browse repository at this point
Copy the full SHA 18fc69aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 20f7786 - Browse repository at this point
Copy the full SHA 20f7786View commit details -
Configuration menu - View commit details
-
Copy full SHA for 3cac867 - Browse repository at this point
Copy the full SHA 3cac867View commit details -
Configuration menu - View commit details
-
Copy full SHA for 5ab21b0 - Browse repository at this point
Copy the full SHA 5ab21b0View commit details
Commits on Aug 29, 2020
-
Configuration menu - View commit details
-
Copy full SHA for ac47458 - Browse repository at this point
Copy the full SHA ac47458View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0f58903 - Browse repository at this point
Copy the full SHA 0f58903View commit details -
Configuration menu - View commit details
-
Copy full SHA for 22933e6 - Browse repository at this point
Copy the full SHA 22933e6View commit details
Commits on Aug 30, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 563485b - Browse repository at this point
Copy the full SHA 563485bView commit details -
Configuration menu - View commit details
-
Copy full SHA for a584761 - Browse repository at this point
Copy the full SHA a584761View commit details -
Add model card for singbert lite. Update widget for singbert and sing…
…bert-large. (huggingface#6827)
Configuration menu - View commit details
-
Copy full SHA for d176aaa - Browse repository at this point
Copy the full SHA d176aaaView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0eecace - Browse repository at this point
Copy the full SHA 0eecaceView commit details -
clearly indicate shuffle=False (huggingface#6312)
* Clarify shuffle * clarify shuffle Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
Configuration menu - View commit details
-
Copy full SHA for 32fe440 - Browse repository at this point
Copy the full SHA 32fe440View commit details -
Configuration menu - View commit details
-
Copy full SHA for dfa10a4 - Browse repository at this point
Copy the full SHA dfa10a4View commit details
Commits on Aug 31, 2020
-
Configuration menu - View commit details
-
Copy full SHA for 0e83769 - Browse repository at this point
Copy the full SHA 0e83769View commit details -
Configuration menu - View commit details
-
Copy full SHA for 05c3214 - Browse repository at this point
Copy the full SHA 05c3214View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4561f05 - Browse repository at this point
Copy the full SHA 4561f05View commit details -
Configuration menu - View commit details
-
Copy full SHA for 895d394 - Browse repository at this point
Copy the full SHA 895d394View commit details -
Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (…
…huggingface#6644) * add datacollator and dataset for next sentence prediction task * bug fix (numbers of special tokens & truncate sequences) * bug fix (+ dict inputs support for data collator) * add padding for nsp data collator; renamed cached files to avoid conflict. * add test for nsp data collator * Style Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Configuration menu - View commit details
-
Copy full SHA for 2de7ee0 - Browse repository at this point
Copy the full SHA 2de7ee0View commit details -
Configuration menu - View commit details
-
Copy full SHA for d2f9cb8 - Browse repository at this point
Copy the full SHA d2f9cb8View commit details -
Configuration menu - View commit details
-
Copy full SHA for c48546c - Browse repository at this point
Copy the full SHA c48546cView commit details -
Only access loss tensor every logging_steps
* tensor.item() was being called every step. This must not be done for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU communication at each step. On RoBERTa MLM for example, it reduces step time by 30%, should be larger for smaller step time models/tasks. * Train batch size was not correct in case a user uses the `per_gpu_train_batch_size` flag * Avg reduce loss accross eval shards
Configuration menu - View commit details
-
Copy full SHA for ac03af4 - Browse repository at this point
Copy the full SHA ac03af4View commit details -
Configuration menu - View commit details
-
Copy full SHA for db74df3 - Browse repository at this point
Copy the full SHA db74df3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 2b981cd - Browse repository at this point
Copy the full SHA 2b981cdView commit details