New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize llm/GPT3 performance #8172
Conversation
Thanks for your contribution! |
2c30fcb
to
44b5b99
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #8172 +/- ##
===========================================
- Coverage 55.41% 55.16% -0.26%
===========================================
Files 597 601 +4
Lines 91593 91780 +187
===========================================
- Hits 50754 50628 -126
- Misses 40839 41152 +313 ☔ View full report in Codecov by Sentry. |
d8e6538
to
4688465
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
Performance optimization
PR changes
Others
Description
将 model_zoo/gpt-3 中的优化策略迁移至 llm/gpt-3 中
Fast Layer Norm op
开启方法:运行
run_pretrain.py
时设置--use_fast_layer_norm true
Fused Linear
开启方法:运行
run_pretrain.py
时设置--use_fused_linear true
Fused Dropout + Add Residual
开启方法:运行
run_pretrain.py
时设置--use_fused_dropout_add true
enable_linear_fused_grad_add
开启方法:版本要求:paddlenlp >= a5d87f5。运行
run_pretrain.py
时设置--enable_linear_fused_grad_add true
use SPInnerOverlap
开启方法:版本要求:paddlenlp >= a092775, paddle >= cfaa001,运行
run_pretrain.py
时设置--tensor_parallel_config enable_mp_async_allreduce enable_fused_linear_param_grad_add
并且训练开启 spdisable transmission attention_mask while pp
开启方法:默认在 pp 时不传输 attention_mask