Optimize llm/GPT3 performance #8172

MarioLulab · 2024-03-22T09:16:31Z

PR types

Performance optimization

PR changes

Others

Description

将 model_zoo/gpt-3 中的优化策略迁移至 llm/gpt-3 中

Fast Layer Norm op
开启方法：运行 run_pretrain.py 时设置 --use_fast_layer_norm true
Fused Linear
开启方法：运行 run_pretrain.py 时设置 --use_fused_linear true
Fused Dropout + Add Residual
开启方法：运行 run_pretrain.py 时设置 --use_fused_dropout_add true
enable_linear_fused_grad_add
开启方法：版本要求：paddlenlp >= a5d87f5。运行 run_pretrain.py 时设置 --enable_linear_fused_grad_add true
use SPInnerOverlap
开启方法：版本要求：paddlenlp >= a092775, paddle >= cfaa001，运行 run_pretrain.py 时设置 --tensor_parallel_config enable_mp_async_allreduce enable_fused_linear_param_grad_add 并且训练开启 sp
disable transmission attention_mask while pp
开启方法：默认在 pp 时不传输 attention_mask

paddle-bot · 2024-03-22T09:16:37Z

Thanks for your contribution!

codecov · 2024-03-22T09:46:25Z

Codecov Report

Attention: Patch coverage is 80.00000% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 55.16%. Comparing base (d577e19) to head (9a6696f).
Report is 9 commits behind head on develop.

Files	Patch %	Lines
paddlenlp/transformers/gpt/modeling.py	85.00%	3 Missing ⚠️
paddlenlp/transformers/gpt/modeling_pp.py	33.33%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #8172      +/-   ##
===========================================
- Coverage    55.41%   55.16%   -0.26%     
===========================================
  Files          597      601       +4     
  Lines        91593    91780     +187     
===========================================
- Hits         50754    50628     -126     
- Misses       40839    41152     +313

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

paddlenlp/transformers/gpt/modeling_pp.py

tests/transformers/gpt/test_modeling.py

llm/run_pretrain.py

ZHUI

LGTM

MarioLulab added 10 commits March 18, 2024 15:56

add scripts

28f6291

add use_fast_layer_norm

d089f24

update use_fast_layernorm

033ebf1

update

c3d59e5

add gpt fast layer norm

327af56

add fused_linear

7f2066a

add fused linear switch

ceb4466

add dropout + residual add

5da3184

update run_gpt3.sh

fe255dc

delete debug files

c22ca84

MarioLulab force-pushed the luqi/dev_gpt3_part1 branch from 2c30fcb to 44b5b99 Compare March 22, 2024 09:28

code format

4688465

MarioLulab force-pushed the luqi/dev_gpt3_part1 branch from d8e6538 to 4688465 Compare March 22, 2024 09:47

MarioLulab added 3 commits March 22, 2024 18:33

add testcases

df44e8a

delete fast_in in testcase

b8c83d5

update not transmission attention_mask

7609d81

MarioLulab changed the title ~~Part-1 Optimize llm/GPT3 performance~~ Optimize llm/GPT3 performance Mar 26, 2024

Merge branch 'develop' into luqi/dev_gpt3_part1

45ec603

Xreki reviewed Mar 28, 2024

View reviewed changes

paddlenlp/transformers/gpt/modeling_pp.py Show resolved Hide resolved

tests/transformers/gpt/test_modeling.py Outdated Show resolved Hide resolved

remove added testcases

9a6696f

ZHUI reviewed Apr 9, 2024

View reviewed changes

llm/run_pretrain.py Show resolved Hide resolved

ZHUI approved these changes Apr 9, 2024

View reviewed changes

ZHUI merged commit 2900f78 into PaddlePaddle:develop Apr 11, 2024
8 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize llm/GPT3 performance #8172

Optimize llm/GPT3 performance #8172

MarioLulab commented Mar 22, 2024 •

edited

paddle-bot bot commented Mar 22, 2024

codecov bot commented Mar 22, 2024 •

edited

ZHUI left a comment

Optimize llm/GPT3 performance #8172

Optimize llm/GPT3 performance #8172

Conversation

MarioLulab commented Mar 22, 2024 • edited

PR types

PR changes

Description

paddle-bot bot commented Mar 22, 2024

codecov bot commented Mar 22, 2024 • edited

Codecov Report

ZHUI left a comment

Choose a reason for hiding this comment

MarioLulab commented Mar 22, 2024 •

edited

codecov bot commented Mar 22, 2024 •

edited