New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[LLM] Add sequence_parallel support for qwen #8558

Merged

ZHUI merged 11 commits into PaddlePaddle:develop from Difers:add_sp_qwen

Jun 21, 2024

Contributor

Difers commented Jun 6, 2024 •

edited

Loading

PR types

Performance optimization

PR changes

Models

Description

千问模型增加sequence_parallel支持

自测记录：
关闭sequence_parallel：
workerlog_disable_sp.0.log
开启sequence_parallel：
workerlog_sp.0.log

paddle-bot bot commented Jun 6, 2024

Thanks for your contribution!

codecov bot commented Jun 6, 2024 •

edited

Loading

Codecov Report

Attention: Patch coverage is 40.35088% with 34 lines in your changes missing coverage. Please review.

Project coverage is 55.80%. Comparing base (5619cc3) to head (396bdc9).
Report is 1 commits behind head on develop.

Files	Patch %	Lines
paddlenlp/transformers/qwen/modeling.py	45.09%	28 Missing ⚠️
paddlenlp/transformers/qwen/modeling_pp.py	0.00%	6 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #8558      +/-   ##
===========================================
- Coverage    55.81%   55.80%   -0.02%     
===========================================
  Files          620      620              
  Lines        96599    96642      +43     
===========================================
+ Hits         53917    53928      +11     
- Misses       42682    42714      +32

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

DesmonDay reviewed

View reviewed changes

paddlenlp/transformers/qwen/modeling.py

                           base_actions = {
                               # Column Linear
                               "lm_head.weight": partial(fn, is_column=True),
-                              "qwen.h.0.mlp.w2.weight": partial(fn, is_column=True),
-                              "qwen.h.0.mlp.w1.weight": partial(fn, is_column=True),
                               "qwen.h.0.attn.c_attn.weight": partial(fn, is_column=True, is_naive_3fuse=True),
                               "qwen.h.0.attn.c_attn.bias": partial(fn, is_column=True, is_naive_3fuse=True),
                               # Row Linear
                               "qwen.wte.weight": partial(fn, is_column=False),
                               "qwen.h.0.mlp.c_proj.weight": partial(fn, is_column=False),
                               "qwen.h.0.attn.c_proj.weight": partial(fn, is_column=False),

Contributor

DesmonDay Jun 7, 2024

c_proj->o_proj

Contributor

DesmonDay Jun 7, 2024

话说，为啥把c_proj改成了o_proj，会不会影响模型权重的加载了，这一点确认一下吧。

Contributor Author

Difers Jun 7, 2024

命名习惯，那我改回来吧...

DesmonDay reviewed

View reviewed changes

paddlenlp/transformers/qwen/modeling.py Outdated

                               input_is_parallel=True,
                           )
                       else:
                           self.c_attn = nn.Linear(config.hidden_size, 3 * self.projection_size, bias_attr=True)
-                          self.c_proj = nn.Linear(config.hidden_size, self.projection_size, bias_attr=not config.no_bias)
+                          self.o_proj = nn.Linear(

Contributor

DesmonDay Jun 7, 2024

这里的改动有测试过么？

Difers force-pushed the add_sp_qwen branch from 245e097 to 3a87306 Compare

June 7, 2024 08:21

Contributor Author

Difers commented Jun 12, 2024

@DesmonDay hi,可以再review下吗～

Difers force-pushed the add_sp_qwen branch from 3a87306 to 7e83259 Compare

June 12, 2024 12:43

DesmonDay reviewed

View reviewed changes

paddlenlp/transformers/qwen/modeling.py Show resolved Hide resolved

DesmonDay reviewed

View reviewed changes

paddlenlp/transformers/qwen/modeling.py Outdated

@@ @@ -252,18 +284,26 @@ def forward( @@
                       encoder_hidden_states=None,
                       encoder_attention_mask=None,
                       output_attentions=False,
+                      alibi=None,

Contributor

DesmonDay Jun 13, 2024

为什么会增加alibi？

Contributor Author

Difers Jun 13, 2024

已修改

DesmonDay reviewed

View reviewed changes

paddlenlp/transformers/qwen/modeling.py

                           )
                       else:
-                          attn_output, attn_weight = self._attn(query, key, value, attention_mask)
-                      context_layer = self._merge_heads(attn_output, self.num_heads, self.head_dim)

Contributor

DesmonDay Jun 13, 2024

这个为啥删掉了，有测试过么？

Contributor Author

Difers Jun 13, 2024

这一步改到_attn里了，自测记录在Description里添加了

Difers force-pushed the add_sp_qwen branch from 7e83259 to 48c6379 Compare

June 13, 2024 06:58

Contributor

DesmonDay commented Jun 14, 2024

从自测记录里看，打开和关闭sequence_parallel，loss精度对不齐，需要对齐精度。

Difers force-pushed the add_sp_qwen branch from 48c6379 to 9b97cbe Compare

June 20, 2024 03:41

DrownFish19 reviewed

View reviewed changes

paddlenlp/transformers/qwen/modeling.py Outdated Show resolved Hide resolved

Difers added 10 commits

June 20, 2024 14:23


          add sequence_parallel for qwen

39e213a


          add sequence_parallel in qwen pp

e75593d


          fix typo error

8bbae48


          add some mappings

13127dd


          fix some error

0e2f3f9


          fix typo

5001ad2


          fix some error


          fix some errors

5f6f964


          fix config

10971e8


          fix some typo

71e2b64

Difers force-pushed the add_sp_qwen branch from 111214f to 71e2b64 Compare

June 20, 2024 06:24

DrownFish19 changed the title ~~Add sequence_parallel support for qwen~~ [llm] Add sequence_parallel support for qwen

DrownFish19 changed the title ~~[llm] Add sequence_parallel support for qwen~~ [LLM] Add sequence_parallel support for qwen

Collaborator

DrownFish19 commented Jun 20, 2024

LGTM

DrownFish19 previously approved these changes

View reviewed changes

DesmonDay previously approved these changes

View reviewed changes

Contributor

DesmonDay left a comment

LGTM

ZHUI reviewed

View reviewed changes

paddlenlp/transformers/qwen/configuration.py Outdated

Comment on lines 43 to 45

+                      tensor_parallel_output=True,
+                      sequence_parallel=False,
+                      fuse_sequence_parallel_allreduce=False,

Collaborator

ZHUI Jun 20, 2024

都删除吧，已经内置了

Contributor Author

Difers Jun 20, 2024

done


          fix config

396bdc9

Difers dismissed stale reviews from DesmonDay and DrownFish19 via

396bdc9

June 20, 2024 09:41

ZHUI approved these changes

View reviewed changes

Collaborator

ZHUI left a comment

LGTM

ZHUI merged commit 65e721e into PaddlePaddle:develop

8 of 11 checks passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment