-
Notifications
You must be signed in to change notification settings - Fork 3.1k
llama_with_auto_pp #10648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama_with_auto_pp #10648
Conversation
|
Thanks for your contribution! |
Codecov ReportAttention: Patch coverage is
❌ Your patch check has failed because the patch coverage (16.27%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## develop #10648 +/- ##
===========================================
- Coverage 46.98% 46.84% -0.14%
===========================================
Files 799 800 +1
Lines 132255 132774 +519
===========================================
+ Hits 62135 62195 +60
- Misses 70120 70579 +459 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| # output_attentions、use_cache 可以由config控制且有默认值, delete掉 | ||
| # inputs_embeds、past_key_values 动手PP组网没有使用,delete掉,使用默认值 | ||
|
|
||
| # attn_mask_startend_row_indices 自动并行组网没有使用,不考虑 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些注释可以删除掉
| local_chunk_id = stage_idx // pp_degree | ||
| if stage_idx == 0: # 第一个model_chunk输入特殊处理 | ||
| new_model = _Pipeline_model_chunk(layer_lists[:chunk_size]) | ||
| def forward0( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
规范下命名,可以命名为 forward_with_emb
| new_model.forward = forward0.__get__(new_model) | ||
| else: | ||
| new_model = _Pipeline_model_chunk(layer_lists[stage_idx * chunk_size : (stage_idx + 1) * chunk_size]) | ||
| def forward1(self, *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
规范下命名,可以命名为 forward_with_decode
| assert mode in ["VPP", "1F1B", "GPipe"] | ||
| stages = manual_model_split(model, group.rank, group, mode, pp_degree) | ||
| if mode == "VPP": | ||
| schedule = ScheduleInterleaved1F1B(stages, n_microbatches = n_microbatches, loss_fn = loss_fn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后续,Schdules 名称会和动手对齐。ScheduleGPipe应该要替换成ScheduleFThenB,ScheduleInterleaved1F1B 改成 ScheduleVPP
|
|
||
| return ret | ||
|
|
||
| class _Pipeline_model_chunk(nn.Layer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
命名为 llama_chunk 吧
| def __init__(self, layers): | ||
| super(_Pipeline_model_chunk, self).__init__() | ||
| self.layers = layers | ||
| def forward(self, *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
把下方注册forward逻辑的代码,在此处重写,使用if else 走不同分支,更容易让用户理解
Before submitting
testsfolder. If there are codecov issues, please add tests cases first.PR types
PR changes
Description