Conversation
d321871 to
86fab2b
Compare
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
86fab2b to
6d5e96b
Compare
|
Will need to add eagle_decoder_type support in megatron_eagle.py as well as export support. Can leave to next PR |
benchislett
left a comment
There was a problem hiding this comment.
Also, are we loading an MoE layer here? Are we overriding it with MLP somehow?
86fb0f4 to
d56532f
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #689 +/- ##
==========================================
- Coverage 74.73% 74.65% -0.09%
==========================================
Files 192 192
Lines 18870 18909 +39
==========================================
+ Hits 14103 14117 +14
- Misses 4767 4792 +25 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Please test e2e pipeline for parallel draft: from convert to train to export before merge |
yeyu-nvidia
left a comment
There was a problem hiding this comment.
Please address the comments
| @@ -43,3 +44,4 @@ def modify( | |||
| self.eagle_report_acc = eagle_report_acc | |||
| self.eagle_reuse_base_decoder = eagle_reuse_base_decoder | |||
There was a problem hiding this comment.
This should be removed since you have eagle_decoder_type now
There was a problem hiding this comment.
Doesn't megatron still use this argument?
|
Training failed on a8. Command to reproduce: bash launch_train.sh --save_steps 20 --data_path /workspace/scratch.yeyu_hw/Daring-Anteater/llama3.2_1B_fp8.jsonl --training_seq_len 512
|
tested |
fixed |
yeyu-nvidia
left a comment
There was a problem hiding this comment.
Megatron will need some API refactoring due to this PR. We will need to add MLA to megatron as well.
Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>
Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>
Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>
Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>
Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>
Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>
Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>
Signed-off-by: yeyu-nvidia <yeyu@nvidia.com>
868ce71 to
11187be
Compare
| import torch.distributed | ||
| from huggingface_hub import snapshot_download | ||
| from torch import nn | ||
| from transformers.cache_utils import DynamicCache |
There was a problem hiding this comment.
These utils require transformers should better be moved to /plugins. No need to change now but a remark.
ChenhanYu
left a comment
There was a problem hiding this comment.
I leave a short comment regarding dependency on transformers. This is great work.
I think we should only refactor something if it's due to the need for new feature. We should not refactor it if it's due to this PR. I would appreciate to know if there is a better way for this feature. Thanks |
Isn't MLA a need to Megatron? This PR disables eagle_reuse_base_decoder for HF and introduce MLA decoder. Don't we need to refactor to enable eagle_decoder_type to Megatron as well and deprecate eagle_reuse_base_decoder? What is the definition for "need for new feature" if we don't support something that we claim we support but only half way support it? |
What does this PR do?
Type of change: New Feature
Overview:
Usage
# Add a code snippet demonstrating how to use thisTesting
Before your PR is "Ready for review"
Additional Information