Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] 请问是否能公开将HuggingFace版本的Aquila模型转化为Megatron模型的脚本和代码? #32

Open
stainswei opened this issue Jan 23, 2024 · 1 comment

Comments

@stainswei
Copy link

请问怎么将HuggingFace版本的Aquila2-7B模型和Aquila2-34B模型转化为Megatron模型,scripts中的convert_hf_to_megatron.py文件中需要ref_model_path,但是在项目中没有给出

@gushu333
Copy link
Collaborator

您好,已更新相关代码,请参考#38

转换脚本请参考:

export PYTHONPATH=/path/to/FlagScale/megatron
# 34b
python convert_hf_to_megatron.py -i {34b_hf_ckpt}  -o  {tgt_dir} -r ref_model/34b_ref_model --num-layers 60 --hidden-size 6144 --num-attention-heads 48 --data-type bf16 --group-query-attention --num-query-groups 8

# 7B
python convert_hf_to_megatron.py -i {7b_hf_ckpt} -r ref_model/7b_ref_model -o {tgt_dir}  --num-layers 32 --hidden-size 4096 --num-attention-heads 32 --data-type bf16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants