简介

MindSpeed 是针对华为昇腾设备的大模型加速库。

大模型训练是一种非常复杂的过程，涉及到许多技术和挑战，其中大模型训练需要大量的显存资源是一个难题，对计算卡提出了不小的挑战。为了在单个计算卡显存资源不足时，可以通过多张计算卡进行计算，业界出现了类似 Megatron、DeepSpeed 等第三方大模型加速库，对模型、输入数据等进行切分并分配到不同的计算卡上，最后再通过集合通信对结果进行汇总。

昇腾提供 MindSpeed 加速库，使能客户大模型业务快速迁移至昇腾设备，并且支持昇腾专有算法，确保开箱可用。

安装

1. 安装依赖

在安装MindSpeed之前，请参考版本配套表，安装最新昇腾软件栈：https://www.hiascend.com/zh/。

依赖软件
Driver
Firmware
CANN
Kernel
PyTorch
torch_npu
apex

2. 安装 MindSpeed

下载源码安装：

git clone https://gitee.com/ascend/MindSpeed.git
pip install -e MindSpeed

如需使用ATB算子，请在安装前添加环境变量ENABLE_ATB=1，例如：

git clone https://gitee.com/ascend/MindSpeed.git
ENABLE_ATB=1 pip install -e MindSpeed

3. 获取 Megatron-LM 并指定 commit id

# 目前版本基于core_r0.7.0的release版本
git clone https://github.com/NVIDIA/Megatron-LM.git
cd Megatron-LM
git checkout core_r0.7.0

快速上手

以 GPT 模型为例：

在 Megatron-LM 目录下修改pretrain_gpt.py文件，在import torch下新增一行import mindspeed.megatron_adaptor

 import os
 import torch
+import mindspeed.megatron_adaptor
 from torch import Tensor
 from functools import partial
 from typing import Union

在 Megatron-LM 目录下修改pretrain_gpt.py文件，在model_provider函数中删除assert(args.context_parallel_size == 1), "Context parallelism is only supported with Megatron Core!"。

else:
-   assert(args.context_parallel_size == 1), "Context parallelism is only supported with Megatron Core!"

    model = megatron.model.GPTModel(
        config,
        num_tokentypes=0,
        parallel_output=True,
        pre_process=pre_process,
        post_process=post_process
    )

在 Megatron-LM 目录下，准备好训练数据，并在示例脚本中填写对应路径，然后执行。
```
bash examples/pretrain_gpt_distributed.sh
```

特性介绍

特性	介绍
Megatron 数据并行	link
Megatron 张量并行	link
Megatron 流水并行	link
Megatron 虚拟流水并行	link
Megatron 序列并行	link
Megatron 重计算	link
Megatron 分布式优化器	link
Megatron 异步DDP	link
Megatron 权重更新通信隐藏	link
Megatron Mcore MoE	link
Ascend DeepSpeed MoE	link
Ascend Mask归一	link
Ascend 内存碎片优化	link
Ascend 自适应选择重计算	link
Ascend 激活函数重计算	link
Ascend 计算通信并行优化	link
Ascend BF16 参数副本复用	link
Ascend rms_norm 融合算子	link
Ascend swiglu 融合算子	link
Ascend rotary_embedding 融合算子	link
Ascend flash attention 适配	link
Ascend nano-pipe流水线并行	link
Ascend MLP 通信隐藏	link
Ascend 重计算流水线独立调度	link
Ulysses 长序列并行	link
Ring Attention 长序列并行	link
【Prototype】混合长序列并行	link
【Prototype】Ascend MC2	link
【Prototype】alibi	link
【Prototype】PP自动并行	link
【Prototype】其他昇腾亲和优化	暂无

自定义算子

算子	介绍
npu_dropout_add_layer_norm	link
npu_rotary_position_embedding	link
【Prototype】ffn	link
【Prototype】fusion_attention	link
【Prototype】rms_norm	link
【Prototype】swiglu	link
【Prototype】lcal_coc	link
【Prototype】npu_mm_all_reduce_add_rms_norm	link
【Prototype】npu_mm_all_reduce_add_rms_norm_	link
【Prototype】npu_grouped_mat_mul	link
【Prototype】npu_grouped_mat_mul_all_reduce	link

MindSpeed中采集Profile数据

MindSpeed支持命令式开启Profile采集数据，命令配置介绍如下：

配置命令	默认值	命令含义
--profile	False	profile数据采集开关
--profile-step-start	10	开始采集步骤
--profile-step-end	12	结束采集步骤
--profile-level	level0 (可选：level0, level1, level2)	数据信息采集等级
--profile-with-cpu	False	是否采集cpu信息
--profile-with-stack	False	是否采集stack信息
--profile-with-memory	False	是否采集memory信息
--profile-record-shapes	False	是否采集shapes信息
--profile-save-path	./profile_dir	采集信息保存路径
--profile-ranks	0 (可选配置举例：0 1 2 3 4 5 6 7)	被采集的ranks

版本配套表

PyTorch Extension版本号采用{PyTorch版本}-{昇腾版本}命名规则，前者为PyTorch Extension匹配的PyTorch版本，后者用于匹配CANN版本，详细匹配如下：

MindSpeed版本	PyTorch版本	Extension版本	Python版本
1.0	2.1.0	2.1.0.post3	Python3.8.x, Python3.9.x, Python3.10.x

昇腾辅助软件中有更多关于PyTorch和CANN的版本信息。

分支维护策略

MindSpeed版本分支的维护阶段如下：

状态	时间	说明
计划	1—3 个月	计划特性
开发	3 个月	开发特性
维护	6-12 个月	合入所有已解决的问题并发布版本，针对不同的MindSpeed版本采取不同的维护策略，常规版本和长期支持版本维护周期分别为6个月和12个月
无维护	0—3 个月	合入所有已解决的问题，无专职维护人员，无版本发布
生命周期终止（EOL）	N/A	分支不再接受任何修改

MindSpeed版本维护策略

MindSpeed版本	维护策略	当前状态	发布时间	后续状态	EOL日期
1.0	常规版本	维护	2024/03/30

安全声明

MindSpeed 安全声明

常见问题

现象	介绍
Data helpers 数据预处理出错	link

Name		Name	Last commit message	Last commit date
Latest commit History 363 Commits
ci		ci
docs		docs
mindspeed		mindspeed
sources/images		sources/images
tests_extend		tests_extend
.gitignore		.gitignore
LICENSE		LICENSE
OWNERS		OWNERS
README.md		README.md
SECURITYNOTE.md		SECURITYNOTE.md
Third_Party_Open_Source_Software_Notice		Third_Party_Open_Source_Software_Notice
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

简介

安装

1. 安装依赖

2. 安装 MindSpeed

3. 获取 Megatron-LM 并指定 commit id

快速上手

特性介绍

自定义算子

MindSpeed中采集Profile数据

版本配套表

分支维护策略

MindSpeed版本维护策略

安全声明

常见问题

About

Releases

Packages

Contributors 14

Languages

License

wangwz6666/MindSpeed

Folders and files

Latest commit

History

Repository files navigation

简介

安装

1. 安装依赖

2. 安装 MindSpeed

3. 获取 Megatron-LM 并指定 commit id

快速上手

特性介绍

自定义算子

MindSpeed中采集Profile数据

版本配套表

分支维护策略

MindSpeed版本维护策略

安全声明

常见问题

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 14

Languages

Packages