pytorch GPT模型显存OOM，cuda11不能运行 #17

bigprince97 · 2020-12-10T06:51:48Z

感谢你们的开源工作，我用自己的模型转化之后可以成功用lightseq成功预测，加速效果的确明显，但是vocab size只能为非常小的值，一旦过大，就会爆显存，实例模型的vocab size是5004，有相应解决方案吗？在3090上也不能用Lightseq，有支持3090的打算吗？

Taka152 · 2020-12-10T10:56:22Z

@bigprince97 感谢你使用lightseq并成功应用在自己的模型上面。回答一下你提出的两个问题：

1.vocab设置过大显存会爆。
显存占用除了vocab size会影响之外，max_batch_size和max_step的设置也会影响到显存的占用。建议可以适当调小这两个参数来提供更多的空间给更大的vocab size。

2.3090使用lightseq。
目前的使用并没有限制显卡的具体型号，只限制了cuda>=10.1，所以理论上是可以在3090上成功使用的。方便分享更多的信息以查看具体原因吗，包括报错，cuda版本等信息。

bigprince97 · 2020-12-11T02:15:54Z

cuda版本是11.0
在创建容器时会提示tensorrt server的提示3090不支持该容器。

运行实例代码时，报错如下：

bigprince97 · 2020-12-11T02:20:59Z

您指的max_step应该是位置向量的最大长度吧，这改成100，的确能扩大vocab size到50257，感谢！

bigprince97 · 2020-12-11T02:38:16Z

另外我将pytorch版本的gpt2的参数根据proto转换成了对应的模型文件，lightseq可以正常推理，但是预测结果和pytorch版本的gpt2有很大差异，pytorch版本是按照gpt2论文的结构，可能和lightseq里面的gpt模型结构有细微差异，这一块具体的模型结构，能够麻烦提供具体的pytorch或者tf版本的实现吗？

bigprince97 · 2020-12-11T05:45:19Z

结果不一致的问题已经解决了，不是模型结构不一致的问题，是pytorch的参数矩阵转置问题。

Taka152 · 2020-12-11T07:33:52Z

很高兴看到你的问题得到解决，剩下一个是3090上的运行。这个问题的主要原因在于build所依赖的Nvidia Triton inference server镜像不支持3090。

我们前几天更新了CMake的编译方法，解决了对Triton inference server镜像的依赖，欢迎你尝试一下doc/build.md里提到的方法进行编译，应该可以解决在3090或者说是cuda11下的运行问题。

bigprince97 · 2020-12-15T01:42:52Z

编译的时候失败了，使用的cuda11环境。

Taka152 · 2020-12-15T02:29:49Z

项目里有submodule，尝试git submodule update --init

Majokiki · 2021-04-21T09:05:13Z

结果不一致的问题已经解决了，不是模型结构不一致的问题，是pytorch的参数矩阵转置问题。

@bigprince97 您好，我也遇到了lightseq预测结果和pytorch版本的gpt2（https://github.com/yangjianxin1/GPT2-chitchat）
不一致的问题，百思不解中幸运地找到了您的解释，请问您说的pytorch的参数矩阵转置的问题，是出现在torch.load()中的吗？能否给一些详细的信息呢？感谢

Taka152 · 2021-04-21T09:13:12Z

结果不一致的问题已经解决了，不是模型结构不一致的问题，是pytorch的参数矩阵转置问题。

@bigprince97 您好，我也遇到了lightseq预测结果和pytorch版本的gpt2（https://github.com/yangjianxin1/GPT2-chitchat）
不一致的问题，百思不解中幸运地找到了您的解释，请问您说的pytorch的参数矩阵转置的问题，是出现在torch.load()中的吗？能否给一些详细的信息呢？感谢

@Majokiki pytorch的weight很多都是[out_dim, in_dim]的方式存储的，lightseq中需要[in_dim, out_dim]的方式存储

YINGPENGZH · 2022-09-06T02:46:31Z

结果不一致的问题已经解决了，不是模型结构不一致的问题，是pytorch的参数矩阵转置问题。

@bigprince97 您好，我也遇到了lightseq预测结果和pytorch版本的gpt2（https://github.com/yangjianxin1/GPT2-chitchat）
不一致的问题，百思不解中幸运地找到了您的解释，请问您说的pytorch的参数矩阵转置的问题，是出现在torch.load()中的吗？能否给一些详细的信息呢？感谢

@Majokiki pytorch的weight很多都是[out_dim, in_dim]的方式存储的，lightseq中需要[in_dim, out_dim]的方式存储

您好，我也遇到了pytorch和lightseq的gpt2不一致的问题，请问具体是怎么解决的呢？在pytorch模型转化之前做什么吗？

Taka152 changed the title ~~3090的支持，以及显存占用过多~~ pytorch GPT模型使用vocab设置变大显存OOM，cuda11不能运行 Dec 11, 2020

Taka152 changed the title ~~pytorch GPT模型使用vocab设置变大显存OOM，cuda11不能运行~~ pytorch GPT模型显存OOM，cuda11不能运行 Dec 11, 2020

Taka152 closed this as completed Dec 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pytorch GPT模型显存OOM，cuda11不能运行 #17

pytorch GPT模型显存OOM，cuda11不能运行 #17

bigprince97 commented Dec 10, 2020

Taka152 commented Dec 10, 2020

bigprince97 commented Dec 11, 2020

bigprince97 commented Dec 11, 2020 •

edited

bigprince97 commented Dec 11, 2020 •

edited

bigprince97 commented Dec 11, 2020

Taka152 commented Dec 11, 2020 •

edited

bigprince97 commented Dec 15, 2020

Taka152 commented Dec 15, 2020

Majokiki commented Apr 21, 2021 •

edited

Taka152 commented Apr 21, 2021

YINGPENGZH commented Sep 6, 2022

pytorch GPT模型显存OOM，cuda11不能运行 #17

pytorch GPT模型显存OOM，cuda11不能运行 #17

Comments

bigprince97 commented Dec 10, 2020

Taka152 commented Dec 10, 2020

bigprince97 commented Dec 11, 2020

bigprince97 commented Dec 11, 2020 • edited

bigprince97 commented Dec 11, 2020 • edited

bigprince97 commented Dec 11, 2020

Taka152 commented Dec 11, 2020 • edited

bigprince97 commented Dec 15, 2020

Taka152 commented Dec 15, 2020

Majokiki commented Apr 21, 2021 • edited

Taka152 commented Apr 21, 2021

YINGPENGZH commented Sep 6, 2022

bigprince97 commented Dec 11, 2020 •

edited

bigprince97 commented Dec 11, 2020 •

edited

Taka152 commented Dec 11, 2020 •

edited

Majokiki commented Apr 21, 2021 •

edited