Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

推理速度很慢 #87

Open
mynamedaike opened this issue Apr 23, 2023 · 14 comments
Open

推理速度很慢 #87

mynamedaike opened this issue Apr 23, 2023 · 14 comments

Comments

@mynamedaike
Copy link

在32G V100显卡上进行了FP16精度模型多卡部署以及8 Bit和4 Bit量化模型单卡部署,发现推理速度都很慢。一个普通问题需要100-120秒甚至更长时间才能做出回答。

@duanyu
Copy link

duanyu commented Apr 24, 2023

+1

1 similar comment
@wellingtonyl
Copy link

+1

@zacario-li
Copy link

我也发现了,我用两张3090和6张3090,并没有感觉到速度上的差别。同时,我用一张3090跑alpaca,它速度就很快,基本上0.X秒就可以回答出来答案,写代码稍微慢一点,但也就几秒钟。所以我不太确定问题出在哪里。

@liushengyi
Copy link

+1

@lzlz99
Copy link

lzlz99 commented Apr 24, 2023

两张A10或者4张A10同样的,非常慢。。

@SkySlity
Copy link

4张tesla V100 都还是慢的飞起, 而且还短,重要信息说一半没了,不知道咋续上

@smallshen
Copy link

+1, 同样的情况

@kevindany
Copy link

回答都不错,但就是慢,两个A100推理速度一般10s起步,慢点就100s,测试了常见的中文提问,sql能力,python能力。

@Deali-Axy
Copy link

+1,同样的情况

@xxxxuee
Copy link

xxxxuee commented May 6, 2023

+1

@dandanzou-hust
Copy link

+1,等待优化

@wanglaiqi
Copy link

生成一篇1000字的文章,测试了一下要1分钟多的时间,

@Deali-Axy
Copy link

有后续么?

@zjmwqx
Copy link

zjmwqx commented Sep 25, 2023

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests