New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
推理速度很慢 #87
Comments
+1 |
1 similar comment
+1 |
我也发现了,我用两张3090和6张3090,并没有感觉到速度上的差别。同时,我用一张3090跑alpaca,它速度就很快,基本上0.X秒就可以回答出来答案,写代码稍微慢一点,但也就几秒钟。所以我不太确定问题出在哪里。 |
+1 |
两张A10或者4张A10同样的,非常慢。。 |
4张tesla V100 都还是慢的飞起, 而且还短,重要信息说一半没了,不知道咋续上 |
+1, 同样的情况 |
Closed
回答都不错,但就是慢,两个A100推理速度一般10s起步,慢点就100s,测试了常见的中文提问,sql能力,python能力。 |
+1,同样的情况 |
+1 |
+1,等待优化 |
生成一篇1000字的文章,测试了一下要1分钟多的时间, |
有后续么? |
+1 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
在32G V100显卡上进行了FP16精度模型多卡部署以及8 Bit和4 Bit量化模型单卡部署,发现推理速度都很慢。一个普通问题需要100-120秒甚至更长时间才能做出回答。
The text was updated successfully, but these errors were encountered: