Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

矩阵乘性能数据 #10

Open
yyfcc17 opened this issue May 29, 2024 · 2 comments
Open

矩阵乘性能数据 #10

yyfcc17 opened this issue May 29, 2024 · 2 comments

Comments

@yyfcc17
Copy link

yyfcc17 commented May 29, 2024

你好,请问有W4A16与FP16矩阵乘的具体性能对比数据吗?

@gavinchen430
Copy link
Collaborator

在A30上,m=1,n=16384,k=4096 FP16带宽大概750GB/s, W4A16大概是600GB/s。

@yyfcc17
Copy link
Author

yyfcc17 commented Jun 3, 2024

谢谢回复,请问具体的FP16矩阵乘和W4A16,W2A16矩阵乘,总体时间上加速如何呢?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants