Why we use q,k,k_max = (64, 64, 32) / (64, 128,128) / (64, 128, 2e16) for sm80? #847
ZhangDY-6483
started this conversation in
Ideas
Replies: 1 comment 3 replies
-
Hi, |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In generate_kernel.py, we generate one specific arch's kernel with some fixed shapes, such as q,k,k_max = (64, 64, 32) / (64, 128,128) / (64, 128, 2e16) for sm80. I'm not sure why these three shapes are selected but not others?
(Did we emulate all the possible shape and test the performance of them and select the best combination? )
Beta Was this translation helpful? Give feedback.
All reactions