-
-
Notifications
You must be signed in to change notification settings - Fork 843
Open
Description
#define LDST64BITS(value) (reinterpret_cast<float2 >(&(value))[0])
...
// s_a, 6416, 每个线程load 4 half, ##每行需要4线程,64行,共256线程
const int load_smem_a_m = tid / 4; // 0~63
const int load_smem_a_k = (tid % 4) * 4; // 0,4,12,...
...
LDST64BITS(s_a[load_smem_a_m][load_smem_a_k]) =(LDST64BITS(A[load_gmem_a_addr]));
s_a每个线程是读取4个元素,为何在搬运时使用LDST64BITS,LDST64BITS定义的是搬运2个数吧?
Metadata
Metadata
Assignees
Labels
No labels