-
Notifications
You must be signed in to change notification settings - Fork 56
Xe rearchitecture #477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Xe rearchitecture #477
Conversation
ca42282
to
f5cf22c
Compare
1389b60
to
42fa72b
Compare
@rolandschulz I will address the merge conflicts once review is done, to avoid rebasing. |
Hi @petercad , will this PR support ST_T? |
No, since block 2D store messages don't support transposition. But, since these kinds of stores are occasionally useful, it might be a good idea to introduce an emulated transpose store operation, using D32 scattered writes. If you have some specific use cases, let me know. |
Yes there is a case in sdpa backward where dV=PtdQ might be calculated as dVt=dQtP. For dVt, it requires transpose write. For now, I just go dV=Pt*dQ for simplicity/delivery. |
a82c1c2
to
97d7f82
Compare
97d7f82
to
910288c
Compare
910288c
to
be699d0
Compare
Do you think the commit history is useful or do we not lose anything if we squash? |
@petercad - there are couple of error I am seeing while running xe_gemm:
|
be699d0
to
8ddf42b
Compare
These are all in sycl_cute_common.hpp, which is pushed now. |
I personally like the more specific commit messages when looking back through Git history (e.g. blaming) when there are logically independent parts. |
8ddf42b
to
3c0bc23
Compare
This PR introduces a new architecture for Xe CuTe atoms (CUTLASS-level changes to come later).
Current status:
make_block_2d_copy
)make_block_2d_copy_{A,B,C}
).Link to rendered documentation here.
Note
This branch requires a very recent IGC version — ci-comp_igc-30311 or later. This IGC has important bug fixes/improvements to inline vISA needed to properly implement the new atoms.