New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incorrect Results with TF32 on Main #1937
Comments
There might be a problem with |
|
if I use
Same result when changing |
@3gx I'm also looking into similar issues, is it possible to express the transposed block ptrs |
Same result when using block ptrs by making the following changes:
and
|
… swizzling code This fix bug triton-lang#1937
I created #2180 to address the issue |
btw, I think the reason why fp16 works is by luck, because in this case the mma shape m=n=k=8, so even we swap the n and k the result is the same. |
… swizzling code Also some addtional fix in Pipeline pass This fix bug triton-lang#1937
Results look closer after the merge, but how close are we expecting compared to torch. Running the above script gives
|
… correct swizzling code (triton-lang#2180) fix bug triton-lang#1937 Co-authored-by: Philippe Tillet <phil@openai.com>
… correct swizzling code (triton-lang#2180) fix bug triton-lang#1937 Co-authored-by: Philippe Tillet <phil@openai.com>
Running on Ampere A6000, Triton commit
fd89aa1d2bca4652f383b70f81d993f258e4440f
Taken from this issue: #1840
Output on my end
Setting allow_tf32=False makes everything work.
The text was updated successfully, but these errors were encountered: