Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Matmul gives wrong output for large sizes #1051

Closed
awni opened this issue Apr 29, 2024 · 5 comments · Fixed by #1058
Closed

[BUG] Matmul gives wrong output for large sizes #1051

awni opened this issue Apr 29, 2024 · 5 comments · Fixed by #1058
Assignees
Labels
bug Something isn't working

Comments

@awni
Copy link
Member

awni commented Apr 29, 2024

Decrease 131072 by 131071 produces the right output, but above that the outputs don't match as they should.

import mlx.core as mx

w = mx.random.uniform(shape=(32, 32 * 4))

x = mx.random.uniform(shape=(131072, 128, 32))

y1 = x[:10] @ w
y2 = x @ w

print((y1 - y2[:10]).max().abs())
@awni
Copy link
Member Author

awni commented Apr 29, 2024

@jagrit06 this seems that we are overflowing an integer index into the output as it starts to break in the 2B range. INT_MAX is on the small side for the largest output we can support though.

Anything we can do to support larger sizes?

If not, we should put some throws in the ops as these are sneaky to debug.

@awni awni added the bug Something isn't working label Apr 29, 2024
@jagrit06 jagrit06 self-assigned this Apr 30, 2024
@jagrit06
Copy link
Member

This particular case is simple since what happens is when we try to compute auto batch_size_out = out.size() / (M *N);, the int M and N multiple to overflow and then the batch_size_out comes out to 0
The simple fix here is do that in size_t and I can make a couple other changes to make sure we can handle the large shapes

The only things I'm wondering about is if batch_size_out >= UINT32_MAX, then we will need to launch multiple matmul kernels since the grid dims can only be uint

@jagrit06
Copy link
Member

I tacked on a quick fix with #1058

@jagrit06 jagrit06 linked a pull request Apr 30, 2024 that will close this issue
4 tasks
@awni
Copy link
Member Author

awni commented Apr 30, 2024

The only things I'm wondering about is if batch_size_out >= UINT32_MAX, then we will need to launch multiple matmul kernels since the grid dims can only be uint

That seems like a much more rare case.

@thegodone
Copy link

thanks guys really cool work and fast fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants