Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In SetEncoding, only set one of matmul_narrow_{M,N} #17641

Closed
bjacob opened this issue Jun 11, 2024 · 0 comments
Closed

In SetEncoding, only set one of matmul_narrow_{M,N} #17641

bjacob opened this issue Jun 11, 2024 · 0 comments
Assignees

Comments

@bjacob
Copy link
Contributor

bjacob commented Jun 11, 2024

@lialan reports:

Question: I hit cases where matmul N dimension is too narrow:

[cpu-materialize-encoding]: tile (16, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (8, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (4, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (2, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (1, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)

In such cases, do we assume the input is illegal? because we cannot find legal tiles

In SetEncoding for matmuls, when the M or N dimension is small, we may set the matmul_narrow_{M,N} attribute on the EncodingAttr, which has the effect of limiting padding in that dimension. Then, later, in MaterializeEncoding, we may transpose the matmul to reduce the matmul_narrow_ case to transpose_narrow_M, and we select an appropriate tile size for the small M value. Then that allows picking up a corresponding narrow-M code path in the mmt4d ukernel. That tile size is only narrow in the M dimension, it is generic large in the N dimension.

But what happens if both matmul_narrow_{M,N} are set? Then, we have limited padding in both dimensions. The situation is still the same after transposing, and so we now can't use a narrow-M tile, because that tile is for generic large-N and now we also have narrow-N.

By the time we reach MaterializeEncoding, it's too late to fix that up, since we have already limited padding (and buffer allocations) in both M and N dimensions. So this must be dealt with at SetEncoding time.

=> Solution: in SetEncoding, we should not set both matmul_narrow_{M,N} encoding attributes. If both M and N are small static values for which we would normally set the attribute, we should compare them and only set matmul_narrow_{M,N} for the smaller of the two. For example, if we would normally set matmul_narrow_M=4 and matmul_narrow_N=1 then in fact we should only set matmul_narrow_N=1 .

If there is a tie, that is if we would set both matmul_narrow_{M,N} to the same value, then we should only set matmul_narrow_M. Avoiding the transposition business in that case.

bjacob pushed a commit that referenced this issue Jun 18, 2024
This is respond to #17641

Signed-off-by: Alan Li <me@alanli.org>
@lialan lialan closed this as completed Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants