In `SetEncoding`, only set one of `matmul_narrow_{M,N}` #17641

bjacob · 2024-06-11T14:35:16Z

Question: I hit cases where matmul N dimension is too narrow:

[cpu-materialize-encoding]: tile (16, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (8, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (4, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (2, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (1, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)

In such cases, do we assume the input is illegal? because we cannot find legal tiles

In SetEncoding for matmuls, when the M or N dimension is small, we may set the matmul_narrow_{M,N} attribute on the EncodingAttr, which has the effect of limiting padding in that dimension. Then, later, in MaterializeEncoding, we may transpose the matmul to reduce the matmul_narrow_ case to transpose_narrow_M, and we select an appropriate tile size for the small M value. Then that allows picking up a corresponding narrow-M code path in the mmt4d ukernel. That tile size is only narrow in the M dimension, it is generic large in the N dimension.

But what happens if both matmul_narrow_{M,N} are set? Then, we have limited padding in both dimensions. The situation is still the same after transposing, and so we now can't use a narrow-M tile, because that tile is for generic large-N and now we also have narrow-N.

By the time we reach MaterializeEncoding, it's too late to fix that up, since we have already limited padding (and buffer allocations) in both M and N dimensions. So this must be dealt with at SetEncoding time.

=> Solution: in SetEncoding, we should not set both matmul_narrow_{M,N} encoding attributes. If both M and N are small static values for which we would normally set the attribute, we should compare them and only set matmul_narrow_{M,N} for the smaller of the two. For example, if we would normally set matmul_narrow_M=4 and matmul_narrow_N=1 then in fact we should only set matmul_narrow_N=1 .

If there is a tie, that is if we would set both matmul_narrow_{M,N} to the same value, then we should only set matmul_narrow_M. Avoiding the transposition business in that case.

The text was updated successfully, but these errors were encountered:

This is respond to #17641 Signed-off-by: Alan Li <me@alanli.org>

bjacob assigned lialan Jun 11, 2024

lialan mentioned this issue Jun 12, 2024

Only set one narrow M/N at a time #17647

Merged

bjacob pushed a commit that referenced this issue Jun 18, 2024

Only set one narrow M/N at a time (#17647)

6f17869

This is respond to #17641 Signed-off-by: Alan Li <me@alanli.org>

lialan closed this as completed Jun 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

In `SetEncoding`, only set one of `matmul_narrow_{M,N}` #17641

In `SetEncoding`, only set one of `matmul_narrow_{M,N}` #17641

bjacob commented Jun 11, 2024

In SetEncoding, only set one of matmul_narrow_{M,N} #17641

In SetEncoding, only set one of matmul_narrow_{M,N} #17641

Comments

bjacob commented Jun 11, 2024

In `SetEncoding`, only set one of `matmul_narrow_{M,N}` #17641

In `SetEncoding`, only set one of `matmul_narrow_{M,N}` #17641