You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Question: I hit cases where matmul N dimension is too narrow:
[cpu-materialize-encoding]: tile (16, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (8, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (4, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (2, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
[cpu-materialize-encoding]: tile (1, 16, 2) is skipped because it is not valid for upper_bound (1, 1, 16)
In such cases, do we assume the input is illegal? because we cannot find legal tiles
In SetEncoding for matmuls, when the M or N dimension is small, we may set the matmul_narrow_{M,N} attribute on the EncodingAttr, which has the effect of limiting padding in that dimension. Then, later, in MaterializeEncoding, we may transpose the matmul to reduce the matmul_narrow_ case to transpose_narrow_M, and we select an appropriate tile size for the small M value. Then that allows picking up a corresponding narrow-M code path in the mmt4d ukernel. That tile size is only narrow in the M dimension, it is generic large in the N dimension.
But what happens if both matmul_narrow_{M,N} are set? Then, we have limited padding in both dimensions. The situation is still the same after transposing, and so we now can't use a narrow-M tile, because that tile is for generic large-N and now we also have narrow-N.
By the time we reach MaterializeEncoding, it's too late to fix that up, since we have already limited padding (and buffer allocations) in both M and N dimensions. So this must be dealt with at SetEncoding time.
=> Solution: in SetEncoding, we should not set both matmul_narrow_{M,N} encoding attributes. If both M and N are small static values for which we would normally set the attribute, we should compare them and only set matmul_narrow_{M,N} for the smaller of the two. For example, if we would normally set matmul_narrow_M=4 and matmul_narrow_N=1 then in fact we should only set matmul_narrow_N=1 .
If there is a tie, that is if we would set both matmul_narrow_{M,N} to the same value, then we should only set matmul_narrow_M. Avoiding the transposition business in that case.
The text was updated successfully, but these errors were encountered:
@lialan reports:
In
SetEncoding
for matmuls, when the M or N dimension is small, we may set thematmul_narrow_{M,N}
attribute on theEncodingAttr
, which has the effect of limiting padding in that dimension. Then, later, in MaterializeEncoding, we may transpose the matmul to reduce thematmul_narrow_
case totranspose_narrow_M
, and we select an appropriate tile size for the small M value. Then that allows picking up a corresponding narrow-M code path in themmt4d
ukernel. That tile size is only narrow in the M dimension, it is generic large in the N dimension.But what happens if both
matmul_narrow_{M,N}
are set? Then, we have limited padding in both dimensions. The situation is still the same after transposing, and so we now can't use a narrow-M tile, because that tile is for generic large-N and now we also have narrow-N.By the time we reach MaterializeEncoding, it's too late to fix that up, since we have already limited padding (and buffer allocations) in both M and N dimensions. So this must be dealt with at SetEncoding time.
=> Solution: in SetEncoding, we should not set both
matmul_narrow_{M,N}
encoding attributes. If both M and N are small static values for which we would normally set the attribute, we should compare them and only setmatmul_narrow_{M,N}
for the smaller of the two. For example, if we would normally setmatmul_narrow_M=4
andmatmul_narrow_N=1
then in fact we should only setmatmul_narrow_N=1
.If there is a tie, that is if we would set both
matmul_narrow_{M,N}
to the same value, then we should only setmatmul_narrow_M
. Avoiding the transposition business in that case.The text was updated successfully, but these errors were encountered: