[AIROCMLIR-552] Added Broadcasting Linalg Lowering Path#2270
[AIROCMLIR-552] Added Broadcasting Linalg Lowering Path#2270
Conversation
| // CHECK-DAG: %[[collapsed:.*]] = tensor.collapse_shape %[[expanded_0]] {{.*}} into tensor<64xf32> | ||
| // CHECK-DAG: %[[expanded_1:.*]] = tensor.expand_shape %[[collapsed]] {{.*}} into tensor<64xf32> |
There was a problem hiding this comment.
This seems unnecessary. We should put some simplifier to remove such patterns to not clutter IR
There was a problem hiding this comment.
I think this comes from the fact that reshapeValue always emits pair of collapse_shape and expand_shape even though we have the desired shape after tensor collapse_shape.
Also, if you run the canonicalize pass, the IR would look like the following:
func.func @mbcast_add(%arg0: tensor<802816xf32>, %arg1: tensor<64xf32>) -> tensor<802816xf32> {
%expanded = tensor.expand_shape %arg0 [[0, 1, 2, 3]] output_shape [1, 64, 112, 112] : tensor<802816xf32> into tensor<1x64x112x112xf32>
%0 = tensor.empty() : tensor<1x64x112x112xf32>
%broadcasted = linalg.broadcast ins(%arg1 : tensor<64xf32>) outs(%0 : tensor<1x64x112x112xf32>) dimensions = [0, 2, 3]
%1 = tensor.empty() : tensor<1x64x112x112xf32>
%2 = linalg.add ins(%expanded, %broadcasted : tensor<1x64x112x112xf32>, tensor<1x64x112x112xf32>) outs(%1 : tensor<1x64x112x112xf32>) -> tensor<1x64x112x112xf32>
%collapsed = tensor.collapse_shape %2 [[0, 1, 2, 3]] : tensor<1x64x112x112xf32> into tensor<802816xf32>
return %collapsed : tensor<802816xf32>
}| // CHECK-DAG: %[[cst:.*]] = arith.constant dense<0.000000e+00> : tensor<4x3xf32> | ||
| // CHECK-DAG: %[[collapsed:.*]] = tensor.collapse_shape %[[cst]] |
There was a problem hiding this comment.
It is not clear here why tensor.collapse_shape is necessary here. Can you explain ?
It would be better to add shape information on CHECK-DAG of tensor.collapse_shape
There was a problem hiding this comment.
The tensor.collapse_shape comes from as_underlying_shape so that we can return. I can add a CHECK-NEXT/CHECK-DAG at the end. to make that more clear.
func.func @literal_splat_f32() -> tensor<12xf32> {
%cst = arith.constant dense<0.000000e+00> : tensor<4x3xf32>
%collapsed = tensor.collapse_shape %cst [[0, 1]] : tensor<4x3xf32> into tensor<12xf32>
return %collapsed : tensor<12xf32>
}This is unfortunately side effect of having !migraphx.shaped type having stride. This cause us to emit as.underlying.shape, and as.logical.shape at the function boundaries, which gives quite ugly IR sometimes.
This is the before we run function boundary conversion passes:
"builtin.module"() ({
"func.func"() <{function_type = () -> tensor<12xf32>, sym_name = "literal_splat_f32"}> ({
%0 = "arith.constant"() <{value = dense<0.000000e+00> : tensor<4x3xf32>}> : () -> tensor<4x3xf32>
%1 = "migraphx.mlir.as.underlying.shape"(%0) : (tensor<4x3xf32>) -> !migraphx.shaped<4x3xf32, 3x1>
"func.return"(%1) : (!migraphx.shaped<4x3xf32, 3x1>) -> ()
}) : () -> ()
}) : () -> ()| // CHECK-DAG: %[[cst:.*]] = arith.constant dense<0.000000e+00> : tensor<4x3xf32> | ||
| // CHECK-DAG: %[[collapsed:.*]] = tensor.collapse_shape %[[cst]] |
There was a problem hiding this comment.
Avoid using CHECK-DAG if possible, it doesn't guarantee order of instructions.
Use CHECK, CHECK-NEXT
There was a problem hiding this comment.
My understanding is that is ok and desired?
We can have:
%one = tensor.expand_shape ...
%two = tensor.expand_shape ...
%three = linalg.generic ins(%one, %two) .... This program also has the same semantics:
%two = tensor.expand_shape ...
%one = tensor.expand_shape ...
%three = linalg.generic ins(%one, %two) .... My understanding is that CHECK, and CHECK-NEXT could only match one of the two case, but CHECK-DAG can always match the two cases? Does mlir print always guarantee a certain ordering? I can change it to CHECK, and CHECK-NEXT without that much work.
| convertedElement = | ||
| FloatAttr::get(newType.getElementType(), floatAttr.getValue()); | ||
| else | ||
| return failure(); |
There was a problem hiding this comment.
We also use BoolAttr in some cases. Can you check and add cases for the same ?
There was a problem hiding this comment.
For the boolAttr case, I have casted that into IntegerAttr so that we don't run target materialization. For dialect conversion purposes, the final value of the arith.constant, must have the same type as what the TypeConverter converts.
| // CHECK-DAG: %[[collapsed:.*]] = tensor.collapse_shape %[[cst]] | ||
| // CHECK-DAG: return %[[collapsed]] | ||
| func.func @literal_splat_f32() -> !migraphx.shaped<4x3xf32, 3x1> { | ||
| %0 = migraphx.literal (dense<0.0> : tensor<4x3xf32>) : <4x3xf32, 3x1> |
There was a problem hiding this comment.
is it splat ? It says dense here
There was a problem hiding this comment.
Can you add non-splat dense literal test ?
There was a problem hiding this comment.
is it splat ? It says dense here
My understanding is that dense<1.0> : tensor<2xf32> is splat, but dense<[1, 2]> : tensor<2xf32> is not splat? https://mlir.llvm.org/docs/Dialects/Builtin/#denseintorfpelementsattr
Can you add non-splat dense literal test ?
Sure, see literal_dense_si32
| if (outRank < inRank) { | ||
| return op.emitError("MultiBroadcastOp shouldn't reduce rank"); | ||
| } |
There was a problem hiding this comment.
This checks should be before trivial broadcast of literals. Ideally this check should be in verifier.
There was a problem hiding this comment.
Pull request overview
Adds MIGraphX→Linalg lowering support for literal constants and explicit broadcasting ops, aligning the structure with the existing MIGraphX→TOSA lowering approach.
Changes:
- Add Linalg conversion patterns for
migraphx.literal,migraphx.broadcast, andmigraphx.multibroadcast. - Add new lit checks covering broadcast/multibroadcast lowering and literal lowering.
- Remove
broadcast/multibroadcastfrom the “not implemented” conversion test list.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp |
Implements new conversion patterns and helper reshaping logic for broadcast + literal lowering. |
mlir/test/Conversion/MIGraphXToLinalg/mixr-to-linalg-ops.mlir |
Adds lit coverage for the new lowering paths. |
mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-not-implemented.mlir |
Removes expected-failure cases now that broadcast ops are implemented. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
027e966 to
fd730d0
Compare
Automated weekly review of merged PRs #2234 #2240 #2248 #2249 #2251 #2254 #2257 #2258 #2259 #2270 #2271. Identifies 6 areas with weak test coverage and meaningful business risk: 1. ConcurrentQueue (no unit tests, multi-threaded, silent deadlock risk) 2. parse_tuning_db_line / read_tuning_db key schema change (no Python tests) 3. BooleanElementwiseConverter missing f16/unsigned dtype coverage 4. Attention MaxNumFOp vs MaximumFOp NaN correctness (no dedicated test) 5. firstCausalMaskIter off-by-one risk (no non-trivial shape test) 6. Sliding window attention edge cases (windowSize=0/>=seqLen/unaligned) The GitHub discussion API returned FORBIDDEN (read-only CI token); analysis committed here as a permanent record. Co-authored-by: Djordje Antic <djordje.antic@amd.com>
Motivation
Added lowering for the following operations:
Technical Details
The code structure should look about the same as the one from MIGraphXToTosa.
migraphx.broadcast and migraphx.multibroadcast have similar semantics.
For migraphx.broadcast:
Uses the axis attribute to place the input within the output shape. Dimensions outside [axis, axis+inputRank) are broadcast, plus any input dimensions of size 1.
For migraphx.multibroadcast
Uses output strides: stride == 0 means that dimension is broadcast. Input is reshaped to just the non-broadcast dimensions, then linalg.broadcast expands it. Special case: splat constants are folded directly into an arith.constant of the output shape.
Test Plan
Added lit test.
Test Result
Passed e2e test and lit test.
Submission Checklist