Skip to content

[AIROCMLIR-552] Added Broadcasting Linalg Lowering Path#2270

Merged
Mr-Anyone merged 2 commits intodevelopfrom
pr-migraphx-to-linalg-broadcasting
Mar 6, 2026
Merged

[AIROCMLIR-552] Added Broadcasting Linalg Lowering Path#2270
Mr-Anyone merged 2 commits intodevelopfrom
pr-migraphx-to-linalg-broadcasting

Conversation

@Mr-Anyone
Copy link
Copy Markdown
Contributor

Motivation

Added lowering for the following operations:

  1. migraphx.literal
  2. migraphx.broadcasting
  3. migraphx.multibroadcasting

Technical Details

The code structure should look about the same as the one from MIGraphXToTosa.

migraphx.broadcast and migraphx.multibroadcast have similar semantics.

For migraphx.broadcast:

Uses the axis attribute to place the input within the output shape. Dimensions outside [axis, axis+inputRank) are broadcast, plus any input dimensions of size 1.

For migraphx.multibroadcast

Uses output strides: stride == 0 means that dimension is broadcast. Input is reshaped to just the non-broadcast dimensions, then linalg.broadcast expands it. Special case: splat constants are folded directly into an arith.constant of the output shape.

Test Plan

Added lit test.

Test Result

Passed e2e test and lit test.

Submission Checklist

@Mr-Anyone Mr-Anyone requested a review from causten as a code owner March 4, 2026 23:25
@Mr-Anyone Mr-Anyone changed the title [AIROCMLIR-552] Added broadcasting [AIROCMLIR-552] Added Broadcasting Linalg Lowering Path Mar 4, 2026
Comment on lines +181 to +182
// CHECK-DAG: %[[collapsed:.*]] = tensor.collapse_shape %[[expanded_0]] {{.*}} into tensor<64xf32>
// CHECK-DAG: %[[expanded_1:.*]] = tensor.expand_shape %[[collapsed]] {{.*}} into tensor<64xf32>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unnecessary. We should put some simplifier to remove such patterns to not clutter IR

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comes from the fact that reshapeValue always emits pair of collapse_shape and expand_shape even though we have the desired shape after tensor collapse_shape.

Also, if you run the canonicalize pass, the IR would look like the following:

  func.func @mbcast_add(%arg0: tensor<802816xf32>, %arg1: tensor<64xf32>) -> tensor<802816xf32> {
    %expanded = tensor.expand_shape %arg0 [[0, 1, 2, 3]] output_shape [1, 64, 112, 112] : tensor<802816xf32> into tensor<1x64x112x112xf32>
    %0 = tensor.empty() : tensor<1x64x112x112xf32>
    %broadcasted = linalg.broadcast ins(%arg1 : tensor<64xf32>) outs(%0 : tensor<1x64x112x112xf32>) dimensions = [0, 2, 3]
    %1 = tensor.empty() : tensor<1x64x112x112xf32>
    %2 = linalg.add ins(%expanded, %broadcasted : tensor<1x64x112x112xf32>, tensor<1x64x112x112xf32>) outs(%1 : tensor<1x64x112x112xf32>) -> tensor<1x64x112x112xf32>
    %collapsed = tensor.collapse_shape %2 [[0, 1, 2, 3]] : tensor<1x64x112x112xf32> into tensor<802816xf32>
    return %collapsed : tensor<802816xf32>
  }

Comment on lines +197 to +198
// CHECK-DAG: %[[cst:.*]] = arith.constant dense<0.000000e+00> : tensor<4x3xf32>
// CHECK-DAG: %[[collapsed:.*]] = tensor.collapse_shape %[[cst]]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear here why tensor.collapse_shape is necessary here. Can you explain ?
It would be better to add shape information on CHECK-DAG of tensor.collapse_shape

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tensor.collapse_shape comes from as_underlying_shape so that we can return. I can add a CHECK-NEXT/CHECK-DAG at the end. to make that more clear.

  func.func @literal_splat_f32() -> tensor<12xf32> {
    %cst = arith.constant dense<0.000000e+00> : tensor<4x3xf32>
    %collapsed = tensor.collapse_shape %cst [[0, 1]] : tensor<4x3xf32> into tensor<12xf32>
    return %collapsed : tensor<12xf32>
  }

This is unfortunately side effect of having !migraphx.shaped type having stride. This cause us to emit as.underlying.shape, and as.logical.shape at the function boundaries, which gives quite ugly IR sometimes.

This is the before we run function boundary conversion passes:

"builtin.module"() ({
  "func.func"() <{function_type = () -> tensor<12xf32>, sym_name = "literal_splat_f32"}> ({
    %0 = "arith.constant"() <{value = dense<0.000000e+00> : tensor<4x3xf32>}> : () -> tensor<4x3xf32>
    %1 = "migraphx.mlir.as.underlying.shape"(%0) : (tensor<4x3xf32>) -> !migraphx.shaped<4x3xf32, 3x1>
    "func.return"(%1) : (!migraphx.shaped<4x3xf32, 3x1>) -> ()
  }) : () -> ()
}) : () -> ()

Comment on lines +197 to +198
// CHECK-DAG: %[[cst:.*]] = arith.constant dense<0.000000e+00> : tensor<4x3xf32>
// CHECK-DAG: %[[collapsed:.*]] = tensor.collapse_shape %[[cst]]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid using CHECK-DAG if possible, it doesn't guarantee order of instructions.
Use CHECK, CHECK-NEXT

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that is ok and desired?

We can have:

%one = tensor.expand_shape ... 
%two = tensor.expand_shape ... 
%three = linalg.generic ins(%one, %two) .... 

This program also has the same semantics:

%two = tensor.expand_shape ... 
%one = tensor.expand_shape ... 
%three = linalg.generic ins(%one, %two) .... 

My understanding is that CHECK, and CHECK-NEXT could only match one of the two case, but CHECK-DAG can always match the two cases? Does mlir print always guarantee a certain ordering? I can change it to CHECK, and CHECK-NEXT without that much work.

convertedElement =
FloatAttr::get(newType.getElementType(), floatAttr.getValue());
else
return failure();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also use BoolAttr in some cases. Can you check and add cases for the same ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the boolAttr case, I have casted that into IntegerAttr so that we don't run target materialization. For dialect conversion purposes, the final value of the arith.constant, must have the same type as what the TypeConverter converts.

// CHECK-DAG: %[[collapsed:.*]] = tensor.collapse_shape %[[cst]]
// CHECK-DAG: return %[[collapsed]]
func.func @literal_splat_f32() -> !migraphx.shaped<4x3xf32, 3x1> {
%0 = migraphx.literal (dense<0.0> : tensor<4x3xf32>) : <4x3xf32, 3x1>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it splat ? It says dense here

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add non-splat dense literal test ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it splat ? It says dense here

My understanding is that dense<1.0> : tensor<2xf32> is splat, but dense<[1, 2]> : tensor<2xf32> is not splat? https://mlir.llvm.org/docs/Dialects/Builtin/#denseintorfpelementsattr

Can you add non-splat dense literal test ?

Sure, see literal_dense_si32

Comment on lines +503 to +505
if (outRank < inRank) {
return op.emitError("MultiBroadcastOp shouldn't reduce rank");
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks should be before trivial broadcast of literals. Ideally this check should be in verifier.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds MIGraphX→Linalg lowering support for literal constants and explicit broadcasting ops, aligning the structure with the existing MIGraphX→TOSA lowering approach.

Changes:

  • Add Linalg conversion patterns for migraphx.literal, migraphx.broadcast, and migraphx.multibroadcast.
  • Add new lit checks covering broadcast/multibroadcast lowering and literal lowering.
  • Remove broadcast / multibroadcast from the “not implemented” conversion test list.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp Implements new conversion patterns and helper reshaping logic for broadcast + literal lowering.
mlir/test/Conversion/MIGraphXToLinalg/mixr-to-linalg-ops.mlir Adds lit coverage for the new lowering paths.
mlir/test/Conversion/MIGraphXToLinalg/migraphx-to-linalg-not-implemented.mlir Removes expected-failure cases now that broadcast ops are implemented.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp
Comment thread mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp
Comment thread mlir/lib/Conversion/MIGraphXToLinalg/MIGraphXToLinalg.cpp Outdated
Comment thread mlir/test/Conversion/MIGraphXToLinalg/mixr-to-linalg-ops.mlir
@Mr-Anyone Mr-Anyone requested a review from umangyadav March 5, 2026 16:44
@Mr-Anyone Mr-Anyone force-pushed the pr-migraphx-to-linalg-broadcasting branch from 027e966 to fd730d0 Compare March 6, 2026 15:49
@Mr-Anyone Mr-Anyone merged commit a8ae8ac into develop Mar 6, 2026
7 of 14 checks passed
@Mr-Anyone Mr-Anyone deleted the pr-migraphx-to-linalg-broadcasting branch March 6, 2026 21:10
cursor Bot pushed a commit that referenced this pull request Mar 13, 2026
Automated weekly review of merged PRs #2234 #2240 #2248 #2249 #2251
#2254 #2257 #2258 #2259 #2270 #2271.

Identifies 6 areas with weak test coverage and meaningful business risk:
1. ConcurrentQueue (no unit tests, multi-threaded, silent deadlock risk)
2. parse_tuning_db_line / read_tuning_db key schema change (no Python tests)
3. BooleanElementwiseConverter missing f16/unsigned dtype coverage
4. Attention MaxNumFOp vs MaximumFOp NaN correctness (no dedicated test)
5. firstCausalMaskIter off-by-one risk (no non-trivial shape test)
6. Sliding window attention edge cases (windowSize=0/>=seqLen/unaligned)

The GitHub discussion API returned FORBIDDEN (read-only CI token);
analysis committed here as a permanent record.

Co-authored-by: Djordje Antic <djordje.antic@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants