Skip to content

MLIR Affine Dialect Loop Fusion pass appears to not work #61604

@rohany

Description

@rohany

I’m playing around with the Affine dialect of MLIR, in particular the Loop Fusion pass. I haven’t been able to get it to work with mlir-opt on some simple examples, including the ones from the documentation page here: https://mlir.llvm.org/docs/Passes/#-affine-loop-fusion-fuse-affine-loop-nests.

At a high level, I’m doing the following:

  • Copying some MLIR source with fusable loops into a file “testing.mlir”
  • Running bin/mlir-opt testing.mlir —affine-loop-fusion —dump-pass-pipeline

Here is concrete input and output:

➜  build git:(main) ✗ cat ../testing.mlir                                                                                                                                                                                      
func.func @producer_consumer_fusion(%arg0: memref<10xf32>, %arg1: memref<10xf32>) {
  %0 = memref.alloc() : memref<10xf32>
  %1 = memref.alloc() : memref<10xf32>
  %cst = arith.constant 0.000000e+00 : f32
  affine.for %arg2 = 0 to 10 {
    affine.store %cst, %0[%arg2] : memref<10xf32>
    affine.store %cst, %1[%arg2] : memref<10xf32>
  }
  affine.for %arg2 = 0 to 10 {
    %2 = affine.load %0[%arg2] : memref<10xf32>
    %3 = arith.addf %2, %2 : f32
    affine.store %3, %arg0[%arg2] : memref<10xf32>
  }
  affine.for %arg2 = 0 to 10 {
    %2 = affine.load %1[%arg2] : memref<10xf32>
    %3 = arith.mulf %2, %2 : f32
    affine.store %3, %arg1[%arg2] : memref<10xf32>
  }
  return
}

func.func @sibling_fusion(%arg0: memref<10x10xf32>, %arg1: memref<10x10xf32>,
                     %arg2: memref<10x10xf32>, %arg3: memref<10x10xf32>,
                     %arg4: memref<10x10xf32>) {
  affine.for %arg5 = 0 to 3 {
    affine.for %arg6 = 0 to 3 {
      %0 = affine.load %arg0[%arg5, %arg6] : memref<10x10xf32>
      %1 = affine.load %arg1[%arg5, %arg6] : memref<10x10xf32>
      %2 = arith.mulf %0, %1 : f32
      affine.store %2, %arg3[%arg5, %arg6] : memref<10x10xf32>
    }
  }
  affine.for %arg5 = 0 to 3 {
    affine.for %arg6 = 0 to 3 {
      %0 = affine.load %arg0[%arg5, %arg6] : memref<10x10xf32>
      %1 = affine.load %arg2[%arg5, %arg6] : memref<10x10xf32>
      %2 = arith.addf %0, %1 : f32
      affine.store %2, %arg4[%arg5, %arg6] : memref<10x10xf32>
    }
  }
  return
}

➜  build git:(main) ✗ bin/mlir-opt ../testing.mlir --affine-loop-fusion --dump-pass-pipeline
Pass Manager with 1 passes:
builtin.module(affine-loop-fusion{fusion-compute-tolerance=3.000000e-01 fusion-fast-mem-space=0 fusion-local-buf-threshold=0 fusion-maximal=false mode=producer})

module {
  func.func @producer_consumer_fusion(%arg0: memref<10xf32>, %arg1: memref<10xf32>) {
    %alloc = memref.alloc() : memref<10xf32>
    %alloc_0 = memref.alloc() : memref<10xf32>
    %cst = arith.constant 0.000000e+00 : f32
    affine.for %arg2 = 0 to 10 {
      affine.store %cst, %alloc[%arg2] : memref<10xf32>
      affine.store %cst, %alloc_0[%arg2] : memref<10xf32>
    }
    affine.for %arg2 = 0 to 10 {
      %0 = affine.load %alloc[%arg2] : memref<10xf32>
      %1 = arith.addf %0, %0 : f32
      affine.store %1, %arg0[%arg2] : memref<10xf32>
    }
    affine.for %arg2 = 0 to 10 {
      %0 = affine.load %alloc_0[%arg2] : memref<10xf32>
      %1 = arith.mulf %0, %0 : f32
      affine.store %1, %arg1[%arg2] : memref<10xf32>
    }
    return
  }
  func.func @sibling_fusion(%arg0: memref<10x10xf32>, %arg1: memref<10x10xf32>, %arg2: memref<10x10xf32>, %arg3: memref<10x10xf32>, %arg4: memref<10x10xf32>) {
    affine.for %arg5 = 0 to 3 {
      affine.for %arg6 = 0 to 3 {
        %0 = affine.load %arg0[%arg5, %arg6] : memref<10x10xf32>
        %1 = affine.load %arg1[%arg5, %arg6] : memref<10x10xf32>
        %2 = arith.mulf %0, %1 : f32
        affine.store %2, %arg3[%arg5, %arg6] : memref<10x10xf32>
      }
    }
    affine.for %arg5 = 0 to 3 {
      affine.for %arg6 = 0 to 3 {
        %0 = affine.load %arg0[%arg5, %arg6] : memref<10x10xf32>
        %1 = affine.load %arg2[%arg5, %arg6] : memref<10x10xf32>
        %2 = arith.addf %0, %1 : f32
        affine.store %2, %arg4[%arg5, %arg6] : memref<10x10xf32>
      }
    }
    return
  }
}

A related bug is that I seem to be unable to set the value of the argument mode to the loop fusion pass. Even if I do —affine-loop-fusion=“mode=greedy”, the output from the pass manager always reports that mode is producer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    mlir:affinequestionA question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions