# Using DebugResult

Here, we will show how to use `DebugResult` to debug some problems we might encounter when using our mlir-opt CLI Wrapper.

Let’s first import some necessary classes and generate an instance of our mlir-opt CLI Wrapper.

In [1]:
from mlir_graphblas import MlirOptCli

cli = MlirOptCli(executable=None, options=None)

Using development graphblas-opt: /Users/pnguyen/code/mlir-graphblas/mlir_graphblas/src/build/bin/graphblas-opt


## Generate Example Input

Let's say we have a bunch of MLIR code that we're not familiar with. 

In [2]:
mlir_string = """
#trait_sum_reduction = {
  indexing_maps = [
    affine_map<(i,j,k) -> (i,j,k)>,  // A
    affine_map<(i,j,k) -> ()>        // x (scalar out)
  ],
  iterator_types = ["reduction", "reduction", "reduction"],
  doc = "x += SUM_ijk A(i,j,k)"
}

#sparseTensor = #sparse_tensor.encoding<{
  dimLevelType = [ "compressed", "compressed", "compressed" ],
  dimOrdering = affine_map<(i,j,k) -> (i,j,k)>,
  pointerBitWidth = 64,
  indexBitWidth = 64
}>

func @func_f32(%argA: tensor<10x20x30xf32, #sparseTensor>) -> f32 {
  %out_tensor = linalg.init_tensor [] : tensor<f32>
  %reduction = linalg.generic #trait_sum_reduction
     ins(%argA: tensor<10x20x30xf32, #sparseTensor>)
    outs(%out_tensor: tensor<f32>) {
      ^bb(%a: f32, %x: f32):
        %0 = arith.addf %x, %a : f32
        linalg.yield %0 : f32
  } -> tensor<f32>
  %answer = tensor.extract %reduction[] : tensor<f32>
  return %answer : f32
}
"""
mlir_bytes = mlir_string.encode()

Since we're not familiar with this code, we don't exactly know what passes are necessary or in what order they should go in.

Let's say that this is the first set of passes we try. 

In [3]:
passes = [
    "--sparsification",
    "--sparse-tensor-conversion",
    "--linalg-bufferize",
    "--arith-bufferize",
    "--func-bufferize",
    "--tensor-bufferize",
    "--finalizing-bufferize",
    "--convert-linalg-to-loops",
    "--convert-vector-to-llvm",
    "--convert-math-to-llvm",
    "--convert-math-to-libm",
    "--convert-memref-to-llvm",
    "--convert-openmp-to-llvm",
    "--convert-arith-to-llvm",
    "--convert-std-to-llvm",
    "--reconcile-unrealized-casts"
]

Let's see what results we get. 

In [4]:
result = cli.apply_passes(mlir_bytes, passes)

[stderr] <stdin>:20:16: error: failed to legalize operation 'builtin.unrealized_conversion_cast' that was explicitly marked illegal
[stderr]   %reduction = linalg.generic #trait_sum_reduction
[stderr]                ^
[stderr] <stdin>:20:16: note: see current operation: %4 = "builtin.unrealized_conversion_cast"(%3) : (i64) -> index


MlirOptError: <stdin>:20:16: error: failed to legalize operation 'builtin.unrealized_conversion_cast' that was explicitly marked illegal
  %reduction = linalg.generic #trait_sum_reduction
               ^

We get an exception. 

Unfortunately, the exception message isn't very clear as it only gives us the immediate error message but doesn't inform us of the context in which it occurred, e.g. in which pass the error occurred (if any) or if any necessary passes are missing. 

We only know that the operation `builtin.unrealized_conversion_cast` shows up somewhere and that it's a problem.

Let's try to use the `debug_passes` method instead of the `apply_passes` to get more information. 

In [5]:
result = cli.debug_passes(mlir_bytes, passes)

In [6]:
result

  Error when running reconcile-unrealized-casts  
<stdin>:24:10: error: failed to legalize operation 'builtin.unrealized_conversion_cast' that was explicitly marked illegal
    %4 = builtin.unrealized_conversion_cast %3 : i64 to index
         ^
<stdin>:24:10: note: see current operation: %4 = "builtin.unrealized_conversion_cast"(%3) : (i64) -> index loc("<stdin>":24:10)


  Input to reconcile-unrealized-casts  
             10        20        30        40        50        60        70        80        90        100       110       120       130       140       150       160       170       180       190       200       
    12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123
    --------------------------------------------------------------------------------------------------------------------------------------------------------------

This large output may seem intimidating due to it's size, but it's mostly large since it's showing the inputs to each pass. 

We know that the error happens when the `builtin.unrealized_conversion_cast` operation occurs. 

We can see from the output above that it happens during the `convert-std-to-llvm` pass. 

It's likely that there's something problematic in the input to that pass, so it's worth looking into the IR that was given to the `convert-std-to-llvm` pass, which we can see under the section labelled `
`. We'll show a sort snippet of it below. 

In [7]:
result_string = str(result)
lines = result_string.splitlines()
lines = lines[lines.index("  Input to convert-std-to-llvm  ")-1:]
lines = lines[:lines.index("")]
print("\n".join(lines))

  Input to convert-std-to-llvm  
module {
  llvm.func @malloc(i64) -> !llvm.ptr<i8>
  llvm.func @sparseValuesF32(%arg0: !llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
    %0 = llvm.mlir.constant(1 : index) : i64
    %1 = llvm.alloca %0 x !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
    llvm.call @_mlir_ciface_sparseValuesF32(%1, %arg0) : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) -> ()
    %2 = llvm.load %1 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
    llvm.return %2 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
  }
  llvm.func @_mlir_ciface_sparseValuesF32(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) attr

While this is a good idea in general, it doesn't seem to be useful here. When MLIR applies a pass, that pass is applied until quiescence, i.e. it keeps applying the pass until nothing changes (or until some limit on the number of applications is reached). 

It seems that the `convert-std-to-llvm` pass has already been applied a few times since we see several ops from the LLVM dialect already present in the IR shown under the `Input to convert-std-to-llvm` section (for example, we see `llvm.mlir.constant`). 

Another good place to look is in the output of the last pass right before we get our error. Let's look at the result of the `convert-math-to-llvm` pass. 

In [8]:
lines = result_string.splitlines()
lines = lines[lines.index("  Input to convert-math-to-llvm  ")-1:]
lines = lines[:lines.index("")]
print("\n".join(lines))

  Input to convert-math-to-llvm  
module {
  func private @sparseValuesF32(!llvm.ptr<i8>) -> memref<?xf32> attributes {llvm.emit_c_interface}
  func private @sparsePointers64(!llvm.ptr<i8>, index) -> memref<?xi64> attributes {llvm.emit_c_interface}
  func @func_f32(%arg0: !llvm.ptr<i8>) -> f32 {
    %c0 = arith.constant 0 : index
    %c1 = arith.constant 1 : index
    %c2 = arith.constant 2 : index
    %cst = arith.constant 0.000000e+00 : f32
    %0 = call @sparsePointers64(%arg0, %c0) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %1 = call @sparsePointers64(%arg0, %c1) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %2 = call @sparsePointers64(%arg0, %c2) : (!llvm.ptr<i8>, index) -> memref<?xi64>
    %3 = call @sparseValuesF32(%arg0) : (!llvm.ptr<i8>) -> memref<?xf32>
    %4 = memref.alloc() : memref<f32>
    memref.store %cst, %4[] : memref<f32>
    %5 = memref.load %4[] : memref<f32>
    %6 = memref.load %0[%c0] : memref<?xi64>
    %7 = arith.index_cast %6 : i64 to index
    %8 = memr

We see that the ops are mostly ops from the standard, llvm, and builtin dialects. However, there are some ops from the `scf` dialect. It would make sense that the `convert-std-to-llvm` pass would be able to handle ops from the builtin dialect. It would make sense that it be able to handle ops from the llvm dialect since that's the target diallect. It's unclear whether or not the `convert-std-to-llvm` dialect can handle ops from the `scf` dialect. Given the name of the `convert-std-to-llvm` pass, we can infer that it will mostly handle ops from the `std` dialect and cannot handle ops from the `scf` dialect. Let's see if there are any passes that can convert from the `scf` dialect to the `std` dialect. 

In [9]:
!mlir-opt --help | grep "scf"

Available Dialects: acc, affine, amx, arith, arm_neon, arm_sve, async, bufferization, builtin, cf, complex, dlti, emitc, gpu, linalg, llvm, math, memref, nvvm, omp, pdl, pdl_interp, quant, rocdl, scf, shape, sparse_tensor, spv, std, tensor, test, tosa, vector, x86vector
      --async-parallel-for                              -   Convert scf.parallel operations to multiple async compute ops executed concurrently for non-overlapping iteration ranges
      --convert-linalg-tiled-loops-to-scf               -   Lower linalg tiled loops to SCF loops and parallel loops
      --convert-openacc-to-scf                          -   Convert the OpenACC ops to OpenACC with SCF dialect
      --convert-parallel-loops-to-gpu                   -   Convert mapped scf.parallel ops to gpu launch operations
      --convert-scf-to-cf                               -   Convert SCF dialect to ControlFlow dialect, replacing structured control flow with a CFG
      --convert-scf-to-openmp                  

The pass `convert-scf-to-cf` seems promising as it intends to convert the `scf` dialect to `cf` dialect. 

Let's see if running the `convert-scf-to-cf` pass any of the conversion passes will get rid of our exception. 

In [14]:
passes = [
    "--sparsification",
    "--sparse-tensor-conversion",
    "--linalg-bufferize",
    "--arith-bufferize",
    "--func-bufferize",
    "--tensor-bufferize",
    "--finalizing-bufferize",
    "--convert-scf-to-cf", # newly added
    "--convert-linalg-to-loops",
    "--convert-vector-to-llvm",
    "--convert-math-to-llvm",
    "--convert-math-to-libm",
    "--convert-memref-to-llvm",
    "--convert-openmp-to-llvm",
    "--convert-arith-to-llvm",
    "--convert-std-to-llvm",
    "--reconcile-unrealized-casts"
]
result = cli.apply_passes(mlir_bytes, passes)
print(result[:1500])

module attributes {llvm.data_layout = ""} {
  llvm.func @malloc(i64) -> !llvm.ptr<i8>
  llvm.func @sparseValuesF32(%arg0: !llvm.ptr<i8>) -> !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> attributes {llvm.emit_c_interface, sym_visibility = "private"} {
    %0 = llvm.mlir.constant(1 : index) : i64
    %1 = llvm.alloca %0 x !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)> : (i64) -> !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
    llvm.call @_mlir_ciface_sparseValuesF32(%1, %arg0) : (!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) -> ()
    %2 = llvm.load %1 : !llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>
    llvm.return %2 : !llvm.struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>
  }
  llvm.func @_mlir_ciface_sparseValuesF32(!llvm.ptr<struct<(ptr<f32>, ptr<f32>, i64, array<1 x i64>, array<1 x i64>)>>, !llvm.ptr<i8>) at

It looks like it fixed our issue!