# Fusion of graphblas.matrix_select Ops

This example will go over how to use the `--graphblas-optimize` pass from `graphblas-opt` to fuse `graphblas.matrix_select` ops.

When fusing `graphblas.matrix_select` ops, `--graphblas-optimize` simply combines several sequential `graphblas.matrix_select` ops into a single use of `graphblas.matrix_select` with multiple `selector` attributes.

Let's first import some necessary libraries.

In [11]:
import tempfile
from mlir_graphblas.cli import GRAPHBLAS_OPT_EXE

Since [sparse tensor encodings](https://mlir.llvm.org/docs/Dialects/SparseTensorOps/#sparsetensorencodingattr) can be very verbose in MLIR, let's import some helpers to make the MLIR code more readable.

In [12]:
from mlir_graphblas.tools import tersify_mlir

## Fusing graphblas.matrix_select Ops With Same Source Tensor

If we have several uses of `graphblas.matrix_select`, then `--graphblas-optimize` fuses them into one call with many selectors.

Here's some example code using 2 sequential `graphblas.matrix_select` ops with 4 total selectors. 

In [13]:
mlir_text = """
#CSR64 = #sparse_tensor.encoding<{
  dimLevelType = [ "dense", "compressed" ],
  dimOrdering = affine_map<(i,j) -> (i,j)>,
  pointerBitWidth = 64,
  indexBitWidth = 64
}>

func @select_fuse_multi(%sparse_tensor: tensor<?x?xf64, #CSR64>) -> (tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>) {
    %thunk_a = constant 1.2 : f64
    %thunk_b = constant 3.4 : f64
    %answer1, %answer2 = graphblas.matrix_select %sparse_tensor, %thunk_a { selectors = ["gt", "triu"] } : tensor<?x?xf64, #CSR64>, f64 to tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>
    %answer3 = graphblas.matrix_select %sparse_tensor, %thunk_b { selectors = ["tril", "gt"] } : tensor<?x?xf64, #CSR64>, f64 to tensor<?x?xf64, #CSR64>
    return %answer1, %answer2, %answer3 : tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>
}
"""

Let's see what code we get when we run it through `graphblas-opt` with the `--graphblas-optimize` pass.

In [14]:
with tempfile.NamedTemporaryFile() as temp:
    temp_file_name = temp.name
    with open(temp_file_name, 'w') as f:
        f.write(mlir_text)
    temp.flush()

    output_mlir = ! cat $temp_file_name | $GRAPHBLAS_OPT_EXE --graphblas-optimize
    output_mlir = "\n".join(output_mlir)
    output_mlir = tersify_mlir(output_mlir)

print(output_mlir)

#CSR64 = #sparse_tensor.encoding<{ 
    dimLevelType = [ "dense", "compressed" ], 
    dimOrdering = affine_map<(d0, d1) -> (d0, d1)>, 
    pointerBitWidth = 64, 
    indexBitWidth = 64 
}>

builtin.module  {
  builtin.func @select_fuse_multi(%arg0: tensor<?x?xf64, #CSR64>) -> (tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>) {
    %cst = constant 1.200000e+00 : f64
    %cst_0 = constant 3.400000e+00 : f64
    %0:3 = graphblas.matrix_select %arg0, %cst_0, %cst {selectors = ["tril", "gt", "gt", "triu"]} : tensor<?x?xf64, #CSR64>, f64, f64 to tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>
    return %0#1, %0#2, %0#0 : tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>
  }
}




As shown above, `--graphblas-optimize` combined the original 2 uses of `graphblas.matrix_select` into one!

## Fusing graphblas.matrix_select Ops With Different Source Tensors

Our previous examples fused ops that all selected from the same source tensor. 

`--graphblas-optimize` can also fuse calls that use different source tensors as shown here.

In [15]:
mlir_text = """
#CSR64 = #sparse_tensor.encoding<{
  dimLevelType = [ "dense", "compressed" ],
  dimOrdering = affine_map<(i,j) -> (i,j)>,
  pointerBitWidth = 64,
  indexBitWidth = 64
}>

func @select_fuse_separate(%sparse_tensor1: tensor<?x?xf64, #CSR64>, %sparse_tensor2: tensor<?x?xf64, #CSR64>) -> (tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>) {
    %c0_f64 = constant 0.0 : f64
    %answer1 = graphblas.matrix_select %sparse_tensor1, %c0_f64 { selectors = ["gt"] } : tensor<?x?xf64, #CSR64>, f64 to tensor<?x?xf64, #CSR64>
    %answer2 = graphblas.matrix_select %sparse_tensor2 { selectors = ["triu"] } : tensor<?x?xf64, #CSR64> to tensor<?x?xf64, #CSR64>
    %answer3 = graphblas.matrix_select %sparse_tensor1 { selectors = ["tril"] } : tensor<?x?xf64, #CSR64> to tensor<?x?xf64, #CSR64>
    return %answer1, %answer2, %answer3 : tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>
}
"""
with tempfile.NamedTemporaryFile() as temp:
    temp_file_name = temp.name
    with open(temp_file_name, 'w') as f:
        f.write(mlir_text)
    temp.flush()

    output_mlir = ! cat $temp_file_name | $GRAPHBLAS_OPT_EXE --graphblas-structuralize --graphblas-optimize
    output_mlir = "\n".join(output_mlir)
    output_mlir = tersify_mlir(output_mlir)

print(output_mlir)

#CSR64 = #sparse_tensor.encoding<{ 
    dimLevelType = [ "dense", "compressed" ], 
    dimOrdering = affine_map<(d0, d1) -> (d0, d1)>, 
    pointerBitWidth = 64, 
    indexBitWidth = 64 
}>

builtin.module  {
  builtin.func @select_fuse_separate(%arg0: tensor<?x?xf64, #CSR64>, %arg1: tensor<?x?xf64, #CSR64>) -> (tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>) {
    %cst = constant 0.000000e+00 : f64
    %0 = graphblas.matrix_select %arg1 {selectors = ["triu"]} : tensor<?x?xf64, #CSR64> to tensor<?x?xf64, #CSR64>
    %1:2 = graphblas.matrix_select %arg0, %cst {selectors = ["tril", "gt"]} : tensor<?x?xf64, #CSR64>, f64 to tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>
    return %1#1, %0, %1#0 : tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>, tensor<?x?xf64, #CSR64>
  }
}




Note that this necessarily reduces to two `graphblas.matrix_select` uses since `graphblas.matrix_select` takes exactly 1 source tensor.