# graphblas.matrix_apply

This example will go over how to compile MLIR code (using the GraphBLAS dialect) to a function callable from Python.

The example MLIR code we’ll use here will demonstrate how the `graphblas.matrix_apply` op from the GraphBLAS dialect works. 

Let’s first import some necessary modules and generate an instance of our JIT engine.

In [1]:
import mlir_graphblas
import mlir_graphblas.sparse_utils
import numpy as np

engine = mlir_graphblas.MlirJitEngine()

Here are the passes we'll use. The pass `--graphblas-lower` is necessary to lower the GraphBLAS dialect.

In [2]:
passes = [
    "--graphblas-lower",
    "--sparsification",
    "--sparse-tensor-conversion",
    "--linalg-bufferize",
    "--func-bufferize",
    "--tensor-bufferize",
    "--tensor-constant-bufferize",
    "--finalizing-bufferize",
    "--convert-linalg-to-loops",
    "--convert-scf-to-std",
    "--convert-std-to-llvm",
]

Similar to our examples using the GraphBLAS dialect, we'll need some helper functions to convert sparse tensors to dense tensors. 

In [3]:
mlir_text = """
#trait_densify_csr = {
  indexing_maps = [
    affine_map<(i,j) -> (i,j)>,
    affine_map<(i,j) -> (i,j)>
  ],
  iterator_types = ["parallel", "parallel"]
}

#CSR64 = #sparse_tensor.encoding<{
  dimLevelType = [ "dense", "compressed" ],
  dimOrdering = affine_map<(i,j) -> (i,j)>,
  pointerBitWidth = 64,
  indexBitWidth = 64
}>

func @csr_densify4x4(%argA: tensor<4x4xf64, #CSR64>) -> tensor<4x4xf64> {
  %output_storage = constant dense<0.0> : tensor<4x4xf64>
  %0 = linalg.generic #trait_densify_csr
    ins(%argA: tensor<4x4xf64, #CSR64>)
    outs(%output_storage: tensor<4x4xf64>) {
      ^bb(%A: f64, %x: f64):
        linalg.yield %A : f64
    } -> tensor<4x4xf64>
  return %0 : tensor<4x4xf64>
}
"""

Let's compile our MLIR code. 

In [4]:
engine.add(mlir_text, passes)

['csr_densify4x4']

## Overview of graphblas.matrix_apply

Here, we'll show how to use the `graphblas.matrix_apply` op. 

`graphblas.matrix_apply` takes 1 sparse matrix operand in CSR format, a [thunk](https://en.wikipedia.org/wiki/Thunk) operand, and an `apply_operator` attribute. 

`graphblas.matrix_apply` applies element-wise the function indicated by the `apply_operator` attribute to each element and the thunk. The result will be a CSR matrix.

Here's an example use of the `graphblas.matrix_apply` op:
```
%answer = graphblas.matrix_apply %sparse_tensor, %thunk { apply_operator = "min" } : (tensor<?x?xf64, #CSR64>, f64) to tensor<?x?xf64, #CSR64>
```

The only currently supported option for the `apply_operator` attribute is "min".

Note that `graphblas.matrix_apply` will fail if the given sparse matrix is not in CSR format.

Let's create an example input CSR matrix.

In [5]:
indices = np.array(
    [
        [0, 3],
        [1, 3],
        [2, 0],
        [3, 0],
    ],
    dtype=np.uint64,
)
values = np.array([111, 222, 333, 444], dtype=np.float64)
sizes = np.array([4, 4], dtype=np.uint64)
sparsity = np.array([False, True], dtype=np.bool8)

csr_matrix = mlir_graphblas.sparse_utils.MLIRSparseTensor(indices, values, sizes, sparsity)

In [6]:
dense_matrix = engine.csr_densify4x4(csr_matrix)

In [7]:
dense_matrix

array([[  0.,   0.,   0., 111.],
       [  0.,   0.,   0., 222.],
       [333.,   0.,   0.,   0.],
       [444.,   0.,   0.,   0.]])

## graphblas.matrix_apply (Min)

Here, we'll clip the values of a sparse matrix to be no higher than a given limit.

In [8]:
mlir_text = """
#CSR64 = #sparse_tensor.encoding<{
  dimLevelType = [ "dense", "compressed" ],
  dimOrdering = affine_map<(i,j) -> (i,j)>,
  pointerBitWidth = 64,
  indexBitWidth = 64
}>

module {
    func @clip(%sparse_tensor: tensor<?x?xf64, #CSR64>, %limit: f64) -> tensor<?x?xf64, #CSR64> {
        %answer = graphblas.matrix_apply %sparse_tensor, %limit { apply_operator = "min" } : (tensor<?x?xf64, #CSR64>, f64) to tensor<?x?xf64, #CSR64>
        return %answer : tensor<?x?xf64, #CSR64>
    }
}
"""

In [9]:
engine.add(mlir_text, passes)

['clip']

In [10]:
sparse_result = engine.clip(csr_matrix, 200)

In [11]:
engine.csr_densify4x4(sparse_result)

array([[  0.,   0.,   0., 111.],
       [  0.,   0.,   0., 200.],
       [200.,   0.,   0.,   0.],
       [200.,   0.,   0.,   0.]])

The result looks sane. Let's verify that it has the same behavior as NumPy.

In [12]:
expected_result = dense_matrix.copy()
expected_result[expected_result>200] = 200
np.all(expected_result == engine.csr_densify4x4(sparse_result))

True