Sparsity RFC #1143

aartbik · 2023-02-10T19:10:58Z

A proposal to add sparse tensor types to StableHLO.

rootjalex · 2023-03-16T17:10:13Z

rfcs/20230210-sparsity.md

+    #BCSR = #sparse_tensor.encoding<
+      ( i = ib * 3 + ii
+      , j = jb * 2 + jj
+      ) ->
+      ( ib = i floordiv 2 : compressed
+      , jb = j floordiv 3 : compressed
+      , ii = i mod 2 : dense
+      , jj = j mod 3 : dense
+      )


I really like this concept of expressing the affine mapping of physical (data-structure) indices to logical indices. I do think it's a bit unclear how this mapping will be used to generate good sparse code. For example, how would this mapping enable the compiler to generate an element-wise CSR * BCSR that properly coiterates the j dimension?

First, thanks for all your feedback! Much appreciated.

You bring up a good point on the tension between keeping the sparse tensor types very general and the ability of a sparse compiler to ultimately map this to efficient code. I was trying to address this in my " healthy sparse ecosystem" comment below, but should probably spell this out a bit more clear. The general direction I want to go is that sparse compilers should probably map any combination of sparsity to functional code, even if some of these are not very efficient (I am a bit weary on just giving up for inefficient case). Finding what combination works well is then a "healthy" cooperation between ML developer, the framework that generates StableHLO, and the sparse compiler.

rootjalex · 2023-03-16T17:12:00Z

rfcs/20230210-sparsity.md

+      %result = stablehlo.add %matrix1, %matrix2
+        : tensor<10x10xf32, #CSR>,
+          tensor<10x10xf32, #CSC> -> tensor<10x10xf32, #???>


Has there been any consideration in disallowing certain operations on certain combinations of data structures? Element-wise combining CSR and CSC matrices requires a discordant traversal of at least one of them, or a data-structure conversion to be inserted, both of which are generally slow.

Ah, I should have read further.

Furthermore, the IR is not necessarily concerned with the efficiency of any of the expressed operations.

It does seem like supporting code-generation for all possible combinations may place a lot of strain on compiler writers unless there is a default (e.g. like "Always insert a conversion if the mode ordering does not line up.")

In my PhD thesis sparse compiler, I had a "panic" default fall back. If none of the efficient techniques worked, it would fall back to simply "looking up" each next element. Very inefficient, but it would at least generate functional code, together with a prompt back to the user that this version would need a better sparsity design. I feel this approach (or something similar) is slightly better than a priori disallowing sparsity combinations at StableHLO IR level (perhaps some clever researcher finds an efficient way of doing something that was originally deemed too hard ;--)

That seems like a reasonable approach. To clarify, I did not intend to restrict the IR, as that seems particularly difficult to do - I was more arguing for disallowing some types of codegen (i.e. "panic" failing to codegen), but you make a good point about it being better to generate inefficient but functional code instead of failing at codegen.

rootjalex · 2023-03-16T17:17:48Z

rfcs/20230210-sparsity.md

+into the code. Nevertheless, in a healthy sparse ecosystem,
+ML frameworks are expected to generate IR that can ultimately map
+to efficient sparse code with the current state-of-the-art
+sparse compilation technology.


I won't pretend to be an ML framework expert (am far from it), but has there been any debate on whether the compiler should simply not support inefficient operations (such as the discordant traversal in my comment above) instead of requiring the upstream frameworks to just not generate inefficient IR?

Open to that debate, although, like phrased above, I am leaning towards a feedback loop of always generating functionally correct code, but prompting back (somehow) on cases that were not dealt with efficiently.

burmako · 2023-04-12T22:41:34Z

It would be great to have sparsity support in StableHLO, and the proposed high-level vision sounds compelling, especially given that it is based on the work that has been going on for 2.5 years.

By design, this RFC doesn't introduce any changes to StableHLO, so there's nothing specific to approve at this point. However, I think would be great to merge this PR anyway to capture the importance of sparsity to StableHLO as well as the community interest in this work.

As a meta comment, evaluating manifesto RFCs is a gap in our RFC process at the moment, and this will need to be addressed in the future - once we complete the transition to OpenXLA governance and roll out a formal RFC process.

Looking forward to subsequent RFCs that would propose to introduce sparse tensor types etc! Figuring out the details would be an interesting discussion to have, and we'll evaluate these RFCs separately.

@aartbik

After StableHLO => Linalg lowering was implemented in the [linalg RFC](#1610), most of the ops stated by @aartbik in the [sparsity RFC](#1143) can generate correct linalg code with sparse encodings. And it can then be lowered through mlir-opt with --sparsifier pipeline and run with mlir-runner. The `sparse.mlir` file in this PR checks the generated linalg code with sparse encodings. These are the same tests for the dense ops but with sparse input/output. The only op that couldn't be lowered to correct linalg with sparsity is `stablehlo.concatenate`. Therefore, after discussing with @GleasonK, I added a flag `enable-sparse-ops` to the linalg pass and rewrite `stablehlo.concatenate` to `sparse_tensor.concatenate` when the flag is set to true and input/output have sparse encodings. If there're more linalg converters that need modification, a new pass for sparsity will be created to run before linalg pass.

Sparsity RFC

b8eb3c8

burmako added the RFC label Feb 10, 2023

burmako self-assigned this Feb 10, 2023

aartbik and others added 4 commits February 10, 2023 13:24

addressed feedback on formatting

82b20f5

Merge branch 'openxla:main' into bik

c85fb47

used one style for placing :

f3166bb

add missing stablehlo, use proper type syntax

621fb64

burmako mentioned this pull request Feb 11, 2023

Consider adding support for sparsity #1078

Open

aartbik and others added 2 commits February 24, 2023 16:24

index -> coordinates

19d8934

Merge branch 'openxla:main' into bik

adb5b8f

rootjalex reviewed Mar 16, 2023

View reviewed changes

burmako merged commit 5313209 into openxla:main Apr 12, 2023

burmako mentioned this pull request Apr 13, 2023

Introduce QuantizedType #1352

Merged

aartbik deleted the bik branch September 26, 2023 19:30

yinying-lisa-li mentioned this pull request Apr 2, 2024

Support sparsity through StableHLO => Linalg lowering #2149

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparsity RFC #1143

Sparsity RFC #1143

aartbik commented Feb 10, 2023

rootjalex Mar 16, 2023

aartbik Mar 16, 2023 •

edited

Loading

rootjalex Mar 16, 2023

rootjalex Mar 16, 2023

aartbik Mar 16, 2023 •

edited

Loading

rootjalex Mar 20, 2023

rootjalex Mar 16, 2023

aartbik Mar 16, 2023

burmako commented Apr 12, 2023

Sparsity RFC #1143

Sparsity RFC #1143

Conversation

aartbik commented Feb 10, 2023

rootjalex Mar 16, 2023

Choose a reason for hiding this comment

aartbik Mar 16, 2023 • edited Loading

Choose a reason for hiding this comment

rootjalex Mar 16, 2023

Choose a reason for hiding this comment

rootjalex Mar 16, 2023

Choose a reason for hiding this comment

aartbik Mar 16, 2023 • edited Loading

Choose a reason for hiding this comment

rootjalex Mar 20, 2023

Choose a reason for hiding this comment

rootjalex Mar 16, 2023

Choose a reason for hiding this comment

aartbik Mar 16, 2023

Choose a reason for hiding this comment

burmako commented Apr 12, 2023

aartbik Mar 16, 2023 •

edited

Loading

aartbik Mar 16, 2023 •

edited

Loading