+TACO is not generally able to generate parallel code for expressions where the output is stored in a format that does not support random inserts (such as CSR). This branch contains modifications to allow for generating parallel code in special cases where the output has the same sparsity pattern as one of the inputs. This is needed to evaluate kernels such as SDDMM (sampled dense-dense matrix multiplication) and TTV (tensor-times-vector) where the output is a CSR. However, these modifications break taco's ability to be used as a C++ library or generate correct assembly code (this causes for many of taco's test cases to fail). This branch can still be used to generate kernels using the instructions below, but we recommend using the master branch to generate kernels with a dense output.
0 commit comments