matmul produce wrong (empty) results on ROCm builds

Using the `master` branch, the `matmul` function gives uninitialized(?) results. It is relatively easy to reproduce with following script

```
import numpy
import torch
from torch_sparse.matmul import matmul
from torch_sparse.tensor import SparseTensor
def main():
    src = torch.from_numpy(numpy.ones((3,3)).astype("f4")).to("cuda:0")
    other = torch.from_numpy(numpy.ones((3,3)).astype("f4")).to("cuda:0")
    src = SparseTensor.from_dense(src)
    out = matmul(src, other)
    print(out)
main()
```

`out` will be all 0, or `nan` with `reduce` == `sum`, or maximum number initialized with `min`.

I am actually not sure if it is a bug rooted in `pytorch_sparse` itself or ROCm, as unit tests are all passed in other related libs (e.g. `pytorch_scatter`). This is especially interesting as if kernel codes are compiled with -O2, `out` is filled with 0, if compiled with -O1, `out` is filled with `nan`. If compiled with `-O0`, the program will simply hang due to some error reported from `amdgpu`. 

Any debugging instruction would be helpful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

matmul produce wrong (empty) results on ROCm builds #294

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

matmul produce wrong (empty) results on ROCm builds #294

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions