Sparse matrix multiplication is too slow

I'm comparing it to SciPy and it is clearly too slow. It can be 100 times slower on CPU, which makes it quite unusable. My GPU improves the times but it is still very slow. I show you my benchmarks and the code I used.

CPU: i5 4690k @ 4.4GHz
GPU: GTX 1060

```
SIZE: 5000 DENSITY: 0.01 DEVICE: cpu
torch: 0.0306358 seconds
np:    0.000252247 seconds
torch/np: 121.452
----------------------------------------
SIZE: 5000 DENSITY: 0.01 DEVICE: cuda
torch: 0.0127137 seconds
np:    0.000259161 seconds
torch/np: 49.057
----------------------------------------

SIZE: 10000 DENSITY: 0.01 DEVICE: cpu
torch: 0.155527 seconds
np:    0.00106144 seconds
torch/np: 146.524
----------------------------------------
SIZE: 10000 DENSITY: 0.01 DEVICE: cuda
torch: 0.0476248 seconds
np:    0.000991583 seconds
torch/np: 48.0291
----------------------------------------

SIZE: 50000 DENSITY: 0.01 DEVICE: cpu
torch: 5.94856 seconds
np:    0.0456181 seconds
torch/np: 130.399
----------------------------------------
SIZE: 50000 DENSITY: 0.01 DEVICE: cuda
torch: 1.06403 seconds
np:    0.0419693 seconds
torch/np: 25.3527

SIZE: 50000 DENSITY: 0.0001 DEVICE: cpu
torch: 0.0423768 seconds
np:    0.000562191 seconds
torch/np: 75.3779
----------------------------------------
SIZE: 50000 DENSITY: 0.0001 DEVICE: cuda
torch: 0.0175352 seconds
np:    0.000589371 seconds
torch/np: 29.7524
----------------------------------------
```

```python
import torch
import numpy as np
from scipy import sparse
import time

size = 50000
density = 0.0001
device = 'cuda'
print('SIZE:', size, 'DENSITY:', density, 'DEVICE:', device)

A = sparse.rand(size, size, format='coo', density=density).astype(np.float32)
b = torch.rand(size, 1, device=device)

values = A.data
indices = np.vstack((A.row, A.col))

i = torch.LongTensor(indices)
v = torch.FloatTensor(values)
shape = A.shape

A_torch = torch.sparse.FloatTensor(i, v, torch.Size(shape)).to(device)

s = time.time()
A_torch.mm(b)
t_torch = time.time() - s
print('torch: {:g} seconds'.format(t_torch))

b = b.cpu().numpy()

s = time.time()
A.dot(b)
t_np = time.time() - s
print('np:    {:g} seconds'.format(t_np))

print('torch/np: {:g}'.format(t_torch / t_np))
print('-'*40)
```



cc @nikitaved @pearu @cpuhrsch @IvanYashchuk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Sparse matrix multiplication is too slow #16187

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Sparse matrix multiplication is too slow #16187

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions