Skip to content

Sparse matrix multiplication is too slow #16187

@stefanonardo

Description

@stefanonardo

I'm comparing it to SciPy and it is clearly too slow. It can be 100 times slower on CPU, which makes it quite unusable. My GPU improves the times but it is still very slow. I show you my benchmarks and the code I used.

CPU: i5 4690k @ 4.4GHz
GPU: GTX 1060

SIZE: 5000 DENSITY: 0.01 DEVICE: cpu
torch: 0.0306358 seconds
np:    0.000252247 seconds
torch/np: 121.452
----------------------------------------
SIZE: 5000 DENSITY: 0.01 DEVICE: cuda
torch: 0.0127137 seconds
np:    0.000259161 seconds
torch/np: 49.057
----------------------------------------

SIZE: 10000 DENSITY: 0.01 DEVICE: cpu
torch: 0.155527 seconds
np:    0.00106144 seconds
torch/np: 146.524
----------------------------------------
SIZE: 10000 DENSITY: 0.01 DEVICE: cuda
torch: 0.0476248 seconds
np:    0.000991583 seconds
torch/np: 48.0291
----------------------------------------

SIZE: 50000 DENSITY: 0.01 DEVICE: cpu
torch: 5.94856 seconds
np:    0.0456181 seconds
torch/np: 130.399
----------------------------------------
SIZE: 50000 DENSITY: 0.01 DEVICE: cuda
torch: 1.06403 seconds
np:    0.0419693 seconds
torch/np: 25.3527

SIZE: 50000 DENSITY: 0.0001 DEVICE: cpu
torch: 0.0423768 seconds
np:    0.000562191 seconds
torch/np: 75.3779
----------------------------------------
SIZE: 50000 DENSITY: 0.0001 DEVICE: cuda
torch: 0.0175352 seconds
np:    0.000589371 seconds
torch/np: 29.7524
----------------------------------------
import torch
import numpy as np
from scipy import sparse
import time

size = 50000
density = 0.0001
device = 'cuda'
print('SIZE:', size, 'DENSITY:', density, 'DEVICE:', device)

A = sparse.rand(size, size, format='coo', density=density).astype(np.float32)
b = torch.rand(size, 1, device=device)

values = A.data
indices = np.vstack((A.row, A.col))

i = torch.LongTensor(indices)
v = torch.FloatTensor(values)
shape = A.shape

A_torch = torch.sparse.FloatTensor(i, v, torch.Size(shape)).to(device)

s = time.time()
A_torch.mm(b)
t_torch = time.time() - s
print('torch: {:g} seconds'.format(t_torch))

b = b.cpu().numpy()

s = time.time()
A.dot(b)
t_np = time.time() - s
print('np:    {:g} seconds'.format(t_np))

print('torch/np: {:g}'.format(t_torch / t_np))
print('-'*40)

cc @nikitaved @pearu @cpuhrsch @IvanYashchuk

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: sparseRelated to torch.sparsetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions