Skip to content

COO to CSR tensor conversion is slow #57381

@pearu

Description

@pearu

Originally raised in #50937 (comment)

The COO to CSR tensor conversion is implemented in

pytorch/torch/_tensor.py

Lines 928 to 940 in 3a777b6

if self.is_sparse:
coalesced_self = self.coalesce()
row_indices = coalesced_self.indices()[0]
ro = [0]
i = 0
for irow in range(self.shape[0]):
while i < row_indices.size()[0] and row_indices[i] == irow:
i += 1
ro.append(i)
return torch.sparse_csr_tensor(torch.tensor(ro, dtype=row_indices.dtype),
coalesced_self.indices()[1], coalesced_self.values(),
size=coalesced_self.shape, dtype=coalesced_self.dtype)

and it consists of two time-consuming parts:

  • coalescing if the COO tensor is uncoaleasced
  • row indices compression procedure

Indeed, taking a coalesced COO sample with size=(1000, 1000) and nnz=500, about 1.5% of the to_sparse_csr call time is spend in coalescing and 98% of the time in running the row indices compression loop.

The slowness of the row indices compression loop has two origins:

  • it is implemented in Python
  • it uses a suboptimal algorithm

Indeed, a sample of row indices with size[0] == 1000 and nnz=500, a pure Python algorithm in
https://github.com/pearu/gcs/blob/b54ba0cba9c853b797274ef26b6c42386f2cafa3/gcs/storage.py#L24-L45
is about 3x faster than used in to_sparse_csr, and when applying numba.njit to the compress_indices function, about 83x speedup is achieved compared to to_sparse_csr.

The performance of COO to CSR tensor conversion can be improved by implementing a more efficient row indices compression algorithm (see compress_indices referenced above) in C++.

cc @VitalyFedyunin @ngimel @aocsa @nikitaved @pearu @mruberry

Metadata

Metadata

Assignees

Labels

enhancementNot as big of a feature, but technically not a bug. Should be easy to fixmodule: performanceIssues related to performance, either of kernel code or framework gluemodule: sparseRelated to torch.sparseopen sourcetriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions