BUG: CSR/CSC Division Yields COO #20169

ilan-gold · 2024-03-01T10:37:38Z

Describe your issue.

I would expect dividing a CSR or CSC matrix by an ndarray of the same size would yield a CSR/CSC matrix since the operation should just scale data but instead we get COO. I see there's an in place row scale that should be similar except it doesn't need to broadcast. But the problem then becomes one of just indexing the divisor, and then calling this?

Reproducing Code Example

from scipy import sparse as sp
import numpy as np

dividend = sp.random(1_000, 10_000, density=0.01, random_state=0, format="csr")
divisor = np.ones((1_000, 10_000))

dividend / divisor # coo
sp.csr_array(dividend) / divisor # coo again

Error message

N/A

SciPy/NumPy/Python version and system information

1.12.0 1.26.3 sys.version_info(major=3, minor=11, micro=6, releaselevel='final', serial=0)
Build Dependencies:
  blas:
    detection method: pkgconfig
    found: true
    include directory: /opt/arm64-builds/include
    lib directory: /opt/arm64-builds/lib
    name: openblas
    openblas configuration: USE_64BITINT= DYNAMIC_ARCH=1 DYNAMIC_OLDER= NO_CBLAS=
      NO_LAPACK= NO_LAPACKE= NO_AFFINITY=1 USE_OPENMP= SANDYBRIDGE MAX_THREADS=3
    pc file directory: /opt/arm64-builds/lib/pkgconfig
    version: 0.3.21.dev
  lapack:
    detection method: pkgconfig
    found: true
    include directory: /opt/arm64-builds/include
    lib directory: /opt/arm64-builds/lib
    name: openblas
    openblas configuration: USE_64BITINT= DYNAMIC_ARCH=1 DYNAMIC_OLDER= NO_CBLAS=
      NO_LAPACK= NO_LAPACKE= NO_AFFINITY=1 USE_OPENMP= SANDYBRIDGE MAX_THREADS=3
    pc file directory: /opt/arm64-builds/lib/pkgconfig
    version: 0.3.21.dev
  pybind11:
    detection method: config-tool
    include directory: unknown
    name: pybind11
    version: 2.11.1
Compilers:
  c:
    commands: cc
    linker: ld64
    name: clang
    version: 14.0.0
  c++:
    commands: c++
    linker: ld64
    name: clang
    version: 14.0.0
  cython:
    commands: cython
    linker: cython
    name: cython
    version: 3.0.8
  fortran:
    commands: gfortran
    linker: ld64
    name: gcc
    version: 12.1.0
  pythran:
    include directory: ../../pip-build-env-ewarq4oq/overlay/lib/python3.11/site-packages/pythran
    version: 0.15.0
Machine Information:
  build:
    cpu: aarch64
    endian: little
    family: aarch64
    system: darwin
  cross-compiled: false
  host:
    cpu: aarch64
    endian: little
    family: aarch64
    system: darwin
Python Information:
  path: /private/var/folders/76/zy5ktkns50v6gt5g8r0sf6sc0000gn/T/cibw-run-73jfxkej/cp311-macosx_arm64/build/venv/bin/python
  version: '3.11'

The text was updated successfully, but these errors were encountered:

ilan-gold · 2024-03-01T10:38:19Z

cc: @ivirshup

ilan-gold · 2024-03-01T10:41:27Z

For example, could we do:

dividend.data / divisor[dividend.nonzero()]

To get the new data?

ivirshup · 2024-03-01T11:09:44Z

I would add:

CSR / scalar -> CSR, but CSR / vector -> COO
the behaviour occurs for broadcast multiply, so I assume is using a common code path.

from scipy import sparse
import numpy as np

csr = sparse.random_array((100, 100), format="csr", density=0.01)
type(csr / 2)
# scipy.sparse._csr.csr_array
type(csr / np.full(100, 2))
# scipy.sparse._coo.coo_array

type(csr * 2)
# scipy.sparse._csr.csr_array
type(csr * np.full(100, 2))
# scipy.sparse._coo.coo_array

perimosocordiae · 2024-03-01T14:19:08Z

The code path for the CSR / vector case does this (given inputs csr and vector):

recip = np.true_divide(1., vector)
return csr.multiply(recip)

Then, in the handling of multiply, we do:

coo = csr.tocoo()
coo.data = np.multiply(coo.data, recip[coo.row, coo.col])
return coo

This strips away a bunch of branches for shape-checking and other stuff, but that's the core of the operation.

perimosocordiae · 2024-03-01T14:22:17Z

I wouldn't necessarily call this a correctness bug, because we don't generally guarantee that sparse formats will be preserved, but I can see a case for it being a performance bug in some situations.

Note: using csr.nonzero() to index into the dense argument would be slower, because it's implemented using tocoo() under the hood.

ivirshup · 2024-03-01T14:50:12Z

I would also say more performance bug, and probably unexpected behaviours

because we don't generally guarantee that sparse formats will be preserved

Based on that, would changing this be an acceptable change in behavior?

For reference, here is how scikit-learn implements the similar inplace_row_scale functions:

inplace_csr_column_scale isX.data *= scale.take(X.indices, mode="clip")
inplace_csr_row_scale is X.data *= np.repeat(scale, np.diff(X.indptr))

So pure python, though with allocations we could skip in compiled code. But can look like:

from scipy import sparse
import numpy as np
from operator import mul, truediv

def broadcast_csr_by_vec(X, vec, op, axis):
    if axis == 0:
        new_data = op(X.data, np.repeat(vec, np.diff(X.indptr)))
    elif axis == 1:
        new_data = op(X.data, vec.take(X.indices, mode="clip"))
    return X._with_data(new_data)

With some promise for performance:

In [28]: X = sparse.random_array((10_000, 1_000), format="csr", density=0.1)
    ...: y = np.random.randn(10_000)

In [29]: broadcast_csr_by_vec(X, y, mul, axis=0) != (X * y[:, None])
Out[29]: 
<10000x1000 sparse array of type '<class 'numpy.bool_'>'
	with 0 stored elements in Compressed Sparse Row format>

In [30]: %timeit broadcast_csr_by_vec(X, y, mul, axis=0)
4.89 ms ± 453 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [31]: %timeit (X * y[:, None])
18.9 ms ± 974 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

ivirshup · 2024-03-05T13:25:42Z

Added bonus of the approach above: since we're not "faking" division the results are actually equivalent to numpy (which they currently aren't):

dense_op = X.toarray() / y[:, None]
sparse_op = (X / y[:, None]).toarray()
new_op = broadcast_csr_by_vec(X, y, truediv, axis=0).toarray()

np.testing.assert_array_equal(dense_op, new_op)
assert not np.array_equal(dense_op, sparse_op)

ivirshup · 2024-03-05T13:31:31Z

I think a PR to fix this may be a little bigger than expected due to the number of cases that currently return COO, though don't have to. For example, currently $CSR_{m,n} * Dense_{m, n} \rightarrow COO_{m,n}$. This could (with fewer allocations) result in CSR

ilan-gold added the defect A clear bug or issue that prevents SciPy from being installed or used as expected label Mar 1, 2024

lucascolley added the scipy.sparse label Mar 1, 2024

ivirshup mentioned this issue Mar 5, 2024

(feat): pre-processing functions for dask with sparse chunks scverse/scanpy#2856

Merged

3 tasks

ivirshup mentioned this issue Mar 28, 2024

updates sparse scale scverse/scanpy#2942

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: CSR/CSC Division Yields COO #20169

BUG: CSR/CSC Division Yields COO #20169

ilan-gold commented Mar 1, 2024 •

edited

ilan-gold commented Mar 1, 2024

ilan-gold commented Mar 1, 2024

ivirshup commented Mar 1, 2024 •

edited

perimosocordiae commented Mar 1, 2024

perimosocordiae commented Mar 1, 2024

ivirshup commented Mar 1, 2024 •

edited

ivirshup commented Mar 5, 2024

ivirshup commented Mar 5, 2024

BUG: CSR/CSC Division Yields COO #20169

BUG: CSR/CSC Division Yields COO #20169

Comments

ilan-gold commented Mar 1, 2024 • edited

Describe your issue.

Reproducing Code Example

Error message

SciPy/NumPy/Python version and system information

ilan-gold commented Mar 1, 2024

ilan-gold commented Mar 1, 2024

ivirshup commented Mar 1, 2024 • edited

perimosocordiae commented Mar 1, 2024

perimosocordiae commented Mar 1, 2024

ivirshup commented Mar 1, 2024 • edited

ivirshup commented Mar 5, 2024

ivirshup commented Mar 5, 2024

ilan-gold commented Mar 1, 2024 •

edited

ivirshup commented Mar 1, 2024 •

edited

ivirshup commented Mar 1, 2024 •

edited