BUG: sparse in-place operations is not in-place (change data-type and object) #7826

zerothi · 2017-09-04T11:56:13Z

When doing inplace additions of sparse matrices the resulting data-type is not retained.

I.e. the in-place sparse matrix gets up-casted.

Reproducing code example:

import numpy as np
import scipy.sparse as sp

cc = sp.csr_matrix((10, 10), dtype=np.float64)
cc[0, 0] = 1.

# WRONG
cf = sp.csr_matrix((10, 10), dtype=np.float32)
cf[0, 0] = 1.
print('float32 == ', cf.dtype)
cf += cc
print('float32 == ', cf.dtype)

# WORKS AS EXPECTED
cf = sp.csr_matrix((10, 10), dtype=np.float32)
cf[0, 0] = 1.
print('float32 == ', cf.dtype)
cf[:, :] += cc
print('float32 == ', cf.dtype)

Error message:

There is no error message, it simply upcasts the arrays, also when adding a complex array (say only for real part).

Scipy/Numpy/Python version information:

scipy: 0.19.1
numpy: 1.13.1
Python: 2.7.13 and 3.6.2

The text was updated successfully, but these errors were encountered:

pv · 2017-09-04T11:58:22Z

There's no inplace operation for sparse matrices. `a += b` expands to the default Python operation `a = a + b`.

zerothi · 2017-09-04T12:07:06Z

So you suggest that it works as expected? I.e. upcasting is an accepted side-effect?

My expectations would be that of numpy (where the above thing works as expected, cf is always np.float32).

pv · 2017-09-04T14:55:44Z

That there is no in-place operation indeed differs from numpy arrays. This late in the game, with scipy.sparse 10+ years old, I do not really see a way to change this. . The way forward is likely to add new `csr_array` etc classes, which have fully numpy array compatible semantics, and soft-deprecate the `csr_matrix` etc classes.

matthew-brett · 2017-09-04T14:57:38Z

csr_array etc sounds like an excellent way forwards ...

zerothi · 2017-09-04T18:38:30Z

I agree. This sounds like a better game-plan.

In my own code I had to implement a replacement for csr_matrix, i.e. dim == 3 where the first 2 are the sparse matrix and the last any number to bypass some of the problems I had with the csr_matrix. However, it has other shortcomings... :(

denis-bz · 2017-09-13T13:10:40Z

It would be nice if += could do the equivalent of

def inc( X, Y, c=1. ):
""" X += c * Y, X Y sparse or dense
NB not tested on all combinations of: ndarray () (1,) (1,1), numpy matrix, sparse matrix
"""
if (not hasattr( X, "indices" ) # dense += sparse
and hasattr( Y, "indices" )):
# inc an ndarray view, because ndarry += sparse -> matrix --
X = getattr( X, "A", X ).squeeze()
X[Y.indices] += c * Y.data
else:
X += c * Y # sparse + different sparse: SparseEfficiencyWarning
return X

(A matrix of all "X op Y" combinations that work / don't work / upcast
would I think help users and implementers too.)

zerothi · 2018-11-08T09:55:02Z

I have just refound this bug (time forgets).

I think the proper solution (which may break backward compatibility) would be to raise NotImplementedError when users try to use in-place operations.

I.e. a user may expect:

import numpy as np
from scipy.sparse import csr_matrix, diags
def func(A, b):
    A += b
A = csr_matrix((10, 10))
A[1, 2] = 1
b = diags(np.arange(10))
func(A, b)
print(A)

this will yield:

  (1, 2)	1.0

it should however yield:

 (1, 2)	1.0
  (1, 1)	1.0
  (1, 2)	1.0
  (2, 2)	2.0
  (3, 3)	3.0
  (4, 4)	4.0
  (5, 5)	5.0
  (6, 6)	6.0
  (7, 7)	7.0
  (8, 8)	8.0
  (9, 9)	9.0

The above func will work correctly for numpy arrays.

PS. for others, my current fix is this (which only works for CSC/CSR matrices):

def func(A, b):
    AA = A + b
    # Restore data in the A array
    A.indices = AA.indices
    A.indptr = AA.indptr
    A.data = AA.data
    del AA

rgommers · 2018-11-11T04:25:56Z

I don't think we want to break backwards compat at this point. At most maybe a warning if a user uses an in-place operation and gets a different dtype back (if this can be checked without costing too much performance).

pydata/sparse#146 adds support for in-place operations to pydata/sparse. I don't see an explicit test for preserving dtype though, and the implementation is similarly a convenience rather than a real in-place operation. So it may suffer from the same issue. @hameerabbasi do you know?

hameerabbasi · 2018-11-11T09:29:51Z

Hi! There’s no real way that I know of to do in-place addition on Sparse arrays. The reason is that the size of the arrays may change so there’s no real way to do it.

That said, PyData/Sparse does “faux in-place” operations like @pv described, mainly for compatibility with NumPy.

The way it’s laid out currently, we do not preserve the dtype, but preserving that is a few line change at the very most. Since PyData/Sparse prefers NumPy compatibility over all else, doing this is certainly desired for us.

ev-br · 2018-11-11T09:49:17Z

It's certainly possible to do in-place binops on some formats of sparse arrays. E.g. DOK-type arrays, https://github.com/ev-br/sparr/blob/master/sparr/sp_map.h#L312

zerothi · 2018-11-11T11:21:09Z

@hameerabbasi I don't see a problem with doing in-place additions since the problem is that the object retaining the data should be the same, see e.g. my func which is basically __iadd__. I see that you cannot ensure the sparsity pattern to be the same, but that is fine.

hameerabbasi · 2018-11-11T11:24:10Z

Yes, the object remains the same, but all the arrays and memory inside it change. That's why I call them "faux in-place" operations.

As for maintaining the dtype, I've added an issue pydata/sparse#205 and a PR pydata/sparse#206 to fix it and bring behavior in line with NumPy.

zerothi · 2018-11-11T11:39:22Z

Ok, I see.

cdouglass added the scipy.sparse label Sep 12, 2017

zerothi changed the title ~~BUG: Inplace additions of sparse matrices changes data-type~~ BUG: sparse in-place operations is not in-place (change data-type and object) Nov 8, 2018

rgommers added the needs-decision Items that need further discussion before they are merged or closed label Nov 11, 2018

perimosocordiae mentioned this issue Apr 14, 2020

Operations between numpy.array and scipy.sparse matrix return inconsistent array type #7510

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: sparse in-place operations is not in-place (change data-type and object) #7826

BUG: sparse in-place operations is not in-place (change data-type and object) #7826

zerothi commented Sep 4, 2017

pv commented Sep 4, 2017 via email

zerothi commented Sep 4, 2017

pv commented Sep 4, 2017 via email

matthew-brett commented Sep 4, 2017

zerothi commented Sep 4, 2017

denis-bz commented Sep 13, 2017

zerothi commented Nov 8, 2018

rgommers commented Nov 11, 2018

hameerabbasi commented Nov 11, 2018 •

edited

ev-br commented Nov 11, 2018

zerothi commented Nov 11, 2018

hameerabbasi commented Nov 11, 2018

zerothi commented Nov 11, 2018

BUG: sparse in-place operations is not in-place (change data-type and object) #7826

BUG: sparse in-place operations is not in-place (change data-type and object) #7826

Comments

zerothi commented Sep 4, 2017

Reproducing code example:

Error message:

Scipy/Numpy/Python version information:

pv commented Sep 4, 2017 via email

zerothi commented Sep 4, 2017

pv commented Sep 4, 2017 via email

matthew-brett commented Sep 4, 2017

zerothi commented Sep 4, 2017

denis-bz commented Sep 13, 2017

zerothi commented Nov 8, 2018

rgommers commented Nov 11, 2018

hameerabbasi commented Nov 11, 2018 • edited

ev-br commented Nov 11, 2018

zerothi commented Nov 11, 2018

hameerabbasi commented Nov 11, 2018

zerothi commented Nov 11, 2018

hameerabbasi commented Nov 11, 2018 •

edited