Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

toarray on sparse matrices can't handle dtype=bool (Trac #1533) #2058

Closed
scipy-gitbot opened this issue Apr 25, 2013 · 5 comments · Fixed by #2527
Closed

toarray on sparse matrices can't handle dtype=bool (Trac #1533) #2058

scipy-gitbot opened this issue Apr 25, 2013 · 5 comments · Fixed by #2527
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected Migrated from Trac scipy.sparse
Milestone

Comments

@scipy-gitbot
Copy link

Original ticket http://projects.scipy.org/scipy/ticket/1533 on 2011-10-08 by trac user larsmans, assigned to @wnbell.

It seems toarray on sparse matrices doesn't work if they were constructed with an explicit dtype=bool.

The following works:

>>> import numpy as np
>>> from scipy.sparse import *
>>> X = np.ones(10, dtype=bool)
>>> csr_matrix(X).toarray()
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1]], dtype=int8)

But now the dtype has changed. I guess the space requirements are the same, but it prints 1 instead of True. So,

>>> csr_matrix(X, dtype=bool)
<1x10 sparse matrix of type '<type 'numpy.bool_'>'
        with 10 stored elements in Compressed Sparse Row format>

Ok, that works too, but I want to print the elements...

>>> csr_matrix(X, dtype=bool).toarray()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/scratch/lars/apps/lib/python2.6/site-packages/scipy-0.9.0-py2.6-linux-x86_64.egg/scipy/sparse/compressed.py", line 550, in toarray
    return self.tocoo(copy=False).toarray()
  File "/scratch/lars/apps/lib/python2.6/site-packages/scipy-0.9.0-py2.6-linux-x86_64.egg/scipy/sparse/coo.py", line 221, in toarray
    coo_todense(M, N, self.nnz, self.row, self.col, self.data, B.ravel())
  File "/scratch/lars/apps/lib/python2.6/site-packages/scipy-0.9.0-py2.6-linux-x86_64.egg/scipy/sparse/sparsetools/coo.py", line 172, in coo_todense
    return _coo.coo_todense(*args)
TypeError: Array of type 'byte' required.  Array of type 'bool' given

The problem seems to be somewhere in the coo_matrix code; the same happens with all other sparse matrix types, except for lil_matrix:

>>> lil_matrix(X, dtype=bool).toarray()
array([[ True,  True,  True,  True,  True,  True,  True,  True,  True,
         True]], dtype=bool)
@scipy-gitbot
Copy link
Author

trac user agrothberg wrote on 2013-03-05

I am seeing the same issue with:

>>>csr_matrix(numpy.array([True, True, False]),dtype=numpy.bool)
<1x3 sparse matrix of type '<type 'numpy.bool_'>'
    with 2 stored elements in Compressed Sparse Row format>

and then

csr_matrix(numpy.array([True, True, False]),dtype=numpy.bool).toarray()

TypeError                                 Traceback (most recent call last)
<ipython-input-31-3cea59b044f4> in <module>()
----> 1 csr_matrix(numpy.array([True, True, False]),dtype=numpy.bool).toarray()

/usr/lib/python2.7/dist-packages/scipy/sparse/compressed.pyc in toarray(self)
    548 
    549     def toarray(self):
--> 550         return self.tocoo(copy=False).toarray()
    551 
    552     ##############################################################

/usr/lib/python2.7/dist-packages/scipy/sparse/coo.pyc in toarray(self)
    238         B = np.zeros(self.shape, dtype=self.dtype)
    239         M,N = self.shape
--> 240         coo_todense(M, N, self.nnz, self.row, self.col, self.data, B.ravel())
    241         return B
    242 

/usr/lib/python2.7/dist-packages/scipy/sparse/sparsetools/coo.pyc in coo_todense(*args)
    170         npy_clongdouble_wrapper Bx)
    171     """
--> 172     return _coo.coo_todense(*args)
    173 
    174 def coo_matvec(*args):

TypeError: Array of type 'byte' required.  Array of type 'bool' given

why does the type show up at numpy.bool_ rather than numpy.bool?

@scipy-gitbot
Copy link
Author

trac user cowlicks wrote on 2013-04-17

If you instantiate a csr_matrix with a ndarray what basically happens is:

class csr_matrix(_cs_matrix):
    ...

class _cs_matrix(...):
    def __init__(self, arg1, ...):
        self.format = 'csr'
        if isspmatrix(arg1):
            if arg1.format == self.format and copy:
                ...
            else:
                arg1 = arg1.asformat(self.format)
            self._set_self( arg1 )
        self._set_self( self.__class__(coo_matrix(arg1, dtype=dtype)) )

So the ndarray is passed through _cs_matrix where it becomes a coo_matrix, and is then passed through _cs_matrix again again. Then since coo_matrix.format == self.format is False, the coo_matrix is cast as csr_matrix by arg1.asformat(self.format). Where asformat uses to_csr which is where this happens:

data    = np.empty(self.nnz, dtype=upcast(self.dtype))
...
return csr_matrix((data, ...))

Where the upcast changes the dtype from bool to int8.

This is odd behavior, why is the dtype being changed? Is there a reason a csr_matrix should not have a bool dtype? Or is this just a good example of why we need to specify boolean data type handling?

@rgommers
Copy link
Member

@larsmans gh-2527 closes this issue, guess you want to check that out.

@cowlicks
Copy link
Contributor

Indeed this can be closed.

In [1]: import numpy as np

In [2]: from scipy.sparse import *

In [3]: X = np.ones(10, dtype=bool)

In [4]: csr_matrix(X).toarray()
Out[4]: 
array([[ True,  True,  True,  True,  True,  True,  True,  True,  True,
         True]], dtype=bool)

@pv pv closed this as completed Aug 12, 2013
@larsmans
Copy link
Contributor

I see there's a unit test in c6ea8cc, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected Migrated from Trac scipy.sparse
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants