-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Closed
Labels
Description
When using list inputs for (data, (i, j))
, COO matrix infers the data type from the type of the items in data
, just as np.array
would.
In contrast, when constructing a CSR matrix with a (data, indices, indptr)
tuple, either the data
value must be an array, or the dtype=
keyword argument must be passed to the constructor:
In [30]: i = [0, 1, 1, 1, 1, 2, 3, 4, 4]
In [31]: j = [2, 0, 1, 3, 4, 1, 0, 3, 4]
In [32]: data = [6, 1, 2, 4, 5, 1, 9, 6, 7]
In [33]: indptr = [0, 1, 5, 6, 7, 9]
In [34]: coo = sparse.coo_matrix((data, (i, j)))
In [35]: csr = sparse.csr_matrix((data, j, indptr))
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
/Users/nuneziglesiasj/anaconda/envs/elegant/lib/python3.4/site-packages/scipy/sparse/sputils.py in getdtype(dtype, a, default)
116 try:
--> 117 newdtype = a.dtype
118 except AttributeError:
AttributeError: 'list' object has no attribute 'dtype'
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call last)
<ipython-input-35-8de58fac1a34> in <module>()
----> 1 csr = sparse.csr_matrix((data, j, indptr))
/Users/nuneziglesiasj/anaconda/envs/elegant/lib/python3.4/site-packages/scipy/sparse/compressed.py in __init__(self, arg1, shape, dtype, copy)
54 self.indices = np.array(indices, copy=copy, dtype=idx_dtype)
55 self.indptr = np.array(indptr, copy=copy, dtype=idx_dtype)
---> 56 self.data = np.array(data, copy=copy, dtype=getdtype(dtype, data))
57 else:
58 raise ValueError("unrecognized %s_matrix constructor usage" %
/Users/nuneziglesiasj/anaconda/envs/elegant/lib/python3.4/site-packages/scipy/sparse/sputils.py in getdtype(dtype, a, default)
121 canCast = False
122 else:
--> 123 raise TypeError("could not interpret data type")
124 else:
125 newdtype = np.dtype(dtype)
TypeError: could not interpret data type
In [36]: sparse.csr_matrix?
In [37]: csr = sparse.csr_matrix((data, j, indptr), dtype=int)
In [38]: data2 = np.array(data)
In [39]: csr = sparse.csr_matrix((data2, j, indptr))
In [40]: coo.todense()
Out[40]:
matrix([[0, 0, 6, 0, 0],
[1, 2, 0, 4, 5],
[0, 1, 0, 0, 0],
[9, 0, 0, 0, 0],
[0, 0, 0, 6, 7]])
In [41]: csr.todense()
Out[41]:
matrix([[0, 0, 6, 0, 0],
[1, 2, 0, 4, 5],
[0, 1, 0, 0, 0],
[9, 0, 0, 0, 0],
[0, 0, 0, 6, 7]])
I would argue that CSR should infer the data type in the same way that COO does, even if the input is a list (or other array-like).