Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in SparseArray.__array_ufunc__ for reduce #27080

Closed
TomAugspurger opened this issue Jun 27, 2019 · 2 comments · Fixed by #27890

Comments

@TomAugspurger
Copy link
Contributor

commented Jun 27, 2019

Code Sample, a copy-pastable example if possible

In [2]: a = pd.SparseArray([0, 10, 1])

In [3]: np.maximum.reduce(a)
Out[3]: ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/Envs/pandas-dev/lib/python3.7/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

~/Envs/pandas-dev/lib/python3.7/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    400                         if cls is not object \
    401                                 and callable(cls.__dict__.get('__repr__')):
--> 402                             return _repr_pprint(obj, self, cycle)
    403
    404             return _default_pprint(obj, self, cycle)

~/Envs/pandas-dev/lib/python3.7/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    695     """A pprint that just redirects to the normal repr function."""
    696     # Find newlines and replace them with p.break_()
--> 697     output = repr(obj)
    698     for idx,output_line in enumerate(output.splitlines()):
    699         if idx:

~/sandbox/pandas/pandas/core/arrays/sparse.py in __repr__(self)
   1815     def __repr__(self):
   1816         return '{self}\nFill: {fill}\n{index}'.format(
-> 1817             self=printing.pprint_thing(self),
   1818             fill=printing.pprint_thing(self.fill_value),
   1819             index=printing.pprint_thing(self.sp_index))

~/sandbox/pandas/pandas/io/formats/printing.py in pprint_thing(thing, _nest_lvl, escape_chars, default_escapes, quote_strings, max_seq_items)
    215         result = _pprint_seq(thing, _nest_lvl, escape_chars=escape_chars,
    216                              quote_strings=quote_strings,
--> 217                              max_seq_items=max_seq_items)
    218     elif isinstance(thing, str) and quote_strings:
    219         result = "'{thing}'".format(thing=as_escaped_unicode(thing))

~/sandbox/pandas/pandas/io/formats/printing.py in _pprint_seq(seq, _nest_lvl, max_seq_items, **kwds)
    111     r = [pprint_thing(next(s),
    112                       _nest_lvl + 1, max_seq_items=max_seq_items, **kwds)
--> 113          for i in range(min(nitems, len(seq)))]
    114     body = ", ".join(r)
    115

~/sandbox/pandas/pandas/io/formats/printing.py in <listcomp>(.0)
    111     r = [pprint_thing(next(s),
    112                       _nest_lvl + 1, max_seq_items=max_seq_items, **kwds)
--> 113          for i in range(min(nitems, len(seq)))]
    114     body = ", ".join(r)
    115

~/sandbox/pandas/pandas/core/arrays/base.py in __iter__(self)
    283         # calls to ``__getitem__``, which may be slower than necessary.
    284         for i in range(len(self)):
--> 285             yield self[i]
    286
    287     # ------------------------------------------------------------------------

~/sandbox/pandas/pandas/core/arrays/sparse.py in __getitem__(self, key)
   1092
   1093         if is_integer(key):
-> 1094             return self._get_val_at(key)
   1095         elif isinstance(key, tuple):
   1096             data_slice = self.to_dense()[key]

~/sandbox/pandas/pandas/core/arrays/sparse.py in _get_val_at(self, loc)
   1135             return self.fill_value
   1136         else:
-> 1137             return libindex.get_value_at(self.sp_values, sp_loc)
   1138
   1139     def take(self, indices, allow_fill=False, fill_value=None):

TypeError: Argument 'arr' has incorrect type (expected numpy.ndarray, got numpy.int64)

In [4]: result = np.maximum.reduce(a)

In [5]: type(result)
Out[5]: pandas.core.arrays.sparse.SparseArray

should be a scalar 10.

@jbrockmendel

This comment has been minimized.

Copy link
Member

commented Jun 27, 2019

It looks like there is also a problem with __repr__? Or is the SparseArray constructor accepting something it shouldn't?

@TomAugspurger

This comment has been minimized.

Copy link
Contributor Author

commented Jun 27, 2019

@jbrockmendel jbrockmendel referenced this issue Aug 13, 2019
4 of 5 tasks complete

@jreback jreback added this to the 1.0 milestone Aug 13, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.