Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Categorical fillna with custom objects raises TypeError #21097

Closed
jbrockmendel opened this issue May 17, 2018 · 2 comments
Closed

Categorical fillna with custom objects raises TypeError #21097

jbrockmendel opened this issue May 17, 2018 · 2 comments
Labels
Categorical Categorical Data Type good first issue Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@jbrockmendel
Copy link
Member

This behavior appears new in 0.23.0.

Setup copied from #21002:

from pandas.core.base import StringMixin

class County(StringMixin):
    name = u'San Sebastián'
    state = u'PR'
    def __unicode__(self):
        return self.name + u', ' + self.state

cat = pd.Categorical([County() for n in range(61)])
>>> cat.fillna(cat[0])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/site-packages/pandas/util/_decorators.py", line 177, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/pandas/core/arrays/categorical.py", line 1769, in fillna
    '"{0}"'.format(type(value).__name__))
TypeError: "value" parameter must be a scalar, dict or Series, but you passed a "County"

I don't see any reason why passing one of the categories to fillna should be disallowed, so it looks like the issue is with being too strict about what qualifies as a scalar.

@jbrockmendel
Copy link
Member Author

@TomAugspurger I think this might be up your alley.

@TomAugspurger
Copy link
Contributor

Comes from #19684

We check is_scalar(value), and for County that's false.

is_scalar doesn't quite seem like the right check. We want to see if it's in the categories... Here's a simpler repro:

In [25]: c = pd.Categorical([(0, 1), (1, 2)])

In [26]: c.fillna((0, 1))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-26-69f5393b61c6> in <module>()
----> 1 c.fillna((0, 1))

~/sandbox/pandas/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    175                 else:
    176                     kwargs[new_arg_name] = new_arg_value
--> 177             return func(*args, **kwargs)
    178         return wrapper
    179     return _deprecate_kwarg

~/sandbox/pandas/pandas/core/arrays/categorical.py in fillna(self, value, method, limit)
   1767                 raise TypeError('"value" parameter must be a scalar, dict '
   1768                                 'or Series, but you passed a '
-> 1769                                 '"{0}"'.format(type(value).__name__))
   1770
   1771         return self._constructor(values, categories=self.categories,

TypeError: "value" parameter must be a scalar, dict or Series, but you passed a "tuple"

We'll need to watch out for non-hashable value

@TomAugspurger TomAugspurger added Regression Functionality that used to work in a prior pandas version Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Categorical Categorical Data Type Effort Low good first issue labels May 22, 2018
@TomAugspurger TomAugspurger added this to the 0.23.1 milestone May 22, 2018
TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue May 26, 2018
jreback pushed a commit that referenced this issue May 28, 2018
jorisvandenbossche pushed a commit to jorisvandenbossche/pandas that referenced this issue Jun 8, 2018
jorisvandenbossche pushed a commit that referenced this issue Jun 9, 2018
Closes #19788
Closes #21097
(cherry picked from commit 36c1f6b)
david-liu-brattle-1 pushed a commit to david-liu-brattle-1/pandas that referenced this issue Jun 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type good first issue Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

No branches or pull requests

2 participants