Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fillna a categories series with another series breaks #13628

Closed
rsdenijs opened this issue Jul 12, 2016 · 2 comments · Fixed by #27933
Closed

Fillna a categories series with another series breaks #13628

rsdenijs opened this issue Jul 12, 2016 · 2 comments · Fixed by #27933
Labels
Bug Categorical Categorical Data Type good first issue Testing pandas testing functions or related to the test suite
Milestone

Comments

@rsdenijs
Copy link

rsdenijs commented Jul 12, 2016

On pandas 0.18.1, for some values of a, b

Code Sample

pd.Categorical([1,2,None, None]).fillna(pd.Series([1,1, a, b]))

Expected Output

Either the same as

pd.Categorical([1, 2, a, b])

or

ValueError: fill value must be in categories

If a or b do not match the categories. This is what whe get when we do pd.Categorical([1, 2, None, None]).fillna(3)

Actual output

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
This error message seems rather unenlightening...

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Jul 12, 2016

I think it should be your second suggestion, the ValueError. This is also what you get when filling with scalars:

In [4]: pd.Series(pd.Categorical([1,2,None, None])).fillna(1)
Out[4]:
0    1
1    2
2    1
3    1
dtype: category
Categories (2, int64): [1, 2]

In [5]: pd.Series(pd.Categorical([1,2,None, None])).fillna('a')
...
ValueError: fill value must be in categories

@jorisvandenbossche
Copy link
Member

This seems to be working now:

In [132]: pd.Categorical([1,2,None, None]).fillna(pd.Series([1,1, 'a', 'b']))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-132-77e152fafe70> in <module>()
----> 1 pd.Categorical([1,2,None, None]).fillna(pd.Series([1,1, 'a', 'b']))

/home/joris/scipy/pandas/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    136                 else:
    137                     kwargs[new_arg_name] = new_arg_value
--> 138             return func(*args, **kwargs)
    139         return wrapper
    140     return _deprecate_kwarg

/home/joris/scipy/pandas/pandas/core/arrays/categorical.py in fillna(self, value, method, limit)
   1639             if isinstance(value, ABCSeries):
   1640                 if not value[~value.isin(self.categories)].isna().all():
-> 1641                     raise ValueError("fill value must be in categories")
   1642 
   1643                 values_codes = _get_codes_for_values(value, self.categories)

ValueError: fill value must be in categories

So would be good to add a test for this (or to check if this was fixed specifically in a PR with test added)

@jorisvandenbossche jorisvandenbossche added Testing pandas testing functions or related to the test suite good first issue and removed API Design labels Feb 20, 2018
@jreback jreback added this to the 1.0 milestone Aug 16, 2019
jreback pushed a commit that referenced this issue Aug 16, 2019
…t error message (#27933)

* Fixes issue #13628

* Fixes issue #13628 fix

* isort Fixes #13628

* move to 1.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type good first issue Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants