Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.remove_category(np.nan) fails on Categorical with floats #10156

Closed
wikiped opened this issue May 16, 2015 · 2 comments
Closed

.remove_category(np.nan) fails on Categorical with floats #10156

wikiped opened this issue May 16, 2015 · 2 comments
Labels
Bug Categorical Categorical Data Type
Milestone

Comments

@wikiped
Copy link

wikiped commented May 16, 2015

Trying to remove a nan category from Categorical series fails if categories are made of floats.
In the docs it says:

Note: As integer Series can’t include NaN, the categories were converted to object.

So it is probably linked to this with float remaining float and nan != nan.

If this is intended behavior perhaps would be useful to add this to the docs?

import pandas as pd
df = pd.DataFrame({'a': pd.Categorical([1,2,3]),
                   'b': pd.Categorical(list('abc')),
                   'c': pd.Categorical([1.1,2.1,3.1])})
for col in df.columns:
    df[col].cat.add_categories(pd.np.nan, inplace=True)
    print df[col]
    df[col].cat.remove_categories(pd.np.nan)

0    1
1    2
2    3
Name: a, dtype: category
Categories (4, object): [1, 2, 3, NaN]
0    a
1    b
2    c
Name: b, dtype: category
Categories (4, object): [a, b, c, NaN]
0    1.1
1    2.1
2    3.1
Name: c, dtype: category
Categories (4, float64): [1.1, 2.1, 3.1, NaN]

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
d:\Anaconda\envs\py2k\lib\site-packages\pandas\core\categorical.pyc in _delegate_method(self, name, *args, **kwargs)
   1643         from pandas import Series
   1644         method = getattr(self.categorical, name)
-> 1645         res = method(*args, **kwargs)
   1646         if not res is None:
   1647             return Series(res, index=self.index)

d:\Anaconda\envs\py2k\lib\site-packages\pandas\core\categorical.pyc in remove_categories(self, removals, inplace)
    753         not_included = removals - set(self._categories)
    754         if len(not_included) != 0:
--> 755             raise ValueError("removals must all be in old categories: %s" % str(not_included))
    756         new_categories = [ c for c in self._categories if c not in removals ]
    757         return self.set_categories(new_categories, ordered=self.ordered, rename=False,

ValueError: removals must all be in old categories: set([nan])

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.9.final.0
python-bits: 64
pandas: 0.16.1
numpy: 1.9.2
    ...
@jreback
Copy link
Contributor

jreback commented May 18, 2015

looks like a bug. the set op is not friendly to nan. pull-requests are welcome.

@jreback
Copy link
Contributor

jreback commented Jun 9, 2015

closed by #10304

@jreback jreback closed this as completed Jun 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Categorical Categorical Data Type
Projects
None yet
Development

No branches or pull requests

2 participants