Skip to content

Commit

Permalink
Merge pull request #9331 from shoyer/categorical-unique-order
Browse files Browse the repository at this point in the history
BUG: don't sort unique values from categoricals
  • Loading branch information
shoyer committed Feb 13, 2015
2 parents c37f8df + b787bf8 commit e266c3d
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 8 deletions.
2 changes: 2 additions & 0 deletions doc/source/whatsnew/v0.16.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,8 @@ Bug Fixes
SQLAlchemy type (:issue:`9083`).


- Items in ``Categorical.unique()`` (and ``s.unique()`` if ``s`` is of dtype ``category``) now appear in the order in which they are originally found, not in sorted order (:issue:`9331`). This is now consistent with the behavior for other dtypes in pandas.


- Fixed bug on bug endian platforms which produced incorrect results in ``StataReader`` (:issue:`8688`).

Expand Down
11 changes: 5 additions & 6 deletions pandas/core/categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -1385,17 +1385,16 @@ def unique(self):
"""
Return the unique values.
Unused categories are NOT returned.
Unused categories are NOT returned. Unique values are returned in order
of appearance.
Returns
-------
unique values : array
"""
unique_codes = np.unique(self.codes)
# for compatibility with normal unique, which has nan last
if unique_codes[0] == -1:
unique_codes[0:-1] = unique_codes[1:]
unique_codes[-1] = -1
from pandas.core.nanops import unique1d
# unlike np.unique, unique1d does not sort
unique_codes = unique1d(self.codes)
return take_1d(self.categories.values, unique_codes)

def equals(self, other):
Expand Down
7 changes: 5 additions & 2 deletions pandas/tests/test_categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -818,12 +818,15 @@ def test_unique(self):
exp = np.asarray(["a","b"])
res = cat.unique()
self.assert_numpy_array_equal(res, exp)

cat = Categorical(["a","b","a","a"], categories=["a","b","c"])
res = cat.unique()
self.assert_numpy_array_equal(res, exp)
cat = Categorical(["a","b","a", np.nan], categories=["a","b","c"])

# unique should not sort
cat = Categorical(["b", "b", np.nan, "a"], categories=["a","b","c"])
res = cat.unique()
exp = np.asarray(["a","b", np.nan], dtype=object)
exp = np.asarray(["b", np.nan, "a"], dtype=object)
self.assert_numpy_array_equal(res, exp)

def test_mode(self):
Expand Down

0 comments on commit e266c3d

Please sign in to comment.