Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression from 0.19.2 to 0.20.1 in pandas.unique() when applied to list of tuples #16519

Closed
jotterbach opened this issue May 27, 2017 · 1 comment
Labels
Blocker Blocking issue or pull request for an upcoming release Regression Functionality that used to work in a prior pandas version Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@jotterbach
Copy link

jotterbach commented May 27, 2017

Code Sample, a copy-pastable example if possible

import pandas as pd

input = [(0, 0), (0, 1), (1, 0), (1, 1), (0, 0), (0, 1), (1, 0), (1, 1)]
print pd.unique(input)

Problem description

The code exits unexpectedly

Traceback (most recent call last):
  File "pandas_bug.py", line 6, in <module>
    pd.unique(input)
  File "/Users/johannes/.virtualenvs/pandas/lib/python2.7/site-packages/pandas/core/algorithms.py", line 351, in unique
    uniques = table.unique(values)
  File "pandas/_libs/hashtable_class_helper.pxi", line 1271, in pandas._libs.hashtable.PyObjectHashTable.unique (pandas/_libs/hashtable.c:21384)
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)

Expected Output

The code works on pandas version 0.19.2 and produces the expected output

[(0, 0) (0, 1) (1, 0) (1, 1)]

Moreover this problem is not limited to MacOSX, but was also encounter on Ubuntu CI server.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.10.final.0 python-bits: 64 OS: Darwin OS-release: 15.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None

pandas: 0.20.1
pytest: None
pip: 9.0.1
setuptools: 35.0.2
Cython: None
numpy: 1.12.1
scipy: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
pandas_gbq: None
pandas_datareader: None
None

@jreback
Copy link
Contributor

jreback commented May 30, 2017

this is related to #16394 and needs the same fix, along with some tests; ensuring that nothing else breaks.

diff --git a/pandas/core/algorithms.py b/pandas/core/algorithms.py
index 77d79c9..9cfaf04 100644
--- a/pandas/core/algorithms.py
+++ b/pandas/core/algorithms.py
@@ -163,7 +163,7 @@ def _ensure_arraylike(values):
                                ABCIndexClass, ABCSeries)):
         inferred = lib.infer_dtype(values)
         if inferred in ['mixed', 'string', 'unicode']:
-            values = np.asarray(values, dtype=object)
+            values = lib.list_to_object_array(values)
         else:
             values = np.asarray(values)
     return values

@jreback jreback added Regression Functionality that used to work in a prior pandas version Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels May 30, 2017
@jreback jreback added this to the 0.20.2 milestone May 30, 2017
@TomAugspurger TomAugspurger added the Blocker Blocking issue or pull request for an upcoming release label May 30, 2017
TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue May 30, 2017
TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue May 31, 2017
jreback pushed a commit to TomAugspurger/pandas that referenced this issue May 31, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocker Blocking issue or pull request for an upcoming release Regression Functionality that used to work in a prior pandas version Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

No branches or pull requests

3 participants