Regression from 0.19.2 to 0.20.1 in pandas.unique() when applied to list of tuples #16519

jotterbach · 2017-05-27T01:11:50Z

Code Sample, a copy-pastable example if possible

import pandas as pd

input = [(0, 0), (0, 1), (1, 0), (1, 1), (0, 0), (0, 1), (1, 0), (1, 1)]
print pd.unique(input)

Problem description

The code exits unexpectedly

Traceback (most recent call last):
  File "pandas_bug.py", line 6, in <module>
    pd.unique(input)
  File "/Users/johannes/.virtualenvs/pandas/lib/python2.7/site-packages/pandas/core/algorithms.py", line 351, in unique
    uniques = table.unique(values)
  File "pandas/_libs/hashtable_class_helper.pxi", line 1271, in pandas._libs.hashtable.PyObjectHashTable.unique (pandas/_libs/hashtable.c:21384)
ValueError: Buffer has wrong number of dimensions (expected 1, got 2)

Expected Output

The code works on pandas version 0.19.2 and produces the expected output

[(0, 0) (0, 1) (1, 0) (1, 1)]

Moreover this problem is not limited to MacOSX, but was also encounter on Ubuntu CI server.

Output of `pd.show_versions()`

INSTALLED VERSIONS ------------------ commit: None python: 2.7.10.final.0 python-bits: 64 OS: Darwin OS-release: 15.6.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: None.None

pandas: 0.20.1
pytest: None
pip: 9.0.1
setuptools: 35.0.2
Cython: None
numpy: 1.12.1
scipy: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
pandas_gbq: None
pandas_datareader: None
None

The text was updated successfully, but these errors were encountered:

jreback · 2017-05-30T11:18:37Z

this is related to #16394 and needs the same fix, along with some tests; ensuring that nothing else breaks.

diff --git a/pandas/core/algorithms.py b/pandas/core/algorithms.py
index 77d79c9..9cfaf04 100644
--- a/pandas/core/algorithms.py
+++ b/pandas/core/algorithms.py
@@ -163,7 +163,7 @@ def _ensure_arraylike(values):
                                ABCIndexClass, ABCSeries)):
         inferred = lib.infer_dtype(values)
         if inferred in ['mixed', 'string', 'unicode']:
-            values = np.asarray(values, dtype=object)
+            values = lib.list_to_object_array(values)
         else:
             values = np.asarray(values)
     return values

Closes pandas-dev#16519

jreback added Regression Functionality that used to work in a prior pandas version Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels May 30, 2017

jreback added this to the 0.20.2 milestone May 30, 2017

jreback added Difficulty Novice labels May 30, 2017

TomAugspurger added the Blocker Blocking issue or pull request for an upcoming release label May 30, 2017

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue May 30, 2017

BUG: Fixed pd.unique on array of tuples

35da5b9

Closes pandas-dev#16519

TomAugspurger mentioned this issue May 30, 2017

BUG: Fixed pd.unique on array of tuples #16543

Merged

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue May 31, 2017

BUG: Fixed pd.unique on array of tuples

40babf0

Closes pandas-dev#16519

jreback pushed a commit to TomAugspurger/pandas that referenced this issue May 31, 2017

BUG: Fixed pd.unique on array of tuples

658f1ab

Closes pandas-dev#16519

jreback closed this as completed in #16543 Jun 1, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression from 0.19.2 to 0.20.1 in pandas.unique() when applied to list of tuples #16519

Regression from 0.19.2 to 0.20.1 in pandas.unique() when applied to list of tuples #16519

jotterbach commented May 27, 2017 •

edited

Loading

jreback commented May 30, 2017

Regression from 0.19.2 to 0.20.1 in pandas.unique() when applied to list of tuples #16519

Regression from 0.19.2 to 0.20.1 in pandas.unique() when applied to list of tuples #16519

Comments

jotterbach commented May 27, 2017 • edited Loading

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

jreback commented May 30, 2017

jotterbach commented May 27, 2017 •

edited

Loading

Output of `pd.show_versions()`