New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series.isin(values) raises ValueError if values is a set #12988

Closed
ch41rmn opened this Issue Apr 26, 2016 · 5 comments

Comments

Projects
None yet
3 participants
@ch41rmn

ch41rmn commented Apr 26, 2016

When trying to use Series.isin(values) for Timestamps, an exception is raised if values is a set.

d = {'Dates':[pd.Timestamp('2013-01-02'),
              pd.Timestamp('2013-01-03'),
              pd.Timestamp('2013-01-04')],
     'Num1':[1,2,3],
     'Num2':[-1,-2,-3]}

df = pd.DataFrame(data=d)

>>> df['Dates'].isin({pd.Timestamp('2013-01-04')})
ValueError: Buffer has wrong number of dimensions (expected 1, got 0)

>>> df['Dates'].isin([pd.Timestamp('2013-01-04')])
0    False
1    False
2     True
Name: Dates, dtype: bool

output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Linux
OS-release: 4.1.13-100.fc21.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_AU.utf8

pandas: 0.18.0
nose: None
pip: 8.1.1
setuptools: 20.2.2
Cython: None
numpy: 1.11.0
scipy: 0.17.0
statsmodels: None
xarray: 0.7.2
IPython: 4.1.2
sphinx: 1.3.5
patsy: None
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5.2
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None

@ch41rmn ch41rmn changed the title from Series isin set to Series.isin(values) raises ValueError if values is a set Apr 26, 2016

@ch41rmn

This comment has been minimized.

Show comment
Hide comment
@ch41rmn

ch41rmn Apr 26, 2016

Related stackoverflow: http://stackoverflow.com/questions/19070194/isin-function-does-not-work-for-dates

The bug raised in that SO post has been fixed, but the discussion between @cpcloud and the poster of the answer sheds light on why .isin works differently for sets (via __hash__) and lists (via __eq__).

ch41rmn commented Apr 26, 2016

Related stackoverflow: http://stackoverflow.com/questions/19070194/isin-function-does-not-work-for-dates

The bug raised in that SO post has been fixed, but the discussion between @cpcloud and the poster of the answer sheds light on why .isin works differently for sets (via __hash__) and lists (via __eq__).

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Apr 26, 2016

Contributor

yeh we should coerce to a before passing.

pull-requests welcome.

Contributor

jreback commented Apr 26, 2016

yeh we should coerce to a before passing.

pull-requests welcome.

@jreback jreback added this to the Next Major Release milestone Apr 26, 2016

@ch41rmn

This comment has been minimized.

Show comment
Hide comment
@ch41rmn

ch41rmn Apr 28, 2016

Error is raised by pd.to_datetime:

>>> import pandas as pd
>>> s = set(['20160420'])
>>> pd.to_datetime(s)
Traceback (most recent call last):
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 402, in _convert_listlike
    values, tz = tslib.datetime_to_datetime64(arg)
  File "pandas/tslib.pyx", line 1560, in pandas.tslib.datetime_to_datetime64 (pandas/tslib.c:29286)
    def datetime_to_datetime64(ndarray[object] values):
ValueError: Buffer has wrong number of dimensions (expected 1, got 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/accounts/mliu/code/pandas/pandas/util/decorators.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 291, in to_datetime
    unit=unit, infer_datetime_format=infer_datetime_format)
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 420, in _to_datetime
    return _convert_listlike(arg, box, format)
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 405, in _convert_listlike
    raise e
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 391, in _convert_listlike
    require_iso8601=require_iso8601
  File "pandas/tslib.pyx", line 1964, in pandas.tslib.array_to_datetime (pandas/tslib.c:39434)
    cpdef array_to_datetime(ndarray[object] values, errors='raise',
ValueError: Buffer has wrong number of dimensions (expected 1, got 0)
>>> pd.to_datetime(list(s))
DatetimeIndex(['2016-04-20'], dtype='datetime64[ns]', freq=None)

ch41rmn commented Apr 28, 2016

Error is raised by pd.to_datetime:

>>> import pandas as pd
>>> s = set(['20160420'])
>>> pd.to_datetime(s)
Traceback (most recent call last):
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 402, in _convert_listlike
    values, tz = tslib.datetime_to_datetime64(arg)
  File "pandas/tslib.pyx", line 1560, in pandas.tslib.datetime_to_datetime64 (pandas/tslib.c:29286)
    def datetime_to_datetime64(ndarray[object] values):
ValueError: Buffer has wrong number of dimensions (expected 1, got 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/accounts/mliu/code/pandas/pandas/util/decorators.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 291, in to_datetime
    unit=unit, infer_datetime_format=infer_datetime_format)
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 420, in _to_datetime
    return _convert_listlike(arg, box, format)
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 405, in _convert_listlike
    raise e
  File "/home/accounts/mliu/code/pandas/pandas/tseries/tools.py", line 391, in _convert_listlike
    require_iso8601=require_iso8601
  File "pandas/tslib.pyx", line 1964, in pandas.tslib.array_to_datetime (pandas/tslib.c:39434)
    cpdef array_to_datetime(ndarray[object] values, errors='raise',
ValueError: Buffer has wrong number of dimensions (expected 1, got 0)
>>> pd.to_datetime(list(s))
DatetimeIndex(['2016-04-20'], dtype='datetime64[ns]', freq=None)
@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Apr 28, 2016

Contributor

this is not a legit input to_datetime as these are ordered outputs

I suppose the error message could be better though

Contributor

jreback commented Apr 28, 2016

this is not a legit input to_datetime as these are ordered outputs

I suppose the error message could be better though

@ch41rmn ch41rmn referenced this issue Apr 28, 2016

Closed

BUG: .isin(...) now coerces sets to lists #13014

0 of 4 tasks complete
@ch41rmn

This comment has been minimized.

Show comment
Hide comment
@ch41rmn

ch41rmn Apr 28, 2016

I added the set --> list coercion in pandas.core.algorithms.isin. See pull request.

ch41rmn commented Apr 28, 2016

I added the set --> list coercion in pandas.core.algorithms.isin. See pull request.

@jreback jreback modified the milestones: 0.18.1, Next Major Release Apr 29, 2016

@jreback jreback closed this in 05e734a May 1, 2016

nps added a commit to nps/pandas that referenced this issue May 17, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment