Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

str.get fails if Series contains dict #20671

Closed
datapythonista opened this Issue Apr 12, 2018 · 0 comments

Comments

Projects
None yet
2 participants
@datapythonista
Copy link
Member

commented Apr 12, 2018

Code Sample, a copy-pastable example if possible

>>> s = pandas.Series([{0: 'a', 1: 'b'}])
>>> s
0    {0: 'a', 1: 'b'}
dtype: object
>>> s.str.get(-1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mgarcia/.anaconda3/lib/python3.6/site-packages/pandas/core/strings.py", line 1556, in get
    result = str_get(self._data, i)
  File "/home/mgarcia/.anaconda3/lib/python3.6/site-packages/pandas/core/strings.py", line 1264, in str_get
    return _na_map(f, arr)
  File "/home/mgarcia/.anaconda3/lib/python3.6/site-packages/pandas/core/strings.py", line 156, in _na_map
    return _map(f, arr, na_mask=True, na_value=na_result, dtype=dtype)
  File "/home/mgarcia/.anaconda3/lib/python3.6/site-packages/pandas/core/strings.py", line 171, in _map
    result = lib.map_infer_mask(arr, f, mask.view(np.uint8), convert)
  File "pandas/_libs/src/inference.pyx", line 1482, in pandas._libs.lib.map_infer_mask
  File "/home/mgarcia/.anaconda3/lib/python3.6/site-packages/pandas/core/strings.py", line 1263, in <lambda>
    f = lambda x: x[i] if len(x) > i >= -len(x) else np.nan
KeyError: -1

Problem description

str.get is designed for strings, but also useful with other structures like lists, for which works fine. When the values of the Series contain a dict, str.get tries to get the key provided as an index from the dictionary and fails with a KeyError.

I think it's more consistent with the rest of pandas to simply return numpy.nan when this happens.

Expected Output

>>> s = pandas.Series([{0: 'a', 1: 'b'}])
>>> s
0    {0: 'a', 1: 'b'}
dtype: object
>>> s.str.get(-1)
0    NaN

Output of pd.show_versions()

>>> pandas.show_versions()

INSTALLED VERSIONS

commit: fa231e8
python: 3.6.4.final.0
python-bits: 64
OS: Linux
OS-release: 4.8.13-100.fc23.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.utf8
LOCALE: en_GB.UTF-8

pandas: 0.23.0.dev0+740.gfa231e8.dirty
pytest: 3.1.3
pip: 9.0.1
setuptools: 38.5.1
Cython: 0.27.3
numpy: 1.14.0
scipy: 1.0.0
pyarrow: 0.8.0
xarray: 0.10.0
IPython: 6.2.1
sphinx: 1.5
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2018.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: 0.4.0
matplotlib: 2.1.2
openpyxl: 2.5.0
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.1
pymysql: 0.8.0
psycopg2: None
jinja2: 2.10
s3fs: 0.1.3
fastparquet: 0.1.4
pandas_gbq: None
pandas_datareader: None

datapythonista added a commit to datapythonista/pandas that referenced this issue Apr 12, 2018

@jreback jreback modified the milestones: 0.23.0, Next Major Release Apr 14, 2018

datapythonista added a commit to datapythonista/pandas that referenced this issue Apr 15, 2018

@jreback jreback modified the milestones: Next Major Release, 0.23.0 Apr 24, 2018

jreback added a commit that referenced this issue Apr 24, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.