New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: melt changes type of tz-aware columns #15785

Closed
stigviaene opened this Issue Mar 23, 2017 · 1 comment

Comments

Projects
None yet
2 participants
@stigviaene

stigviaene commented Mar 23, 2017

Code Samples

import pandas as pd
frame = pd.DataFrame({'klass':range(5), 'ts': [pd.Timestamp('2017-03-23 08:22:42.173378+01'), pd.Timestamp('2017-03-23 08:22:42.178578+01'), pd.Timestamp('2017-03-23 08:22:42.173578+01'), pd.Timestamp('2017-03-23 08:22:42.178378+01'), pd.Timestamp('2017-03-23 08:22:42.163378+01')], 'attribute':['att1', 'att2', 'att3', 'att4', 'att5'], 'value': ['a', 'b', 'c', 'd', 'd']})
# At this point, frame.ts is of dtype datetime64[ns, pytz.FixedOffset(60)]
frame.set_index(['ts', 'klass'], inplace=True)
queried_index = frame.query('value=="d"').index
pivoted_frame = frame.reset_index().pivot_table(index=['klass', 'ts'], columns='attribute', values='value', aggfunc='first')
melted_frame = pd.melt(pivoted_frame.reset_index(), id_vars=['klass', 'ts'], var_name='attribute', value_name='value')
# At this point, melted_frame.ts is of dtype datetime64[ns]
queried_after_melted_index = melted_frame.query('value=="d"').set_index(['ts', 'klass']).index
frame.loc[queried_index]  # Works
frame.loc[queried_index] = 'test'  # Works
frame.loc[queried_after_melted_index]  # Works
frame.loc[queried_after_melted_index] = 'test'  # Breaks

The last statement gives:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexing.py", line 140, in __setitem__
    indexer = self._get_setitem_indexer(key)
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexing.py", line 127, in _get_setitem_indexer
    return self._convert_to_indexer(key, is_setter=True)
  File "/usr/local/lib/python3.5/dist-packages/pandas/core/indexing.py", line 1230, in _convert_to_indexer
    raise KeyError('%s not in index' % objarr[mask])
KeyError: "MultiIndex(levels=[[2017-03-23 07:22:42.163378, 2017-03-23 07:22:42.173378, 2017-03-23 07:22:42.173578, 2017-03-23 07:22:42.178378, 2017-03-23 07:22:42.178578], [0, 1, 2, 3, 4]],\n           labels=[[3, 0], [3, 4]],\n           names=['ts', 'klass']) not in index"

Problem description

  • It is counter-intuitive that any operation (which does not explicitly mention in its docs that it does) alters the type of any column.
  • Also counter-intuitive is that frame.loc has different behavior in a statement than it has in an assignment.

Expected Output

  • melted_frame.ts and frame.ts have the same dtype.
  • DataFrame.loc fails in both cases, not just in an assignment, or succeeds in both.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 3.5.2.final.0 python-bits: 64 OS: Linux OS-release: 4.4.0-66-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

pandas: 0.19.2
nose: None
pip: 9.0.1
setuptools: 20.7.0
Cython: None
numpy: 1.12.0
scipy: None
statsmodels: None
xarray: None
IPython: 5.3.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 0.7.3
lxml: 3.5.0
bs4: 4.4.1
html5lib: 0.999
httplib2: 0.9.1
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: None
pandas_datareader: None

@jreback

This comment has been minimized.

Contributor

jreback commented Mar 23, 2017

@stigviaene .melt doesn't have the battery of tests that most other things have. So not suprising that this doesn't convert correctly. Welcome to have you submit a patch to fix or at least see if you can locate the problem.

your comments on indexing are orthogonal. If you have a specific bug/comment you can raise in another issue.

@jreback jreback added this to the Next Major Release milestone Mar 23, 2017

@jreback jreback changed the title from melt changes type of timestamp columns to BUG: melt changes type of timestamp columns Mar 23, 2017

@jreback jreback changed the title from BUG: melt changes type of timestamp columns to BUG: melt changes type of tz-aware columns Mar 23, 2017

@jreback jreback modified the milestones: Next Major Release, 0.23.0 Mar 12, 2018

jreback added a commit that referenced this issue Mar 13, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment