Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_msgpack() fails when data has "period" #13463

Closed
amanhanda opened this issue Jun 16, 2016 · 4 comments · Fixed by #19975
Closed

to_msgpack() fails when data has "period" #13463

amanhanda opened this issue Jun 16, 2016 · 4 comments · Fixed by #19975
Labels
Bug Period Period data type
Milestone

Comments

@amanhanda
Copy link

to_msgpack() fails when the data has "period". This works in 0.16.2

ISSUE

import pandas as pd
pd.DataFrame({'C': pd.period_range('2015-01-01', periods=3)}).to_msgpack()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-65-8e5ad9cd6036> in <module>()
----> 1 pd.DataFrame({'C': pd.period_range('2015-01-01', periods=3)}).to_msgpack()

/auto/energymdl2/anaconda/envs/commod_20160516_pd18/lib/python2.7/site-packages/pandas/core/generic.pyc in to_msgpack(self, path_or_buf, encoding, **kwargs)
   1120         from pandas.io import packers
   1121         return packers.to_msgpack(path_or_buf, self, encoding=encoding,
-> 1122                                   **kwargs)
   1123
   1124     def to_sql(self, name, con, flavor='sqlite', schema=None, if_exists='fail',

/auto/energymdl2/anaconda/envs/commod_20160516_pd18/lib/python2.7/site-packages/pandas/io/packers.pyc in to_msgpack(path_or_buf, *args, **kwargs)
    152     elif path_or_buf is None:
    153         buf = compat.BytesIO()
--> 154         writer(buf)
    155         return buf.getvalue()
    156     else:

/auto/energymdl2/anaconda/envs/commod_20160516_pd18/lib/python2.7/site-packages/pandas/io/packers.pyc in writer(fh)
    145     def writer(fh):
    146         for a in args:
--> 147             fh.write(pack(a, **kwargs))
    148
    149     if isinstance(path_or_buf, compat.string_types):

/auto/energymdl2/anaconda/envs/commod_20160516_pd18/lib/python2.7/site-packages/pandas/io/packers.pyc in pack(o, default, encoding, unicode_errors, use_single_float, autoreset, use_bin_type)
    699                   use_single_float=use_single_float,
    700                   autoreset=autoreset,
--> 701                   use_bin_type=use_bin_type).pack(o)
    702
    703

pandas/msgpack/_packer.pyx in pandas.msgpack._packer.Packer.pack (pandas/msgpack/_packer.cpp:3434)()

pandas/msgpack/_packer.pyx in pandas.msgpack._packer.Packer.pack (pandas/msgpack/_packer.cpp:3276)()

pandas/msgpack/_packer.pyx in pandas.msgpack._packer.Packer._pack (pandas/msgpack/_packer.cpp:2433)()

pandas/msgpack/_packer.pyx in pandas.msgpack._packer.Packer._pack (pandas/msgpack/_packer.cpp:3006)()

pandas/msgpack/_packer.pyx in pandas.msgpack._packer.Packer._pack (pandas/msgpack/_packer.cpp:2433)()

pandas/msgpack/_packer.pyx in pandas.msgpack._packer.Packer._pack (pandas/msgpack/_packer.cpp:3006)()

pandas/msgpack/_packer.pyx in pandas.msgpack._packer.Packer._pack (pandas/msgpack/_packer.cpp:3088)()

/auto/energymdl2/anaconda/envs/commod_20160516_pd18/lib/python2.7/site-packages/pandas/io/packers.pyc in encode(obj)
    510         return {u'typ': u'period',
    511                 u'ordinal': obj.ordinal,
--> 512                 u'freq': u(obj.freq)}
    513     elif isinstance(obj, BlockIndex):
    514         return {u'typ': u'block_index',

/auto/energymdl2/anaconda/envs/commod_20160516_pd18/lib/python2.7/site-packages/pandas/compat/__init__.pyc in u(s)
    262
    263     def u(s):
--> 264         return unicode(s, "unicode_escape")
    265
    266     def u_safe(s):

TypeError: coercing to Unicode: need string or buffer, Day found

Expected Output

Works on 0.16.2

In [30]: pd.DataFrame({'C': pd.period_range('2015-01-01', periods=3)}).to_msgpack()
Out[30]: '\x84\xa6blocks\x91\x86\xa5items\x86\xa4name\xc0\xa5dtype\x11\xa8compress\xc0\xa4data\x91\xa1C\xa5klass\xa5Index\xa3typ\xa5index\xa8compress\xc0\xa5shape\x92\x01\x03\xa6values\x93\x83\xa7ordinal\xcd@4\xa4freq\xa1D\xa3typ\xa6period\x83\xa7ordinal\xcd@5\xa4freq\xa1D\xa3typ\xa6period\x83\xa7ordinal\xcd@6\xa4freq\xa1D\xa3typ\xa6period\xa5klass\xabObjectBlock\xa5dtype\x11\xa4axes\x92\x86\xa4name\xc0\xa5dtype\x11\xa8compress\xc0\xa4data\x91\xa1C\xa5klass\xa5Index\xa3typ\xa5index\x86\xa4name\xc0\xa5dtype\x07\xa8compress\xc0\xa4data\xb8\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\xa5klass\xaaInt64Index\xa3typ\xa5index\xa3typ\xadblock_manager\xa5klass\xa9DataFrame'

INSTALLED VERSIONS


------------------
commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-573.7.1.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: C
LANG: en_US.UTF-8

pandas: 0.18.1
nose: 1.3.7
pip: 8.1.1
setuptools: 20.7.0
Cython: 0.24
numpy: 1.10.4
scipy: 0.17.0
statsmodels: 0.6.1
xarray: 0.7.2
IPython: 4.1.2
sphinx: 1.3.5
patsy: 0.4.1
dateutil: 2.5.2
pytz: 2016.4
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5.2
matplotlib: 1.4.3
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.3.2
html5lib: 0.999
httplib2: 0.9.2
apiclient: 1.5.0
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None
@jreback
Copy link
Contributor

jreback commented Jun 16, 2016

So periods work as part of a Period index, but general support for object that is non-string is not very well covered in tests. Currently Period is an object dtype.

@jreback jreback added Bug Msgpack Period Period data type labels Jun 16, 2016
@jreback jreback added this to the Next Major Release milestone Jun 16, 2016
@jreback
Copy link
Contributor

jreback commented Jun 16, 2016

quickest way to have this fixed is to do a patch and submit as a PR.

@amanhanda
Copy link
Author

For Period, we would patch with obj.freqstr, instead of obj.freq.

    509     elif isinstance(obj, Period):
    510         return {u'typ': u'period',
    511                 u'ordinal': obj.ordinal,
1-> 512                 u'freq': u(obj.freq)}

@jreback
Copy link
Contributor

jreback commented Jun 16, 2016

yes this is prob a simple fix. But needs some adequate testing (as well as prob a checkin of some generated msgpacks on older versions, use pandas/io/tests/generate_legacy_storage_files.py to generate from a working version)

Liam3851 added a commit to Liam3851/pandas that referenced this issue Mar 2, 2018
Adds support for IntervalIndex and Interval; fixes Period and
TimedeltaIndex. Closes pandas-dev#19939 and pandas-dev#13463.
Liam3851 added a commit to Liam3851/pandas that referenced this issue Mar 2, 2018
Adds support for IntervalIndex and Interval; fixes Period and
TimedeltaIndex. Closes pandas-dev#19967 and pandas-dev#13463.
Liam3851 added a commit to Liam3851/pandas that referenced this issue Mar 5, 2018
Adds support for IntervalIndex and Interval; fixes Period and
TimedeltaIndex. Closes pandas-dev#19967 and pandas-dev#13463.
@jreback jreback modified the milestones: Next Major Release, 0.23.0 Mar 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Period Period data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants