msgpack unpack dataframe error: InvalidIndexError #9618

Closed
jmavila opened this Issue Mar 9, 2015 · 4 comments

Comments

Projects
None yet
3 participants

jmavila commented Mar 9, 2015

This happen with some Pandas dataframes with indexes.

from pandas import read_msgpack
msg = df.to_msgpack()
read_msgpack(msg) #==> exception!

Full Traceback (most recent call last):

File "/srv/python/venv/local/lib/python2.7/site-packages/pandas/io/packers.py", line 163, in read_msgpack
return read(fh)
File "/srv/python/venv/local/lib/python2.7/site-packages/pandas/io/packers.py", line 141, in read
l = list(unpack(fh))
File "pandas/msgpack.pyx", line 662, in pandas.msgpack.Unpacker.next (pandas/msgpack.cpp:8303)
File "pandas/msgpack.pyx", line 591, in pandas.msgpack.Unpacker._unpack (pandas/msgpack.cpp:7328)
File "/srv/python/venv/local/lib/python2.7/site-packages/pandas/io/packers.py", line 490, in decode
blocks = [create_block(b) for b in obj['blocks']]
File "/srv/python/venv/local/lib/python2.7/site-packages/pandas/io/packers.py", line 488, in create_block
placement=axes[0].get_indexer(b['items']))
File "/srv/python/venv/local/lib/python2.7/site-packages/pandas/core/index.py", line 1488, in get_indexer
raise InvalidIndexError('Reindexing only valid with uniquely'
InvalidIndexError: Reindexing only valid with uniquely valued Index objects

Contributor

jreback commented Mar 9, 2015

show pd.show_versions() a sample of the data frame and df.index, and df.index.is_unique

jreback added the Msgpack label Mar 9, 2015

jmavila commented Mar 9, 2015

pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.8.final.0
python-bits: 64
OS: Linux
OS-release: 3.8.0-29-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.15.0

df.index

Int64Index([22, 29, 15, 11, 16, 43, 23, 30, 38, 24, 5, 9, 25, 7, 47, 4, 12, 26, 10, 17, 31, 39, 46, 6, 27, 18, 44, 41, 13, 8, 19, 20, 32, 52, 48, 0, 2, 3, 34, 35, 33, 49, 51, 1, 50], dtype='int64')

df.index.is_unique

result_df.index.is_unique == True

Sample dataframe

,d-37_id,d-37,m5010,m5010
22,29,201202,24330.47,24330.47
29,30,201203,56512.66,56512.66
15,31,201204,40812.07,40812.07
11,32,201205,115167.1,115167.1
16,33,201206,65513.84,65513.84
43,34,201207,20229.75,20229.75
23,35,201208,47008.03,47008.03
30,36,201209,69365.29,69365.29
38,37,201210,23771.42,23771.42
24,38,201211,61179.85,61179.85
5,39,201212,35658.55,35658.55
9,16,201301,162725.08,162725.08
25,17,201302,30242.13,30242.13
7,18,201303,100384.58,100384.58
47,19,201304,90338.54,90338.54
4,20,201305,75657.68,75657.68
12,21,201306,187771.41,187771.41
26,22,201307,70994.8,70994.8
10,23,201308,39255.37,39255.37
17,24,201309,232711.12,232711.12
31,25,201310,65818.43,65818.43
39,26,201311,184484.82,184484.82
46,27,201312,113374.1,113374.1
6,15,201401,129522.36,129522.36
27,40,201402,111226.45,111226.45
18,41,201403,64987.46,64987.46
44,42,201404,201649.92,201649.92
41,43,201405,44253.82,44253.82
13,44,201406,177624.59,177624.59
8,45,201407,51742.21,51742.21
19,62,201408,62129.06,62129.06
20,63,201409,196536.66,196536.66
32,64,201410,83913.08,83913.08
52,65,201411,6765.87,6765.87
48,66,201412,4585.0,4585.0
0,53,201501,282906.59,282906.59
2,54,201502,243472.38,243472.38
3,55,201503,150515.79,150515.79
34,56,201504,262266.21,262266.21
35,57,201505,57971.76,57971.76
33,58,201506,26562.02,26562.02
49,59,201507,53539.97,53539.97
51,70,201512,4231.5,4231.5
1,49,201601,120828.2,120828.2
50,71,201701,10448.92,10448.92

Contributor

jreback commented Mar 9, 2015

This works in 0.15.2

You can try to upgrade (I do recall this bug getting fixed, but can't find ATM).
or show an example which fails.

In [15]: pd.read_msgpack(df.to_msgpack())
Out[15]: 
    d-37_id    d-37      m5010    m5010.1
22       29  201202   24330.47   24330.47
29       30  201203   56512.66   56512.66
Contributor

filmor commented Mar 13, 2015

This does not seem to be fixed and happens when the column index is not unique:

pd.read_msgpack(pd.DataFrame(columns=["a", "a"]).to_msgpack())

I thought I had reported this error, will try to find it.

jreback added the Bug label Mar 13, 2015

jreback added this to the 0.16.1 milestone Mar 13, 2015

@jreback jreback modified the milestone: 0.17.0, 0.16.1 Apr 28, 2015

jreback closed this in #10527 Jul 18, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment