Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hierarchical index + frame_table + data_columns=True -> TypeError #4710

Closed
kghose opened this issue Aug 30, 2013 · 6 comments · Fixed by #4716

Comments

@kghose
Copy link

commented Aug 30, 2013

import pandas as pd, numpy

r = numpy.empty((3,4))
index = pd.MultiIndex.from_tuples([('A','a'), ('A','b'), ('B','a'), ('B','b')])
df = pd.DataFrame(r, columns=index)

store = pd.HDFStore('df.h5')
store.put('data',df) #->OK
store.put('data1',df,table=True) #-> Ok
store.put('data2',df,table=True,data_columns=['A']) #-> Ok
store.put('data3',df,table=True,data_columns=True) #-> raises hell 
#TypeError: not all arguments converted during string formatting

store['data']['A'] #->OK
store['data1']['A'] #-> KeyError KeyError: u'no item named A'

In [43]: df = store['data']

In [44]: df
Out[44]: 
   A                            B               
   a             b              a              b
0  2 -1.727234e-77  2.964394e-323   0.000000e+00
1  0  0.000000e+00   0.000000e+00   0.000000e+00
2  0  0.000000e+00   0.000000e+00  8.344027e-309

In [45]: df = store['data1']

In [46]: df
Out[46]: 
   (A, a)        (A, b)         (B, a)         (B, b)
0       2 -1.727234e-77  2.964394e-323   0.000000e+00
1       0  0.000000e+00   0.000000e+00   0.000000e+00
2       0  0.000000e+00   0.000000e+00  8.344027e-309


pd.__version__ -> '0.12.0'
@jreback

This comment has been minimized.

Copy link
Contributor

commented Aug 30, 2013

hmm might be a bug; just specify the data columns that you need in any event

@jreback

This comment has been minimized.

@kghose

This comment has been minimized.

Copy link
Author

commented Aug 30, 2013

Once again, man: an instant reply. Thanks so much! Sorry I had to edit the code several times. I was copying and pasting from by REPL and my Editor and made a hash of it. I verified that the code as it stands now replicates the issue.

Also, exploiting you kindness, could you send me a link to how I can use MultiIndex for the select and select_as_multiple usages?

I see your link. Thanks!

Thanks again!!

@jreback

This comment has been minimized.

Copy link
Contributor

commented Aug 30, 2013

yep...this a bug....actually wasn't really tested with a column mi

@jreback

This comment has been minimized.

Copy link
Contributor

commented Aug 31, 2013

@kghose

There was a somewhat related bug that the multi-index columns were not being recreated properly. The PR #4716 fixes this.

But, I have disallowed a multi-index when you specify data_columns, because this is quite complicated. If you are interested in solving it, go for it.

@kghose

This comment has been minimized.

Copy link
Author

commented Aug 31, 2013

Thanks @jreback . I'll pass for now on the mi ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.