Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot append DataFrames with uint dtypes to HDFStore #3493

Closed
jmellen opened this issue Apr 30, 2013 · 4 comments
Closed

Cannot append DataFrames with uint dtypes to HDFStore #3493

jmellen opened this issue Apr 30, 2013 · 4 comments
Labels
Bug IO Data IO issues that don't fit into a more specific label
Milestone

Comments

@jmellen
Copy link
Contributor

jmellen commented Apr 30, 2013

Trying to store DataFrames with unsigned integer dtypes in HDFStore fails due to a bug in the get_atom_data methods of the DataCol and DataIndexableCol classes in pandas.io.pytables. These methods use Python's capitalize method to map dtypes to PyTables column types, and fail because PyTables' unsigned int classes start with two capital letters (e.g., UInt32Col). This error occurs on 0.11 and the current master:

import pandas as pd
import numpy as np

uint8_series = pd.Series(np.random.random_integers(0,high=255,size=5), dtype=np.uint8)
udf = pd.DataFrame({'u08': uint8_series}, index=np.arange(5))

store = pd.HDFStore('uint.h5')

# this invocation will throw an error in pandas 0.11 and current master
store.append('uints', udf)

I have a commit + test on my fork that fixes this, just needed to submit the bug here first to have the right commit message on my fork, per contribution standards. First time committer.

@jreback
Copy link
Contributor

jreback commented Apr 30, 2013

good catch. I should have had a mapping table rather than do capitalize for just this reason....(after all there are not that many types)....and could provide a more informative message as well....put up your PR and we'll take a look (also enable travis!, see CONTRIBUTION in the main dir for how)

@jmellen
Copy link
Contributor Author

jmellen commented Apr 30, 2013

Travis is running right now; the fix just checks for startswith('uint') but a mapping table would also work. I didn't have enough confidence in what values self.kind could have (is it guaranteed to be lowercase?) to make a mapping.

Will send the PR when Travis finishes.

@jreback
Copy link
Contributor

jreback commented Apr 30, 2013

your PR is good (just need a release notes mention)....if it can't find the dtype it'll raise anyhow (as you discovered)

also..if you are interested, pls check out #2391

@jreback
Copy link
Contributor

jreback commented May 1, 2013

closed by #3493

@jreback jreback closed this as completed May 1, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

No branches or pull requests

2 participants