Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDFStore silently coerces number to string in 'where' clause but not in put #512

bshanks opened this issue Dec 20, 2011 · 1 comment


Copy link

commented Dec 20, 2011

Thanks for pandas! Once written, tables with numerical values in the index are unselectable. Perhaps either they should be coerced to string upon store.put, or alternately they should not be coerced in

store = HDFStore('test.h5')
store.put('test', DataFrame([0, 1, 2], [10, 11, 12], ['col1']), table=True)'test', where=[{'field' : 'index','op'    : '>=','value' : 11}])

NotImplementedError                       Traceback (most recent call last)

/home/bshanks/prog/atr/<ipython console> in <module>()

/usr/local/lib/python2.7/dist-packages/pandas/io/pytables.pyc in select(self, key, where)
    235             raise Exception('can only select on objects written as tables')
    236         if group is not None:
--> 237             return self._read_group(group, where)
    239     def put(self, key, value, table=False, append=False,

/usr/local/lib/python2.7/dist-packages/pandas/io/pytables.pyc in _read_group(self, group, where)
    619         kind = _LEGACY_MAP.get(kind, kind)
    620         handler = self._get_handler(op='read', kind=kind)
--> 621         return handler(group, where)
    623     def _read_series(self, group, where=None):

/usr/local/lib/python2.7/dist-packages/pandas/io/pytables.pyc in _read_frame_table(self, group, where)
    647     def _read_frame_table(self, group, where=None):
--> 648         return self._read_panel_table(group, where)['value']
    650     def _read_panel_table(self, group, where=None):

/usr/local/lib/python2.7/dist-packages/pandas/io/pytables.pyc in _read_panel_table(self, group, where)
    655         # create the selection

    656         sel = Selection(table, where)
--> 657
    658         fields = table._v_attrs.fields

/usr/local/lib/python2.7/dist-packages/pandas/io/pytables.pyc in select(self)
    861         """
    862         if self.the_condition:
--> 863             self.values = self.table.readWhere(self.the_condition)
    865         else:

/usr/lib/python2.7/dist-packages/tables/table.pyc in readWhere(self, condition, condvars, field, start, stop, step)
   1272         coords = [ p.nrow for p in
-> 1273                    self._where(condition, condvars, start, stop, step) ]
   1274         self._whereCondition = None  # reset the conditions
   1275         return self.readCoordinates(coords, field)

/usr/lib/python2.7/dist-packages/tables/table.pyc in _where(self, condition, condvars, start, stop, step)
   1225         # Compile the condition and extract usable index conditions.

   1226         condvars = self._requiredExprVars(condition, condvars, depth=3)
-> 1227         compiled = self._compileCondition(condition, condvars)
   1229         # Can we use indexes?

/usr/lib/python2.7/dist-packages/tables/table.pyc in _compileCondition(self, condition, condvars)
   1101         indexedcols = frozenset(indexedcols)
   1102         # Now let ``compile_condition()`` do the Numexpr-related job.

-> 1103         compiled = compile_condition(condition, typemap, indexedcols, copycols)
   1105         # Check that there actually are columns in the condition.

/usr/lib/python2.7/dist-packages/tables/conditions.pyc in compile_condition(condition, typemap, indexedcols, copycols)
    154     except NotImplementedError, nie:
    155         # Try to make this Numexpr error less cryptic.

--> 156         raise _unsupported_operation_error(nie)
    157     params = varnames

NotImplementedError: unsupported operand types for *ge*: long, str

This comment has been minimized.

Copy link

commented Jan 27, 2012

this actually is unsupported in the current implementation of the table read/write of a table, definition requires the index to be a TimeCol64, so you must currently pass a datetime object (which is converted via mktime(value.timetuple()) to the required value for comparison - I believe what you show in your example doesn't work because the passed value is not a datetime, it generates a string comparison on the 'index' column (and this generates the error)

it would be possible to add another table format with say a float index (figuring out written data is easy via the pandas attribute recorded in the table, but specifiying which table format you want would require an option passed via append)

I tend to use columns when I have a panel of the form: items x time x tickers...which lends itself to the current format

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
3 participants
You can’t perform that action at this time.