New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

min_itemsize not working on MultiIndex columns for Series, with format="table" #11412

Closed
toobaz opened this Issue Oct 22, 2015 · 3 comments

Comments

Projects
None yet
2 participants
@toobaz
Member

toobaz commented Oct 22, 2015

If I do

ddf = pd.DataFrame([['a', 'b', 1],
                    ['a', 'b', 2]],
                    columns=['A', 'B', 'C']).set_index(['A', 'B'])

and then

ddf['C'].to_hdf('/tmp/store.hdf', 'test',
          format="table",
          min_itemsize={'A' : 3})

I get the following:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-46-66f05c11146d> in <module>()
      1 ddf['C'].to_hdf('/tmp/store.hdf', 'test',
      2           format="table",
----> 3           min_itemsize={'A' : 3})

/usr/lib/python2.7/dist-packages/pandas/core/generic.pyc in to_hdf(self, path_or_buf, key, **kwargs)
    936 
    937         from pandas.io import pytables
--> 938         return pytables.to_hdf(path_or_buf, key, self, **kwargs)
    939 
    940     def to_msgpack(self, path_or_buf=None, **kwargs):

/usr/lib/python2.7/dist-packages/pandas/io/pytables.pyc in to_hdf(path_or_buf, key, value, mode, complevel, complib, append, **kwargs)
    268         with HDFStore(path_or_buf, mode=mode, complevel=complevel,
    269                        complib=complib) as store:
--> 270             f(store)
    271     else:
    272         f(path_or_buf)

/usr/lib/python2.7/dist-packages/pandas/io/pytables.pyc in <lambda>(store)
    263         f = lambda store: store.append(key, value, **kwargs)
    264     else:
--> 265         f = lambda store: store.put(key, value, **kwargs)
    266 
    267     if isinstance(path_or_buf, string_types):

/usr/lib/python2.7/dist-packages/pandas/io/pytables.pyc in put(self, key, value, format, append, **kwargs)
    825             format = get_option("io.hdf.default_format") or 'fixed'
    826         kwargs = self._validate_format(format, kwargs)
--> 827         self._write_to_group(key, value, append=append, **kwargs)
    828 
    829     def remove(self, key, where=None, start=None, stop=None):

/usr/lib/python2.7/dist-packages/pandas/io/pytables.pyc in _write_to_group(self, key, value, format, index, append, complib, encoding, **kwargs)
   1263 
   1264         # write the object
-> 1265         s.write(obj=value, append=append, complib=complib, **kwargs)
   1266 
   1267         if s.is_table and index:

/usr/lib/python2.7/dist-packages/pandas/io/pytables.pyc in write(self, obj, **kwargs)
   4104         cols.append(name)
   4105         obj.columns = cols
-> 4106         return super(AppendableMultiSeriesTable, self).write(obj=obj, **kwargs)
   4107 
   4108 

/usr/lib/python2.7/dist-packages/pandas/io/pytables.pyc in write(self, obj, data_columns, **kwargs)
   4071             obj.columns = [name]
   4072         return super(AppendableSeriesTable, self).write(
-> 4073             obj=obj, data_columns=obj.columns, **kwargs)
   4074 
   4075     def read(self, columns=None, **kwargs):

/usr/lib/python2.7/dist-packages/pandas/io/pytables.pyc in write(self, obj, axes, append, complib, complevel, fletcher32, min_itemsize, chunksize, expectedrows, dropna, **kwargs)
   3769         self.create_axes(axes=axes, obj=obj, validate=append,
   3770                          min_itemsize=min_itemsize,
-> 3771                          **kwargs)
   3772 
   3773         for a in self.axes:

/usr/lib/python2.7/dist-packages/pandas/io/pytables.pyc in create_axes(self, axes, obj, validate, nan_rep, data_columns, min_itemsize, **kwargs)
   3371             axis, axis_labels = self.non_index_axes[0]
   3372             data_columns = self.validate_data_columns(
-> 3373                 data_columns, min_itemsize)
   3374             if len(data_columns):
   3375                 mgr = block_obj.reindex_axis(

/usr/lib/python2.7/dist-packages/pandas/io/pytables.pyc in validate_data_columns(self, data_columns, min_itemsize)
   3247 
   3248             existing_data_columns = set(data_columns)
-> 3249             data_columns.extend([
   3250                 k for k in min_itemsize.keys()
   3251                 if k != 'values' and k not in existing_data_columns

AttributeError: 'Index' object has no attribute 'extend'

All goes smoothly instead if I don't specify "format=table", or if I don't specify the min_itemsize, or if I save as DataFrame (ddf[['C']]) rather than a as Series.

Tested with up to date pandas from git and pytables 3.2.2-1.

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 22, 2015

Contributor

dupe of #11364

its a bug, specify 'index' as the key to make it work

Contributor

jreback commented Oct 22, 2015

dupe of #11364

its a bug, specify 'index' as the key to make it work

@jreback jreback closed this Oct 22, 2015

@jreback jreback added the IO HDF5 label Oct 22, 2015

@toobaz

This comment has been minimized.

Show comment
Hide comment
@toobaz

toobaz Oct 22, 2015

Member

Sorry for the dupe (and for the ridiculous bug title).

But that said,

ddf['C'].to_hdf('/tmp/store.hdf', 'test',
          format="table",
          min_itemsize={'index' : 3})

still gives exactly the same error.

Member

toobaz commented Oct 22, 2015

Sorry for the dupe (and for the ridiculous bug title).

But that said,

ddf['C'].to_hdf('/tmp/store.hdf', 'test',
          format="table",
          min_itemsize={'index' : 3})

still gives exactly the same error.

@toobaz toobaz changed the title from min_itemsize to min_itemsize not working on MultiIndex columns for Series, with format="table" Oct 22, 2015

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback Oct 22, 2015

Contributor

you can post that as an example in the other issue then. its the same/related.

Contributor

jreback commented Oct 22, 2015

you can post that as an example in the other issue then. its the same/related.

toobaz added a commit to toobaz/pandas that referenced this issue Nov 24, 2016

toobaz added a commit to toobaz/pandas that referenced this issue Dec 5, 2016

@jreback jreback added this to the 0.19.2 milestone Dec 5, 2016

jreback added a commit that referenced this issue Dec 5, 2016

BUG: Ensure min_itemsize is always a list (#11412)
closes #11412

Author: Pietro Battiston <me@pietrobattiston.it>

Closes #14728 from toobaz/minitemsizefix and squashes the following commits:

e25cd1f [Pietro Battiston] Whatsnew
b9bb88f [Pietro Battiston] Tests for previous commit
6406ee8 [Pietro Battiston] BUG: Ensure min_itemsize is always a list

jorisvandenbossche added a commit that referenced this issue Dec 15, 2016

BUG: Ensure min_itemsize is always a list (#11412)
closes #11412

Author: Pietro Battiston <me@pietrobattiston.it>

Closes #14728 from toobaz/minitemsizefix and squashes the following commits:

e25cd1f [Pietro Battiston] Whatsnew
b9bb88f [Pietro Battiston] Tests for previous commit
6406ee8 [Pietro Battiston] BUG: Ensure min_itemsize is always a list

(cherry picked from commit 53bf1b2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment