Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing table properties #6

Closed
andrewcollette opened this issue Dec 27, 2012 · 4 comments
Closed

Missing table properties #6

andrewcollette opened this issue Dec 27, 2012 · 4 comments

Comments

@andrewcollette
Copy link
Contributor

Original author: bioinformed@gmail.com (November 18, 2008 21:01:06)

It would be wonderful if tables published maxsize and the various filter
flags as properties. In addition, a property called 'extendable' as an
alias for 'None in self.maxsize' would be a useful helper.

Original issue: http://code.google.com/p/h5py/issues/detail?id=6

@andrewcollette
Copy link
Contributor Author

From andrew.c...@gmail.com on November 18, 2008 21:38:39
These properties exist in the current svn version; they'll be part of the upcoming
1.0 release. One issue with an "extendable" flag may be that not every axis is
guranteed to be extendable; if you declare the maxshape as (10,None,10), then only
the second axis can be extended. I'll see if there's a way to deal with this.

@andrewcollette
Copy link
Contributor Author

From bioinformed@gmail.com on November 19, 2008 13:48:47
Thanks! Regarding 'extendable', a developer will need to know the axises and
chunksizes to extend, so the 'None in t.maxsize' is about as good as we can get as a
single boolean flag. I could forsee uses like:

def extend(t,newchunks):
rank = len(t.shape)
if isinstance(newchunks,int):
newchunks = [newchunks]_rank
if len(newchunks) != rank:
raise ValueError('Invalid rank')
newsize = []
for nchunks,n,msize,csize in izip(newchunks,t.shape,t.maxsize,t.chunksize):
if msize is None:
raise ValueError('Cannot resize axis')
newsize.append( n+nchunks_csize )
t.resize( tuple(newsize) )

I'd rename the current 'extend' to 'resize', since it takes an absolute size and
rework 'extend' to do as above.

@andrewcollette
Copy link
Contributor Author

From andrew.c...@gmail.com on November 19, 2008 15:39:22
It turns out you don't need to make the dataset size an exact multiple of the chunk
size; although HDF5 will allocate storage space in chunks you can specify any shape
you want. Dataset chunking is supposed to be transparent; I'm a little suspicious of
any method that wants you to do chunk arithmetic.

Extend() has been deprecated because it no longer hooks in to the HDF5 H5Dextend()
function, which actually did something slightly different (guaranteed a minimum
size). H5Dextend has coincidentally been deprecated in HDF5 1.8.

If you want to grow an existing array by a fixed amount you can do something like:

dset.resize([x+y for x, y in zip(dset.shape, extension_sequence)])

or for a single axis

dset.resize(dset.shape[ax]+amount, axis=ax)

@andrewcollette
Copy link
Contributor Author

From bioinformed@gmail.com on November 19, 2008 16:09:37
That simplifies everything. Never mind-- a plain resize works fine for me now that I
know that the chunking is a back end detail (to my great relief). I'd rather not
have to worry about growing by fixed amounts and simply resize arbitrarily as needed,
provided I don't change fixed dimensions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant