Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested compound types TypeError #1197

Open
ckoerber opened this issue Apr 1, 2019 · 5 comments
Open

Nested compound types TypeError #1197

ckoerber opened this issue Apr 1, 2019 · 5 comments

Comments

@ckoerber
Copy link

@ckoerber ckoerber commented Apr 1, 2019

First of all, thank you for helping with the h5py package.

I believe I have encountered a bug related to nested compound data types. Below you can find a minimal example.

# create compound data
d = np.dtype((('i', 2), 3))
a = np.arange(2*3).reshape((3,2))
data = np.array([tuple(a)], dtype=d)
# a.shape = (1, 3, 2)

# create dataset
with h5py.File("test.h5") as fle:
    dset = fle.create_dataset(
        "test", 
        shape=[1,], 
        dtype=d,
    )
    # dset.dtype = dtype((('<i4', (2,)), (3,)))
    dset[...] = data

The last line raises the error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-23-14e95409766a> in <module>()
     13     )
     14     # dset.dtype = dtype((('<i4', (2,)), (3,)))
---> 15     dset[...] = data

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

/usr/local/lib/python3.7/site-packages/h5py/_hl/dataset.py in __setitem__(self, args, val)
    613                 raise TypeError(
    614                     "When writing to array types, last N dimensions have to match (got %s, but should be %s)"
--> 615                     % (valshp, shp)
    616                 )
    617             mtype = h5t.py_create(numpy.dtype((val.dtype, shp)))

TypeError: When writing to array types, last N dimensions have to match (got (2,), but should be (3,))

Transposing the format d or the array a does not help.

Instead, if I create the dataset with a different type, the write works without problems

dset = new_file.create_dataset(
    "test", 
    shape=[1,], 
    dtype=np.dtype(('i', (3, 2))),
)

My version info is as follows

Summary of the h5py configuration
---------------------------------

h5py    2.9.0
HDF5    1.10.4
Python  3.7.2 (default, Jan 13 2019, 12:50:01) 
[Clang 10.0.0 (clang-1000.11.45.5)]
sys.platform    darwin
sys.maxsize     9223372036854775807
numpy   1.16.2
@tacaswell tacaswell added this to the 2.9.1 milestone Apr 1, 2019
@takluyver takluyver removed this from the 2.9.1 milestone Jun 3, 2019
@takluyver takluyver added this to the 2.10 milestone Jun 3, 2019
@takluyver
Copy link
Member

@takluyver takluyver commented Jun 5, 2019

Does anyone want to investigate this? Otherwise I'll drop the milestone soon. I'm not really familiar with regular compound types, let alone nested ones, so I'm not intending to work on this myself.

@aragilar
Copy link
Member

@aragilar aragilar commented Jun 14, 2019

Looking at where it's failing, https://github.com/h5py/h5py/blob/master/h5py/_hl/dataset.py#L646 looks to be assuming that there is only one level of compound types. We could document that nested compound types are currently not supported, and that we'd accept a PR to support nested compound types (there are probably other changes needed other than that line, I don't think it's an easy fix).

@takluyver
Copy link
Member

@takluyver takluyver commented Jun 14, 2019

Thanks @aragilar - let's do that for 2.10, and leave this issue open without a specific milestone.

@takluyver takluyver removed this from the 2.10 milestone Jun 14, 2019
@scopatz
Copy link
Member

@scopatz scopatz commented Jun 17, 2019

Wow, hard to believe that this is missing. It definitely shouldn't be scheduled for v2.10 if no one is interested in working on it.

@takluyver
Copy link
Member

@takluyver takluyver commented Oct 30, 2020

@aragilar it's not obvious to me which line you were pointing to. Looking at the Git history, when you wrote your comment last year, master should have been like this:

valshp = val.shape[-len(shp):]

But that line appears to be about array dtypes, rather than compound ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants