New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VLEN string problem on Win64 #593

Closed
andrewcollette opened this Issue Jun 30, 2015 · 18 comments

Comments

Projects
None yet
10 participants
@andrewcollette
Member

andrewcollette commented Jun 30, 2015

From this thread:

https://groups.google.com/forum/#!topic/h5py/nF_EkdOpCd8

Evidently storing a variable-length string causes a segmentation fault with 2.5.0, on win64 (conda).

@rossant

This comment has been minimized.

Show comment
Hide comment
@rossant

rossant Jul 3, 2015

it seems like i have exactly the same problem with booleans. fixed by casting to an int

rossant commented Jul 3, 2015

it seems like i have exactly the same problem with booleans. fixed by casting to an int

@groutr

This comment has been minimized.

Show comment
Hide comment
@groutr

groutr Jul 7, 2015

I'm pretty sure this is related.
PyTables/PyTables#471

groutr commented Jul 7, 2015

I'm pretty sure this is related.
PyTables/PyTables#471

@cpcloud

This comment has been minimized.

Show comment
Hide comment
@cpcloud

cpcloud Jul 7, 2015

This isn't limited to variable length string dtypes:

In [1]: import numpy as np

In [2]: import h5py

In [3]: f = h5py.File('foo.h5')

In [4]: ds = f.create_dataset('/x', shape=(20, 24), dtype=np.dtype([('a', 'i4'), ('b', 'i4')]), chunks=(4, 6))

In [5]: ds
Out[5]:  # crashes the interpreter

cpcloud commented Jul 7, 2015

This isn't limited to variable length string dtypes:

In [1]: import numpy as np

In [2]: import h5py

In [3]: f = h5py.File('foo.h5')

In [4]: ds = f.create_dataset('/x', shape=(20, 24), dtype=np.dtype([('a', 'i4'), ('b', 'i4')]), chunks=(4, 6))

In [5]: ds
Out[5]:  # crashes the interpreter
@cgohlke

This comment has been minimized.

Show comment
Hide comment
@cgohlke

cgohlke Jul 8, 2015

Contributor

I can't reproduce these crashes with the builds from http://www.lfd.uci.edu/~gohlke/pythonlibs/#h5py

>>> import numpy as np
>>> import h5py
>>> f = h5py.File('foo.h5')
>>> ds = f.create_dataset('/x', shape=(20, 24), dtype=np.dtype([('a', 'i4'), ('b', 'i4')]), chunks=(4, 6))
>>> ds
<HDF5 dataset "x": shape (20, 24), type "|V8">
Contributor

cgohlke commented Jul 8, 2015

I can't reproduce these crashes with the builds from http://www.lfd.uci.edu/~gohlke/pythonlibs/#h5py

>>> import numpy as np
>>> import h5py
>>> f = h5py.File('foo.h5')
>>> ds = f.create_dataset('/x', shape=(20, 24), dtype=np.dtype([('a', 'i4'), ('b', 'i4')]), chunks=(4, 6))
>>> ds
<HDF5 dataset "x": shape (20, 24), type "|V8">
@groutr

This comment has been minimized.

Show comment
Hide comment
@groutr

groutr Jul 10, 2015

I can confirm that Python 3.4 on Windows 32bit does not seem to be affected by this issue.

groutr commented Jul 10, 2015

I can confirm that Python 3.4 on Windows 32bit does not seem to be affected by this issue.

@rossant

This comment has been minimized.

Show comment
Hide comment
@rossant

rossant Jul 10, 2015

is this a conda-only problem?

rossant commented Jul 10, 2015

is this a conda-only problem?

@groutr

This comment has been minimized.

Show comment
Hide comment
@groutr

groutr Jul 10, 2015

I don't know if it is a conda only problem. As far as I can tell, we are compiling hdf5 and h5py correctly.
From what I'm seeing, the difference is the conda packages is linking against the latest hdf5 library (1.8.15patch1). Is there a change between 1.8.14 and 1.8.15patch1 that might have affected h5py?

When using hdf5 1.8.15patch and h5py 2.5.0, the interpreter crashes running the test suite. When using hdf5 1.8.14 and h5py 2.5.0, the test suite passes.

groutr commented Jul 10, 2015

I don't know if it is a conda only problem. As far as I can tell, we are compiling hdf5 and h5py correctly.
From what I'm seeing, the difference is the conda packages is linking against the latest hdf5 library (1.8.15patch1). Is there a change between 1.8.14 and 1.8.15patch1 that might have affected h5py?

When using hdf5 1.8.15patch and h5py 2.5.0, the interpreter crashes running the test suite. When using hdf5 1.8.14 and h5py 2.5.0, the test suite passes.

@cgohlke

This comment has been minimized.

Show comment
Hide comment
@cgohlke

cgohlke Jul 11, 2015

Contributor

I rebuilt against HDF5 1.8.15-patch1: No crashes.

Contributor

cgohlke commented Jul 11, 2015

I rebuilt against HDF5 1.8.15-patch1: No crashes.

@groutr

This comment has been minimized.

Show comment
Hide comment
@groutr

groutr Jul 11, 2015

Hmmm....Maybe it is just affecting the conda builds. I'll look over things again today.

groutr commented Jul 11, 2015

Hmmm....Maybe it is just affecting the conda builds. I'll look over things again today.

@groutr

This comment has been minimized.

Show comment
Hide comment
@groutr

groutr Jul 31, 2015

I'm wondering if threadsafe in hdf5 has anything to do with this. @cgohlke, are you compiling your hdf5 windows binaries with HDF5_ENABLE_THREADSAFE?

groutr commented Jul 31, 2015

I'm wondering if threadsafe in hdf5 has anything to do with this. @cgohlke, are you compiling your hdf5 windows binaries with HDF5_ENABLE_THREADSAFE?

@cgohlke

This comment has been minimized.

Show comment
Hide comment
@cgohlke

cgohlke Jul 31, 2015

Contributor

HDF5_ENABLE_THREADSAFE? No, not for h5py and pytables.

Contributor

cgohlke commented Jul 31, 2015

HDF5_ENABLE_THREADSAFE? No, not for h5py and pytables.

@subhacom

This comment has been minimized.

Show comment
Hide comment
@subhacom

subhacom Aug 22, 2015

I am facing what seems to be the same issue, but for reading dataset containing strings. I cloned master and used setuptools to install on both conda and cygwin. Occurs only with conda, not with cygwin on the same (windows 7) system.

subhacom commented Aug 22, 2015

I am facing what seems to be the same issue, but for reading dataset containing strings. I cloned master and used setuptools to install on both conda and cygwin. Occurs only with conda, not with cygwin on the same (windows 7) system.

@jdreaver

This comment has been minimized.

Show comment
Hide comment
@jdreaver

jdreaver Sep 10, 2015

I am experiencing the same issue. If I set an attribute using a string, I get a segfault. It is fixed if I wrap the string in an array:

group.attrs["key"] = np.array(["string"], dtype=np.string_)

However, I think we can agree that this is a bug and this workaround is a band-aid. I am also using the latest h5py and hdf5 packages from conda on Windows 64 bit.

jdreaver commented Sep 10, 2015

I am experiencing the same issue. If I set an attribute using a string, I get a segfault. It is fixed if I wrap the string in an array:

group.attrs["key"] = np.array(["string"], dtype=np.string_)

However, I think we can agree that this is a bug and this workaround is a band-aid. I am also using the latest h5py and hdf5 packages from conda on Windows 64 bit.

@jdreaver

This comment has been minimized.

Show comment
Hide comment
@jdreaver

jdreaver Sep 23, 2015

I'll add that I'm also getting a segfault with boolean attributes. (My workaround is to just store them as ints and then restore to bool when loading.)

jdreaver commented Sep 23, 2015

I'll add that I'm also getting a segfault with boolean attributes. (My workaround is to just store them as ints and then restore to bool when loading.)

@eegroopm

This comment has been minimized.

Show comment
Hide comment
@eegroopm

eegroopm Sep 24, 2015

I can confirm the the issue with arrays of variable length strings. jdreaver's workaround also seems to work, albeit only with dtype=np.string_. Trying to use h5py's special dtypes, e.g., dt = h5py.special_dtype(vlen=str), from the documentation still ends with a segfault.

This DOES appear to be a conda issue. No such troubles using WinPython (python 3.4, 64 bit).

Rebuilding h5py with pip from cgohlke's page http://www.lfd.uci.edu/~gohlke/pythonlibs/#h5py fixes the issue, too, in the conda distro.

eegroopm commented Sep 24, 2015

I can confirm the the issue with arrays of variable length strings. jdreaver's workaround also seems to work, albeit only with dtype=np.string_. Trying to use h5py's special dtypes, e.g., dt = h5py.special_dtype(vlen=str), from the documentation still ends with a segfault.

This DOES appear to be a conda issue. No such troubles using WinPython (python 3.4, 64 bit).

Rebuilding h5py with pip from cgohlke's page http://www.lfd.uci.edu/~gohlke/pythonlibs/#h5py fixes the issue, too, in the conda distro.

@jpiersol

This comment has been minimized.

Show comment
Hide comment
@jpiersol

jpiersol Nov 5, 2015

This seems to be fixed in the latest Win64 Anaconda release (Anaconda 2.4.0/Python 3.5).

jpiersol commented Nov 5, 2015

This seems to be fixed in the latest Win64 Anaconda release (Anaconda 2.4.0/Python 3.5).

@jdreaver

This comment has been minimized.

Show comment
Hide comment
@jdreaver

jdreaver Nov 5, 2015

Thanks for the heads up @jpiersol!

jdreaver commented Nov 5, 2015

Thanks for the heads up @jpiersol!

@TheQuantumPhysicist

This comment has been minimized.

Show comment
Hide comment
@TheQuantumPhysicist

TheQuantumPhysicist Feb 9, 2017

We are facing exactly the same symptoms with Python 3.5 64-bit and h5py 2.6. A create_dataset command without a list wrapping the dataset name crashes the interpreter. Also we tried h5py.run_tests() and it works fine.

TheQuantumPhysicist commented Feb 9, 2017

We are facing exactly the same symptoms with Python 3.5 64-bit and h5py 2.6. A create_dataset command without a list wrapping the dataset name crashes the interpreter. Also we tried h5py.run_tests() and it works fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment