Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot access project data through signac interface #274

Closed
klywang opened this issue Jan 22, 2020 · 4 comments
Closed

Cannot access project data through signac interface #274

klywang opened this issue Jan 22, 2020 · 4 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@klywang
Copy link
Contributor

klywang commented Jan 22, 2020

Description

Accessing arrays from project.data using signac returns a ValueError. The same arrays may be accessed usingh5py so I believe the data was written correctly. This error does not occur when accessing arrays using job.dataor when accessing scalars and strings from project.data.

To reproduce

In a project directory:

$ signac shell
>>> import numpy as np
>>> import h5py
>>> pr.data['a'] = np.zeros([10])
>>> with pr.data:
...     pr.data.a[:]
... 

This will return a ValueError.

Accessing the same arrays with h5py returns the correct values:

>>> f = h5py.File('signac_data.h5', mode='r')
>>> f['a']
<HDF5 dataset "a": shape (10,), type "<f8">
>>> f['a'][:]
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
>>> f.close()

Error output

Traceback (most recent call last):
  File "<console>", line 2, in <module>
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/Users/kelwang/miniconda3/envs/ENV/lib/python3.8/site-packages/h5py/_hl/dataset.py", line 489, in __getitem__
    if is_empty_dataspace(self.id):
  File "/Users/kelwang/miniconda3/envs/ENV/lib/python3.8/site-packages/h5py/_hl/base.py", line 87, in is_empty_dataspace
    if obj.get_space().get_simple_extent_type() == h5s.NULL:
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5d.pyx", line 289, in h5py.h5d.DatasetID.get_space
ValueError: Not a dataset (not a dataset)

System configuration

Please complete the following information:

  • Operating System [e.g. macOS]: macOS and Linux (Debian)
  • Version of Python [e.g. 3.7]: Python 3.8.1
  • Version of signac [e.g. 1.0]: 1.3.0

Or copy & paste the output of: python -c 'import platform; print(platform.platform()); import sys; print(sys.version); import signac; print(signac.__version__)'

@klywang klywang added the bug Something isn't working label Jan 22, 2020
@bdice
Copy link
Member

bdice commented Jan 22, 2020

I can reproduce this on my machine. I tried copying the signac_data.h5 file to a job directory and it seems like that worked fine. I am guessing that there is something different in the implementation of project.data vs. job.data that causes this issue.

@bdice
Copy link
Member

bdice commented Jan 22, 2020

I think I solved the problem. I think the underlying problem is that project.stores is returning a new instance of H5StoreManager in both first line with pr.data: and the second line pr.data.a[:].

The Job class keeps a single H5StoreManager in its private job._stores variable, but the Project class returns a new instance each time the project.stores attribute is called (in this case, it's called internally by project.data).

@property
def stores(self):
"""Access HDF5-stores associated with this project.
Use this property to access an HDF5 file within the project's root
directory using the H5Store dict-like interface.
This is an example for accessing an HDF5 file called ``'my_data.h5'``
within the project's root directory:
.. code-block:: python
project.stores['my_data']['array'] = np.random((32, 4))
This is equivalent to:
.. code-block:: python
H5Store(project.fn('my_data.h5'))['array'] = np.random((32, 4))
Both the `project.stores` and the `H5Store` itself support attribute
access. The above example could therefore also be expressed as:
.. code-block:: python
project.stores.my_data.array = np.random((32, 4))
:return: The HDF5-Store manager for this project.
:rtype: :class:`~..core.h5store.H5StoreManager`
"""
return H5StoreManager(self._rd)

@klywang, would you be willing to take these next steps?

  1. Try that branch.
  2. Add tests. Copy some of the testing logic from test_job.py into test_project.py since I don't think we're testing project.data or project.stores at all right now.
  3. Make sure the tests fail on master and succeed on fix/project-data-interface.
  4. Create a pull request, update changelog, etc.

@klywang klywang self-assigned this Jan 22, 2020
@bdice
Copy link
Member

bdice commented Jan 26, 2020

@klywang Can you open a PR with your current progress on this?

@bdice
Copy link
Member

bdice commented Feb 22, 2020

Resolved by #278.

@bdice bdice closed this as completed Feb 22, 2020
@bdice bdice modified the milestones: v1.4.0, v1.3.1 Feb 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants