Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle legacy nested datasets (with tests) #850

Merged
merged 7 commits into from
Oct 19, 2021

Conversation

joshmoore
Copy link
Member

@joshmoore joshmoore commented Oct 18, 2021

fix #840
As pointed out @abred, zarr 2.10 (and earlier) was no longer properly handling the opening of nested datasets if they did not contain the dimension_separator metadata in .zarray. This is fixed with this PR. However, there are a number of cases which still xfail:

$ pytest zarr/tests/test_dim_separator.py -sv | grep XFAIL
zarr/tests/test_dim_separator.py::test_open[static_nested_legacy] XFAIL
zarr/tests/test_dim_separator.py::test_fsstore[static_nested_legacy] XFAIL
zarr/tests/test_dim_separator.py::test_directory[static_nested_legacy] XFAIL
zarr/tests/test_dim_separator.py::test_nested[static_flat_legacy] XFAIL
zarr/tests/test_dim_separator.py::test_nested[directory_default] XFAIL
zarr/tests/test_dim_separator.py::test_nested[fs_default] XFAIL

Essentially for legacy datasets without the dimension_separator the following holds:

  • All nested datasets MUST be opened with NestedDirectoryStore.
  • All flat datasets MUST NOT be opened with NestedDirectoryStore.

These restrictions can be circumvented by updating your .zarray files to contain the proper metadata:

$ git grep dimension_separator fixture/{nested,flat}{,_legacy}
fixture/flat/.zarray:    "dimension_separator": ".",
fixture/nested/.zarray:    "dimension_separator": "/",

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

Each fixture (flat & nested) should have a version
which does not include any new metadata.
@pep8speaks
Copy link

pep8speaks commented Oct 18, 2021

Hello @joshmoore! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-10-19 19:48:47 UTC

@codecov
Copy link

codecov bot commented Oct 19, 2021

Codecov Report

Merging #850 (6754503) into master (4f8cb35) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head 6754503 differs from pull request most recent head ff1b9a9. Consider uploading reports for the commit ff1b9a9 to get more accurate results

@@           Coverage Diff           @@
##           master     #850   +/-   ##
=======================================
  Coverage   99.94%   99.94%           
=======================================
  Files          31       31           
  Lines       10936    10952   +16     
=======================================
+ Hits        10930    10946   +16     
  Misses          6        6           
Impacted Files Coverage Δ
zarr/core.py 100.00% <100.00%> (ø)
zarr/meta.py 100.00% <100.00%> (ø)
zarr/tests/test_dim_separator.py 100.00% <100.00%> (ø)

@joshmoore
Copy link
Member Author

joshmoore commented Oct 19, 2021

I'd propose to get this out as a quick 2.10.2 if there are no objections.

@joshmoore joshmoore merged commit d1dc987 into zarr-developers:master Oct 19, 2021
@joshmoore joshmoore deleted the issue-840 branch October 19, 2021 19:49
@jakirkham
Copy link
Member

Thanks Josh! 😄

@joshmoore
Copy link
Member Author

Note that one of the tests that passed here is failing in the feedstock:

1 similar comment
@joshmoore
Copy link
Member Author

Note that one of the tests that passed here is failing in the feedstock:

python -m pip install -e .
python -m pip install .
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, why this was needed? Did it fail without this? If so, that may be indicative of the issue we are seeing in conda-forge

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropping the editable install fixed the error "No module named 'pip'". More info in pypa/pip#10573

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

zarr 2.10.0 fails to load pre-existing nested files
3 participants