Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0 dim arrays: indexing #1980

Merged
merged 5 commits into from
Jun 22, 2024
Merged

Conversation

d-v-b
Copy link
Contributor

@d-v-b d-v-b commented Jun 18, 2024

Support (basic) indexing of 0 dimensional arrays. I parameterized the old v2 test for this w.r.t. the out keyword and also the dtype of the array, marking the structured dtype case as an xfail. Builds on #1979.

@normanrz git blame tells me that you marked the 0d indexing tests as xfail, with the reasoning being that v3 does not support 0d arrays. I couldn't find anything in the v3 spec about 0d arrays (and I actually couldn't find anything in the v2 spec about it either!). Did I miss something in the spec? If not, we might want to make this explicit one way or another.

cc @tomwhite if you have the time to test this I would appreciate it. It's possible that there are other broken things that I haven't found via our test suite.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@tomwhite
Copy link
Contributor

Thanks for working on this @d-v-b!

I tried running the example from #1976, and it gets further, but doesn't seem to write the value:

>>> import zarr
>>> z = zarr.open(store='example-v3.zarr', mode='w', shape=())
>>> z[...] = 3
>>> z[...]
array(1.5e-323)

Did I miss something in the spec? If not, we might want to make this explicit one way or another.

I agree that the spec should say something about this case. 0-d arrays are still arrays, and should be supported in Zarr v3 IMO.

@normanrz
Copy link
Contributor

The spec says that the shape needs to be an array of integers. I think this includes empty arrays (ie. 0d arrays). I think it is fine to add support. I would have expected more things to break.

How does chunk key encoding work for 0d arrays?

@d-v-b
Copy link
Contributor Author

d-v-b commented Jun 18, 2024

How does chunk key encoding work for 0d arrays?

@normanrz I was wondering the same thing. It turns out that I did miss something in the v3 spec: in the section describing the default chunk key encoding, it states:

Arrays may have 0 dimensions (when for example representing scalars), in which case the coordinate of a chunk is the empty tuple, and the chunk key will consist of the string c.

I wonder if this is a typo, and the intention was to use the v2 behavior with a single chunk at c/0 or c.0, depending on the separator metadata? Seems that there's already an open issue about this, which I had completely forgotten about.

@normanrz
Copy link
Contributor

Arrays may have 0 dimensions (when for example representing scalars), in which case the coordinate of a chunk is the empty tuple, and the chunk key will consist of the string c.

I wonder if this is a typo, and the intention was to use the v2 behavior with a single chunk at c/0 or c.0, depending on the separator metadata? Seems that there's already an open issue about this, which I had completely forgotten about.

Reading it now, I think the spec pretty clear. For default, the chunk key will be c. v2 is more tricky intuitively. But the spec explicitly sets it to 0 for all 0d arrays. zarrita also implements it like that: https://github.com/scalableminds/zarrita/blob/async/zarrita/metadata.py#L138-L139

@d-v-b
Copy link
Contributor Author

d-v-b commented Jun 21, 2024

@tomwhite I'm going to merge this and open a subsequent PR that fixes the problem you observed.

@d-v-b
Copy link
Contributor Author

d-v-b commented Jun 21, 2024

I ended up needing to implement __array__ on the Array class, and I fixed a bug in Array._set_basic_selection wherein scalars were not being cast to the correct dtype.

@d-v-b d-v-b requested a review from normanrz June 21, 2024 20:26
@d-v-b d-v-b merged commit 65dc4cc into zarr-developers:v3 Jun 22, 2024
18 checks passed
dcherian added a commit to dcherian/zarr-python that referenced this pull request Jun 25, 2024
* v3: (22 commits)
  [v3] `Buffer` ensure correct subclass based on the `BufferPrototype` argument (zarr-developers#1974)
  Fix doc build (zarr-developers#1987)
  Fix doc build warnings (zarr-developers#1985)
  Automatically generate API reference docs (zarr-developers#1918)
  Update `RemoteStore.__str__` and add UPath tests (zarr-developers#1964)
  [v3] Elevate codec pipeline (zarr-developers#1932)
  0 dim arrays: indexing (zarr-developers#1980)
  `parse_shapelike` allows 0 (zarr-developers#1979)
  Clean up typing and docs for indexing (zarr-developers#1961)
  add json indentation to config (zarr-developers#1952)
  chore: update pre-commit hooks (zarr-developers#1973)
  Bump pypa/gh-action-pypi-publish in the actions group (zarr-developers#1969)
  chore: update pre-commit hooks (zarr-developers#1957)
  Update release.rst (zarr-developers#1960)
  doc: update release notes for 3.0.0.alpha (zarr-developers#1959)
  Basic working FsspecStore (zarr-developers#1785)
  Feature: Top level V3 API (zarr-developers#1884)
  Buffer Prototype Argument (zarr-developers#1910)
  Create issue-metrics.yml
  fixes bug in transpose (zarr-developers#1949)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants