Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dimension scales #167

Closed
alimanfoo opened this issue Oct 25, 2017 · 8 comments
Closed

Dimension scales #167

alimanfoo opened this issue Oct 25, 2017 · 8 comments
Labels
enhancement New features or improvements

Comments

@alimanfoo
Copy link
Member

Implement the h5py dimension scales API?

@alimanfoo alimanfoo added the enhancement New features or improvements label Nov 21, 2017
@weatherfrog
Copy link

+1
Is this still on the wishlist? Would a PR be appreciated?

@jhamman
Copy link
Member

jhamman commented Feb 12, 2018

We would also make use of something like this feature in Xarray.

@alimanfoo
Copy link
Member Author

alimanfoo commented Feb 21, 2018 via email

@jhamman
Copy link
Member

jhamman commented Feb 23, 2018

@alimanfoo - thinking about this more, let me clarify how we would likely use this in xarray.

  • We mostly would need a DIMENSION_LIST attribute on each array (we make one of these ourselves for now, see ref 1).
  • If we could query the dataset/group for a list of dimensions and their sizes, that would be awesome.
  • The dimension scales themselves, in the common data model, are less important and we would probably continue to use normal variables to store coordinate labels (see ref 2).

References:

  1. Pointer to how we're currently handling this in Xarray
  2. Unidata description of how NetCDF uses the "DIMENSION_LIST " attribute

@alimanfoo
Copy link
Member Author

alimanfoo commented Feb 23, 2018 via email

@rsignell-usgs
Copy link

Was talking with @shoyer about this at scipy2018.

Currently xarray is putting the _ARRAY_DIMENSIONS info to allow interpretation as NetCDF into the .zattrs for the variable, like:

{
    "_ARRAY_DIMENSIONS": [
        "time",
        "node"
    ],
    "coordinates": "y x",
    "location": "node",
    "long_name": "water surface elevation above geoid",
    "mesh": "adcirc_mesh",
    "standard_name": "sea_surface_height_above_geoid",
    "units": "m"
}

but perhaps could find a home in .zarray, an example of which currently looks like:

{
    "chunks": [
        10,
        141973
    ],
    "compressor": {
        "blocksize": 0,
        "clevel": 5,
        "cname": "lz4",
        "id": "blosc",
        "shuffle": 1
    },
    "dtype": "<f8",
    "fill_value": -99999.0,
    "filters": null,
    "order": "C",
    "shape": [
        720,
        9228245
    ],
    "zarr_format": 2
}

as a new field dimensions or something, and be part of the Zarr functionality/documentation?

I imagine there will be other packages soon trying to use Zarr for NetCDF, and it would be nice to have a convention.

Whatever is decided, @shoyer suggests we call these the "NetZDF" conventions! 😸

@shoyer
Copy link
Contributor

shoyer commented Jul 15, 2018

See #276 for a proposed Zarr spec v3, incorporating optional dimension names and the "netzdf" format.

@jhamman
Copy link
Member

jhamman commented Dec 7, 2023

closing as this is now part of the v3 spec and is in the v3 dev branch.

@jhamman jhamman closed this as completed Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New features or improvements
Projects
None yet
Development

No branches or pull requests

5 participants