Skip to content

Should iterating over a Dataset include coordinates? #211

@shoyer

Description

@shoyer

My inclination is no: the contents of a Dataset (e.g., list(ds), ds.keys() and ds.values()) should only include non-coordinates.

__contains__ checks for a coordinate (e.g., 'time') would need to look in ds.dimensions or ds.coordinates instead of ds, but I see no need to __getitem__: ds['time'] can still work.

Pluses:

  1. This change would more closely align xray.Dataset with pandas.DataFrame, which also does not include any elements of the index in the contents of the frame.
  2. It would eliminate the need for using ds.noncoordinates -- which, as @ToddSmall has pointed out, is not very intuitive.
  3. In my experience, I have been using ds.noncoordinates.items() more often than ds.items() (which contains redundant information, as coordinates are repeated). The only time I really want to iterate over all variables in a dataset is when I'm using the lower level Variable API.

Negatives:

  1. This would break the existing API.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions