addition of `esm_datastore.to_dask()` #401

d70-t · 2021-11-22T14:44:14Z

Is your feature request related to a problem? Please describe.
I often encounter the situation where my search on intake-ESM returns a result which is representable by a single dataset. I know that this is the case, because I craft my search in a way that this will be the case. However, extracting out the single dataset from the return value of to_dataset_dict() is always a bit lengthy.

Describe the solution you'd like
As many intake libraries provide the to_dask() method to convert a result into an xarray dataset (which probably is a misnomer, but that's what people are using), I propose to add to_dask() to esm_datastore which should return a single dataset in case to_dataset_dict() would return exactly one item and fail in other cases.

Describe alternatives you've considered
I'd consider adding to_dataset() with the same functionality as above, but as other libraries seem to have settled on to_dask() already, I'd rather stay consistent than pedantic.

Draft implementation

def to_dask(self, *args, **kwargs):
    if len(self) != 1:
        raise ValueError("not exactly one result")
    return next(iter(self.to_dataset_dict(*args, **kwargs).values()))

The text was updated successfully, but these errors were encountered:

andersy005 · 2021-11-22T14:57:39Z

I know that this is the case, because I craft my search in a way that this will be the case.

Since you know that your search will result in a single dataset, have you tried accessing the key in question and calling .to_dask() on this key?

ds = cat[cat.keys()[0]].to_dask()

d70-t · 2021-11-22T15:04:26Z

Thanks for getting back so quickly!

Since you know that your search will result in a single dataset, have you tried accessing the key in question and calling .to_dask() on this key?

I didn't. In stead I used something like:

ds = next(iter(cat.to_dataset_dict().values()))

which is a little bit longer, but doesn't require calling out cat twice and thus is simple to use in some places.

The goal I want to achieve ist to shorten the amount of code to write a bit. I.e. I would like to write something like:

ds = cat.search(...).to_dask()

in stead of:

res = cat.search(...)
ds = res[res.keys()[0]].to_dask()

which seems to be a bit redundant.

andersy005 · 2021-11-22T15:19:12Z

@d70-t, this would be a reasonable, straightforward addition. Would you be interested in submitting a pull request?

andersy005 added the enhancement Issues that are found to be a reasonable candidate feature additions label Nov 22, 2021

d70-t mentioned this issue Nov 23, 2021

Add to_dask() method to esm_datastore #403

Merged

3 tasks

andersy005 closed this as completed in #403 Nov 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

addition of `esm_datastore.to_dask()` #401

addition of `esm_datastore.to_dask()` #401

d70-t commented Nov 22, 2021

andersy005 commented Nov 22, 2021

d70-t commented Nov 22, 2021

andersy005 commented Nov 22, 2021 •

edited

Loading

addition of esm_datastore.to_dask() #401

addition of esm_datastore.to_dask() #401

Comments

d70-t commented Nov 22, 2021

andersy005 commented Nov 22, 2021

d70-t commented Nov 22, 2021

andersy005 commented Nov 22, 2021 • edited Loading

addition of `esm_datastore.to_dask()` #401

addition of `esm_datastore.to_dask()` #401

andersy005 commented Nov 22, 2021 •

edited

Loading