Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

addition of esm_datastore.to_dask() #401

Closed
d70-t opened this issue Nov 22, 2021 · 3 comments · Fixed by #403
Closed

addition of esm_datastore.to_dask() #401

d70-t opened this issue Nov 22, 2021 · 3 comments · Fixed by #403
Labels
enhancement Issues that are found to be a reasonable candidate feature additions

Comments

@d70-t
Copy link
Contributor

d70-t commented Nov 22, 2021

Is your feature request related to a problem? Please describe.
I often encounter the situation where my search on intake-ESM returns a result which is representable by a single dataset. I know that this is the case, because I craft my search in a way that this will be the case. However, extracting out the single dataset from the return value of to_dataset_dict() is always a bit lengthy.

Describe the solution you'd like
As many intake libraries provide the to_dask() method to convert a result into an xarray dataset (which probably is a misnomer, but that's what people are using), I propose to add to_dask() to esm_datastore which should return a single dataset in case to_dataset_dict() would return exactly one item and fail in other cases.

Describe alternatives you've considered
I'd consider adding to_dataset() with the same functionality as above, but as other libraries seem to have settled on to_dask() already, I'd rather stay consistent than pedantic.

Draft implementation

def to_dask(self, *args, **kwargs):
    if len(self) != 1:
        raise ValueError("not exactly one result")
    return next(iter(self.to_dataset_dict(*args, **kwargs).values()))
@andersy005
Copy link
Member

I know that this is the case, because I craft my search in a way that this will be the case.

Since you know that your search will result in a single dataset, have you tried accessing the key in question and calling .to_dask() on this key?

ds = cat[cat.keys()[0]].to_dask()

@d70-t
Copy link
Contributor Author

d70-t commented Nov 22, 2021

Thanks for getting back so quickly!

Since you know that your search will result in a single dataset, have you tried accessing the key in question and calling .to_dask() on this key?

I didn't. In stead I used something like:

ds = next(iter(cat.to_dataset_dict().values()))

which is a little bit longer, but doesn't require calling out cat twice and thus is simple to use in some places.


The goal I want to achieve ist to shorten the amount of code to write a bit. I.e. I would like to write something like:

ds = cat.search(...).to_dask()

in stead of:

res = cat.search(...)
ds = res[res.keys()[0]].to_dask()

which seems to be a bit redundant.

@andersy005 andersy005 added the enhancement Issues that are found to be a reasonable candidate feature additions label Nov 22, 2021
@andersy005
Copy link
Member

andersy005 commented Nov 22, 2021

@d70-t, this would be a reasonable, straightforward addition. Would you be interested in submitting a pull request?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Issues that are found to be a reasonable candidate feature additions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants