Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: dask.dataframe.to_xarray() #6058

Open
raybellwaves opened this issue Apr 2, 2020 · 5 comments
Open

Feature request: dask.dataframe.to_xarray() #6058

raybellwaves opened this issue Apr 2, 2020 · 5 comments

Comments

@raybellwaves
Copy link
Member

See https://stackoverflow.com/questions/60896303/dask-convert-a-dask-dataframe-to-an-xarray-dataset/60904272#60904272

There is an equivalent method in pandas (pandas.DataFrame.to_xarray) and xarray has the method to convert to a dask.dataframe. (xarray.Dataset.to_dask_dataframe)

My motivation is I was showing how a package I developed xskillscore can be dropped in to your work flow if you are using pandas.DataFrame's (https://github.com/raybellwaves/xskillscore-tutorial/blob/master/01_Determinisitic.ipynb). Because xskillscore is an extension of xarray it works with dask. I was looking to repeat this workflow using a dask.dataframe. In the end I went with the dask.array approach (https://github.com/raybellwaves/xskillscore-tutorial/blob/master/03_Big_Data.ipynb).

@mrocklin @TomAugspurger apologies for the tagging. you may know if there are xarray devs here who may be interested.

@jrbourbeau
Copy link
Member

Thanks for raising an issue @raybellwaves. In theory a .to_xarray() method seems in scope as there's an equivalent pandas method.

cc @shoyer @jhamman who may have thoughts on this topic

@shoyer
Copy link
Member

shoyer commented Apr 2, 2020

We have xarray.Dataset.to_dask_dataframe, but not from_dask_dataframe yet. I would suggest implementing that method in xarray first, which dask could call. That's the equivalent of what we do for pandas's to_xarray.

I think this could probably be done, but some care would need to be taken to do so in an efficient manner.

@TomAugspurger
Copy link
Member

@shoyer in general, how well does xarray handle arrays with unknown chunk sizes? We can optionally compute the chunk sizes if desired, but the default would result in unknown sizes.

@shoyer
Copy link
Member

shoyer commented Apr 2, 2020 via email

@martindurant
Copy link
Member

XREF PR in xarray: pydata/xarray#4659

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants