New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dask Persist #1344
Comments
I'm happy with either or both of these solutions. |
since we already have |
We do eventually want to support some sort of duck typing. This way people can do things like the following while still benefiting from shared intermediates:
But this may not happen as quickly as a |
In this example, are x, y and z dask objects or xarray objects? |
I think the idea is that they could be mixed types, e.g., a dask-dataframe, a dask-array and an xarray Dataset or DataArray. |
* Add persist method to DataSet Fixes #1344 * add persist method to DataArray * add whats new entry * add doc section on persist * doc: dask array now supports automatic chunk alignment (at least on most operations)
It would be convenient to load constituent dask.arrays into memory as dask.arrays rather than as numpy arrays. This would help with distributed computations where we want to load a large amount of data into distributed memory once and then iterate on the full xarray dataset repeatedly without reloading from disk every time.
We can probably solve this from either side:
.persist
method that replaced all of its dask.arrays with a persisted version of that arraycc @shoyer @jcrist @rabernat @pwolfram
The text was updated successfully, but these errors were encountered: