Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Dask for lazy loading and delayed operations #10

Open
dmentipl opened this issue Jan 13, 2020 · 0 comments
Open

Use Dask for lazy loading and delayed operations #10

dmentipl opened this issue Jan 13, 2020 · 0 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@dmentipl
Copy link
Owner

Accessing a particle array on a snapshot is currently lazy, in the sense that the data is only loaded from disc into memory when requested, e.g. with snap['position']. However, it remains there in the dictionary snap._arrays.

An alternative is to load the array as a Dask array and use the resulting object as if it were a NumPy array loaded into memory. Then complicated expressions can be written before anything is loaded into memory. The computation is executed with the .compute() method. Dask also allows for easy parallelization on both local, i.e. a multi-core laptop, and remote hardware, i.e. a supercomputing cluster.

A suggestion is to adjust the __getitem__() method to return a Dask array, and to not store the array in snap._arrays.

@dmentipl dmentipl self-assigned this Jan 13, 2020
@dmentipl dmentipl added enhancement New feature or request help wanted Extra attention is needed labels Jan 13, 2020
@dmentipl dmentipl removed their assignment Feb 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant