Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
22 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
Best Practices | ||
============== | ||
|
||
This page contains a list of best practices for Dask arrays. | ||
|
||
- When deciding chunks be aware that the scheduler may impose overheads up to | ||
1ms per operation per chunk. You want to make your chunks large enough that | ||
operations on those chunks take up 100ms or so. | ||
|
||
You also want chunks to be small enough that you can have several of them in | ||
memory at once, probably more than twice the number of threads you're using, | ||
even for simple computations. | ||
|
||
Aiming for chunks that are 50MB-500MB is usually a good rule of thumb. | ||
You should experiment though. | ||
|
||
- When loading data from chunked data sources (like HDF5) you should arrange | ||
your chunks to align with the chunks of the underlying data source, | ||
otherwise you may read through all of your data many times more than is | ||
necessary. | ||
|
||
- It is difficult to make the HDF5 library is not |