You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During the course of running a job intermediate output is written to Zarr files. If a job is interrupted it could be resumed by continuing from the point at which it was stopped.
Cubed Array objects can be pickled, so it is already possible to reload them (and the underlying Plan object containing the DAG) as long as they have been saved first.
To detect where in the DAG the resume from, we can skip any intermediate Zarr files that have all of their chunks written. The check we need to do is nchunks_initialized == nchunks. (Since we specify that write_empty_chunks is True we know that every chunk will be written out, even if it composed of empty fill values.)
In the future it might be possible to resume at the level of individual chunks, but for now starting at the level of a Zarr array (and rewriting any chunks that have already been written) is sufficient.
This depends on #10 for testing - we can use it to see how many tasks actually ran, to check that earlier arrays were not re-written after resuming a partially completed job.
The text was updated successfully, but these errors were encountered:
During the course of running a job intermediate output is written to Zarr files. If a job is interrupted it could be resumed by continuing from the point at which it was stopped.
Cubed
Array
objects can be pickled, so it is already possible to reload them (and the underlyingPlan
object containing the DAG) as long as they have been saved first.To detect where in the DAG the resume from, we can skip any intermediate Zarr files that have all of their chunks written. The check we need to do is
nchunks_initialized == nchunks
. (Since we specify thatwrite_empty_chunks
isTrue
we know that every chunk will be written out, even if it composed of empty fill values.)In the future it might be possible to resume at the level of individual chunks, but for now starting at the level of a Zarr array (and rewriting any chunks that have already been written) is sufficient.
This depends on #10 for testing - we can use it to see how many tasks actually ran, to check that earlier arrays were not re-written after resuming a partially completed job.
The text was updated successfully, but these errors were encountered: