Saving progress for datasets #30

shahbuland · 2022-10-14T20:44:26Z

Saving progress for datasets, namely IterablePipelines, is currently a bit clunky. The output dataset is agnostic of progress/location in source. With respect to the source iterator being read from, all that is really being saved is an index in the dataset being read from. Currently naively running next on iterator to get back to whatever index was saved. Leaving a note here to revisit this later as it might have unforeseen consequences at scale.

shahbuland · 2022-10-27T22:33:48Z

#31 Partially addresses, needs more debugging to ensure it is consistent and fault tolerant across all pipelines

shahbuland · 2022-11-18T01:36:47Z

Need to add saving for client statistics

shahbuland · 2022-12-15T21:47:39Z

Solved with #37

shahbuland closed this as completed Dec 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Saving progress for datasets #30

Saving progress for datasets #30

shahbuland commented Oct 14, 2022 •

edited

Loading

shahbuland commented Oct 27, 2022

shahbuland commented Nov 18, 2022

shahbuland commented Dec 15, 2022

Saving progress for datasets #30

Saving progress for datasets #30

Comments

shahbuland commented Oct 14, 2022 • edited Loading

shahbuland commented Oct 27, 2022

shahbuland commented Nov 18, 2022

shahbuland commented Dec 15, 2022

shahbuland commented Oct 14, 2022 •

edited

Loading