Skip to content

Commit

Permalink
Fix resuming arrow format (#6964)
Browse files Browse the repository at this point in the history
* fix resuming in arrow format

* one more

* fix arrow resuming
  • Loading branch information
lhoestq committed Jun 14, 2024
1 parent 087671d commit ef2fb35
Show file tree
Hide file tree
Showing 3 changed files with 357 additions and 215 deletions.
2 changes: 1 addition & 1 deletion docs/source/stream.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -415,6 +415,6 @@ This can be used with the `StatefulDataLoader` from `torchdata`:

<Tip>

Resuming returns exactly where the checkpoint was saved except in two cases: 1) examples from shuffle buffers are lost when resuming and the buffers are refilled with new data and 2) combinations of `.with_format(arrow)` and batched `.map()` may skip one batch.
Resuming returns exactly where the checkpoint was saved except if `.shuffle()` is used: examples from shuffle buffers are lost when resuming and the buffers are refilled with new data.

</Tip>
Loading

0 comments on commit ef2fb35

Please sign in to comment.