Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move the loader snapshot queue to the reader #5085

Merged
merged 1 commit into from
Oct 9, 2023

Conversation

stiepan
Copy link
Member

@stiepan stiepan commented Oct 6, 2023

Category:

Refactoring (Redesign of existing code that doesn't affect functionality)

Description:

This PR moves the snapshot queue from the loader to the reader.
Reader already is resposible for prefetching batches, so it makes sense for it to make sure to keep loader's state snapshot after each produced batch. This should simplify loaders code and remove the need for extra mutex to access loaders snapshot queue. The snapshot queue is now advanced alongside the batch prefetch queue, which makes accessing snapshots (reader's by the executor and loader's by the reader) to have no side effects.

The snapshots are put in the queue by the prefetching thread. If the loader cannot be checkpointed at the given moment, the reader simply puts None in the queue and the relevant error is raised on the attempt to access the snapshot. This could be extended to the executor as well (so that it does not have to compute epoch sizes in advance).

In the future, the snapshot queue could be extracted from the reader and provided to it as a dependency similar to loader. This way, we could provide different checkpointing mechanisms to different readers and loaders.

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: DALI-3653

@stiepan stiepan force-pushed the move_snapshost_queue_to_reader branch from c417048 to 0d9bea0 Compare October 8, 2023 23:16
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
@stiepan stiepan changed the title [WIP] Move the loader snapshot queue to the reader Move the loader snapshot queue to the reader Oct 8, 2023
@stiepan
Copy link
Member Author

stiepan commented Oct 8, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10161684]: BUILD STARTED

@stiepan stiepan marked this pull request as ready for review October 9, 2023 00:10
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [10161684]: BUILD PASSED

@szkarpinski szkarpinski self-assigned this Oct 9, 2023
@szkarpinski szkarpinski merged commit 56d4950 into NVIDIA:main Oct 9, 2023
5 checks passed
JanuszL pushed a commit to JanuszL/DALI that referenced this pull request Oct 13, 2023
Signed-off-by: Kamil Tokarski <ktokarski@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants