Skip to content
This repository was archived by the owner on Sep 11, 2023. It is now read-only.
This repository was archived by the owner on Sep 11, 2023. It is now read-only.

Experiment with calling dataset.load() _after_ joining examples into batch #475

@JackKelly

Description

@JackKelly

For the Zarr DataSources, it may be faster to load the data into memory after joining (lazily loaded) examples.

i.e. call .load() towards the end of get_batch() instead of at the end of get_example().

This should allow dask to do a better job of scheduling what needs to be done. And might result in faster times per batch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions