Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batched IterableDataset #6279

Open
lneukom opened this issue Oct 5, 2023 · 1 comment
Open

Batched IterableDataset #6279

lneukom opened this issue Oct 5, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@lneukom
Copy link

lneukom commented Oct 5, 2023

Feature request

Hi,

could you add an implementation of a batched IterableDataset. It already support an option to do batch iteration via .iter(batch_size=...) but this cannot be used in combination with a torch DataLoader since it just returns an iterator.

Motivation

The current implementation loads each element of a batch individually which can be very slow in cases of a big batch_size. I did some experiments here and using a batched iteration would speed up data loading significantly.

Your contribution

N/A

@lneukom lneukom added the enhancement New feature or request label Oct 5, 2023
@VascoSch92
Copy link

This is exactly what I was looking for. It would also be very useful for me :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants