Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read videos from memory #6603

Open
vedantroy opened this issue Sep 18, 2022 · 3 comments
Open

Read videos from memory #6603

vedantroy opened this issue Sep 18, 2022 · 3 comments

Comments

@vedantroy
Copy link

vedantroy commented Sep 18, 2022

馃殌 The feature

I have a lot of small videos (3 million video files), in an attempt to reduce disk bottleneck, I store all the videos in LMDB using: https://lmdb.readthedocs.io/en/release/. I would love the ability to read videos from memory.

Motivation, pitch

My training (8xA100 GPUs) process is data starved. I'm pretty sure the issue is disk throughput, hence my usage of LMDB. Allowing torchvision to read videos from memory would help me use LMDB for storing binary files.

Alternatives

I am about to try manually doing this using pyav.

Additional context

No response

cc @pmeier

@pmeier
Copy link
Collaborator

pmeier commented Sep 19, 2022

Could you clarify what you mean by "read videos from memory"? Do you want to a dataset that allows you to read data from a LMDB?

If yes, you can have a look at the LSUN dataset that stores images inside multiple LMDBs. There is probably room for optimization, but this should get you started.

@YosuaMichael
Copy link
Contributor

cc @jdsgomes

@jdsgomes
Copy link
Contributor

Hi @vedantroy thank you for the feature request. I am actually planning to add this functionality to our VideoReader. I will keep you posted with progress but the ETA is about 2 weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants