Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement input sharder #24

Closed
rom1504 opened this issue Nov 13, 2022 · 3 comments
Closed

implement input sharder #24

rom1504 opened this issue Nov 13, 2022 · 3 comments
Labels
v0 https://github.com/iejMac/video2dataset/issues/18

Comments

@rom1504
Copy link
Collaborator

rom1504 commented Nov 13, 2022

https://docs.google.com/document/d/1_TD2KQLkEegszq4Eip568fc6cWnh9h0Jqj4Lc88t9Y0/edit#bookmark=id.g59sfbda6v9c

DoD: input sharder implemented and tested

@rom1504 rom1504 added the v0 https://github.com/iejMac/video2dataset/issues/18 label Nov 13, 2022
@iejMac
Copy link
Owner

iejMac commented Nov 15, 2022

how do we do this? is this already done in img2dataset?

@rom1504
Copy link
Collaborator Author

rom1504 commented Nov 15, 2022

Yeah it's https://github.com/rom1504/img2dataset/blob/main/img2dataset/reader.py#L139

Today in img2dataset only one input file is shared at a time, but it might change in the future

Here I'd suggest to shard everything beforehand

I think that file from img2dataset can be mostly copy pasted

@rom1504
Copy link
Collaborator Author

rom1504 commented Nov 19, 2022

done

@rom1504 rom1504 closed this as completed Nov 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v0 https://github.com/iejMac/video2dataset/issues/18
Projects
None yet
Development

No branches or pull requests

2 participants