-
Notifications
You must be signed in to change notification settings - Fork 71
Speed-up samplers by avoiding backwards seeks #245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
c090e44
6b52cfd
c618160
b5545a9
2c5a559
ce46196
5fed662
1873174
e86e017
63462e9
a97afe7
83c6763
71a839a
a333e9b
08833b0
3dcbe1e
054b72e
2bb1d58
9b93214
7a500de
75945a8
0c5c537
5585b46
523dff9
5814439
6e18e07
6378a34
9ae87cc
5eba3e4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -160,28 +160,54 @@ def _build_all_clips_indices( | |
| def _decode_all_clips_indices( | ||
| decoder: VideoDecoder, all_clips_indices: list[int], num_frames_per_clip: int | ||
| ) -> list[FrameBatch]: | ||
| # This takes the list of all the frames to decode, decode all the frames, | ||
| # and then packs them into clips of length num_frames_per_clip. | ||
| # This is slow, unoptimized, and u.g.l.y. It is not meant to stay. | ||
| # TODO: | ||
| # - sort the frames to avoid backward seeks, dedup, decode, and re-organize frames. | ||
| # - write most of this in C++ | ||
| # This takes the list of all the frames to decode (in arbitrary order), | ||
| # decode all the frames, and then packs them into clips of length | ||
| # num_frames_per_clip. | ||
| # | ||
| # To avoid backwards seeks (which are slow), we: | ||
| # - sort all the frame indices to be decoded | ||
| # - dedup them | ||
| # - decode all unique frames in sorted order | ||
| # - re-assemble the decoded frames back to their original order | ||
| # | ||
| # TODO: Write this in C++ so we can avoid the copies that happen in `to_framebatch` | ||
|
|
||
| def chunk_list(lst, chunk_size): | ||
| # return list of sublists of length chunk_size | ||
| return [lst[i : i + chunk_size] for i in range(0, len(lst), chunk_size)] | ||
|
|
||
| def to_framebatch(frames: list[Frame]) -> FrameBatch: | ||
| # IMPORTANT: see other IMPORTANT note below | ||
| data = torch.stack([frame.data for frame in frames]) | ||
| pts_seconds = torch.tensor([frame.pts_seconds for frame in frames]) | ||
| duration_seconds = torch.tensor([frame.duration_seconds for frame in frames]) | ||
| return FrameBatch( | ||
| data=data, pts_seconds=pts_seconds, duration_seconds=duration_seconds | ||
| ) | ||
|
|
||
| all_decoded_frames: list[Frame] = [ | ||
| decoder.get_frame_at(index) for index in all_clips_indices | ||
| ] | ||
| all_clips_indices_sorted, argsort = zip( | ||
| *sorted((frame_index, i) for (i, frame_index) in enumerate(all_clips_indices)) | ||
| ) | ||
| previous_decoded_frame = None | ||
| all_decoded_frames = [None] * len(all_clips_indices) | ||
| for i, j in enumerate(argsort): | ||
| frame_index = all_clips_indices_sorted[i] | ||
| if ( | ||
| previous_decoded_frame is not None # then we know i > 0 | ||
| and frame_index == all_clips_indices_sorted[i - 1] | ||
| ): | ||
| # Avoid decoding the same frame twice. | ||
| # IMPORTANT: this is only correct because a copy of the frame will | ||
| # happen within `to_framebatch` when we call torch.stack. | ||
| # If a copy isn't made, the same underlying memory will be used for | ||
| # the 2 consecutive frames. When we re-write this, we should make | ||
| # sure to explicitly copy the data. | ||
| decoded_frame = previous_decoded_frame | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is setting it to the same python object, right? Will there be any issues with that? Example, if the user modifies that tensor or something else in FrameBatch -- they will modify both entries in the list, right?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To be slightly safer w.r.t. future changes this should be decoded_frame = copy(previous_decoded_frame)but we don't implement Note that a copy still happens within We can either:
LMK.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am OK with this as-is.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, I'll add a comment in |
||
| else: | ||
| decoded_frame = decoder.get_frame_at(index=frame_index) | ||
| previous_decoded_frame = decoded_frame | ||
| all_decoded_frames[j] = decoded_frame | ||
|
|
||
| all_clips: list[list[Frame]] = chunk_list( | ||
| all_decoded_frames, chunk_size=num_frames_per_clip | ||
| ) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that we don't have to chunk the clips. The implementation already allows us to return a single 5D |
||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.