Skip to content

Conversation

@NicolasHug
Copy link
Contributor

@NicolasHug NicolasHug commented Feb 9, 2025

Towards #476. Our VideoDecoder is inherently single-stream and almost all streamIndex parameters aren't used, so this PR removes them. This also removes most but not all multi-stream related logic (there is some left in the scan for example).

I haven't updates the custom ops APIs nor the core APIs because this may create conflicts internally and I want to deal with that separately as a follow-up PR.

Other clean-ups we could do afterwards:

  • Update Python core APIs and ops APIs to remove the stream_index parameter.
  • Remove the AVFrameStream struct and just use AVFrame. We know the stream, we don't need to keep track of it.
  • Remove the StreamInfos_ vec and only keep a single StreamInfo for the active stream.
  • Simplify the scan to ignore inactive streams (at the very least we don't need to set the StreamInfo_ for inactive streams).

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 9, 2025
@NicolasHug NicolasHug marked this pull request as ready for review February 10, 2025 10:10
videoStreamOptions.colorConversionLibrary.value_or(defaultLibrary);
}

void VideoDecoder::updateMetadataWithCodecContext(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inlined this above, as it became hard to justify this being a single separate method with only one call-site.

VideoDecoder::FrameOutput VideoDecoder::getNextFrameNoDemux() {
auto output = getNextFrameNoDemuxInternal();
output.data = maybePermuteHWC2CHW(output.streamIndex, output.data);
VideoDecoder::FrameOutput VideoDecoder::getNextFrame() {
Copy link
Contributor Author

@NicolasHug NicolasHug Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed all the NoDemux suffixes - they weren't informative (and wrong)

break;
}
VideoDecoder::FrameOutput VideoDecoder::getFramePlayedAt(double seconds) {
StreamInfo& streamInfo = streamInfos_[activeStreamIndex_];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the for loop over StreamInfos_ and just indented to the left, using activeStreamIndex_. This is probably a minor fix.

OpsFrameOutput get_frame_at_index(
at::Tensor& decoder,
int64_t stream_index,
[[maybe_unused]] int64_t stream_index,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the PR description, I chose not to update the APIs of the ops and core APIs. Will deal with that in a follow-up.

const StreamInfo& streamInfo,
int64_t pts) const {
int VideoDecoder::getKeyFrameIndexForPts(int64_t pts) const {
const StreamInfo& streamInfo = streamInfos_.at(activeStreamIndex_);
Copy link
Contributor Author

@NicolasHug NicolasHug Feb 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using .at() instead of [] to preserve the const qualifier of the method. I would otherwise get:

  /home/nicolashug/dev/torchcodec/src/torchcodec/decoders/_core/VideoDecoder.cpp:1475:65: error: passing ‘const std::map<int, facebook::torchcodec::VideoDecoder::StreamInfo>’ as ‘this’ argument discards qualifiers [-fpermissive]
   1475 |   const StreamInfo& streamInfo = streamInfos_[activeStreamIndex_];
        |                                                                 ^

@scotts is there a more idiomatic way of dealing with this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the correct thing to do - the [] operator on a map always returns a reference, and its semantics are that if the key is not there, a value is placed there with the default for that type. The at() member function can return a constant reference, and it does bounds-checking, so a key not being there is an exception.

@facebook-github-bot
Copy link
Contributor

@NicolasHug has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@NicolasHug NicolasHug merged commit 0f50aba into meta-pytorch:main Feb 10, 2025
47 of 50 checks passed
@scotts
Copy link
Contributor

scotts commented Feb 10, 2025

I considered if we should eventually make the C++ video decoder just take the stream index as a parameter in its constructor, similar to what we do in Python. But then I remembered that there is one valid case of a C++ video decoder with no active stream: reading the metadata from the header.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants