Validation of allocated output tensor shapes #339

NicolasHug · 2024-11-06T17:11:25Z

This PR adds checks for the shape of the output tensors we allocate. This should help us avoid some potential memory corruption issues.

This PR does not change any existing logic (well, not really, see details in comments below). It only adds safer checks against existing assumptions.

… convertFrameToTensorUsingFilterGraph

NicolasHug · 2024-11-06T17:12:33Z

src/torchcodec/decoders/_core/VideoDecoder.cpp

+        expectedOutputWidth,
+        "x3, got ",
+        shape);
+  }


The check above validates the shape of the pre-allocated output tensor. It was previously only done for swscale, but it should also be done for filtergraph, hence why I moved it up.

NicolasHug · 2024-11-06T17:15:47Z

src/torchcodec/decoders/_core/VideoDecoder.cpp

+int VideoDecoder::convertFrameToBufferUsingSwsScale(
+    int streamIndex,
+    const AVFrame* frame,
+    torch::Tensor& outputTensor) {


I have updated the signature of convertFrameToBufferUsingSwsScale for it to be very similar to convertFrameToTensorUsingFilterGraph. The signatures are now:

int VideoDecoder::convertFrameToBufferUsingSwsScale( int streamIndex, const AVFrame* frame, torch::Tensor& outputTensor); torch::Tensor VideoDecoder::convertFrameToTensorUsingFilterGraph( int streamIndex, const AVFrame* frame);

Their signatures being so different previously was a constant source of confusion for me.

I realize that this change seemingly goes against our TODO of folding the pre-allocated tensor within RawOutput. But I think this TODO will in fact be easier to address with that change, as both APIs will be more closely aligned.

Why is this function now returning the height?

So that we can check against it and have the swscale check next to the filtergraph check. It makes the logic of both libraries more symmetric.

NicolasHug · 2024-11-06T17:17:44Z

src/torchcodec/decoders/_core/VideoDecoder.cpp

  TORCH_CHECK_EQ(filteredFrame->format, AV_PIX_FMT_RGB24);
-  auto frameDims = getHeightAndWidthFromOptionsOrAVFrame(
-      streams_[streamIndex].options, *filteredFrame.get());
+  auto frameDims = getHeightAndWidthFromResizedAVFrame(*filteredFrame.get());


I have reverted this logic to what it previously was before #332. I had some doubt in #332 (comment) already, and now I think it's clear that this logic for finding height and width should be treated as a separate logic from the 2 existing ones.

I have updated the big note below, with my updated understanding.

(I know it's a bit complicated. But the complexity isn't caused by this PR!)

NicolasHug · 2024-11-06T17:18:47Z

src/torchcodec/decoders/_core/CudaDevice.cpp


+  // TODO height and width info of output tensor comes from the metadata, which
+  // may not be accurate. How do we make sure we won't corrupt memory if the
+  // allocated tensor is too short/large?


@ahmadsharif1 any idea on how to do this?

You can look at rawOutput.frame's dimensions and use those, right?

Are you suggesting to get height and width from the AVFrame instead of from the Metadata then? I don't mind but that's a change of logic. Do you know why it wasn't done like that before?

discussed offline, I'll submit a follow-up to do that.

scotts · 2024-11-06T20:13:46Z

src/torchcodec/decoders/_core/VideoDecoder.cpp

+    TORCH_CHECK(
+        (shape.size() == 3) && (shape[0] == expectedOutputHeight) &&
+            (shape[1] == expectedOutputWidth) && (shape[2] == 3),
+        "Expected pre-allocated tensor of shape ",


You should also be able to do (shape == {expectedOutputHeight, expectedOutputWidth, 3}) if you think that's clearer. I'm not sure.

scotts · 2024-11-06T20:46:18Z

src/torchcodec/decoders/_core/VideoDecoder.cpp

  }
 }

+FrameDims getHeightAndWidthFromResizedAVFrame(const AVFrame& resizedAVFrame) {


~~Because the function itself has no opinion on what kind of frame it is (resized or not), I think the name should just be getHeightAndWidthFromAVFrame().~~

Read comment below. I get the motivation for grepping through the code. I don't think this is amazing, but it should help us get to a cleaner place, and it does make explicit how convoluted things are.

scotts · 2024-11-06T20:54:19Z

The changes look good to me, but I think @ahmadsharif1 should also review.

ahmadsharif1 · 2024-11-07T15:39:56Z

src/torchcodec/decoders/_core/VideoDecoder.cpp

+int VideoDecoder::convertFrameToBufferUsingSwsScale(
+    int streamIndex,
+    const AVFrame* frame,
+    torch::Tensor& outputTensor) {


Why is this function now returning the height?

ahmadsharif1 · 2024-11-07T15:44:15Z

src/torchcodec/decoders/_core/VideoDecoder.h

-// for pre-allocating batched output tensors (we could pre-allocate those only
-// once we decode the first frame to get the info frame the AVFrame, but that's
-// a more complex logic).
+// The source of truth for height and width really is the (resized) AVFrame:


The decoded output from ffmpeg isn't resized. The resize happens later

ahmadsharif1 · 2024-11-07T15:45:17Z

src/torchcodec/decoders/_core/VideoDecoder.h


+// There's nothing preventing you from calling this on a non-resized frame, but
+// please don't.
+FrameDims getHeightAndWidthFromResizedAVFrame(const AVFrame& resizedAVFrame);


getHeightAndWidth can just be called getFrameDims

ahmadsharif1 · 2024-11-07T15:52:09Z

src/torchcodec/decoders/_core/CudaDevice.cpp


+  // TODO height and width info of output tensor comes from the metadata, which
+  // may not be accurate. How do we make sure we won't corrupt memory if the
+  // allocated tensor is too short/large?


You can look at rawOutput.frame's dimensions and use those, right?

NicolasHug added 4 commits November 6, 2024 15:49

Align signature of getHeightAndWidthFromOptionsOrAVFrame with that of…

d52e4df

… convertFrameToTensorUsingFilterGraph

damn

5269366

revert lint

ac428a3

update coments

3f80956

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 6, 2024

NicolasHug commented Nov 6, 2024

View reviewed changes

NicolasHug requested review from ahmadsharif1 and scotts November 6, 2024 18:16

scotts reviewed Nov 6, 2024

View reviewed changes

scotts approved these changes Nov 6, 2024

View reviewed changes

ahmadsharif1 reviewed Nov 7, 2024

View reviewed changes

Merge branch 'main' into shape_assert

a19f7bc

NicolasHug merged commit c2bea4b into meta-pytorch:main Nov 8, 2024
37 of 40 checks passed

NicolasHug mentioned this pull request Nov 8, 2024

Use AVFrame info for height and width in GPU APIs #347

Merged

Validation of allocated output tensor shapes #339

Validation of allocated output tensor shapes #339

Uh oh!

Conversation

NicolasHug commented Nov 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

NicolasHug Nov 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scotts Nov 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scotts commented Nov 6, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

NicolasHug commented Nov 6, 2024 •

edited

Loading

NicolasHug Nov 6, 2024 •

edited

Loading

scotts Nov 6, 2024 •

edited

Loading