Skip to content

Conversation

@scotts
Copy link
Contributor

@scotts scotts commented Dec 4, 2025

Depends on #1081.

Implementation of torchcodec.transforms.CenterCrop.

The only potential surprise here is the change to the C++ Crop transform, where we are using it to implement both CenterCrop and RandomCrop. My rationale here is that the C++ objects have a 1:1 mapping to the FFmpeg filters, and the objects in torchcodec.transforms.DecoderTransform can map to either a TorchVision transform or directly to an FFmpeg filter. (We don't have any of the latter yet.) The code in custom_ops.cpp is the bridge between the two.

I'm open to instead making the C++ side not share crop implementations and define a C++ CenterCrop that takes no x and y coordinates.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 4, 2025
@scotts scotts marked this pull request as ready for review December 4, 2025 02:41
Copy link
Contributor

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

:template: dataclass.rst

DecoderTransform
RandomCrop
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is #1081 's driveby, we also need CenterCrop here now :)

Comment on lines +49 to +53
std::string coordinates = x_.has_value()
? (":" + std::to_string(x_.value()) + ":" + std::to_string(y_.value()))
: "";
return "crop=" + std::to_string(outputDims_.width) + ":" +
std::to_string(outputDims_.height) + ":" + std::to_string(x_) + ":" +
std::to_string(y_) + ":exact=1";
std::to_string(outputDims_.height) + coordinates + ":exact=1";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to check my understanding, it's a filtergraph built-in behavior that passing "crop=w:h" leads to a center crop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. I can update the comment to make that clear. From the docs:

x
The horizontal position, in the input video, of the left edge of the output video. It defaults to (in_w-out_w)/2. This expression is evaluated per-frame.

y
The vertical position, in the input video, of the top edge of the output video. It defaults to (in_h-out_h)/2. This expression is evaluated per-frame.

CropTransform(const FrameDims& dims, int x, int y);

// Becomes a center crop if x and y are not specified.
CropTransform(const FrameDims& dims);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To follow-up on your main comment about using CropTransform for both RandomCrop and CenterCrop, I think what you have here, and the rationale behind it, make sense.

@scotts scotts merged commit 7bc569a into meta-pytorch:main Dec 4, 2025
74 checks passed
@scotts scotts deleted the center_crop branch December 4, 2025 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants