Skip to content

Conversation

@Dan-Flores
Copy link
Contributor

@Dan-Flores Dan-Flores commented Nov 7, 2025

Since crf was already utlized in the C++ layer, this PR adds crf to the python API, and moves the tests from test_ops.py to test_encoders.py.

Validation

The function validateNumericOption lets us validate an argument if its AVOption has min and max fields. This error checking is applied to crf to improve our error message.
FFmpeg's output error message:

RuntimeError: avcodec_open2 failed: Result too large

To our own message:

RuntimeError: crf=-10 is out of valid range [-1, 3.40282e+38] for this codec. For more details, run 'ffmpeg -h encoder=libx264'

RuntimeError: crf=-10 is out of valid range [0, 63] for this codec. For more details, run 'ffmpeg -h encoder=libsvtav1'

RuntimeError: crf=-10 is out of valid range [-1, 63] for this codec. For more details, run 'ffmpeg -h encoder=libvpx-vp9'

Testing

The tests are updated to use the python API encoding pattern:

# Previous ops pattern:
encode_video_to_file(
            frames=source_frames,
            frame_rate=frame_rate,
            filename=encoder_output_path,
            pixel_format=pixel_format,
            crf=crf,
        )
# Updated python pattern:
encoder = VideoEncoder(frames=source_frames, frame_rate=frame_rate)
encoder.to_file(dest=encoder_output_path, pixel_format=pixel_format, crf=crf)

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 7, 2025
@Dan-Flores Dan-Flores marked this pull request as ready for review November 10, 2025 19:21
pixel_format (str, optional): The pixel format to encode frames into (e.g.,
"yuv420p", "yuv444p"). If not specified, uses codec's default format.
crf (int, optional): Constant Rate Factor for encoding quality. Lower values
mean better quality. Valid range depends on the encoder (commonly 0-51).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ever valid to be less than 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe -1 is valid and is equivalent to leaving crf unset. Otherwise, no negative values are valid.

@scotts
Copy link
Contributor

scotts commented Nov 10, 2025

I think this is great! We should add some tests for invalid crf values, both less than 0 (which I think is always invalid?), values we know are outside of the range for a given codec and the wrong type. It's fine if these result in exceptions on the C++ side, but we want to make sure users get a clean Python exception and not a segfault.

Args:
format (str): The container format of the encoded frames, e.g. "mp4", "mov",
"mkv", "avi", "webm", "flv", or "gif"
"mkv", "avi", "webm", "flv", etc.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q - why remove "gif"? Do we not support it anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not test explicitly for it anymore, but it still works. I mostly wanted to amend the docstring to make it seem less like a finalized, exhaustive list of supported formats.


for s_frame, rt_frame in zip(source_frames, round_trip_frames):
assert psnr(s_frame, rt_frame) > 30
torch.testing.assert_close(s_frame, rt_frame, atol=2, rtol=0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be failing for webm, you might need to use the previous logic

        # If FFmpeg selects a codec or pixel format that does lossy encoding, assert 99% of pixels
        # are within a higher tolerance.
        if ffmpeg_version == 6:
            assert_close = partial(assert_tensor_close_on_at_least, percentage=99)
            atol = 15
        else:
            assert_close = torch.testing.assert_close
            atol = 3 if format == "webm" else 2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reminder - it seems I applied the webm tolerance to the wrong test. We can simply use atol = 3 if format == "webm" else 2 on the round_trip_test, though I'm not sure why webm needs this special handling.

frame_rate=30,
)
getattr(encoder, method)(**valid_params, crf=-10)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only new test case - all other tests are copied over from test_ops.py.

Copy link
Contributor

@NicolasHug NicolasHug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Dan-Flores , let's address the comments before merging but LGTM

if (option->type == AV_OPT_TYPE_INT || option->type == AV_OPT_TYPE_INT64 ||
option->type == AV_OPT_TYPE_FLOAT || option->type == AV_OPT_TYPE_DOUBLE) {
TORCH_CHECK(
value >= option->min && value <= option->max,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is comparing an int (value) to a double (min and max), we should cast.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment led me to realize that codecs can implement 'crf' as a double or an int.
I'll update the PR to accept either type, and treat it as a double in the C++, so this casting will not be necessary.

const char* optionName,
int value) {
// First determine if codec's private class is defined
if (!avCodec.priv_class) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we OK to use priv_class? I.e. is it meant to be "private"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the avcodec.h, priv_class is defined in the section for public fields: https://www.ffmpeg.org/doxygen/2.0/libavcodec_2avcodec_8h_source.html

TORCH_CHECK(false, errorMsg.str());
}

void validateNumericOption(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now this expects an int, not a float or double. Let's reflect that in the name. We may define this with a template later to be truely about generic numeric options.

Suggested change
void validateNumericOption(
void validateIntOption(

@Dan-Flores Dan-Flores merged commit c69739f into meta-pytorch:main Nov 13, 2025
71 of 79 checks passed
@Dan-Flores Dan-Flores deleted the crf_encode_option branch November 13, 2025 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants