-
Notifications
You must be signed in to change notification settings - Fork 70
Add crf to VideoEncoder API #1031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
715db20 to
8fcc181
Compare
| pixel_format (str, optional): The pixel format to encode frames into (e.g., | ||
| "yuv420p", "yuv444p"). If not specified, uses codec's default format. | ||
| crf (int, optional): Constant Rate Factor for encoding quality. Lower values | ||
| mean better quality. Valid range depends on the encoder (commonly 0-51). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it ever valid to be less than 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe -1 is valid and is equivalent to leaving crf unset. Otherwise, no negative values are valid.
|
I think this is great! We should add some tests for invalid crf values, both less than 0 (which I think is always invalid?), values we know are outside of the range for a given codec and the wrong type. It's fine if these result in exceptions on the C++ side, but we want to make sure users get a clean Python exception and not a segfault. |
| Args: | ||
| format (str): The container format of the encoded frames, e.g. "mp4", "mov", | ||
| "mkv", "avi", "webm", "flv", or "gif" | ||
| "mkv", "avi", "webm", "flv", etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q - why remove "gif"? Do we not support it anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not test explicitly for it anymore, but it still works. I mostly wanted to amend the docstring to make it seem less like a finalized, exhaustive list of supported formats.
test/test_encoders.py
Outdated
|
|
||
| for s_frame, rt_frame in zip(source_frames, round_trip_frames): | ||
| assert psnr(s_frame, rt_frame) > 30 | ||
| torch.testing.assert_close(s_frame, rt_frame, atol=2, rtol=0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be failing for webm, you might need to use the previous logic
# If FFmpeg selects a codec or pixel format that does lossy encoding, assert 99% of pixels
# are within a higher tolerance.
if ffmpeg_version == 6:
assert_close = partial(assert_tensor_close_on_at_least, percentage=99)
atol = 15
else:
assert_close = torch.testing.assert_close
atol = 3 if format == "webm" else 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reminder - it seems I applied the webm tolerance to the wrong test. We can simply use atol = 3 if format == "webm" else 2 on the round_trip_test, though I'm not sure why webm needs this special handling.
| frame_rate=30, | ||
| ) | ||
| getattr(encoder, method)(**valid_params, crf=-10) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only new test case - all other tests are copied over from test_ops.py.
NicolasHug
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Dan-Flores , let's address the comments before merging but LGTM
| if (option->type == AV_OPT_TYPE_INT || option->type == AV_OPT_TYPE_INT64 || | ||
| option->type == AV_OPT_TYPE_FLOAT || option->type == AV_OPT_TYPE_DOUBLE) { | ||
| TORCH_CHECK( | ||
| value >= option->min && value <= option->max, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is comparing an int (value) to a double (min and max), we should cast.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment led me to realize that codecs can implement 'crf' as a double or an int.
I'll update the PR to accept either type, and treat it as a double in the C++, so this casting will not be necessary.
| const char* optionName, | ||
| int value) { | ||
| // First determine if codec's private class is defined | ||
| if (!avCodec.priv_class) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we OK to use priv_class? I.e. is it meant to be "private"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the avcodec.h, priv_class is defined in the section for public fields: https://www.ffmpeg.org/doxygen/2.0/libavcodec_2avcodec_8h_source.html
src/torchcodec/_core/Encoder.cpp
Outdated
| TORCH_CHECK(false, errorMsg.str()); | ||
| } | ||
|
|
||
| void validateNumericOption( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now this expects an int, not a float or double. Let's reflect that in the name. We may define this with a template later to be truely about generic numeric options.
| void validateNumericOption( | |
| void validateIntOption( |
Since
crfwas already utlized in the C++ layer, this PR addscrfto the python API, and moves the tests fromtest_ops.pytotest_encoders.py.Validation
The function
validateNumericOptionlets us validate an argument if itsAVOptionhasminandmaxfields. This error checking is applied tocrfto improve our error message.FFmpeg's output error message:
To our own message:
Testing
The tests are updated to use the python API encoding pattern: