-
Notifications
You must be signed in to change notification settings - Fork 72
Add codec options to VideoEncoder API #1050
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add codec options to VideoEncoder API #1050
Conversation
NicolasHug
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Dan-Flores , looks great! Made a few comments below.
| pixel_format: Optional[str] = None, | ||
| crf: Optional[Union[int, float]] = None, | ||
| preset: Optional[Union[str, int]] = None, | ||
| codec_options: Optional[Dict[str, str]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we make it Dict[str, Any]? I think there's value in allowing "some_param":16 instead of the un-pythonic "some_param: "16". We just need to call str() on all the values, which is OK.
| a string: "fast", "medium", "slow"). Defaults to None | ||
| (which will use encoder's default). | ||
| codec_options (dict[str, str], optional): A dictionary of codec-specific | ||
| options to pass to the encoder, e.g. ``{"preset": "slow", "tune": "film"}``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have preset as a built-in parameter so it might be confusing to document it here.
src/torchcodec/_core/Encoder.h
Outdated
| void sortCodecOptions( | ||
| const std::map<std::string, std::string>& codecOptions, | ||
| AVDictionary** codecDict, | ||
| AVDictionary** formatDict); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it could be a pure function in an anonymous namespace rather than a method?
| from torchcodec import _core | ||
|
|
||
|
|
||
| def _flatten_codec_options(codec_options: Optional[Dict[str, str]]) -> Optional[list]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: write that at the bottom like the rest of the helpers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably don't need a function to do this in Python, I'll remove the function altogether.
| "avcodec_open2 failed: Invalid argument", | ||
| ), | ||
| ], | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job on the parametrization above
src/torchcodec/_core/ops.py
Outdated
| crf: Optional[Union[int, float]] = None, | ||
| pixel_format: Optional[str] = None, | ||
| preset: Optional[str] = None, | ||
| codec_options: Optional[list[str]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, extra_options is probably better.
src/torchcodec/_core/Encoder.cpp
Outdated
| av_dict_set(formatDict, key.c_str(), value.c_str(), 0); | ||
| } else { | ||
| // Default to codec option (includes AVCodecContext + encoder-private) | ||
| // validateCodecOption(*avCodecContext_->codec, key.c_str(), value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove:
| // validateCodecOption(*avCodecContext_->codec, key.c_str(), value); |
src/torchcodec/_core/Encoder.cpp
Outdated
| AVDictionary** codecDict, | ||
| AVDictionary** formatDict) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This interface is OK for now but we should consider changing it once we use RAII types for the dics (see my other follow-up suggestion).
src/torchcodec/_core/Encoder.cpp
Outdated
| const std::map<std::string, std::string>& codecOptions, | ||
| AVDictionary** codecDict, | ||
| AVDictionary** formatDict) { | ||
| // Search AVFormatContext's AVClass for options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's remove this comment, it doesn't add much value. Let's also document what this function is doing: it takes some options as input and sorts them into codec options and format options, which are returned into two separate dicts.
…o codec_options_encode_option
NicolasHug
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @Dan-Flores !
| extra_options=[ | ||
| x for k, v in (extra_options or {}).items() for x in (k, str(v)) | ||
| ], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was going to suggest something like that instead of the previous _flatten_codec_options but refrained as I thought it might be too crazy lol.
I like it, I'd just suggest the following, which is a tiiiiiny bit safer, and makes it more obvious that everything in the list is a string.
| extra_options=[ | |
| x for k, v in (extra_options or {}).items() for x in (k, str(v)) | |
| ], | |
| extra_options=[ | |
| str(x) for k, v in (extra_options or {}).items() for x in (k, v) | |
| ], |
| extra_options (dict[str, Any], optional): A dictionary of additional | ||
| encoder options to pass, e.g. ``{"preset": "slow", "tune": "film"}``. | ||
| Values will be converted to strings before passing to the encoder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to align the docs (this one still have preset), same with the one below
test/test_encoders.py
Outdated
| actual_codec_spec = self._get_video_metadata(dest, fields=["codec_name"]).get( | ||
| "codec_name" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here and everywhere, use the [] syntax instead of .get(). We usually get() only when we want to specify a fallback value, which I don't think is intended here.
test/test_encoders.py
Outdated
| # Validate profile (case-insensitive, baseline is reported as "Constrained Baseline") | ||
| assert profile in metadata.get("profile", "").lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest the following, because using in is less strict and can sometimes be surprisingly easy to pass, like assert "" in "abc"
| # Validate profile (case-insensitive, baseline is reported as "Constrained Baseline") | |
| assert profile in metadata.get("profile", "").lower() | |
| expected_profile = "constrained baseline" if profile == "baseline" else profile | |
| assert metadata["profile"].lower() == expected_profile |
This PR adds the
codec_optionsdictionary arg to enable all remaining codec arguments.The Python API accepts a
Dict[str, str], converts it to a flattenedList[str]to pass tocustom_ops.py, which rebuilds the dict inunflattenCodecOptions().Validation
To validate numeric options passed into
codec_options,validateDoubleOptionis refactored intotryToValidateCodecOption:Testing
The error handling done in
tryToValidateCodecOptionis documented intest_codec_options_errors.Via local testing, I also checked that:
sortCodecOptionsproperly finds and sets format options correctly{"-x264-params": "keyint=120:bframes=3"}are utilized