-
Notifications
You must be signed in to change notification settings - Fork 62
Open
Description
This RFC proposes the design / parameters exposed in our VideoEncoder Python API.
The implementation aims to be simple to use without a deep understanding of FFmpeg. To achieve this, the encoder should rely on FFmpeg to select the default codec and most, if not all, other parameters similar to the FFmpeg CLI.
Proposed API
def VideoEncoder(
frames: Tensor,
frame_rate: int,
) -> None:
pass
def to_file(
dest: str
) -> None:
pass
VideoEncoder(frames=frames, frame_rate=25).to_file("output.mov")
def to_tensor(
format: str
) -> torch.Tensor:
pass
encoded_tensor = VideoEncoder(frames=frames, frame_rate=25).to_tensor("mov")
This exposes the following parameters:
- frame_rate, frames
- We cannot reasonably assume a frame_rate, so it must be provided.
These parameters were considered, but will not be exposed:
- codec
- We will rely on FFmpeg to select a valid default codec for the provided container format.
- crf, bit_rate
- Internally, we can set crf to ensure the encoder is handling the frame data correctly. Generally, we will allow FFmpeg to determine quality via bit_rate or crf based on the codec's defaults.
- gop_size, max_b_frames
- These parameters affect frame compression and seeking performance. We can rely on the codec's default values for now.
- output_pixel_format
- We can rely on FFmpeg to select the best pixel format using avcodec_find_best_pix_fmt_of_list, which aims to minimize loss. For lossless encoding, this should default to YUV444 when available.
Questions
- What would a multistream encoder look like, given this proposal and the AudioEncoder?
- Modular approach, similar to one laid out in Audio Encoder Design
Encoder()
.add_audio(AudioEncoder(samples, sample_rate))
.add_video(VideoEncoder(frames, frame_rate))
.to_file(filename)
- Combined approach, similar to torchvision's write_video
Encoder(audio_samples=samples, sample_rate=sample_rate, video_frames=frames, frame_rate=frame_rate)
.to_file(filename)
- How (if at all) should the VideoEncoder report to the user that their FFmpeg installation is missing a default codec?
- For reference, this is an error raised by FFmpeg CLI when attempting to encode to
webm
on FFmpeg 4.3.2:
Automatic encoder selection failed for output stream #0:0. Default encoder for format webm (codec vp8) is probably disabled. Please choose an encoder manually.
- For reference, this is an error raised by FFmpeg CLI when attempting to encode to
- Which (if any) additional quality parameters should be exposed? Ex. preset, CRF
Metadata
Metadata
Assignees
Labels
No labels