Skip to content

[RFC] VideoEncoder Python API #907

@Dan-Flores

Description

@Dan-Flores

This RFC proposes the design / parameters exposed in our VideoEncoder Python API.

The implementation aims to be simple to use without a deep understanding of FFmpeg. To achieve this, the encoder should rely on FFmpeg to select the default codec and most, if not all, other parameters similar to the FFmpeg CLI.

Proposed API

def VideoEncoder(
    frames: Tensor,
    frame_rate: int,
) -> None:
	pass


def to_file(
  dest: str
) -> None:
	pass

VideoEncoder(frames=frames, frame_rate=25).to_file("output.mov")



def to_tensor(
  format: str
) -> torch.Tensor:
	pass

encoded_tensor = VideoEncoder(frames=frames, frame_rate=25).to_tensor("mov")

This exposes the following parameters:

  • frame_rate, frames
    • We cannot reasonably assume a frame_rate, so it must be provided.

These parameters were considered, but will not be exposed:

  • codec
    • We will rely on FFmpeg to select a valid default codec for the provided container format.
  • crf, bit_rate
    • Internally, we can set crf to ensure the encoder is handling the frame data correctly. Generally, we will allow FFmpeg to determine quality via bit_rate or crf based on the codec's defaults.
  • gop_size, max_b_frames
    • These parameters affect frame compression and seeking performance. We can rely on the codec's default values for now.
  • output_pixel_format
    • We can rely on FFmpeg to select the best pixel format using avcodec_find_best_pix_fmt_of_list, which aims to minimize loss. For lossless encoding, this should default to YUV444 when available.

Questions

  • What would a multistream encoder look like, given this proposal and the AudioEncoder?
  1. Modular approach, similar to one laid out in Audio Encoder Design
Encoder()
	.add_audio(AudioEncoder(samples, sample_rate))
	.add_video(VideoEncoder(frames, frame_rate))
	.to_file(filename)
  1. Combined approach, similar to torchvision's write_video
Encoder(audio_samples=samples, sample_rate=sample_rate, video_frames=frames, frame_rate=frame_rate)
	.to_file(filename)
  • How (if at all) should the VideoEncoder report to the user that their FFmpeg installation is missing a default codec?
    • For reference, this is an error raised by FFmpeg CLI when attempting to encode to webm on FFmpeg 4.3.2:
      Automatic encoder selection failed for output stream #0:0. Default encoder for format webm (codec vp8) is probably disabled. Please choose an encoder manually.
  • Which (if any) additional quality parameters should be exposed? Ex. preset, CRF

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions