Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opus inband FEC and other parameters #252

Closed
robin-raymond opened this issue Oct 17, 2015 · 3 comments
Closed

Opus inband FEC and other parameters #252

robin-raymond opened this issue Oct 17, 2015 · 3 comments

Comments

@robin-raymond
Copy link
Contributor

https://tools.ietf.org/html/draft-ietf-rtcweb-fec-01#ref-I-D.ietf-payload-flexible-fec-scheme

   Support for codec-specific FEC mechanisms are typically indicated via
   "a=fmtp" parameters.  For Opus specifically, this is controlled by
   the "useinbandfec=1" parameter, as specified in
   [I-D.ietf-payload-rtp-opus].  These parameters are declarative and
   can be negotiated separately for either media direction.

https://tools.ietf.org/html/draft-ietf-payload-rtp-opus-08#section-6.1

Has the following properties:

Required parameters:

   rate:  the RTP timestamp is incremented with a 48000 Hz clock rate
      for all modes of Opus and all sampling rates.  For data encoded
      with sampling rates other than 48000 Hz, the sampling rate has to
      be adjusted to 48000 Hz.

 Optional parameters:

   maxplaybackrate:  a hint about the maximum output sampling rate that
      the receiver is capable of rendering in Hz.  The decoder MUST be
      capable of decoding any audio bandwidth but due to hardware
      limitations only signals up to the specified sampling rate can be
      played back.  Sending signals with higher audio bandwidth results
      in higher than necessary network usage and encoding complexity, so
      an encoder SHOULD NOT encode frequencies above the audio bandwidth
      specified by maxplaybackrate.  This parameter can take any value
      between 8000 and 48000, although commonly the value will match one
      of the Opus bandwidths (Table 1).  By default, the receiver is
      assumed to have no limitations, i.e. 48000.


   sprop-maxcapturerate:  a hint about the maximum input sampling rate
      that the sender is likely to produce.  This is not a guarantee
      that the sender will never send any higher bandwidth (e.g. it
      could send a pre-recorded prompt that uses a higher bandwidth),
      but it indicates to the receiver that frequencies above this
      maximum can safely be discarded.  This parameter is useful to
      avoid wasting receiver resources by operating the audio processing
      pipeline (e.g. echo cancellation) at a higher rate than necessary.
      This parameter can take any value between 8000 and 48000, although
      commonly the value will match one of the Opus bandwidths
      (Table 1).  By default, the sender is assumed to have no
      limitations, i.e. 48000.


   maxptime:  the maximum duration of media represented by a packet
      (according to Section 6 of [RFC4566]) that a decoder wants to
      receive, in milliseconds rounded up to the next full integer
      value.  Possible values are 3, 5, 10, 20, 40, 60, or an arbitrary
      multiple of an Opus frame size rounded up to the next full integer
      value, up to a maximum value of 120, as defined in Section 4.  If
      no value is specified, the default is 120.

   ptime:  the preferred duration of media represented by a packet
      (according to Section 6 of [RFC4566]) that a decoder wants to
      receive, in milliseconds rounded up to the next full integer
      value.  Possible values are 3, 5, 10, 20, 40, 60, or an arbitrary
      multiple of an Opus frame size rounded up to the next full integer
      value, up to a maximum value of 120, as defined in Section 4.  If
      no value is specified, the default is 20.
maxaveragebitrate:  specifies the maximum average receive bitrate of
      a session in bits per second (b/s).  The actual value of the
      bitrate can vary, as it is dependent on the characteristics of the
      media in a packet.  Note that the maximum average bitrate MAY be
      modified dynamically during a session.  Any positive integer is
      allowed, but values outside the range 6000 to 510000 SHOULD be
      ignored.  If no value is specified, the maximum value specified in
      Section 3.1.1 for the corresponding mode of Opus and corresponding
      maxplaybackrate is the default.

   stereo:  specifies whether the decoder prefers receiving stereo or
      mono signals.  Possible values are 1 and 0 where 1 specifies that
      stereo signals are preferred, and 0 specifies that only mono
      signals are preferred.  Independent of the stereo parameter every
      receiver MUST be able to receive and decode stereo signals but
      sending stereo signals to a receiver that signaled a preference
      for mono signals may result in higher than necessary network
      utilization and encoding complexity.  If no value is specified,
      the default is 0 (mono).


   sprop-stereo:  specifies whether the sender is likely to produce
      stereo audio.  Possible values are 1 and 0, where 1 specifies that
      stereo signals are likely to be sent, and 0 specifies that the
      sender will likely only send mono.  This is not a guarantee that
      the sender will never send stereo audio (e.g. it could send a pre-
      recorded prompt that uses stereo), but it indicates to the
      receiver that the received signal can be safely downmixed to mono.
      This parameter is useful to avoid wasting receiver resources by
      operating the audio processing pipeline (e.g. echo cancellation)
      in stereo when not necessary.  If no value is specified, the
      default is 0 (mono).


   cbr:  specifies if the decoder prefers the use of a constant bitrate
      versus variable bitrate.  Possible values are 1 and 0, where 1
      specifies constant bitrate and 0 specifies variable bitrate.  If
      no value is specified, the default is 0 (vbr).  When cbr is 1, the
      maximum average bitrate can still change, e.g. to adapt to
      changing network conditions.


   useinbandfec:  specifies that the decoder has the capability to take
      advantage of the Opus in-band FEC.  Possible values are 1 and 0.
      Providing 0 when FEC cannot be used on the receiving side is
      RECOMMENDED.  If no value is specified, useinbandfec is assumed to
      be 0.  This parameter is only a preference and the receiver MUST
      be able to process packets that include FEC information, even if
      it means the FEC part is discarded.

   usedtx:  specifies if the decoder prefers the use of DTX.  Possible
      values are 1 and 0.  If no value is specified, the default is 0.

The ortc spec doesn't contain all the parameters listed so we should review to see what we want to include as part of the specification.

http://ortc.org/wp-content/uploads/2015/10/ortc.html#opus-codec-capabilities*

maxplaybackrate unsigned long   A hint about the maximum output sampling rate that the receiver is capable of rendering in Hz.
stereo  boolean Specifies whether the decoder prefers receiving stereo (if true) or mono signals (if false).
@rshpount
Copy link

In addition to negotiated parameters, OPUS codec has parameters that affect encoder only and are not negotiated. These parameters include:

  • Complexity with value in the range 0-10
  • Encoder should constrained VBR
  • Signal type: AUTO, Voice, or Music
  • Encoder intended application: VOIP, AUDIO, or LOWDELAY
  • Expected packet loss percent for FEC
  • Disable encoder prediction, making frames almost completely independent

See https://www.opus-codec.org/docs/html_api-1.1.0/opus__defines_8h.html for more details.

@robin-raymond
Copy link
Contributor Author

9.3.2 Codec capability parameters

We need to duplicate this section as "parameters". This way we can define capabilities for encoder/decoder as well as settings that apply.

@robin-raymond
Copy link
Contributor Author

For the Opus codec, inside the extended options/params we need to add some definitions:

RTCRtpCodecCapability.options we need:
boolean complexity - (sender cap) true = sender supports changing setting https://www.opus-codec.org/docs/html_api-1.1.0/group__opus__encoderctls.html#ga3483877bf1687a75dd4a1de6f85f291c
boolean signal (sender cap) true = sender supports changing setting https://www.opus-codec.org/docs/html_api-1.1.0/group__opus__encoderctls.html#gaaa87ccee4ae46aa6c9528e03c5122b89
boolean application (sender cap) true = sender support changing setting https://www.opus-codec.org/docs/html_api-1.1.0/group__opus__encoderctls.html#ga18fa17dae52ff8f3eaea314204bf1a36
boolean packet-loss-perc (sender cap) true = sender supports changing setting https://www.opus-codec.org/docs/html_api-1.1.0/group__opus__encoderctls.html#gafda1c951dea919ba54432cd03827f1a9
boolean prediction-disabled (sender cap) true = sender supports changing setting https://www.opus-codec.org/docs/html_api-1.1.0/group__opus__encoderctls.html#ga0a73d613f6d9d601b32535fd37f58482

RTCRtpCodecCapability.parameters we need:
ulong maxplaybackrate (receiver cap - optional)
ulong ptime (receiver cap - optional) -- doesn't this need to be more generalized too?
ulong maxaveragebitrate (receiver cap - optional)
boolean stereo (receiver cap - optional)
boolean cbr (receiver cap - optional)
boolean useinbandfec (receiver cap - optional)
boolean usedtx (receiver cap - optional)
ulong sprop-maxcapturerate (sender cap - optional)
boolean sprop-stereo (sender cap - optional)

RTCRtpCodecParameters.parameters we need:
ulong maxplaybackrate (sender param - optional)
ulong ptime (sender param - optional) -- doesn't this need to be more generalized too?
ulong maxaveragebitrate (sender param - optional)
boolean stereo (sender param - optional)
boolean cbr (sender param)
boolean useinbandfec (sender param)
boolean usedtx (sender param - optional)
ulong complexity - (sender param - optional) range 0-10 https://www.opus-codec.org/docs/html_api-1.1.0/group__opus__encoderctls.html#ga3483877bf1687a75dd4a1de6f85f291c
string signal - (sender param - optional) possible values are "auto", "music", "voice"
string application - (sender param - optional) possible values are "voip", "audio", "lowdelay"
ulong packet-loss-perc - (sender param - optional) possible values are 0-100 [i.e. expected packet loss]
boolean prediction-disabled (sender param - optional) = disable prediction
ulong sprop-maxcapturerate (receiver param - optional)
boolean sprop-stereo (receiver param - optional)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants