Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periodic noise bursts in DTX #89

Closed
gustafullberg opened this issue May 8, 2018 · 3 comments
Closed

Periodic noise bursts in DTX #89

gustafullberg opened this issue May 8, 2018 · 3 comments

Comments

@gustafullberg
Copy link
Contributor

In calls with DTX enabled there is sometimes an issue with periodic "noise bursts" during the DTX period. The issue appears when the background noise is slighty unstationary. In my experience the problem is common and can be triggered by noise from ventilation etc.

To illustrate the problem I have encoded a 10 seconds long file containing speech and slowly increasing noise.

I encoded the file with opus_demo from master as of May 8th 2018 (commit 1b58446). I used DTX and the highest complexity mode.

./opus_demo voip 48000 1 32000 -complexity 10 -dtx original.raw encoded.raw
libopus 1.3-beta-33-g1b584467
Encoding 48000 Hz input at 32.000 kb/s in auto bandwidth with 960-sample frames.
average bitrate:              15.400 kb/s
maximum bitrate:              64.400 kb/s
active bitrate:               25.930 kb/s
bitrate standard deviation:   17.448 kb/s

The issue can easily be seen in the following spectrograms. The first spectrogram shows the original file with the encoded file below.
original
encoded

The reason for these clicks is a mismatch between two voice activity detectors: the Opus VAD and the Silk VAD.
The Opus VAD decides when to go into DTX. During DTX a packet is transmitted every 420 ms containing an update of the background noise. If the VAD in the Silk layer of the codec considers the signal to be active (type TYPE_UNVOICED or TYPE_VOICED instead of TYPE_NO_VOICE_ACTIVITY) the decoder will conceal the DTX region by using packet loss concealment (PLC) instead of pure comfort noise (CNG). This will case a noise burst every time a packet is decoded.

I have created two competing pull requests: #84 and #87 . The two PRs solve the issue slightly differently.

PR #84 avoids DTX when the Opus and Silk VADs do not agree.

Pros of #84:
Behaves similarly to lower complexity modes (where only the Silk VAD is used).

Cons of #84:
Results in higher bit-rates (in this example DTX is alomost not used at all).

Encoding with opus_demo from pull request 84:

./opus_demo_pr84 voip 48000 1 32000 -complexity 10 -dtx original.raw encoded_pr84.raw
libopus 1.3-beta-34-g2e635837
Encoding 48000 Hz input at 32.000 kb/s in auto bandwidth with 960-sample frames.
average bitrate:              31.296 kb/s
maximum bitrate:              64.400 kb/s
active bitrate:               33.355 kb/s
bitrate standard deviation:    6.355 kb/s

Spectrogram showing the file encoded with PR84
encoded_pr84

PR #87 passes the result of the Opus VAD to Silk. If the Opus VAD says no activity the maximum value of the Silk VAD is clamped to just below the activity threshold. The Silk encoder then produces a frame with type TYPE_NO_VOICE_ACTIVITY.

Pros of #87:
Does not alter the decision of when to enter DTX (same bit-rate as master)

Cons of #87:
Slightly more code than #84.

Encoding with opus_demo from pull request 87:

./opus_demo_pr87 voip 48000 1 32000 -complexity 10 -dtx original.raw encoded_pr87.raw
libopus 1.3-beta-34-gdbc27362
Encoding 48000 Hz input at 32.000 kb/s in auto bandwidth with 960-sample frames.
average bitrate:              15.386 kb/s
maximum bitrate:              64.400 kb/s
active bitrate:               25.941 kb/s
bitrate standard deviation:   17.431 kb/s

Spectrogram showing the file encoded with PR87
encoded_pr87

I made all audio files available here (raw and wav formats):
https://drive.google.com/drive/folders/1wY-_yz5I44QTccmV0lFohTWHVIGH2Vqx

Conclusion:
Both pull requests solves the issue, but the DTX behavior of #87 is more similar to the current DTX behavior. I suggest that #87 is merged to master and #84 is withdrawn.

@hlundin
Copy link

hlundin commented May 8, 2018

Thanks @gustafullberg.

The issue can actually be heard in real calls. It was found from user complaints.

@minyuel
Copy link

minyuel commented May 14, 2018

It would be great to get this landed soon.

@jmvalin
Copy link
Member

jmvalin commented May 25, 2018

Patch landed

@jmvalin jmvalin closed this as completed May 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants