New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
screenshare: add rap processing for screen streams with audio (complements #11622) #11626
Conversation
use libopus decoder and encoder, its better than built-in ffmpeg/flac don't mix screenshare audio with mics, was generating desync with bad audio segments, encode it together with video file (TODO: needs adjustments in playback)
…lems in iOS, fixed desync with aresample
Hmm. I don't really want to do the audio mixing with an extra step after rendering like this - instead, the render function should be modified to allow multiple parallel audio streams in the edl that it mixes together during the render as a single step. This will speed it up, and also remove a generation of lossy encoding. This would be analogous to how the video rendering code supports multiple "areas". There's no reason for there to be a separate internal audio format for screensharing audio. We should probably just switch everything to opus. There's some complications about the volume levels when using amix like this - by default it attenuates both audio sources when mixing, so the result will have noticeably lower volume in the sections when only one of the two audio sources was active. (The ability to disable that function is only present in unreleased git ffmpeg) |
@@ -155,6 +185,7 @@ def self.render(edl, output_basename) | |||
if audioinfo[input[:filename]][:format][:format_name] == 'wav' | |||
ffmpeg_cmd += ['-ignore_length', '1'] | |||
end | |||
screenshare ? ffmpeg_cmd += ['-c:a', 'libopus'] : nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you using the libopus
decoder here instead of ffmpeg's builtin opus decoder? Have you had any specific issues with the builtin decoder causing problems?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Higher frequency of A/V de-sync occurrences and a bit worse handling of lossy raw streams.
Don't have any data on that anymore, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you confirm whether libopus is enabled in the ffmpeg package 2.3 is using?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're having problems like that with the internal opus decoder, we should probably switch decoding all opus to use libopus. In any case, selecting the libopus decoder needs to be made conditional on the audioinfo saying that the codec is 'opus'.
I added the libopus decoder to the ffmpeg build a while back, search for "Enabled decoders" in the build log to see what's there: https://launchpadlibrarian.net/500061740/buildlog_ubuntu-bionic-amd64.ffmpeg_7%3A4.2.4-1ubuntu0.1bbb2~18.04_BUILDING.txt.gz
@@ -22,8 +22,11 @@ module EDL | |||
module Audio | |||
FFMPEG_AEVALSRC = "aevalsrc=s=48000:c=stereo:exprs=0|0" | |||
FFMPEG_AFORMAT = "aformat=sample_fmts=s16:sample_rates=48000:channel_layouts=stereo" | |||
FFMPEG_AFORMAT_SCREENSHARE = "aresample=async=1000,aformat=sample_fmts=s16:sample_rates=48000:channel_layouts=stereo" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it a bit, we should probably apply the aresample=async=XXX
filter to all of the audio inputs. It won't hurt anything and who knows, it might help improve audio sync in a few cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might help improve audio sync in a few cases
Yeah. While I didn't implement this, I suggested the resample to try and tackle desync issues @germanocaumo was dealing with on streams riddled with gaps due to simulated packet drops. It did improve it by a large margin. So might be a good idea to put it in the other inputs as well yes.
I'd do it in a separate PR if possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like this PR to use the same audio format for all audio, there's no reason to special-case the screenshare audio.
This incorporates only the audio desync related changes from bigbluebutton#11626 * Add the aresample filter with async option to fill in timestamp gaps * Use the libopus decoder for opus audio instead of ffmpeg's builtin decoder
This pull request has conflicts ☹ |
inputs.each do |input| | ||
ffmpeg_cmd += ['-i', input] | ||
end | ||
ffmpeg_cmd += ['-filter_complex', "amix"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should explicitly set the number of inputs to the amix filter here, in case a number of inputs ≠ 2 is provided.
ffmpeg_cmd += ['-filter_complex', "amix"] | |
ffmpeg_cmd += ['-filter_complex', "amix=inputs=#{inputs.length}"] |
:audio => nil | ||
} | ||
|
||
events.xpath('/recording/event[@module="Deskshare" or (@module="bbb-webrtc-sfu" and (@eventname="StartWebRTCDesktopShareEvent" or @eventname="StopWebRTCDesktopShareEvent"))]').each do |event| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the "Deskshare" module (I think that's the old java screensharing?) can ever have audio, so this should only look at the WebRTC events.
What does this PR do?
Lifts @germanocaumo's (Mconf) work on making RaP correctly process raw screen sharing webm files with audio tracks in them.
Complements #11622 by adding the full recording capability to it.
Closes Issue(s)
#8632 (combined with #11622)
More
A few things about this:
libopus
to handle the decoding of the webm audio tracks. I'm almost certain it is built with our 2.3 ffmpeg package?