New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recording processing using MP4/OGG temporary files, stream copy, and … #11589
Recording processing using MP4/OGG temporary files, stream copy, and … #11589
Conversation
…customizable (lower) frame rates. Discussion from bigbluebutton#2483
@abautu Thank you for your contribution! |
@antobinary I just emailed my CLA. Thank you! |
Asking here the same question I did in the issue, and didn't get an answer: is there any licencing implications for encoding the intermediary file with libx264 by default? If so, would it be possible to use libopenh264 instead, and make it the default option for BigBlueButton (since H264 is required for playback on Safari)? |
@fcecagno Someone else would have to answer that. I'm no expert in licensing, but from what I just read tonight (e.g. https://video.stackexchange.com/questions/14694/mp4-h-264-patent-issues) licensing should not be an issue here. Switching to libopenh264 could create a few problems:
I don't understand your comment related to Safari. How is Safary related to the temporary files or to libopenh264? |
Hi, I'd just put together some of my thoughts after looking through this PR. I think in general we do want to merge this change - most of the stuff looks straightforwards and provides an improvement with minimal loss of flexibility. I like the use of a concat file (playlist) to generate the final video file from rather than concatenating an mpeg-ts file. That does give you a lot more flexibility in terms of choice of intermediate codecs, since you can use basically any container for the intermediate (mpegts was kind of limited there). I'm a bit iffy on changing the audio codec on the intermediate to ogg vorbis instead; I see that it would save time when generating the audio files for recordings without video, but I expect it to be a wash when doing video encoding, since for mp4 output you have to transcode to aac afterwards. (And the extra lossy step is more likely to be noticable in audio than in video). Do you have any numbers on the performance difference with and without this change for video recordings? The use of the -force_key_frame option to set the keyframe interval to fixed 10 seconds regardless of framerate is kind of clever. I was going to suggest just calculating a value for -g using the framerate, but this works. It could probably use a comment. I think the addition of Does this work with the ffmpeg we're currently building for BigBlueButton use here: https://launchpad.net/~bigbluebutton/+archive/ubuntu/support ? That's an ffmpeg 4.2.4 build, but a bunch of codecs and options have been disabled on it. In a follow-up PR (not this one, I'd like to keep it nice and simple like it is!) it would be nice to pass the desired final output codec up to be used in the intermediate file, so you can also get the encoding speed improvements when webm output is selected (and allow people who want to avoid h264 completely to do so). |
I did some tests now. I used for testing an audio recording with male talk and one with instrumental music. First sample I got by cutting 30 minutes from the raw recording (/var/bigbluebutton/recording/raw/.....audio/meedingid.opus) of my lectures. The second sample is 2m30s long, downloaded from https://www.bensound.com/bensound-music/bensound-creativeminds.mp3. I converted the original files (opus and mp3) to 2 flac files to use as sources. Then, each file I converted to aac directly (flac -> aac) and indirectly (flac -> vorbis -> aac). I computed the correlation factor of each output file, compared to the source (flac) file and I got the following results:
So the quality loss is minimal (below 1%). While doing this tests, I also noticed that freeswitch records the audio in mono mode. I'm not sure if this is a custom settings on our servers, but I doubt. Can you confirm that default raw audio recordings are in mono mode? In this case, we could adjust the edl settings to also encode in mono mode. The filesize would be about the same (because is the same bitrate), but the quality improves a bit and the encoding time is reduced by 40%. Using the same steps as before, but with mono sources, I got the following correlation level
I added them now.
Indeed, it's a useless leftover from my attempts to concat via pipes fragmented MP4 segments (instead of mpegts).
I didn't test, but I expect it to work ok, since I didn't use any new codecs or features (compared to what was already used). |
Yes, the default freeswitch conference mixing is mono. It is possible to switch it to do stereo instead (you can have dynamic positional audio for speakers in the conference!), but that requires some manual changes and I don't think we even document how to do it. Go ahead and make the changes to do mono audio if you like (should just require changing the values for |
%w[-c:v libvpx-vp9 -b:v 750K -minrate 375K -maxrate 1088K -crf 33 -quality good -speed 1 -g 240 -tile-columns 1 -threads 2 | ||
-c:a libopus -b:a 48K | ||
# We use force_key_frames instead of -g to set the GOP size independent of the frame rate | ||
%w[-c:v libvpx-vp9 -crf 32 -deadline realtime -cpu-used 8 -force_key_frames expr:gte(t,n_forced*10) -tile-columns 2 -tile-rows 2 -threads 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest reverting this to maximum use of 2 threads again. Otherwise, you'll have to increase the minimum hardware requirements.
This especially causes problems for default configs with only 6 CPU cores / 12 CPU threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regarding this, please also see the discussion of #9119
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, why is -cpu-used
set to 8? wouldn't that tell ffmpeg to use 50% of the available time of the cpu threads (4) at most, making it less efficient? (see https://trac.ffmpeg.org/wiki/Encode/VP9#speed )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I missed this. We also shouldn't be using -deadline realtime
when doing batch processing like this. The realtime mode is designed to run as close to 1× speed as possible by lowering encoding quality when needed, but we'd prefer having constant quality that simply encodes as fast as it can at the requested settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@basisbit I see #9119 is also about switching from "good" to "realtime" and changing speed from 1 or 2 to 5. I got faster encoding with cpu_used 8
than speed 5
, while keeping same quality. This is an example I did just now for a 37 minutes deskshare video:
-speed 5
encoding at 68fps (real time 2m34s, cpu time 5m35s)
SSIM Y:0.975077 (16.033934) U:0.993177 (21.660456) V:0.993852 (22.112394) All:0.981223 (17.263649)
PSNR y:28.416285 u:43.825114 v:43.393513 average:30.111927 min:7.914798 max:inf
-cpu_used 8
encoding at 84fps (real time 2m4s, cpu time 4m29s)
SSIM Y:0.974545 (15.942346) U:0.992998 (21.548044) V:0.993694 (22.002233) All:0.980812 (17.169781)
PSNR y:28.374842 u:43.735763 v:43.316691 average:30.069864 min:7.914682 max:inf
I believe https://trac.ffmpeg.org/wiki/Encode/VP9 has some obsolete/wrong information. For example, their CPU usage formula (100*(16-cpu-used)/16)%
gets lower values when cpu_used gets higher (e.g. 0% when cpu_usage is 16). Also, it says that cpu_usage is 1...16, but when you use something higher than 8, ffmpeg errors with (tested on 16.04 and 20.04):
Value 16.000000 for parameter 'cpu-used' out of range [-8 - 8]
Error setting option cpu-used to value 16.
Regarding -threads 4
, we need to keep in mind that 1 thread doesn't mean 1 CPU busy 100%. In fact, vp9 codec doesn't scale very well with the number of threads. When you have 4 threads most of them are working at aroud 40%, so you're using less than 2 cores. Here is an example I did on that same file:
- 8 thread (actually limited by libvp9 to 4 due to input resolution and tile-rows/columns)
339.43user 13.18system 3:09.33elapsed 186%CPU (0avgtext+0avgdata 197688maxresident)k
- 4 threads
346.91user 14.32system 3:17.90elapsed 182%CPU (0avgtext+0avgdata 197816maxresident)k
- 3 threads
segmentation fault
- 2 threads
319.64user 7.71system 3:31.64elapsed 154%CPU (0avgtext+0avgdata 190108maxresident)k
- 1 thread
212.43user 1.87system 3:06.47elapsed 114%CPU (0avgtext+0avgdata 186460maxresident)k
However, your comment made me think to something else: the settings for x264 are not limited in threads and unlike vp9, x264 will happily use as much as possible CPU:
- without threads (current pull)
324.41user 21.46system 1:08.27elapsed 506%CPU (0avgtext+0avgdata 273496maxresident)k
- with threads 4
296.78user 11.51system 1:38.49elapsed 312%CPU (0avgtext+0avgdata 156868maxresident)k
So maybe we should add -threads 4
to FFMPEG_WF_ARGS.
@kepstin I don't consider deadline
as a real issue, as long as the encoder can do more than 1x (which means 5fps for deskshare or 15fps for webcams). My VMs do around 60fps (or 11,4x for deskshare, 3.8x for webcams). My guess is that anyone with vp9 encoding speed lower that 1.5x (in realtime mode), will decide not to use vp9 at all as the "good" enconding mode takes far too long.
%w[-c:v libvpx-vp9 -b:v 1024K -minrate 512K -maxrate 1485K -crf 32 -quality good -speed 2 -g 240 -tile-columns 2 -threads 2 | ||
-c:a libopus -b:a 48K | ||
# We use force_key_frames instead of -g to set the GOP size independent of the frame rate | ||
%w[-c:v libvpx-vp9 -crf 32 -deadline realtime -cpu-used 8 -force_key_frames expr:gte(t,n_forced*10) -tile-columns 2 -tile-rows 2 -threads 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above regarding threads count
related PR that now probably can be closed: #9796 |
What does this PR do?
Processing meeting recording files using MP4/OGG temporary files, stream copy, and customizable (lower) frame rates.
Closes Issue(s)
closes #2483
Motivation
Speed up to processing of raw recording files.
More