Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recording processing using MP4/OGG temporary files, stream copy, and … #11589

Merged

Conversation

abautu
Copy link
Contributor

@abautu abautu commented Mar 8, 2021

What does this PR do?

Processing meeting recording files using MP4/OGG temporary files, stream copy, and customizable (lower) frame rates.

Closes Issue(s)

closes #2483

Motivation

Speed up to processing of raw recording files.

More

  • Two new parameters were added to presentation.yml

@antobinary
Copy link
Member

@abautu Thank you for your contribution!
I have seen your username in relation to a few areas of BBB, most notably recording, but I also spot this is the first official PR contribution you have sent personally.
Could you please confirm you have filled out a CLA? https://docs.bigbluebutton.org/support/faq.html#why-do-i-need-to-sign-a-contributor-license-agreement-to-contribute-source-code ?

@antobinary antobinary added this to the Release 2.3 milestone Mar 8, 2021
@abautu
Copy link
Contributor Author

abautu commented Mar 8, 2021

@antobinary I just emailed my CLA. Thank you!

@fcecagno
Copy link
Member

fcecagno commented Mar 8, 2021

Asking here the same question I did in the issue, and didn't get an answer: is there any licencing implications for encoding the intermediary file with libx264 by default? If so, would it be possible to use libopenh264 instead, and make it the default option for BigBlueButton (since H264 is required for playback on Safari)?

@abautu
Copy link
Contributor Author

abautu commented Mar 8, 2021

@fcecagno Someone else would have to answer that. I'm no expert in licensing, but from what I just read tonight (e.g. https://video.stackexchange.com/questions/14694/mp4-h-264-patent-issues) licensing should not be an issue here.

Switching to libopenh264 could create a few problems:

  • it only supports baseline profile (i.e. video files will be larger)
  • it doesn't support constant rate factor mode (i.e. -crf parameter to keep quality within certain level)
  • it's not compiled with ffmpeg for Ubuntu 16.04 (but might be in 18.04, planned for BBB 2.3)

I don't understand your comment related to Safari. How is Safary related to the temporary files or to libopenh264?

@kepstin
Copy link
Contributor

kepstin commented Mar 8, 2021

Hi, I'd just put together some of my thoughts after looking through this PR. I think in general we do want to merge this change - most of the stuff looks straightforwards and provides an improvement with minimal loss of flexibility.

I like the use of a concat file (playlist) to generate the final video file from rather than concatenating an mpeg-ts file. That does give you a lot more flexibility in terms of choice of intermediate codecs, since you can use basically any container for the intermediate (mpegts was kind of limited there).

I'm a bit iffy on changing the audio codec on the intermediate to ogg vorbis instead; I see that it would save time when generating the audio files for recordings without video, but I expect it to be a wash when doing video encoding, since for mp4 output you have to transcode to aac afterwards. (And the extra lossy step is more likely to be noticable in audio than in video). Do you have any numbers on the performance difference with and without this change for video recordings?

The use of the -force_key_frame option to set the keyframe interval to fixed 10 seconds regardless of framerate is kind of clever. I was going to suggest just calculating a value for -g using the framerate, but this works. It could probably use a comment.

I think the addition of -bsf:v h264_mp4toannexb is unnecessary in the intermediate format, since it's doing a video encode, not copy. I could see that it might be needed when doing the concatenation later, but not in the temporary files. (And even then, I think it should be auto-inserted in the ffmpeg versions we require?)

Does this work with the ffmpeg we're currently building for BigBlueButton use here: https://launchpad.net/~bigbluebutton/+archive/ubuntu/support ? That's an ffmpeg 4.2.4 build, but a bunch of codecs and options have been disabled on it.

In a follow-up PR (not this one, I'd like to keep it nice and simple like it is!) it would be nice to pass the desired final output codec up to be used in the intermediate file, so you can also get the encoding speed improvements when webm output is selected (and allow people who want to avoid h264 completely to do so).

@abautu
Copy link
Contributor Author

abautu commented Mar 10, 2021

I'm a bit iffy on changing the audio codec on the intermediate to ogg vorbis instead; I see that it would save time when generating the audio files for recordings without video, but I expect it to be a wash when doing video encoding, since for mp4 output you have to transcode to aac afterwards. (And the extra lossy step is more likely to be noticable in audio than in video). Do you have any numbers on the performance difference with and without this change for video recordings?

I did some tests now. I used for testing an audio recording with male talk and one with instrumental music. First sample I got by cutting 30 minutes from the raw recording (/var/bigbluebutton/recording/raw/.....audio/meedingid.opus) of my lectures. The second sample is 2m30s long, downloaded from https://www.bensound.com/bensound-music/bensound-creativeminds.mp3. I converted the original files (opus and mp3) to 2 flac files to use as sources. Then, each file I converted to aac directly (flac -> aac) and indirectly (flac -> vorbis -> aac). I computed the correlation factor of each output file, compared to the source (flac) file and I got the following results:

  • voice: flac -> aac: 0.97934
  • voice: flac -> ogg -> aac: 0.97400
  • music: flac -> aac: 0.99997
  • music: flac -> ogg -> aac: 0.99744

So the quality loss is minimal (below 1%).

While doing this tests, I also noticed that freeswitch records the audio in mono mode. I'm not sure if this is a custom settings on our servers, but I doubt. Can you confirm that default raw audio recordings are in mono mode?

In this case, we could adjust the edl settings to also encode in mono mode. The filesize would be about the same (because is the same bitrate), but the quality improves a bit and the encoding time is reduced by 40%. Using the same steps as before, but with mono sources, I got the following correlation level

  • voice: flac -> aac: 0.99553
  • voice: flac -> ogg -> aac: 0.99153
  • music: flac -> aac: 0.99997
  • music: flac -> ogg -> aac: 0.99748

The use of the -force_key_frame option to set the keyframe interval to fixed 10 seconds regardless of framerate is kind of clever. I was going to suggest just calculating a value for -g using the framerate, but this works. It could probably use a comment.

I added them now.

I think the addition of -bsf:v h264_mp4toannexb is unnecessary in the intermediate format, since it's doing a video encode, not copy. I could see that it might be needed when doing the concatenation later, but not in the temporary files. (And even then, I think it should be auto-inserted in the ffmpeg versions we require?)

Indeed, it's a useless leftover from my attempts to concat via pipes fragmented MP4 segments (instead of mpegts).

Does this work with the ffmpeg we're currently building for BigBlueButton use here: https://launchpad.net/~bigbluebutton/+archive/ubuntu/support ? That's an ffmpeg 4.2.4 build, but a bunch of codecs and options have been disabled on it.

I didn't test, but I expect it to work ok, since I didn't use any new codecs or features (compared to what was already used).

@kepstin
Copy link
Contributor

kepstin commented Mar 10, 2021

While doing this tests, I also noticed that freeswitch records the audio in mono mode. I'm not sure if this is a custom settings on our servers, but I doubt. Can you confirm that default raw audio recordings are in mono mode?

Yes, the default freeswitch conference mixing is mono. It is possible to switch it to do stereo instead (you can have dynamic positional audio for speakers in the conference!), but that requires some manual changes and I don't think we even document how to do it.

Go ahead and make the changes to do mono audio if you like (should just require changing the values for FFMPEG_AEVALSRC and FFMPEG_AFORMAT), but either way this looks ready to merge.

@antobinary antobinary merged commit dff0f8b into bigbluebutton:develop Mar 11, 2021
@abautu abautu deleted the 2483-speedup-record-processing branch March 11, 2021 18:36
%w[-c:v libvpx-vp9 -b:v 750K -minrate 375K -maxrate 1088K -crf 33 -quality good -speed 1 -g 240 -tile-columns 1 -threads 2
-c:a libopus -b:a 48K
# We use force_key_frames instead of -g to set the GOP size independent of the frame rate
%w[-c:v libvpx-vp9 -crf 32 -deadline realtime -cpu-used 8 -force_key_frames expr:gte(t,n_forced*10) -tile-columns 2 -tile-rows 2 -threads 4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest reverting this to maximum use of 2 threads again. Otherwise, you'll have to increase the minimum hardware requirements.
This especially causes problems for default configs with only 6 CPU cores / 12 CPU threads.

cc @antobinary @ffdixon

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regarding this, please also see the discussion of #9119

Copy link
Collaborator

@basisbit basisbit Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, why is -cpu-used set to 8? wouldn't that tell ffmpeg to use 50% of the available time of the cpu threads (4) at most, making it less efficient? (see https://trac.ffmpeg.org/wiki/Encode/VP9#speed )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed this. We also shouldn't be using -deadline realtime when doing batch processing like this. The realtime mode is designed to run as close to 1× speed as possible by lowering encoding quality when needed, but we'd prefer having constant quality that simply encodes as fast as it can at the requested settings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@basisbit I see #9119 is also about switching from "good" to "realtime" and changing speed from 1 or 2 to 5. I got faster encoding with cpu_used 8 than speed 5, while keeping same quality. This is an example I did just now for a 37 minutes deskshare video:

-speed 5
encoding at 68fps (real time 2m34s, cpu time 5m35s)
SSIM Y:0.975077 (16.033934) U:0.993177 (21.660456) V:0.993852 (22.112394) All:0.981223 (17.263649)
PSNR y:28.416285 u:43.825114 v:43.393513 average:30.111927 min:7.914798 max:inf

-cpu_used 8
encoding at 84fps (real time 2m4s, cpu time 4m29s)
SSIM Y:0.974545 (15.942346) U:0.992998 (21.548044) V:0.993694 (22.002233) All:0.980812 (17.169781)
PSNR y:28.374842 u:43.735763 v:43.316691 average:30.069864 min:7.914682 max:inf

I believe https://trac.ffmpeg.org/wiki/Encode/VP9 has some obsolete/wrong information. For example, their CPU usage formula (100*(16-cpu-used)/16)% gets lower values when cpu_used gets higher (e.g. 0% when cpu_usage is 16). Also, it says that cpu_usage is 1...16, but when you use something higher than 8, ffmpeg errors with (tested on 16.04 and 20.04):

Value 16.000000 for parameter 'cpu-used' out of range [-8 - 8]
Error setting option cpu-used to value 16.

Regarding -threads 4, we need to keep in mind that 1 thread doesn't mean 1 CPU busy 100%. In fact, vp9 codec doesn't scale very well with the number of threads. When you have 4 threads most of them are working at aroud 40%, so you're using less than 2 cores. Here is an example I did on that same file:

- 8 thread (actually limited by libvp9 to 4 due to input resolution and tile-rows/columns)
339.43user 13.18system 3:09.33elapsed 186%CPU (0avgtext+0avgdata 197688maxresident)k
- 4 threads
346.91user 14.32system 3:17.90elapsed 182%CPU (0avgtext+0avgdata 197816maxresident)k
- 3 threads
segmentation fault
- 2 threads
319.64user 7.71system 3:31.64elapsed 154%CPU (0avgtext+0avgdata 190108maxresident)k
- 1 thread
212.43user 1.87system 3:06.47elapsed 114%CPU (0avgtext+0avgdata 186460maxresident)k

However, your comment made me think to something else: the settings for x264 are not limited in threads and unlike vp9, x264 will happily use as much as possible CPU:

- without threads (current pull)
324.41user 21.46system 1:08.27elapsed 506%CPU (0avgtext+0avgdata 273496maxresident)k
- with threads 4
296.78user 11.51system 1:38.49elapsed 312%CPU (0avgtext+0avgdata 156868maxresident)k

So maybe we should add -threads 4 to FFMPEG_WF_ARGS.

@kepstin I don't consider deadline as a real issue, as long as the encoder can do more than 1x (which means 5fps for deskshare or 15fps for webcams). My VMs do around 60fps (or 11,4x for deskshare, 3.8x for webcams). My guess is that anyone with vp9 encoding speed lower that 1.5x (in realtime mode), will decide not to use vp9 at all as the "good" enconding mode takes far too long.

%w[-c:v libvpx-vp9 -b:v 1024K -minrate 512K -maxrate 1485K -crf 32 -quality good -speed 2 -g 240 -tile-columns 2 -threads 2
-c:a libopus -b:a 48K
# We use force_key_frames instead of -g to set the GOP size independent of the frame rate
%w[-c:v libvpx-vp9 -crf 32 -deadline realtime -cpu-used 8 -force_key_frames expr:gte(t,n_forced*10) -tile-columns 2 -tile-rows 2 -threads 4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above regarding threads count

@basisbit
Copy link
Collaborator

related PR that now probably can be closed: #9796

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve speed of bbb-record
5 participants