Split html5-server in multiple processes to larger meetings #10349

amguirado73 · 2020-08-28T11:24:14Z

What does this PR do?

This PR allows to split html5-server in multiple processes. It is inspired by PR # 8788 but with a different approach. The main idea is to create multiple html5-server processes that allow bypassing the current limitation imposed by the fact that node executes practically in a single thread. In this way, you can get more users per meeting.

Motivation

Currently, there is a limitation whereby a meeting can have between 100-200 users depending on the restrictions applied to users. I would like get more users per meeting.

More

Using an environment variable METEOR_ROLE = [backend | frontend] multiple processes can be started. There should only be one backend process that will handle all events related to the mongoDB. The frontend processes, which only listen to the frontend-redis-channel redis channel, will only process certain messages. It is possible to start as many frontend processes as desired. NGINX is used to balance users' meteor connections.

Examples or use in file /usr/share/meteor/bundle/systemd_start.sh

#For 2 process FRONTEND and 1 BACKEND
/usr/bin/npx concurrently -n 'backend,frontend1,frontend2' "env METEOR_ROLE=backend PORT=3000 /usr/share/$NODE_VERSION/bin/node main.js" "env METEOR_ROLE=frontend PORT=3001 /usr/share/$NODE_VERSION/bin/node main.js" "env METEOR_ROLE=frontend PORT=3002 /usr/share/$NODE_VERSION/bin/node main.js"

#For 3 process FRONTEND and 1 BACKEND
/usr/bin/npx concurrently -n 'backend,frontend1,frontend2,frontend3' "env METEOR_ROLE=backend PORT=3000 /usr/share/$NODE_VERSION/bin/node main.js" "env METEOR_ROLE=frontend PORT=3001 /usr/share/$NODE_VERSION/bin/node main.js" "env METEOR_ROLE=frontend PORT=3002 /usr/share/$NODE_VERSION/bin/node main.js" "env METEOR_ROLE=frontend PORT=3003 /usr/share/$NODE_VERSION/bin/node main.js"

The configuration to apply in NGINX is:

File: /etc/nginx/nginx.conf

upstream poolhtml5servers {
            zone poolhtml5servers 32k;
            hash $ remote_addr;
            server 127.0.0.1:3001 fail_timeout = 5s max_fails = 3;
            server 127.0.0.1:3002 fail_timeout = 5s max_fails = 3;
    }

File /etc/bigbluebutton/nginx/bbb-html5.nginx
location / html5client {
proxy_pass http: // poolhtml5servers;
proxy_http_version 1.1;
proxy_set_header Upgrade $ http_upgrade;
proxy_set_header Connection "Upgrade";
}

The hash command $ remote_addr is used to ensure that each user always goes to the same html5server.

As collateral effects, a problem has been observed with external videos, when the presenter makes start / stop or reposition the video. To do this, the part of external videos has been modified for events that generate messages to Redis channels and thus be able to be received by all existing frontend processes. I have reused PR#7484 to do this.

Also, the bannedusers part has been modified so that mongoDB is used instead of a Set local to each process.

Changes to be committed: new file: akka-bbb-apps/src/main/scala/org/bigbluebutton/core/apps/externalvideo/ExternalVideoApp2x.scala new file: akka-bbb-apps/src/main/scala/org/bigbluebutton/core/apps/externalvideo/StartExternalVideoPubMsgHdlr.scala new file: akka-bbb-apps/src/main/scala/org/bigbluebutton/core/apps/externalvideo/StopExternalVideoPubMsgHdlr.scala new file: akka-bbb-apps/src/main/scala/org/bigbluebutton/core/apps/externalvideo/UpdateExternalVideoPubMsgHdlr.scala modified: akka-bbb-apps/src/main/scala/org/bigbluebutton/core/pubsub/senders/ReceivedJsonMsgHandlerActor.scala modified: akka-bbb-apps/src/main/scala/org/bigbluebutton/core/running/MeetingActor.scala modified: akka-bbb-apps/src/main/scala/org/bigbluebutton/core2/FromAkkaAppsMsgSenderActor.scala new file: bbb-common-message/src/main/scala/org/bigbluebutton/common2/msgs/ExternalVideoMsgs.scala new file: bigbluebutton-html5/imports/api/external-videos/server/eventHandlers.js new file: bigbluebutton-html5/imports/api/external-videos/server/handlers/startExternalVideo.js new file: bigbluebutton-html5/imports/api/external-videos/server/handlers/stopExternalVideo.js new file: bigbluebutton-html5/imports/api/external-videos/server/handlers/updateExternalVideo.js modified: bigbluebutton-html5/imports/api/external-videos/server/index.js modified: bigbluebutton-html5/imports/api/external-videos/server/methods.js modified: bigbluebutton-html5/imports/api/external-videos/server/methods/emitExternalVideoEvent.js modified: bigbluebutton-html5/imports/api/external-videos/server/methods/startWatchingExternalVideo.js modified: bigbluebutton-html5/imports/api/external-videos/server/methods/stopWatchingExternalVideo.js new file: bigbluebutton-html5/imports/api/external-videos/server/streamer.js modified: bigbluebutton-html5/imports/api/meetings/server/handlers/meetingDestruction.js modified: bigbluebutton-html5/imports/api/meetings/server/modifiers/addMeeting.js modified: bigbluebutton-html5/imports/api/meetings/server/modifiers/meetingHasEnded.js modified: bigbluebutton-html5/imports/api/users/server/handlers/validateAuthToken.js modified: bigbluebutton-html5/imports/api/users/server/store/bannedUsers.js modified: bigbluebutton-html5/imports/startup/server/index.js modified: bigbluebutton-html5/imports/startup/server/redis.js modified: bigbluebutton-html5/imports/ui/components/external-video-player/service.js modified: bigbluebutton-html5/private/config/settings.yml

antobinary · 2020-08-28T20:19:07Z

Hi @amguirado73 ! Thank you for your contribution! Could you please confirm you have filled out a CLA? https://docs.bigbluebutton.org/support/faq.html#why-do-i-need-to-sign-a-contributor-license-agreement-to-contribute-source-code ?

amguirado73 · 2020-08-29T09:36:53Z

Hello, I just signed it and sent it by email. If there is something that is not clear in the PR, please let me know to clarify it. Thank you. Regards. El vie., 28 ago. 2020 a las 22:19, Anton Georgiev (<notifications@github.com>) escribió:

…

Hi @amguirado73 <https://github.com/amguirado73> ! Thank you for your contribution! Could you please confirm you have filled out a CLA? https://docs.bigbluebutton.org/support/faq.html#why-do-i-need-to-sign-a-contributor-license-agreement-to-contribute-source-code ? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10349 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/APQOHHZ7TFPFZ2HWYO7TFW3SDAGMTANCNFSM4QOAZT3Q> .

ffdixon · 2020-09-12T12:35:24Z

Contributor agreement received -- thanks!

jibon57 · 2020-09-20T15:57:54Z

@amguirado73 thanks a lot for introducing this new PR also during email conversation. We've tested your PR in release build 2.2.23 & 2.2.25. We've tested with 2,000 real users using following machines:

2 X E5-2680 v2 @ 2.80GHz (total 40 cores)
62GB RAM
1Gbps uplink
&
2 X E5-2650 @ 2.00GHz (total 32 cores)
62GB RAM
1Gbps uplink

We've configured 5 html5 pool & tested over 1 week average 30~70 users in each session. We're really happy as all test goes well & found following observation:

Each server was able to hold 800+ users without any problem. In some case 1100+ users & we didn't notice much problem. During this time webcams were very few. We saw most of the CPU was using by FreeSWITCH (1200%). Nodejs was average 40%.
Each server was able to hold 480+ users & 110+ webcams. After that new meeting user was facing problem to share webcams. They're getting Error: 2003. During testing we split Kurento into 3 parts which was introduced v2.2.24. Also we decreased bitrate to 50kbit/s. Pagination was on where Moderator: 10 & Attendee: 5. During this time user didn't face problem to join with audio but faced problem with webcams only. This time Kurento CPU usage reached to 700%.

Our target was to hold 600 users on each server & we're happy to find it working. Hope BBB core team will have a look on this & merge/further improve. Thanks again to @amguirado73 for all your help during run the test.

ffdixon · 2020-09-20T19:18:28Z

Thanks @jibon57 for sharing your experience! This work (or a variation of it) will be merged into BigBlueButton 2.3-dev.

iSamof · 2020-09-20T20:01:42Z

@jibon57 - thank you very much for this valuable live test and sharing the results. One question:

When you say "Both server was able to hold 800+ users ..." do you mean both servers together had a total of 800+ users or each server managed to service 800+ ... a total of 1600+ between the two servers?

Thank you again

jibon57 · 2020-09-21T04:22:13Z

@ffdixon thank you!
@iSamof each of the server was able to hold 800+ users & tested upto 1100+ on each.

aguerson · 2020-09-21T06:56:06Z

Hi guys,
You are on on the right way ;)

I have 4 questions

On your 5 HTML servers, one meeting exploded on the 5, or one meeting will stay always in 1 of the 5 servers ?
Could you increase the barrier of 5 server to up to 10 or more ?
When you said "you decreased the bitrate to 50kbit/s", where did you did this ?
Maybe I missed something on the doc, when you said "the pagination was on where Moderator: 10 & Attendee: 5" I activated the pagination in 2.2.25, this is not automatic ? you have to push a button during a meeting ? Active an option on the meeting ?

Regards,
Aurélien.

aguerson · 2020-09-21T06:56:39Z

And thank you again !

amguirado73 · 2020-09-21T07:09:29Z

Hi, I can answer questions 1 and 2: #1 Users sessions are balanced by nginx using source IP address. A meeting can be distributed in several processes. #2 There are no tests about it. You can start as many process as you want. Obviously, an optimal number of processes must be found. A possible sizing rule is 100-150 users per process, similarly to the current limit. Regards. El 21/09/2020 a las 8:56, Aurélien GUERSON escribió:

…

Hi guys, You are on on the right way ;) I have 4 questions 1. On your 5 HTML servers, one meeting exploded on the 5, or one meeting will stay always in 1 of the 5 servers ? 2. Could you increase the barrier of 5 server to up to 10 or more ? 3. When you said "you decreased the bitrate to 50kbit/s", where did you did this ? 4. Maybe I missed something on the doc, when you said "the pagination was on where Moderator: 10 & Attendee: 5" I activated the pagination in 2.2.25, this is not automatic ? you have to push a button during a meeting ? Active an option on the meeting ? Regards, Aurélien. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10349 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/APQOHH73QOCO5BG4EYKLKXLSG32JLANCNFSM4QOAZT3Q>.

aguerson · 2020-09-21T07:27:00Z

@amguirado73
Very good news !
When this PR will be validated to be usable with 2.2.25+ ? Did you have some more doc to configure nginx ? I saw the doc on the PR desc ? is that enough to play with it ?

@ffdixon
Do you intend to do a doc on the subject in the official BBB doc ? It is too fresh ?

Without this PR and without the 3 KMS PR, I managed to accept 450 persons with 2cam in one meeting ! ( but the chat at the end didn't respond,... and the meeting freezed )...

with this server ( OpenVZ 7 installed and CTs in ubuntu 16.04 LTS used )
Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz ( 128 procs )
187 Go RAM
10g uplink ( but we didnt't up to 1G )

I already update to 2.2.25 to user the 3 KMS PR, and I hope with this PR, I could finaly take more than 500 users in one sessions.

aguerson · 2020-09-21T07:27:19Z

I managed to accept 450 persons with 2.2.23

mabras · 2020-09-21T07:54:40Z

Thanks @amguirado73

I am just confused about:

Currently, there is a limitation whereby a meeting can have between 100-200 users depending on the restrictions applied to users. I would like get more users per meeting.

A possible sizing rule is 100-150 users per process, similarly to the current limit.

The motivation was to get more, if we still in the same limit then what is point? If I am using Scalelite then this not adding any value?

amguirado73 · 2020-09-21T08:16:58Z

Hi, Although the initial idea of the patch was to have more users per meeting, (and it still is), Jibon has tested that the patch allows to have more users within the same server. 100-150 users per process is a basic rule of thumb to see how many html5-servers should be started to avoid problems. This is a complement for scalalite. It let you grow inside server. About if we still in the same limit for a meeting, I have not enough resources to do tests for this number of users (I can generate only 100 bots). I would appreciate any kind of help about it. Regards El 21/09/2020 a las 9:54, Mohamad Abras escribió:

…

Thanks @amguirado73 <https://github.com/amguirado73> I am just find this confusing Currently, there is a limitation whereby a meeting can have between 100-200 users depending on the restrictions applied to users. I would like get more users per meeting. A possible sizing rule is 100-150 users per process, similarly to the current limit. The motivation was to get more, if we still in the same limit then what is point?. If I am using Scalelite then this not adding any value? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10349 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/APQOHHYUWHBN2E2RVVK4A33SG4BE7ANCNFSM4QOAZT3Q>.

aguerson · 2020-09-21T08:41:00Z

How can you generate 100 bots, have you got a script ?
Could you share it ?

I want to test it to charge my server with the max bots users I could.

I could give you a feedback after.

mabras · 2020-09-21T12:34:27Z

About if we still in the same limit for a meeting, I have not enough resources to do tests for this number of users (I can generate only 100 bots). I would appreciate any kind of help about it.

Just yesterday I was using:
https://github.com/mconf/bigbluebot
I was able to generate 250 bots with it. Five server, each 50 bots.

amguirado73 · 2020-09-21T12:38:37Z

Hi, could you tell us what type of server you have?. Thank you very much. Regards El 21/09/2020 a las 14:34, Mohamad Abras escribió:

…

About if we still in the same limit for a meeting, I have not enough resources to do tests for this number of users (I can generate only 100 bots). I would appreciate any kind of help about it. Just yesterday I was using: https://github.com/mconf/bigbluebot I was able to generate 250 bots with it. Five server, each 50 bot. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10349 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/APQOHHZ2SIRS2U4GPTYBMPTSG5B6JANCNFSM4QOAZT3Q>.

aguerson · 2020-09-22T11:55:47Z

I am always trying to implement bigbluebot. I opened an issue to use it

mconf/bigbluebot#11

iSamof · 2020-09-23T20:59:23Z

@jibon57 - Many thanks for the answers.

Thanks to everyone else for sharing the info.

jibon57 · 2020-09-24T04:07:42Z

Hello guys,

I was checking PM2 cluster mode from here: https://pm2.keymetrics.io/docs/usage/cluster-mode/
As in this solution backend & frontend was separated so using this tool may be more helpful instated of using nginx load balancing. PM2 has better monitoring system. I did some simple test but not sure if it was correct way to do so but was working. As you're testing with bot then you can give try & see the difference. So, I've installed pm2 as global by
npm install pm2@latest -g.

Now edit /usr/share/meteor/bundle/systemd_start.sh:

#PORT=3000 /usr/share/$NODE_VERSION/bin/node main.js
env METEOR_ROLE=backend PORT=3000 pm2 start main.js --restart-delay=3000 --name backend
env METEOR_ROLE=frontend PORT=3001 pm2 start main.js -i 5 --restart-delay=3000 --name frontend

Nginx config: /etc/bigbluebutton/nginx/bbb-html5.nginx:

location /html5client {
  proxy_pass http://127.0.0.1:3001;
  proxy_http_version 1.1;
  proxy_set_header Upgrade $http_upgrade;
  proxy_set_header Connection "Upgrade";
}

Now start process
bash /usr/share/meteor/bundle/systemd_start.sh
It should start process. Now you can monitor
pm2 monit

aguerson · 2020-09-24T07:29:29Z

In my case, I am always trying to find enough ressources to run the 500+ Bots.
Next I will try the first solution.
@jibon57 In your solution, did you apply first the commit and next you patch ? or just your patch ?

GhaziTriki · 2020-11-04T19:46:52Z

Hi @jibon57
Thank you very much for your pull request. I have checked it and it seems good if you want to have more concurrent users on the same server, however it does not seem to resolve another problem, I mean having more than 100 concurrent users per meeting.

Your servers have a lot of RAM, if you use PM2 to have a cluster then you will use a lot of RAM too and I think this is what is happening on your servers, same for your CPU you are using a lot of it.

jibon57 · 2020-11-05T16:09:18Z

Thanks @GhaziTriki . Actually PR from @amguirado73 . For my case I didn't need to have more than 100 users capacity in per room that's why didn't perform test. But @amguirado73 did test with over 320 users. So far this solution is working fine for my case.

GhaziTriki · 2020-11-05T19:26:20Z

@amguirado73 How much efforts is it to create a similar PR 2.2.x? I am interested into testing it.

netzwerk-azv · 2020-11-06T17:37:03Z

@amguirado73 How much efforts is it to create a similar PR 2.2.x? I am interested into testing it.

netzwerk-azv · 2020-11-06T17:41:21Z

Will this be included in one of the 2.2.xx releases or only in 2.3? I only read someone manage to test this in 2.2.23 and 2.2.25 but I am sure this would help a lot of users -right now- facing problems getting enough users on there servers and meetings within limited resources available.
And we dont know when a stable 2.3 will be released

I am currently on 2.2.29 (dev)

schrd · 2020-11-23T20:26:16Z

We are running this patch adapted for 2.2.28 in production on 12 servers since a week with 2 frontend processes each. It works very well so far. We decided to give this patch a try because scalelite balancing strategy is so stupid. We have a very unequal distribution of conference sizes. Two weeks ago scalelite decided to balance two conferences with 150 participants each onto a server which was already loaded with 100 users. Of course the inevitable kicking of users happened. In a load test with 100 desktop computers running 5 bigbluebots each, this patch performed very well, we got 475 bots joined into 4 conferences.

However today one nodejs frontend processes crashed with

Nov 23 13:56:16 bbb-server systemd_start_frontend.sh[1151]: terminate called after throwing an instance of 'std::bad_alloc'
Nov 23 13:56:16 bbb-server systemd_start_frontend.sh[1151]:   what():  std::bad_alloc
Nov 23 13:56:16 bbb-server systemd_start_frontend.sh[1151]: /usr/share/meteor/bundle/systemd_start_frontend.sh: line 66:  2066 Aborted                 PORT=$1 /usr/share/$NODE_VERSION/bin/node main.js

I assume that this is a nodejs bug and is not related to the patch and I believe I saw this message some time ago as well without the patch. Memory was not short on the system. Existing users on the server were immediately balanced to the other frontend process by nginx. I doubt that anyone noticed that error. However when the process was restarted by systemd and nginx balanced connections to it, I discovered the following messages in the log:

Nov 23 13:56:42 bbb-server systemd_start_frontend.sh[8892]: error: Error while trying to send cursor streamer data for meeting xxxxxxx-yyy. TypeError: Cannot read property 'emit' of undefined

I could reproduce this on our test server by restarting one frontend process. The cause is that cursor position and annotations in the presentations are distributed between the users by the meteor-streamer library. The subscriptions of meteor-streamer are started by addMeeting events. The freshly started process however never got those events for the meetings already running. Distribution of annotations and cursor position worked for fresh conferences.

I would like to contribute a patch that fixes this behaviour, however I am not understanding why this meteor-streamer stuff works and why the replication of annotation state between the two meteor processes works at all. To me this meteor-streaming stuff looks like it should broadcast the messages between all users in a conference, but just within the same nodejs process.

Can anybody give me a hint where to look at and how to improve that patch that it will continue to work if meteor is restarted? My current idea is to query the mongodb at server startup for meetings and register those meetings for streaming of cursor and annotations.

github-actions · 2020-12-04T00:50:21Z

This pull request has conflicts ☹
Please resolve those so we can review the pull request.
Thanks.

ichdasich · 2020-12-09T22:15:25Z

With 2.3 taking its time, is there any chance to maybe get this into one of the coming 2.2 releases?

ffdixon · 2020-12-09T22:34:30Z

There is work underway to get a variation of this approach into the next build of BigBlueButton 2.3-dev.

ichdasich · 2020-12-09T22:36:53Z

So, not planned for 2.2? Guess I will have to wait for the 2.3 release then.

ffdixon · 2020-12-09T22:45:42Z

It's easier for us to implement and test this in 2.3-dev. We're weaning off of 2.2 soon as we want to focus our efforts on the next release and get the product onto Ubuntu 18.04 asap.

cod3r0k · 2020-12-10T11:09:03Z

@ffdixon Can we migrate from 2.2 to 2.3? or we can't? (I'm worry about my prev recorded views)

ffdixon · 2020-12-10T15:19:09Z

We would recommend setting up 2.3-dev on a new 18.04 server, leaving your 16.04 server untouched. The recommendation would be to then copy over the recordings from 2.2 onto the 2.3-dev server. This way, nothing changes on the 2.2 server and you can test the 2.3-dev server indepednetly.

schrd · 2021-01-04T18:00:19Z

We are running this patch adapted for 2.2.28 in production on 12 servers since a week with 2 frontend processes each. It works very well so far. We decided to give this patch a try because scalelite balancing strategy is so stupid. We have a very unequal distribution of conference sizes. Two weeks ago scalelite decided to balance two conferences with 150 participants each onto a server which was already loaded with 100 users. Of course the inevitable kicking of users happened. In a load test with 100 desktop computers running 5 bigbluebots each, this patch performed very well, we got 475 bots joined into 4 conferences.

However today one nodejs frontend processes crashed with
Nov 23 13:56:16 bbb-server systemd_start_frontend.sh[1151]: terminate called after throwing an instance of 'std::bad_alloc'
Nov 23 13:56:16 bbb-server systemd_start_frontend.sh[1151]:   what():  std::bad_alloc
Nov 23 13:56:16 bbb-server systemd_start_frontend.sh[1151]: /usr/share/meteor/bundle/systemd_start_frontend.sh: line 66:  2066 Aborted                 PORT=$1 /usr/share/$NODE_VERSION/bin/node main.js

We are running this patch since 7 weeks with more than 32k meeting hours and more than 200k participant hours. A meeting with 10 users for one hour is counted as 1 meeting hour and 10 participant hours. We ported this to 2.2.30 in the mean time. In total we hat 3 nodejs crashes across all of our servers since then. So I assume that this is safe. At least I never hat to struggle with overloaded nodejs processes since then. It also reduces the latency of actions such as mute/unmute on servers with more than 200 concurrent users.

Compared to #11008 this approach is superior in my optinion:

you can have more participants in a meeting than a single nodejs instance can handle. This would allow to run BBB on servers with many but slower cores (such as ARM servers)
no meeting routing logic in bbb-web is required. This reduces complexity.
if one meteor instance crashes for whatever reason, nginx will rebalance the users to the other running frontend. The user will only notice the reconnection notification when meteor reestablishes its websocket. In contrast to the current situation without this patch: if nodejs crashes, all meetings are destroyed on this server
The frontend processes can be setup with systemd options RestartSec=5s and Restart=on-failure (we run it this way)

I setup the nginx loadbalancer to use the number of connections for deciding to which meteor process to use. We run it with 2 workers currently.

ichdasich · 2021-01-04T18:54:30Z

Do you maybe have documentation for applying the patch to 2.2.30?

Also: Does applying the patch require repackaging BBB, and are there any changes in 2.2.30 which require adjustments to the patch?

schrd · 2021-01-05T10:35:44Z

Do you maybe have documentation for applying the patch to 2.2.30?

Also: Does applying the patch require repackaging BBB, and are there any changes in 2.2.30 which require adjustments to the patch?

You need to rebuild bbb-html5 and bbb-apps-akka packages. In addition a new systemd unit bbb-html5-frontend@.service is neccesary which serve the user requests. Nginx needs a loadbalancer config:

# /etc/nginx/conf.d/bbb-html5-loadbalancer.conf 
upstream poolhtml5servers {
  zone poolhtml5servers 32k;
  least_conn;
  server 127.0.0.1:3001 fail_timeout=5s max_fails=3;
  server 127.0.0.1:3002 fail_timeout=5s max_fails=3;
}

# /etc/bigbluebutton/nginx/bbb-html5.nginx
location /html5client {
  proxy_pass http://poolhtml5servers;
  proxy_http_version 1.1;
  proxy_set_header Upgrade $http_upgrade;
  proxy_set_header Connection "Upgrade";
}

location /_timesync {
  proxy_pass http://127.0.0.1:3000;
}

The patched source tree is here: https://gitlab.hrz.tu-chemnitz.de/bigbluebutton/bigbluebutton/-/tree/pr-10349-2.2.30
To create packages I created https://gitlab.hrz.tu-chemnitz.de/bigbluebutton/bigbluebutton-packaging
I did not yet patch bbb-conf because we start/stop services using our configuration management system.

The systemd unit is included in https://gitlab.hrz.tu-chemnitz.de/bigbluebutton/bigbluebutton-packaging/-/blob/master/bbb-html5/bbb-html5-frontend@.service

antobinary · 2021-02-09T20:35:03Z

I am looking at the nginx load balancing strategies trying to understand more about the way frontend f1 is preferred over frontend f2 when a new user is joining. I understand that the hash approach is ensuring that if a user refreshes, the same frontend will be used as for the previous connection.
Could anyone comment on this?
Was the 'Least Connections' approach considered?

schrd · 2021-02-10T19:40:02Z

We are running it with least_conn as balancing strategy. This way the load on the frontends is almost equal.

jibon57 · 2021-02-11T03:56:42Z

Before I always tried with hash $remote_addr; but last 2 days tried with least_conn. However I didn't notice much differences.

I also tried with pm2. Using pm2 I got very good result too.

antobinary · 2021-02-19T23:49:20Z

2.3-alpha7 was just released and it included #11317 which is based on this work of @amguirado73 and @jfsiebel https://github.com/bigbluebutton/bigbluebutton/releases/tag/v2.3-alpha-7

Please give it a try if you have the opportunity to put some load on a 2.3-alpha7 server (not in production)

amguirado73 changed the title ~~Committer: Antonio Guirado <amguirado73@gmail.com>~~ Split html5-server in multiple processes to larger meetings Aug 28, 2020

antobinary added module: core module: client cla: required labels Aug 28, 2020

antobinary added this to the Release 2.3 milestone Aug 28, 2020

ffdixon removed the cla: required label Sep 12, 2020

ffdixon mentioned this pull request Sep 26, 2020

change web server #10524

Closed

github-actions bot added the status: conflict label Dec 4, 2020

aguerson mentioned this pull request Dec 4, 2020

Performance Issues after 2.2.28: NodeJS concurrent users limit #10739

Closed

schrd mentioned this pull request Dec 9, 2020

Add shibboleth omniauth provider bigbluebutton/greenlight#1138

Closed

schrd mentioned this pull request Jan 20, 2021

NodeJS configuration optimization #11183

Closed

antobinary mentioned this pull request Feb 5, 2021

Split Meteor roles backend-frontend revisit #11317

Merged

antobinary added the cla: signed label Feb 11, 2021

This was referenced Feb 21, 2021

External video events propagated through akka-apps #11447

Merged

Set number of bbb-html5 nodejs processes to 2 #11288

Closed

antobinary merged commit 0b64966 into bigbluebutton:develop Mar 2, 2021

This was referenced Mar 3, 2021

Split html5-server in multiples process to allow better performance #10130

Closed

Improve handling of multiple html5 users joining at the same time #6259

Closed

Split html5-server in multiple processes to larger meetings #10349

Split html5-server in multiple processes to larger meetings #10349

Conversation

amguirado73 commented Aug 28, 2020

What does this PR do?

Motivation

More

antobinary commented Aug 28, 2020

amguirado73 commented Aug 29, 2020 via email

ffdixon commented Sep 12, 2020

jibon57 commented Sep 20, 2020 • edited

ffdixon commented Sep 20, 2020

iSamof commented Sep 20, 2020

jibon57 commented Sep 21, 2020

aguerson commented Sep 21, 2020

aguerson commented Sep 21, 2020

amguirado73 commented Sep 21, 2020 via email

aguerson commented Sep 21, 2020

aguerson commented Sep 21, 2020

mabras commented Sep 21, 2020 • edited

amguirado73 commented Sep 21, 2020 via email

aguerson commented Sep 21, 2020 • edited

mabras commented Sep 21, 2020 • edited

amguirado73 commented Sep 21, 2020 via email

aguerson commented Sep 22, 2020 • edited

iSamof commented Sep 23, 2020

jibon57 commented Sep 24, 2020

aguerson commented Sep 24, 2020

GhaziTriki commented Nov 4, 2020

jibon57 commented Nov 5, 2020

GhaziTriki commented Nov 5, 2020

netzwerk-azv commented Nov 6, 2020

netzwerk-azv commented Nov 6, 2020

schrd commented Nov 23, 2020

github-actions bot commented Dec 4, 2020

ichdasich commented Dec 9, 2020

ffdixon commented Dec 9, 2020 • edited

ichdasich commented Dec 9, 2020

ffdixon commented Dec 9, 2020

cod3r0k commented Dec 10, 2020

ffdixon commented Dec 10, 2020

schrd commented Jan 4, 2021

ichdasich commented Jan 4, 2021 • edited

schrd commented Jan 5, 2021

antobinary commented Feb 9, 2021

schrd commented Feb 10, 2021

jibon57 commented Feb 11, 2021 • edited

antobinary commented Feb 19, 2021

jibon57 commented Sep 20, 2020 •

edited

mabras commented Sep 21, 2020 •

edited

aguerson commented Sep 21, 2020 •

edited

mabras commented Sep 21, 2020 •

edited

aguerson commented Sep 22, 2020 •

edited

ffdixon commented Dec 9, 2020 •

edited

ichdasich commented Jan 4, 2021 •

edited

jibon57 commented Feb 11, 2021 •

edited