vidstreams count should not count receiveOnlyEndpoints #1364

nemani · 2020-07-31T15:45:06Z

jitsi-videobridge/jvb/src/main/java/org/jitsi/videobridge/stats/VideobridgeStatistics.java

Line 328 in edd358c

// Assume we're receiving a video stream from the endpoint

Is there a reason for assuming this in the stats? Won't it be better to just check if its a receiveOnlyEndpoint as done on line 277?

jitsi-videobridge/jvb/src/main/java/org/jitsi/videobridge/stats/VideobridgeStatistics.java

Line 277 in edd358c

if (!sendingAudio && !sendingVideo && !inactive)

?

bgrozev · 2020-07-31T22:54:21Z

Thanks for pointing this out. That's depracated code which was written before we had a good way of checking whether we're receiving video. The "videostreams" metric is bad, and it's not used anywhere as far as I know, so I'll open a PR to remove it. Do you use it for anything?

nemani · 2020-07-31T23:01:26Z

We use it to correlate the load on the server. I think it's useful to know the videostream count, as it helps to understand the load on the JVB.

Can we also look at the other jvb stats? For me there is a lot of confusion when using OCTO with stats such as total_participants or total_conferences.

bgrozev · 2020-07-31T23:44:29Z

Currently we recommend using packet rate (packet_rate_download + packet_rate_upload) as a proxy for load. This is what jicofo uses, and it correlates with load much better than video streams count.

I agree there's confusion with stat names, especially with Octo. Hopefully we can clean that up in the future. Some of the stats are kept for backward compatibility. The ones that start with total_ are usully cumulative (e.g. total_conference counts how many conferences are created since the start, not how many are currently active). If you have specific questions let us know.

nemani · 2020-08-22T13:41:52Z

@bgrozev can we have stress as a part of the colibri stats output?
I know stress is shown in jicofo when it decides which bridge to use.

On our grafana dash we use the following formula for stress (copied from stats.ffmuc..net):
(mean("packet_rate_download")+mean("packet_rate_upload"))/50800

I can see the stress using this formula going above 1 but jicofo not using spare bridges in the same region, which is concering.
Any help would be appreciated !
Thanks

bgrozev · 2020-08-24T16:41:40Z

@bgrozev can we have stress as a part of the colibri stats output?

This is something we've been talking about, but not implemented yet. I think the consensus is that it should be calculated on the bridge itself, so we'll have it eventually, but I don't have any timeline.

I can see the stress using this formula going above 1 but jicofo not using spare bridges in the same region, which is concering.

This doesn't seem right. Jicofo should allocate evenly to all available bridges, even before the stress reaches 1. At the threshold of 0.8 it will start to offload participants to other bridges using Octo.

nemani · 2020-08-24T16:45:53Z

I will see if I can send in a PR for it. We only had one running conference, so it will only split it when a new person joins and stress is > 0.8, right? We are considering modifying the source and change the threshold to 0.7 (which IMO should be configurable in jicofo)

…

On Mon, Aug 24, 2020, 10:11 PM bgrozev ***@***.***> wrote: @bgrozev <https://github.com/bgrozev> can we have stress as a part of the colibri stats output? This is something we've been talking about, but not implemented yet. I think the consensus is that it should be calculated on the bridge itself, so we'll have it eventually, but I don't have any timeline. I can see the stress using this formula going above 1 but jicofo not using spare bridges in the same region, which is concering. This doesn't seem right. Jicofo should allocate evenly to all available bridges, even before the stress reaches 1. At the threshold of 0.8 it will start to offload participants to other bridges using Octo. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1364 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADX44NRFGVYHDBT3GIZXGO3SCKJ5JANCNFSM4PQOWOZA> .

bgrozev · 2020-08-24T18:47:12Z

We only had one running conference, so it will only split it when a new person joins and stress is > 0.8, right?

Yes.

We are considering modifying the source and change the threshold to 0.7
(which IMO should be configurable in jicofo)

This makes it configurable: jitsi/jicofo#584

Note that you can achieve the same with the current code by configuring the max packet rate.

nemani · 2020-08-24T18:52:24Z

Great! Thanks for the quick fix. Do you have any suggestions on how once can calculate the max packets a machine can handle? The current number just seems arbitrary. Thanks!

…

On Tue, Aug 25, 2020, 12:17 AM bgrozev ***@***.***> wrote: We only had one running conference, so it will only split it when a new person joins and stress is > 0.8, right? Yes. We are considering modifying the source and change the threshold to 0.7 (which IMO should be configurable in jicofo) This makes it configurable: jitsi/jicofo#584 <jitsi/jicofo#584> Note that you can achieve the same with the current code by configuring the max packet rate. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#1364 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ADX44NWMYNYNSHK3UT5MMLTSCKYT5ANCNFSM4PQOWOZA> .

bgrozev · 2020-08-24T18:56:07Z

It is arbitrary: we set it to match the machines we currently use (the longer term plan is to have the bridges calulate that themselves). I suggest you experiment with the machines you use by gradually adding conferences until system load approaches the number of processors, then note the packet rate.

nemani · 2020-09-08T20:27:51Z

Is there a script I can look at? Would help a lot, I got the general gist. But how exactly does packet rate map to system load.

Also I noticed the UDP buffer sizes here
Did you guys experiment with larger buffers? Again the numbers seem to be arbitrary.

Thanks once again for taking the time to reply, and for working on making jitsi better for all.

bgrozev · 2020-09-08T21:54:32Z

You can use this script to generate load (using a selenium grid): https://github.com/jitsi/jitsi-meet-torture/blob/master/scripts/malleus.sh

The socket buffer size is set sufficiently large to reduce dropped packets on our machines (aws c5.4xlarge or similar) under heady load. Unless you see packet dropped in the kernel (see the output of netstat -naus) you don't need to increase it.

nemani · 2020-09-24T14:50:10Z

Closing this issue as vidstreams has been removed and jvb now calculates stress.

nemani closed this as completed Sep 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vidstreams count should not count receiveOnlyEndpoints #1364

vidstreams count should not count receiveOnlyEndpoints #1364

nemani commented Jul 31, 2020

bgrozev commented Jul 31, 2020

nemani commented Jul 31, 2020

bgrozev commented Jul 31, 2020

nemani commented Aug 22, 2020 •

edited

bgrozev commented Aug 24, 2020

nemani commented Aug 24, 2020 via email

bgrozev commented Aug 24, 2020

nemani commented Aug 24, 2020 via email

bgrozev commented Aug 24, 2020

nemani commented Sep 8, 2020 •

edited

bgrozev commented Sep 8, 2020

nemani commented Sep 24, 2020

vidstreams count should not count receiveOnlyEndpoints #1364

vidstreams count should not count receiveOnlyEndpoints #1364

Comments

nemani commented Jul 31, 2020

bgrozev commented Jul 31, 2020

nemani commented Jul 31, 2020

bgrozev commented Jul 31, 2020

nemani commented Aug 22, 2020 • edited

bgrozev commented Aug 24, 2020

nemani commented Aug 24, 2020 via email

bgrozev commented Aug 24, 2020

nemani commented Aug 24, 2020 via email

bgrozev commented Aug 24, 2020

nemani commented Sep 8, 2020 • edited

bgrozev commented Sep 8, 2020

nemani commented Sep 24, 2020

nemani commented Aug 22, 2020 •

edited

nemani commented Sep 8, 2020 •

edited