Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vidstreams count should not count receiveOnlyEndpoints #1364

Closed
nemani opened this issue Jul 31, 2020 · 12 comments
Closed

vidstreams count should not count receiveOnlyEndpoints #1364

nemani opened this issue Jul 31, 2020 · 12 comments

Comments

@nemani
Copy link

nemani commented Jul 31, 2020

// Assume we're receiving a video stream from the endpoint

Is there a reason for assuming this in the stats? Won't it be better to just check if its a receiveOnlyEndpoint as done on line 277?

?

@bgrozev
Copy link
Member

bgrozev commented Jul 31, 2020

Thanks for pointing this out. That's depracated code which was written before we had a good way of checking whether we're receiving video. The "videostreams" metric is bad, and it's not used anywhere as far as I know, so I'll open a PR to remove it. Do you use it for anything?

@nemani
Copy link
Author

nemani commented Jul 31, 2020

We use it to correlate the load on the server. I think it's useful to know the videostream count, as it helps to understand the load on the JVB.

Can we also look at the other jvb stats? For me there is a lot of confusion when using OCTO with stats such as total_participants or total_conferences.

@bgrozev
Copy link
Member

bgrozev commented Jul 31, 2020

Currently we recommend using packet rate (packet_rate_download + packet_rate_upload) as a proxy for load. This is what jicofo uses, and it correlates with load much better than video streams count.

I agree there's confusion with stat names, especially with Octo. Hopefully we can clean that up in the future. Some of the stats are kept for backward compatibility. The ones that start with total_ are usully cumulative (e.g. total_conference counts how many conferences are created since the start, not how many are currently active). If you have specific questions let us know.

@nemani
Copy link
Author

nemani commented Aug 22, 2020

@bgrozev can we have stress as a part of the colibri stats output?
I know stress is shown in jicofo when it decides which bridge to use.

On our grafana dash we use the following formula for stress (copied from stats.ffmuc..net):
(mean("packet_rate_download")+mean("packet_rate_upload"))/50800

I can see the stress using this formula going above 1 but jicofo not using spare bridges in the same region, which is concering.
Any help would be appreciated !
Thanks

@bgrozev
Copy link
Member

bgrozev commented Aug 24, 2020

@bgrozev can we have stress as a part of the colibri stats output?

This is something we've been talking about, but not implemented yet. I think the consensus is that it should be calculated on the bridge itself, so we'll have it eventually, but I don't have any timeline.

I can see the stress using this formula going above 1 but jicofo not using spare bridges in the same region, which is concering.

This doesn't seem right. Jicofo should allocate evenly to all available bridges, even before the stress reaches 1. At the threshold of 0.8 it will start to offload participants to other bridges using Octo.

@nemani
Copy link
Author

nemani commented Aug 24, 2020 via email

@bgrozev
Copy link
Member

bgrozev commented Aug 24, 2020

We only had one running conference, so it will only split it when a new person joins and stress is > 0.8, right?

Yes.

We are considering modifying the source and change the threshold to 0.7
(which IMO should be configurable in jicofo)

This makes it configurable: jitsi/jicofo#584

Note that you can achieve the same with the current code by configuring the max packet rate.

@nemani
Copy link
Author

nemani commented Aug 24, 2020 via email

@bgrozev
Copy link
Member

bgrozev commented Aug 24, 2020

It is arbitrary: we set it to match the machines we currently use (the longer term plan is to have the bridges calulate that themselves). I suggest you experiment with the machines you use by gradually adding conferences until system load approaches the number of processors, then note the packet rate.

@nemani
Copy link
Author

nemani commented Sep 8, 2020

Is there a script I can look at? Would help a lot, I got the general gist. But how exactly does packet rate map to system load.

Also I noticed the UDP buffer sizes here
Did you guys experiment with larger buffers? Again the numbers seem to be arbitrary.

Thanks once again for taking the time to reply, and for working on making jitsi better for all.

@bgrozev
Copy link
Member

bgrozev commented Sep 8, 2020

You can use this script to generate load (using a selenium grid): https://github.com/jitsi/jitsi-meet-torture/blob/master/scripts/malleus.sh

The socket buffer size is set sufficiently large to reduce dropped packets on our machines (aws c5.4xlarge or similar) under heady load. Unless you see packet dropped in the kernel (see the output of netstat -naus) you don't need to increase it.

@nemani
Copy link
Author

nemani commented Sep 24, 2020

Closing this issue as vidstreams has been removed and jvb now calculates stress.

@nemani nemani closed this as completed Sep 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants