Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable -maxSessions to be changed without requiring restart #1935

Closed
chrishobcroft opened this issue Jun 28, 2021 · 8 comments · Fixed by #2781
Closed

Enable -maxSessions to be changed without requiring restart #1935

chrishobcroft opened this issue Jun 28, 2021 · 8 comments · Fixed by #2781

Comments

@chrishobcroft
Copy link
Contributor

Is your feature request related to a problem? Please describe.

As the Livepeer network grows, active Orchestrators will likely wish to learn how to use -maxSessions to their own benefit, and to the benefit of the network.

Currently, it is not possible to change -maxSessions without restarting the software with a different value for the -maxSessions parameter in the startup command.

This presents a problem that a successful Orchestrator who is, for example, orchestrating 8 concurrent streams, will need to restarts their node in order to increase to a higher number.

This will result in a poor user experience for anyone watching the streams being transcoded by that Orchestrator when the process is restarted.

Describe the solution you'd like

First, to show the -maxSessions parameter value in the NODE STATS for an Orchestrator in livepeer_cli.

Secondly, to provide an option in the livepeer_cli for an Orchestrator to be able to change this without restarting their node.

Describe alternatives you've considered

To continue with the process as is, and recommend Orchestrators wait until they are processing 0 streams before restarting their node to change this parameter. This doesn't feel like a very elegant solution though.

@Titan-Node
Copy link

This would be really helpful for the scalability of Livepeer. As we see continued growth, the rate at which Orchestrators can add additional resources to increase -maxSessions will benefit from being seamless. Jumping from 10 to 50 and again to 100 -maxSessions can happen quickly as the network gets busier.

This is also essential for testing, it would be really nice to dial up or down a few -maxSessions in order to achieve maximum bandwidth and encoding capabilities without having to shut down.

@chrishobcroft
Copy link
Contributor Author

A further thought on this is that being able to allow an O to "turn down the volume" would assist in avoiding discontinuities in overall network services, in circumstances such as the one described in #1959

@kyriediculous
Copy link
Contributor

kyriediculous commented Sep 7, 2021

The session limit can be benchmarked and determined beforehand, it needs to be put at the maximum you want and should not be a dynamic value based on the load you expect. If you have a 100 max sessions and no load you will use the same amount of resources as if you had 1 max sessions. Furthermore this can lead to averse effects whereby you wrongfully change the session limit to below the current load.

I think the orchestrator session limit should be removed anyhow and should be determined based on the connected transcoders

@jailuthra
Copy link
Contributor

jailuthra commented Sep 7, 2021

Another thing to note here is that the concept of a "transcoding session" doesn't actually capture the underlying resource utilization very well, and is more of a stop-gap measure.

Different users of the network and the .com gateway configure streams with different number of renditions, input/output bitrates, resolutions, codec profiles - and soon optional AI tasks - all of which is not captured by the current one-shot benchmarking process.

A much better solution would be for each O/T node to dynamically keep a tab on segment-wise stats, and whenever some threshold # of segments are being transcoded just-under/just-above realtime, it should cool-off on accepting new streams till a few of them finish.

I haven't put much thought into the implementation details around this, but theoretically it makes sense as each node in the has enough data (i.e. segment length & end-to-end transcode time on the node) to know how well it is performing on-the-fly.

@Titan-Node
Copy link

I think the orchestrator session limit should be removed anyhow and should be determined based on the connected transcoders

Can we please implement this?

This would fix a massive pool problem I am running into.

Transcoders come and go every day and I do not have the ability to adjust the -maxSessions limit on my Os. They are arbitrarily set to 2,000 just so I don't have to shut down the nodes when large amounts of work/Transcoders join.

@Titan-Node
Copy link

Just to follow up on this issue. Been running @eliteprox implementation of the auto setting for -maxSessions. Works smoothly for over 3 months. Highly anticipated update. Thanks

@leszko
Copy link
Contributor

leszko commented Aug 30, 2023

Just to follow up on this issue. Been running @eliteprox implementation of the auto setting for -maxSessions. Works smoothly for over 3 months. Highly anticipated update. Thanks

I think I reviewed the PR, so waiting for addressing my comments.

@eliteprox
Copy link
Contributor

Updated PR with comments, completed tests and reverted to maxSessions auto behavior with original session limit default of 10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants