Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace server Select loop with individual client threads. #19429

Merged
merged 4 commits into from
May 30, 2021

Conversation

pchote
Copy link
Member

@pchote pchote commented May 25, 2021

Closes #19334 and the other issues where clients dropping uncleanly can freeze the server.

This PR replaces #19338 with a full reimplementation that splits the centralized Socket management out to individual threads for each socket. The first three commits are moving things around to improve the encapsulation of Connection state with (hopefully) no real logic changes. The last commit contains the real changes and is best viewed with 10ccfc1?w=1.

@hz-ad has been running a slightly earlier version of this branch rebased onto release-20210321 for the last couple of days, and has reported that the performance issues from #19338 have been resolved and that players have not reported any new issues.

@teinarss
Copy link
Contributor

Looks good.

Copy link
Contributor

@reaperrr reaperrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good as far as I'm able to judge it, considering the approval and testing it already got: lgtm.

@reaperrr reaperrr merged commit 2a26ddc into OpenRA:bleed May 30, 2021
@jrb0001
Copy link
Contributor

jrb0001 commented May 30, 2021

@pchote I am not sure if this was a good idea. Now you replaced the DoS vulnerability of #19334 with another one: Each connection spawns a new thread so opening a few thousands connections (very cheap to do...) will crash the main thread (probably by hitting the nproc ulimit on linux). I don't know if windows has a similar protection or if it will crash the whole system / make it unstable.

@pchote
Copy link
Member Author

pchote commented May 30, 2021

That's an interesting point, but I do feel that fixing the sporadic non-exploit related server stalls makes this worthwhile.

We can follow this up with changes to limit the number of connections allowed to a server, and to prune clients that do not complete a handshake within a reasonable timeout.

@anvilvapre
Copy link
Contributor

Please also see the first line of #19338 (comment). I would like to propose to still also set Socket.SendTimeout and Socket.ReceiveTimeout

Back then i created this wrapper around a synchronous socket that uses async inside https://github.com/anvilvapre/OpenRA-1/blob/20210413_socket_timeout/OpenRA.Game/Server/SocketTimeoutable.cs so that minimal code changes were needed. Untested. As an example that no full code rewrites would be needed to use async.

On a side note. In general I think Server.cs could do with a full refactor, after the open pull requests either are merged or closed. Too much functionality in once file that maintains a state machine to do the right thing at the right time that also mixes in some threads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential server-side DoS vulnerability related to blocking sends
5 participants