Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve sending of immediate orders #16052

Merged
merged 2 commits into from Apr 22, 2019

Conversation

@jrb0001
Copy link
Contributor

commented Jan 11, 2019

If immediate orders are attached to a tick, they are completely ignored by the server. This is probably not the intended behaviour for server orders but I also can't see any reason to do that for any other type of immediate order.

The second commit makes sure that every immediate order ends up with its own framing. There is no reason to pack multiple immediate orders into a single framing and this change allows the server to process huge groups in pieces without reading everything at once.

@pchote

This comment has been minimized.

Copy link
Member

commented Mar 10, 2019

Can you please outline a testcase that demonstrates the issue and fix?

@pchote

This comment has been minimized.

Copy link
Member

commented Mar 10, 2019

From the IRC logs:

[20:29:00] | <jrb0001> pchote: but I found the real issue now. those pong orders can sometimes be attached to a frame and thus replace the client orders with a server order. this caused the freeze because my implementation doesn't forward server orders from client to client (why should it? it's called "server order" and not "client order" for a reason, I assumed).

[21:46:42] | <jrb0001> no, the server doesn't tell the clients anything like that except for the StartGame ServerOrder. the clients wait themselves until they have one order from each client before running a tick.
[21:48:12] | <pchote> this is the whole job of the server
[21:48:19] | <pchote> it collects orders from clients, and then forwards them on
[21:48:35] | <jrb0001> or if it does, then it isn't needed. my implementation doesn't forwards orders as soon as they arrive and because it is a distributed system, there isn't even a global ordering.

Can you please provide a link to your server source so we can understand this in more detail?

@jrb0001

This comment has been minimized.

Copy link
Contributor Author

commented Mar 10, 2019

The server sends PingOrders in the lobby but also while playing. But because the official server doesn't do anything if it doesn't receive the PongOrders while playing (it actually receives them but ignores them due to another bug/strange assumption), you will never notice it. My implementation detects server orders slightly differently which caused it to detect the server order PDU with client orders in it as a regular server orders and thus didn't forward anything to the clients. This caused the game to freeze usually at tick 9 (sometimes around tick 100).

There are two differences in between server orders and client orders besides the name (some string) and the actual payload. The type field contains 0xFE for a server order PDU and 0xFF for a client order PDU (0xBF would be a disconnect PDU and 0x65 would be a sync PDU). If the type field is missing (see length field), then it is an empty client order PDU. There is also a tick field which is needed for client orders, disconnect and sync PDUs. It is currently always set to 0 when sending a server order (not really relevant for chat messages and stuff like that).

If server and client orders can end up in the same PDU (like they do at the moment), then the type field is obviously meaningless. The first commit splits immediate orders and regular orders into separate PDUs and the second commit splits all immediate orders into separate PDUs. The former is needed to avoid attaching immediate orders to a tick. The latter should make it robust in case some new type of "immediate client order" gets introduced in the future and also removes a very rare corner case from the protocol completely so that servers don't have to handle it.

My git server isn't public but I will try to attach a dump of the current code. But be warned that it is overly complex because it is designed to scale across multiple servers (mostly to reduce latency) and also partly because I used it to learn a new framework. Due to the way how the master server protocol works, the scaling requires anycasting which would be fine for UDP and short TCP connections, but isn't that great for OpenRA. This caused issues for some of the players where their ISP most likely did some sort of ECMP which is why I will drop the scaling functionality in the next rewrite/overhaul. It has some additional checks exactly for this bug and disconnect you with a link to this bug if you run into it (usually tick 9 or around tick 100) so you can use it as a testcase, if you want.

ora-server.tar.gz

@pchote

This comment has been minimized.

Copy link
Member

commented Mar 22, 2019

Thanks for the description and the code. I haven't forgotten about this, I just haven't had the time to dig into both your and OpenRA's code to start to understand this.

@jrb0001

This comment has been minimized.

Copy link
Contributor Author

commented Mar 22, 2019

This has been in AS (and thus SP and RV) for two months now and I didn't get a single complaint. That should hopefully solve all regression testing needs unless you have some very specific cases in mind.

@jrb0001 jrb0001 force-pushed the jrb0001:immediate-fix branch 2 times, most recently from ecd21b0 to 7a6cafd Apr 15, 2019

@pchote
Copy link
Member

left a comment

LGTM the OrderManager changes are a clear fix/win, and the order framing isn't likely to hurt anything with the default server in practice.

@pchote pchote added the PR: Needs +2 label Apr 15, 2019

@abcdefg30
Copy link
Member

left a comment

Looks good to me otherwise.

foreach (var o in orders)
{
var ms = new MemoryStream();

This comment has been minimized.

Copy link
@abcdefg30

abcdefg30 Apr 18, 2019

Member

I just wondered why we're not disposing this. I suppose the GC will collect that sooner or later, but since we are creating multiple streams now, it might make sense to warp this in a using statement. (Or doesn't it?)

This comment has been minimized.

Copy link
@pchote

pchote Apr 19, 2019

Member

MemoryStream's dont use any unmanaged resources, so in practice this shouldn't matter either way.

This is just moving existing code inside a block, so IMO shouldn't be a blocker.

This comment has been minimized.

Copy link
@jrb0001

jrb0001 Apr 22, 2019

Author Contributor

I don't mind doing the change, but it should be consistent across the whole codebase. Either all 63 usages of MemoryStream should be wrapped in using or none in my opinion.

This comment has been minimized.

Copy link
@reaperrr

reaperrr Apr 22, 2019

Contributor

I agree, and that's a bit out of scope for this PR and could be done in a follow-up, if we decide to go for it.

@reaperrr

This comment has been minimized.

Copy link
Contributor

commented Apr 22, 2019

Needs a rebase.

@jrb0001 jrb0001 force-pushed the jrb0001:immediate-fix branch from 7a6cafd to bee20bd Apr 22, 2019

@jrb0001

This comment has been minimized.

Copy link
Contributor Author

commented Apr 22, 2019

Rebased.

@reaperrr reaperrr merged commit db487e1 into OpenRA:bleed Apr 22, 2019

2 checks passed

continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.