Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a crash with BlockingCollection in Connection #20057

Merged
merged 1 commit into from
Sep 4, 2022

Conversation

penev92
Copy link
Member

@penev92 penev92 commented Jun 4, 2022

The BlockingCollection would have IsAddingCompleted to true, but IsComplete to false, slipping through the cracks and causing an InvalidOperationException ("The collection has been marked as complete with regards to additions.") when trying to add to it.

System.InvalidOperationException
  HResult=0x80131509
  Message=The collection has been marked as complete with regards to additions.
  Source=System.Collections.Concurrent
  StackTrace:
   at System.Collections.Concurrent.BlockingCollection`1.TryAddWithNoTimeValidation(T item, Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Collections.Concurrent.BlockingCollection`1.Add(T item)
   at OpenRA.Server.Connection.SendReceiveLoop(Object s) in D:\Work.Personal\OpenRA\OpenRA\OpenRA.Game\Server\Connection.cs:line 158
   at System.Threading.Thread.StartCallback()

image

TO TEST:
Launch the game, open Skirmish, open the map browser, click OK without selecting a map and then click Start Game.
In 95% of times you get this dialog:
image
(which I guess is a separate bug that should be addressed)
But sometimes the game will crash with the above.
I suggest also adding

	if (sendQueue.IsAddingCompleted && lastPingSent.ElapsedMilliseconds > 1000)
	{
		var a = 1;
	}

with a breakpoint just below the changed if to catch when it would have crashed if you're feeling lucky.

@penev92 penev92 added this to the Next release milestone Jun 9, 2022
@anvilvapre
Copy link
Contributor

anvilvapre commented Jun 12, 2022

BlockingCollection documentation sais that after a call to CompleteAdding the producer will never add/produce more items.

IsAddingCompleted only then will be true. The consumer will then know there will never be any more items - does not have to wait on it.

Seems CompleteAdding is only called in Dispose.

If you add this check here you may also need to add in SendData all places where you add to sendQueue.

Doesn't seem like a clean fix. The loop needs to take into account when Dispose was called.

@anvilvapre
Copy link
Contributor

Also in the new netcode, data is copied so often byte by byte.

Consider replacing CreatePingFrame with AddPingFrame that will append a const byte array ping prefix, and after adds the run time.

BlockingCollection documentation sais that after a call to CompleteAdding the producer will never add/produce more items.

IsAddingCompleted only then will be true. The consumer will then know there will never be any more items - does not have to wait on it.

Seems CompleteAdding is only called in Dispose.

If you add this check here you may also need to add in SendData all places where you add to sendQueue.

Doesn't seem like a clean fix. The loop needs to take into account when Dispose was called.

Dispose should wait for the loop to first have exited before setting CompleteAdded and returning.

@penev92
Copy link
Member Author

penev92 commented Jun 19, 2022

I'm not sure I follow.
The problem here is we've called Connection.Dispose() and thus sendQueue.CompleteAdding() while the collection wasn't empty. According to the MSDN documentation that means we can read from it, but no longer add to it, indicated by IsAddingComplete being true. Once we read from it again it becomes empty (which is when IsCompleted is set to true) we can no longer do anything with it.

That said, there is probably little-to-no point in continuing to read from it and send out packets if things are being already disposed (or maybe we'd better send out what he have in the queue - I'm not sure and that's probably pretty situational).

There are only two places where we add to the collection - the one I fixed and the SendData() method below. That one's usages inside try/catch blocks, so those are safe, meaning currently this looks (to me, at least; correct me if I'm wrong!) like the only potentially volatile place. In the spirit of being thorough/proper/safe we probably should change this from sendQueue.Add(CreatePingFrame()); to SendData(CreatePingFrame()); though, and change SendData() to

		public void SendData(byte[] data)
		{
+			if (sendQueue.IsAddingCompleted)
+				return;
+
			sendQueue.Add(data);
		}

but if we go down that path that may want to change to bool TrySendData(byte[] data) instead and not throw exceptions at all, and then its 3 usages can/should be changed accordingly (which isn't a bad thing at all, just putting it out there for a potential discussion because it turns the one-line fix into a slightly bigger fix 🤷‍♂️

@abcdefg30
Copy link
Member

Imo it would be good to not throw these exceptions at all.

@penev92 penev92 force-pushed the fixBlockingCollectionCrash branch from ac18d04 to 2660af5 Compare June 20, 2022 07:52
@penev92
Copy link
Member Author

penev92 commented Jun 20, 2022

OK, went the TrySendData route.

@PunkPun
Copy link
Member

PunkPun commented Aug 22, 2022

Considering #20198 was merged which was the primary way (to my knowledge the only way) of triggering this bug I'm removing this from Next Release milestone

@PunkPun PunkPun removed this from the Next release milestone Aug 22, 2022
@penev92
Copy link
Member Author

penev92 commented Aug 22, 2022

We need to reevaluate if this is still relevant.

@dragunoff
Copy link
Contributor

I randomly got this crash, here's what I did:

  1. Start a local MP server.
  2. Join as a spectator with another client.
  3. Start game and try to kick the spectator (confirming by pressing "Kick" in the dialog).

@penev92 penev92 added this to the Next release milestone Sep 1, 2022
The BlockingCollection would have `IsAddingCompleted` to true, but `IsComplete` to false, slipping through the cracks and causing an InvalidOperationException ("The collection has been marked as complete with regards to additions.") when trying to add to it.
We now add a check on `(Try)SendData` to only try to add if we can. The collection is still viable for reading until empty/`IsComplete`.
@penev92
Copy link
Member Author

penev92 commented Sep 3, 2022

I verified that that does make it crash rather reliably (on bleed). Not on the first kick, but if you try kicking several times it eventually crashes.
I retested with this PR. I couldn't make it crash, so ... that's the best I've got 😄
I also incorporated @RoosterDragon's idea of a try/catch for good measure.

@PunkPun PunkPun merged commit 216029d into OpenRA:bleed Sep 4, 2022
@penev92 penev92 deleted the fixBlockingCollectionCrash branch September 4, 2022 10:07
@PunkPun
Copy link
Member

PunkPun commented Sep 4, 2022

Changelog

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants