Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Networking bug fix: Reset receive buffer on error. #1478

Merged
merged 1 commit into from Feb 23, 2016

Conversation

jason-bragg
Copy link
Contributor

Client networking was getting in a bad state in rare cases where there was a networking error after a partial message had been read into the receive buffer.

This change refactors some of the buffer grow logic to help reduce partial messages being left in the buffer, as well as properly resets the buffer when socket errors occur.

@sergeybykov
Copy link
Contributor

@jason-bragg I find it hard to review this change in the absence of a narrative explaining the exact failure case and the remedy, even though you already verbally explained it to me. I think a concise synopsis would help me wrap my head around it.

@gabikliot
Copy link
Contributor

Agree with @sergeybykov . Especially given that "even though you already verbally explained it to me" applies only to @sergeybykov , it makes it even harder for other mortals.

@jason-bragg
Copy link
Contributor Author

When performing a network read for a client, we read from a socket into a set of buffers (collectively called the receive buffer). We then process all of the messages out of the receive buffer. If the receive buffer is completely filled during a socket read, there will likely be a partial message left in the receive buffer after processing. The next socket read will read the rest of the message into the receive buffer so it can be processed. In this read loop, if there is a socket error, we reconnect and continue the loop.

If there was a partial message in the receive buffer when a socket error occured, the next socket read would read a new message from the new socket into the receive buffer after the partial message that was still there. This would lead to a corruption of the buffer, which could manifest itself in a couple different ways. Since this buffer was never being reset, under any circumstances, this would result in a purminate loss of communication, until the client was restarted.

To address this, we do two things.

  • Grow the read buffer (up to a max size) any time a network read completely fills it. This is an optimization and a mitigation of this issue, as we should only see partial messages when there is insufficient space in the buffer. (See GrowBuffer, and AdjustBuffer)
  • Reset the buffer on error. This is already being done on the silo networking, it was just missed on the client side. (see new buffer.Reset() calls in the error handling)

@@ -43,6 +39,12 @@ public IncomingMessageBuffer(TraceLogger logger, bool supportForwarding = false,
bodyLength = 0;
}

public List<ArraySegment<byte>> ReceiveBuffer()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping the same name as the property for the method is a little misleading, as this is not receiving a buffer. Maybe GetReceiveBuffer() or BuildReceiveBuffer()

@@ -58,13 +58,14 @@ protected void RunNonBatch()
}
catch (Exception ex)
{
buffer.Reset();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given this usage, shouldn't the Reset() be returning all the buffers to the pool, in case it's not used ever again?

@jdom
Copy link
Member

jdom commented Feb 20, 2016

Code looks great and ready to merge.
Other than possibly returning all buffers to the pool on Reset (but making sure that the next read, if there is any, grows the buffer back). That could even be tackled as a separate PR, if we want, as you explained to me that these buffers won't be completely leaked anyways and be garbage collected (while the buffer is taken, the pool is not referencing it).

return true;
}

void AdjustBuffer()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would certainly benefit from some high-level explanation of what it is doing. Also add private modifier to method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also to increase readability, in line 153 (of the original version) have the branching logic to how to calculate backfillBytes, but have the call to readBuffer.AddRange only once. And update the comment to: "restore buffer up to DEFAULT_MAX_SUSTAINED_RECEIVE_BUFFER_SIZE" (instead of shrink, which is correct from the high level perspective of the method, but at this step in particular you are still expanding/restoring it)

jdom added a commit that referenced this pull request Feb 23, 2016
Networking bug fix:  Reset receive buffer on error.
@jdom jdom merged commit 7c8b2c7 into dotnet:master Feb 23, 2016
@github-actions github-actions bot locked and limited conversation to collaborators Dec 13, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants