Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve async performance #164

Closed
bgrainger opened this issue Jan 24, 2017 · 5 comments
Closed

Improve async performance #164

bgrainger opened this issue Jan 24, 2017 · 5 comments

Comments

@bgrainger
Copy link
Member

bgrainger commented Jan 24, 2017

I have a (non-public) test case that executes two bulk read queries (in a batch); one that returns 31K rows of int,string,int,int and a second that returns 37K rows of int,int,int,int. The remote server is MySQL 5.6 on Linux, about 10ms away from my test box.

On full .NET 4.6.2 on Windows 10, the sync codepath uses less than one core and completes in 4s; the async code path uses over 300% CPU and takes 20s wall time.

sync:
real   4.043s
total  1.922s
user   1.438s
sys    0.484s

async:
real   20.840s
total  68.422s
user   47.531s
sys    20.891s

With dotnet, the results are even worse:

sync:
real   4.336s
total  1.953s
user   1.469s
sys    0.484s

async:
real   24.468s
total  85.703s
user   59.219s
sys    26.484s

Some investigation with dotTrace and dotPeek on the full .NET Framework shows that Socket.ReceiveAsync (used in the async code path) takes 2s of CPU time while Socket.Receive (which blocks until the data is received) only takes 1.5s. Moreover, Socket.ReceiveAsync appears to only return false (i.e., receive completed synchronously) on failure; a successful result always completes asynchronously, meaning that we suffer the penalty of a callback being queued to the threadpool on top of the extra CPU time just to initiate the operation!

Additionally, network bandwidth (shown in Task Manager) is noticeably lower for the async case than the sync case.

We may need to find some way to get the benefits of asynchronous I/O without the overhead of Socket.ReceiveAsync, perhaps by #163 or by writing custom P/Invoke for the full .NET Framework on Windows.

@bgrainger
Copy link
Member Author

bgrainger commented Jan 24, 2017

This should now be greatly improved for connections that use SocketByteHandler, i.e., non-SSL connections. A fix would have to be pushed to corefx to improve NetworkStream and StreamByteHandler.

@bgrainger
Copy link
Member Author

bgrainger commented Jan 24, 2017

Fixed in 0.11.3.

@caleblloyd
Copy link
Contributor

caleblloyd commented Jan 25, 2017

I've done some more thinking on this issue, and I don't think that the issue is necessarily .NET anymore. Look at how many times we are calling into the OS Network Buffer in SocketByteHandler.ReadBytesAsync for a simple Select of 10 Blog Posts:

SELECT `Id`, `Title`, `Content` FROM `BlogPost` ORDER BY `Id` DESC LIMIT @limit;
Write 81 bytes
Read 4 bytes
Read 1 bytes
Read 4 bytes
Read 53 bytes
Read 4 bytes
Read 59 bytes
Read 4 bytes
Read 63 bytes
Read 4 bytes
Read 5 bytes
Read 4 bytes
Read 16 bytes
Read 4 bytes
Read 16 bytes
Read 4 bytes
Read 15 bytes
Read 4 bytes
Read 15 bytes
Read 4 bytes
Read 15 bytes
Read 4 bytes
Read 15 bytes
Read 4 bytes
Read 15 bytes
Read 4 bytes
Read 15 bytes
Read 4 bytes
Read 15 bytes
Read 4 bytes
Read 15 bytes
Read 4 bytes
Read 5 bytes
Async: Read 10 records in 00:00:00.0056040

The entire result set is delivered in a single TCP Packet (select-10-blog-posts-pcap.zip), but we read it from the OS Socket Buffer column by column. I believe that every call to Read here is a syscall into Kernel Space to fetch the bytes.

The solution implemented uses a Synchronous syscall to read the socket if the data is ready, instead of an Async syscall that would have to wait for an interrupt before the data was delivered.

I think that the right solution would be to attempt to buffer all of m_buffer.Length bytes from the OS Socket Buffer, and then return them to the caller piece-by-piece. This would keep us in Userspace a lot more and may even free up the OS Socket Buffer and mitigate retransmissions noticed in #117

@caleblloyd caleblloyd reopened this Jan 25, 2017
@bgrainger
Copy link
Member Author

bgrainger commented Jan 25, 2017

BufferedByteReader may be useful here; I'll plan to take a look later.

@caleblloyd
Copy link
Contributor

caleblloyd commented Jan 25, 2017

I'm hacking on it as we speak; I'm using BufferedByteReader 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants