windows: use SetFileCompletionNotificationModes for TCP sockets (#476) #520

steffengy · 2017-01-19T16:20:31Z

SetFileCompletionNotificationModes allows us to tell the kernel not to enqueue the completion message, if it returns `SUCCESS` immediately.

This removes the CPU-overhead of having to enter the event loop to handle this completion message.

Less memory overhead

You do not need as much internal (or external) buffering,
which can be quite significant, especially in a context where the event loop is only run after buffering the content in an external buffer (up to 3x less, 512 MB insteadof 1.5GB for my case).
In that regard it also behaves similar to the POSIX (epoll) implementation:
You should be able to repeatedly write to the underlying IO, which will still allow it to progress, without requiring any event loop interaction (similar as epoll, which returns eagain, miow returns true).

general issues, some of them already mentioned in the original issue:

The API is not supported on Windows XP (unsure it this needs additional attention?)
We might want to check for non-IFS LSPs , since if one of them is installed, SetFileCompletionNotificationModes will not work properly if such is installed.
Golang does perform this check, while libuv doesn't seem to.
(This is probably because non-IFS LSPs are deprecated for ages)

some issues or rather hurdles using it with UDP-sockets:

There seems to be a bug in the underlying WSARecv implementation (which according to this blogpost which is a summary of the issue and links to the original MSDN post, where Microsoft confirmed the issue in the comment section.
Golang disables UDP because of this, while LIBUV uses a (slightly limited) workaround.
.NET Core disables it for <= Windows 7.
Also the MIO api seems slightly different, which should not be any issue, but for the sake of completeness I still list it here.

This is the main reason this implements this for TCP sockets first.

nayato · 2017-01-19T20:25:47Z

@steffengy FYI, recently merged in .NET Core: https://github.com/dotnet/corefx/pull/15141/files#diff-6c4b087518282ebcba0a74cb9bd073c3R22 (search for CheckIfPlatformHasUdpIssue)
The check is being done for Windows version if socket type is UDP.

alexcrichton

Looks good to me, thanks! A few points:

Would it be possible to add a test for this? A windows-specific test is fine.
I think it's ok to not explicitly support XP just yet. I don't believe mio ever has, libstd only barely does, and it should be easy to add back in later.
I also think it's ok to skip the non-IFS LSP (not that I even know what it is) as it should be easy to add in later if need be and otherwise if it's deprecated let's just wait till we see it.
It's ok to skip UDP for now and focus on TCP.

alexcrichton · 2017-01-19T21:36:03Z

src/sys/windows/tcp.rs

@@ -397,6 +405,10 @@ impl StreamImp {
            self.inner.socket.write_overlapped(&buf[pos..], overlapped)
        };
        match err {
+            Ok(true) if me.instant_notify => {
+                me.write = State::Empty;
+                self.add_readiness(me, Ready::writable());


Here I think we should check/verify the number of bytes written, right?

yeah, that'll need modification to MIOW though, since we'll need to pass lpNumberOfBytesSend to WSASend in write_overlapped.

A pointer to the number, in bytes, sent by this call if the I/O operation completes immediately. Use NULL for this parameter if the lpOverlapped parameter is not NULL to avoid potentially erroneous results.

Golang and LIBUV seem to use this, which is surprising since the errorneous results doesn't sound good.

Or did you have something else in mind?

I think you should be able to get that through WSAGetOverlappedResult, right? That is currently exposed in miow as of recently.

Ok, seems to work.
Still feels slightly slower (might be worth to investigate it that's just a placebo or if doing
it like libuv/go is advantageous performance-wise.

What's the benchmark you're using? I wouldn't mind trying to poke around it in my spare time to see what's up

unfortunately the code I'm currently using is to slow for cargo bench to output
an average. (~6 seconds/iteration which writes 1GB).
Since this is implemented now, the next step for me would be to profile it and get it hopefully much faster.

What you might do is bench the test I just added (maybe with writing more data tough) since that should be sufficient to see a difference?

@alexcrichton
did some benchmarking to simply deterine the overhead of the .result call.
code: https://gist.github.com/steffengy/598350018d4827887b287ed0ea798850

mio/tcp.rs:

-self.inner.socket.result(overlapped).map(Some) +Ok(Some((buf.len() - pos, 3u32)))

results (with .result call):

1,943,652 ns/iter (+/- 525,906) 2,009,193 ns/iter (+/- 1,153,769) 1,943,501 ns/iter (+/- 565,776)

without the .result call:

1,753,306 ns/iter (+/- 1,052,692) 1,724,071 ns/iter (+/- 589,609) 1,792,201 ns/iter (+/- 537,950)

EDIT: miow patch steffengy/miow@a37c31d

steffengy · 2017-01-19T21:53:48Z

I'm also not quite sure how to add an explicit test for this, since by default
it should simply be used for everything in test_tcp?
(Maybe it would make sense to run with and without, but I don't quite see how you'd feasily implement that in those tests)

alexcrichton · 2017-01-19T21:55:09Z

Oh yeah not so much for correctness (as you're right existing tcp tests take care of that) but more a test of right after a write you can write again.

steffengy · 2017-01-19T22:32:34Z

@alexcrichton
Test case is added and verified to only work with these changes.

alexcrichton · 2017-01-19T22:38:25Z

src/sys/windows/tcp.rs

+                    self.add_readiness(me, Ready::writable());
+                    State::Empty
+                } else {
+                    State::Pending((buf, pos + transfered_bytes))


In theory if this happens (which I'm still not 100% sure that it actually can) I think we need to loop here, right? Otherwise nothing's scheduled to move this from the pending to the writing state?

I also have my doubts that this is actually a valid case but in theory yes.
I'll add the loop if we have some conclusive benchmark results for the wsagetoverlappedresult stuff (reducing the commit spam a bit by combining those changes, if the benchmarks show they make sense ).

alexcrichton · 2017-01-19T22:38:47Z

Test case is added and verified to only work with these changes.

Woo thanks!

alexcrichton · 2017-01-20T00:53:46Z

Figured we could continue the perf discussion out here rather than on the now-hidden thread.

So it looks like you passed a pointer to WSASend, but according to that documentation:

Use NULL for this parameter if the lpOverlapped parameter is not NULL to avoid potentially erroneous results. This parameter can be NULL only if the lpOverlapped parameter is not NULL.

So that makes me think that we've gotta call WSAGetOverlappedResult maybe? I'm not sure that that means "potentially erroneous results" though...

steffengy · 2017-01-20T05:52:46Z

@alexcrichton
That's what startled me too (I posted it in some comment before).
LibUv
Golang
.NET
.NET Core

So if even Microsoft uses it in it's .NET implementations.
The bottom part of the documentation is the important one:

If an overlapped operation completes immediately, WSASend returns a value of zero and the
lpNumberOfBytesSent parameter is updated with the number of bytes sent.  
If the overlapped operation is successfully initiated and will complete later, WSASend returns 
SOCKET_ERROR and indicates error code WSA_IO_PENDING. 
In this case, lpNumberOfBytesSent is not   updated.

so it's totally safe for that case. (It only means the value is invalid for WSA_IO_PENDING/not success)

steffengy · 2017-01-20T17:01:31Z

MIOW is refactored to return the size directly, when valid.
So we just saved some more overhead.

Related PR: yoshuawuyts/miow#8

the test case executes 16 writes (16KB), which return WouldBlock with the old case. so it basically shows the saved roundtrip through the event loop.

alexcrichton reviewed Jan 19, 2017

View reviewed changes

steffengy mentioned this pull request Jan 20, 2017

Refactor: return the amount of bytes if possible yoshuawuyts/miow#8

Merged

windows: use SetFileCompletionNotificationModes for TCP sockets (#476)

24b88f2

the test case executes 16 writes (16KB), which return WouldBlock with the old case. so it basically shows the saved roundtrip through the event loop.

alexcrichton merged commit 24b88f2 into tokio-rs:master Jan 20, 2017

Matthias247 mentioned this pull request Nov 8, 2020

Reduce Windows syscall overhead quinn-rs/quinn#882

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

windows: use SetFileCompletionNotificationModes for TCP sockets (#476) #520

windows: use SetFileCompletionNotificationModes for TCP sockets (#476) #520

steffengy commented Jan 19, 2017 •

edited

Loading

nayato commented Jan 19, 2017 •

edited

Loading

alexcrichton left a comment

alexcrichton Jan 19, 2017

steffengy Jan 19, 2017 •

edited

Loading

alexcrichton Jan 19, 2017

steffengy Jan 19, 2017

alexcrichton Jan 19, 2017

steffengy Jan 19, 2017

steffengy Jan 19, 2017 •

edited

Loading

steffengy commented Jan 19, 2017

alexcrichton commented Jan 19, 2017

steffengy commented Jan 19, 2017

alexcrichton Jan 19, 2017

steffengy Jan 19, 2017

alexcrichton commented Jan 19, 2017

alexcrichton commented Jan 20, 2017

steffengy commented Jan 20, 2017 •

edited

Loading

steffengy commented Jan 20, 2017

windows: use SetFileCompletionNotificationModes for TCP sockets (#476) #520

windows: use SetFileCompletionNotificationModes for TCP sockets (#476) #520

Conversation

steffengy commented Jan 19, 2017 • edited Loading

SetFileCompletionNotificationModes allows us to tell the kernel not to enqueue the completion message, if it returns SUCCESS immediately.

Less memory overhead

general issues, some of them already mentioned in the original issue:

some issues or rather hurdles using it with UDP-sockets:

nayato commented Jan 19, 2017 • edited Loading

alexcrichton left a comment

Choose a reason for hiding this comment

alexcrichton Jan 19, 2017

Choose a reason for hiding this comment

steffengy Jan 19, 2017 • edited Loading

Choose a reason for hiding this comment

alexcrichton Jan 19, 2017

Choose a reason for hiding this comment

steffengy Jan 19, 2017

Choose a reason for hiding this comment

alexcrichton Jan 19, 2017

Choose a reason for hiding this comment

steffengy Jan 19, 2017

Choose a reason for hiding this comment

steffengy Jan 19, 2017 • edited Loading

Choose a reason for hiding this comment

steffengy commented Jan 19, 2017

alexcrichton commented Jan 19, 2017

steffengy commented Jan 19, 2017

alexcrichton Jan 19, 2017

Choose a reason for hiding this comment

steffengy Jan 19, 2017

Choose a reason for hiding this comment

alexcrichton commented Jan 19, 2017

alexcrichton commented Jan 20, 2017

steffengy commented Jan 20, 2017 • edited Loading

steffengy commented Jan 20, 2017

steffengy commented Jan 19, 2017 •

edited

Loading

SetFileCompletionNotificationModes allows us to tell the kernel not to enqueue the completion message, if it returns `SUCCESS` immediately.

nayato commented Jan 19, 2017 •

edited

Loading

steffengy Jan 19, 2017 •

edited

Loading

steffengy Jan 19, 2017 •

edited

Loading

steffengy commented Jan 20, 2017 •

edited

Loading