Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows TCP implementation hits unreachable code #921

Closed
rbtying opened this issue Mar 21, 2019 · 7 comments
Closed

Windows TCP implementation hits unreachable code #921

rbtying opened this issue Mar 21, 2019 · 7 comments
Labels
windows Related to the Windows OS.
Milestone

Comments

@rbtying
Copy link

rbtying commented Mar 21, 2019

We've noticed that sometimes (very, very rarely), we get crashes on Windows which are caused by this unreachable! statement here:

https://github.com/carllerche/mio/blob/master/src/sys/windows/tcp.rs#L571

It's nonobvious to me why we would ever end up in there, but immediately prior to that we have some logs which reference WSAECONNRESET -- is there a possibility that in the connection reset case that the mio implementation will incorrectly enter write_done ?

Edit (by @Thomasdezeeuw):
fixed link to the unreachable statement:

_ => unreachable!(),
.

@carllerche
Copy link
Member

carllerche commented Apr 2, 2019 via email

@FXTi
Copy link
Contributor

FXTi commented Apr 9, 2019

@rbtying Would you reproduce the crush case or provide the test case?

More information is definitely needed for clarify the situation here, since the crashes is very rare.

@rbtying
Copy link
Author

rbtying commented Apr 9, 2019

@FXTi @carllerche thanks for the responses! I'm trying to narrow down the crash conditions, but this is coming from a fairly complicated app and we have pretty limited telemetry. I'll let you know if I can get a repro!

yorickpeterse pushed a commit to inko-lang/inko that referenced this issue May 9, 2019
This adds support for performing non-blocking network operations, such
as reading and writing to/from a socket. The runtime API exposed is
similar to Erlang, allowing one to write code that uses non-blocking
APIs without having to resort to using callbacks. For example, in a
typicall callback based language you may write the following to read
from a socket:

    socket.create do (socket) {
      socket.read do (data) {

      }
    }

In Inko, you instead would (more or less) write the following:

    import std::net::socket::TcpStream

    let socket = try! TcpStream.new(ip: '192.0.2.0', port: 80)
    let message = try! socket.read_string(size: 4)

The VM then takes care of using the appropriate non-blocking operations,
and will reschedule processes whenever necessary.

This functionality is exposed through the following runtime modules:

* std::net::ip: used for parsing IPv4 and IPv6 addresses.
* std::net::socket: used for TCP and UDP sockets.
* std::net::unix: used for Unix domain sockets.

The VM uses the system's native polling mechanism to determine when a
file descriptor is available for a read or write. On Linux we use epoll,
while using kqueue for the various BSDs and Mac OS. For Windows we use
wepoll (https://github.com/piscisaureus/wepoll). Wepoll exposes an API
that is compatible with the epoll API, but uses Windows IO completion
ports under the hoods.

When a process attempts to perform a non-blocking operation, the process
is registered (combined with the file descriptor to poll) in a global
poller and suspended. When the file descriptor becomes available for a
read or write, the corresponding process is rescheduled. The polling
mechanism is set up in such a way that a process can not be rescheduled
multiple times at once.

We do not use MIO (https://github.com/tokio-rs/mio), instead we use
epoll, kqueue, and wepoll (using
https://crates.io/crates/wepoll-binding) directly. At the time of
writing, while MIO offers some form of support for Windows it comes with
various issues:

1. tokio-rs/mio#921
2. tokio-rs/mio#919
3. tokio-rs/mio#776
4. tokio-rs/mio#913

It's not clear when these issues would be addressed, as the maintainers
of MIO appear to not have the experience and resources to resolve them
themselves. MIO is part of the Google Summer of Code 2019, with the goal
of improving Windows support. Unfortunately, this likely won't be done
before the end of 2019, and we don't want to wait that long.

Another issue with MIO is its implementation. Internally, MIO uses
various forms of synchronisation which can make it expensive to use a
single poller across multiple threads; it certainly is not a zero-cost
library. It also offers more than we need, such as being able to poll
arbitrary objects.

We are not the first to run into these issues. For example, the Amethyst
video game engine also ran into issues with MIO as detailed in
https://community.amethyst.rs/t/sorting-through-the-mio-mess/561.

With all of this in mind, I decided it was not worth the time to wait
for MIO to get fixed, and to instead spend time directly using epoll,
kqueue, and wepoll. This gives us total control over the code, and
allows us to implement what we need in the way we need it. Most
important of all: it works on Linux, BSD, Mac, and Windows.
@Thomasdezeeuw
Copy link
Collaborator

@rbtying Any update on this?

@rbtying
Copy link
Author

rbtying commented Jun 2, 2019 via email

@bgianfo
Copy link

bgianfo commented Jun 4, 2019

@rbtying Do you have a crash dump? If you do can you determine what the state is, if it's not pending? It seems like possibly IO cancellation might be involved here.

You could try writing to a connection and closing it on the remote side in a loop?

@Thomasdezeeuw Thomasdezeeuw added the windows Related to the Windows OS. label Jun 17, 2019
@Thomasdezeeuw Thomasdezeeuw added this to the v0.7 milestone Jun 17, 2019
@Thomasdezeeuw
Copy link
Collaborator

The Windows implementation is rewritten, so I'm closing this, too bad we didn't find the root cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
windows Related to the Windows OS.
Projects
None yet
Development

No branches or pull requests

5 participants