New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use sockets from thread other than creating thread #165

Closed
hoelzro opened this Issue Dec 28, 2014 · 26 comments

Comments

Projects
None yet
6 participants
@hoelzro
Contributor

hoelzro commented Dec 28, 2014

If you try to use a socket from a thread other than the thread in which that socket was created, the read method will always return an empty buffer:

use v6;

# run this script with the following running in a separate termainal
# (using OpenBSD netcat):
#
#     nc -l localhost 9000 </dev/zero

my $sock = IO::Socket::INET.new(
    :host<localhost>,
    :port(9000),
);

my $p = start {
    # this read will always be empty unless the socket creation is
    # moved inside of the start block
    say $sock.read(8);
    42;
}

$p.result;

I should also note that in the program in which I found this behavior, read just blocks forever.

@hoelzro hoelzro added the bug label Dec 28, 2014

@hoelzro

This comment has been minimized.

Show comment
Hide comment
@hoelzro

hoelzro Dec 29, 2014

Contributor

I believe that the problem is that the uv_stream_t object embedded within an MVMIOSyncStreamData object has a reference to the loop belonging to the original thread context in which it was created, and that loop refers to it as well. MVM_io_syncstream_read_bytes runs a nested libuv loop on the current thread context's loop, so the uv_stream_t belonging to that MVMIOSyncStreamData won't be checked when that loop runs.

Also, regarding that "real world" program in which it blocks forever - it doesn't, I'm just not properly handling what looks to be EOF.

Contributor

hoelzro commented Dec 29, 2014

I believe that the problem is that the uv_stream_t object embedded within an MVMIOSyncStreamData object has a reference to the loop belonging to the original thread context in which it was created, and that loop refers to it as well. MVM_io_syncstream_read_bytes runs a nested libuv loop on the current thread context's loop, so the uv_stream_t belonging to that MVMIOSyncStreamData won't be checked when that loop runs.

Also, regarding that "real world" program in which it blocks forever - it doesn't, I'm just not properly handling what looks to be EOF.

@hoelzro

This comment has been minimized.

Show comment
Hide comment
@hoelzro

hoelzro Dec 29, 2014

Contributor

The following patch fixes the problem, but it pokes around in data->handle->loop which libuv documents as being read-only. As far as I can tell from a quick glance at uv.h, there is no way to migrate a handle to a separate event loop; I'll see what the libuv community has to say about this.

diff --git a/src/io/syncstream.c b/src/io/syncstream.c
index 183466e..ecf92ac 100644
--- a/src/io/syncstream.c
+++ b/src/io/syncstream.c
@@ -71,11 +71,16 @@ static MVMint32 read_to_buffer(MVMThreadContext *tc, MVMIOSyncStreamData *data,
         int r;
         data->handle->data = data;
         data->cur_tc = tc;
-        if ((r = uv_read_start(data->handle, on_alloc, on_read)) < 0)
+        uv_loop_t *old_loop = data->handle->loop;
+        data->handle->loop = tc->loop;
+        if ((r = uv_read_start(data->handle, on_alloc, on_read)) < 0) {
+            data->handle->loop = old_loop;
             MVM_exception_throw_adhoc(tc, "Reading from stream failed: %s",
                 uv_strerror(r));
+        }
         uv_ref((uv_handle_t *)data->handle);
         uv_run(tc->loop, UV_RUN_DEFAULT);
+        data->handle->loop = old_loop;
         return 1;
     }
     else {
Contributor

hoelzro commented Dec 29, 2014

The following patch fixes the problem, but it pokes around in data->handle->loop which libuv documents as being read-only. As far as I can tell from a quick glance at uv.h, there is no way to migrate a handle to a separate event loop; I'll see what the libuv community has to say about this.

diff --git a/src/io/syncstream.c b/src/io/syncstream.c
index 183466e..ecf92ac 100644
--- a/src/io/syncstream.c
+++ b/src/io/syncstream.c
@@ -71,11 +71,16 @@ static MVMint32 read_to_buffer(MVMThreadContext *tc, MVMIOSyncStreamData *data,
         int r;
         data->handle->data = data;
         data->cur_tc = tc;
-        if ((r = uv_read_start(data->handle, on_alloc, on_read)) < 0)
+        uv_loop_t *old_loop = data->handle->loop;
+        data->handle->loop = tc->loop;
+        if ((r = uv_read_start(data->handle, on_alloc, on_read)) < 0) {
+            data->handle->loop = old_loop;
             MVM_exception_throw_adhoc(tc, "Reading from stream failed: %s",
                 uv_strerror(r));
+        }
         uv_ref((uv_handle_t *)data->handle);
         uv_run(tc->loop, UV_RUN_DEFAULT);
+        data->handle->loop = old_loop;
         return 1;
     }
     else {
@hoelzro

This comment has been minimized.

Show comment
Hide comment
@hoelzro

hoelzro Dec 30, 2014

Contributor

Here's a reply from the libuv community:

https://groups.google.com/forum/#!topic/libuv/bOOTAS2zV_U

Handles are tied to the event loop that created them.  You can clone
them by sending them over an IPC pipe to another event loop; see
test/benchmark-multi-accept.c for a (rather elaborate) example[0].

Functionality for migrating handles comes up from time to time.  It
would make a nice addition to the API but it's probably not all that
easy to implement, particularly on Windows.

[0] https://github.com/libuv/libuv/blob/v1.x/test/benchmark-multi-accept.c 

I've been working on a little test program that exchanges handles over an IPC pipe, but if I understand the docs correctly, this requires a named pipe/socket, which seems kind of messy in a VM.

Contributor

hoelzro commented Dec 30, 2014

Here's a reply from the libuv community:

https://groups.google.com/forum/#!topic/libuv/bOOTAS2zV_U

Handles are tied to the event loop that created them.  You can clone
them by sending them over an IPC pipe to another event loop; see
test/benchmark-multi-accept.c for a (rather elaborate) example[0].

Functionality for migrating handles comes up from time to time.  It
would make a nice addition to the API but it's probably not all that
easy to implement, particularly on Windows.

[0] https://github.com/libuv/libuv/blob/v1.x/test/benchmark-multi-accept.c 

I've been working on a little test program that exchanges handles over an IPC pipe, but if I understand the docs correctly, this requires a named pipe/socket, which seems kind of messy in a VM.

@jnthn

This comment has been minimized.

Show comment
Hide comment
@jnthn

jnthn Dec 30, 2014

Member

The event loop per thread approach was taken early on in the libuv adoption, in 89955c6. Probably that is the thing we should be questioning.

Sadly, the simplest possible thing, just dropping to a single global event loop and protecting it with a lock, is not going to work either; folks will rightly be disappointed if one thread calls $*IN.get and no others can do any other synchronous I/O.

Which leaves us with adopting a design like we have for async I/O today, where there is a thread whose job it is to run the event loop and process requests, and it works its way through a request queue. The difference is the thread making the request would need to block on its completion, rather than an async callback happening later. This may, however, have undesirable performance, which in turn would push us towards asking if we even should be building our sync I/O on top of an async I/O library. Of course, in single-threaded programs we can optimize away the whole I/O thread creation and round trip, and just have the one thread in existence directly do the work.

Member

jnthn commented Dec 30, 2014

The event loop per thread approach was taken early on in the libuv adoption, in 89955c6. Probably that is the thing we should be questioning.

Sadly, the simplest possible thing, just dropping to a single global event loop and protecting it with a lock, is not going to work either; folks will rightly be disappointed if one thread calls $*IN.get and no others can do any other synchronous I/O.

Which leaves us with adopting a design like we have for async I/O today, where there is a thread whose job it is to run the event loop and process requests, and it works its way through a request queue. The difference is the thread making the request would need to block on its completion, rather than an async callback happening later. This may, however, have undesirable performance, which in turn would push us towards asking if we even should be building our sync I/O on top of an async I/O library. Of course, in single-threaded programs we can optimize away the whole I/O thread creation and round trip, and just have the one thread in existence directly do the work.

@jnthn

This comment has been minimized.

Show comment
Hide comment
@jnthn

jnthn Dec 30, 2014

Member

Oh, another option if libuv event loops are not thread safe in the sense that you can't use them from many threads at once, but they don't have thread affinity: just store the thread context that created the handle, add an extra lock into the MVMThreadContext that protects the event loop, and then we can always use the correct event loop (and acquire the needed lock to protect it). This - provided it's workable - is the least invasive solution, I suspect.

Member

jnthn commented Dec 30, 2014

Oh, another option if libuv event loops are not thread safe in the sense that you can't use them from many threads at once, but they don't have thread affinity: just store the thread context that created the handle, add an extra lock into the MVMThreadContext that protects the event loop, and then we can always use the correct event loop (and acquire the needed lock to protect it). This - provided it's workable - is the least invasive solution, I suspect.

@hoelzro

This comment has been minimized.

Show comment
Hide comment
@hoelzro

hoelzro Dec 31, 2014

Contributor

Thanks for the input @jnthn. Regarding that last solution, doesn't that mean that every usage of tc->loop would have to acquire that lock? Sounds like a lot of code to change!

Contributor

hoelzro commented Dec 31, 2014

Thanks for the input @jnthn. Regarding that last solution, doesn't that mean that every usage of tc->loop would have to acquire that lock? Sounds like a lot of code to change!

@hoelzro

This comment has been minimized.

Show comment
Hide comment
@hoelzro

hoelzro Dec 31, 2014

Contributor

In the interim, should an exception/warning be thrown so that a user encountering this behavior isn't surprised?

Contributor

hoelzro commented Dec 31, 2014

In the interim, should an exception/warning be thrown so that a user encountering this behavior isn't surprised?

@hoelzro

This comment has been minimized.

Show comment
Hide comment
@hoelzro

hoelzro Dec 31, 2014

Contributor

So it turns out there actually aren't that many places in the code that refer to tc->loop, so I ended up trying that last suggestion, and it worked! If you want to look it over, the code is in the https://github.com/MoarVM/MoarVM/tree/issue165 branch.

Contributor

hoelzro commented Dec 31, 2014

So it turns out there actually aren't that many places in the code that refer to tc->loop, so I ended up trying that last suggestion, and it worked! If you want to look it over, the code is in the https://github.com/MoarVM/MoarVM/tree/issue165 branch.

@hoelzro

This comment has been minimized.

Show comment
Hide comment
@hoelzro

hoelzro Dec 31, 2014

Contributor

I think something didn't take properly with that branch; t/spec/S17-procasync/basic.t blocks forever.

Contributor

hoelzro commented Dec 31, 2014

I think something didn't take properly with that branch; t/spec/S17-procasync/basic.t blocks forever.

@hoelzro

This comment has been minimized.

Show comment
Hide comment
@hoelzro

hoelzro Dec 31, 2014

Contributor

Another thing that occurs to me: I think all libuv calls would have to grab a mutex for a loop, because a libuv call that doesn't explicitly take a loop may reference one internally. For example, a uv_stream_t has a reference to the loop it should run on, so if I destroy that stream, it probably does something with that loop.

Contributor

hoelzro commented Dec 31, 2014

Another thing that occurs to me: I think all libuv calls would have to grab a mutex for a loop, because a libuv call that doesn't explicitly take a loop may reference one internally. For example, a uv_stream_t has a reference to the loop it should run on, so if I destroy that stream, it probably does something with that loop.

@tony-o

This comment has been minimized.

Show comment
Hide comment
@tony-o

tony-o May 27, 2015

Was this merged into moar?

tony-o commented May 27, 2015

Was this merged into moar?

@hoelzro

This comment has been minimized.

Show comment
Hide comment
@hoelzro

hoelzro May 27, 2015

Contributor

@tony-o I don't think so; it seems like there are outstanding issues with it.

Contributor

hoelzro commented May 27, 2015

@tony-o I don't think so; it seems like there are outstanding issues with it.

@tony-o

This comment has been minimized.

Show comment
Hide comment
@tony-o

tony-o May 27, 2015

Interesting. I wonder if this would affect pipes in general, there is a similar issue with opening a proc::asynchronous and that sub process not being able to read from stdin unless it's in the main thread

tony-o commented May 27, 2015

Interesting. I wonder if this would affect pipes in general, there is a similar issue with opening a proc::asynchronous and that sub process not being able to read from stdin unless it's in the main thread

@hoelzro

This comment has been minimized.

Show comment
Hide comment
@hoelzro

hoelzro Jul 15, 2015

Contributor

I just encountered this again while writing a script that shells out to pandoc to render some text. @jnthn Any hopes for a fix?

Contributor

hoelzro commented Jul 15, 2015

I just encountered this again while writing a script that shells out to pandoc to render some text. @jnthn Any hopes for a fix?

@hoelzro

This comment has been minimized.

Show comment
Hide comment
@hoelzro

hoelzro Jul 15, 2015

Contributor

When using files, the bug (for me, anyway) is not the same as the socket issue; it's that I have an exception that Supply.tap is ignoring. Here's some sample code:

my $fh = open('numbers', :w);
my $timer = Supply.interval(1);
$timer.tap(-> $n {
  $fh.write: ($n ~ "\n");
});
sleep 3;
close $fh;

numbers ends up empty, but it's because I didn't encode $n ~ "\n" when writing to the file. Doing that fixes the issue for files. @tony-o Is that the issue you were seeing? I can't seem to find the RT ticket for this =/

Contributor

hoelzro commented Jul 15, 2015

When using files, the bug (for me, anyway) is not the same as the socket issue; it's that I have an exception that Supply.tap is ignoring. Here's some sample code:

my $fh = open('numbers', :w);
my $timer = Supply.interval(1);
$timer.tap(-> $n {
  $fh.write: ($n ~ "\n");
});
sleep 3;
close $fh;

numbers ends up empty, but it's because I didn't encode $n ~ "\n" when writing to the file. Doing that fixes the issue for files. @tony-o Is that the issue you were seeing? I can't seem to find the RT ticket for this =/

@hoelzro

This comment has been minimized.

Show comment
Hide comment
@tony-o

This comment has been minimized.

Show comment
Hide comment
@tony-o

tony-o Jul 16, 2015

@hoelzro nah, my problem isn't related to swallowing errors although I experienced that too with some other stuff and didn't think it was actually a bug.

tony-o commented Jul 16, 2015

@hoelzro nah, my problem isn't related to swallowing errors although I experienced that too with some other stuff and didn't think it was actually a bug.

@hoelzro

This comment has been minimized.

Show comment
Hide comment
@hoelzro

hoelzro Jul 16, 2015

Contributor

The more I look at this and RT #125230 the more I think they are separate issues...

Contributor

hoelzro commented Jul 16, 2015

The more I look at this and RT #125230 the more I think they are separate issues...

hoelzro added a commit that referenced this issue Jul 21, 2015

Die when we try to accept() or read() outside an originating thread
Addresses MoarVM #165

libuv doesn't like it when you use a uv_stream_t outside of the
event loop that created it, and since MoarVM has an event loop
per thread, this causes problems.  The problem you see is that your
reads are always zero in length and your accepts never return.

This change is not ideal; I'd rather change MoarVM so that a user
can actually do this (there are some strategies discussed on
#165), but this will at least
prevent people from getting a headache trying to figure out why they
aren't getting any data from their sockets.

There are a few other invocations of uv_run in the source code that may
benefit from this as well.
@MadcapJake

This comment has been minimized.

Show comment
Hide comment
@MadcapJake

MadcapJake May 6, 2016

I've run into this problem myself when trying to use channels in a shakespeare character counter https://gist.github.com/MadcapJake/c29b76d7eccf5af86e973c5da7c7f7a3#file-workers-pl6

Not quite sure this is relevant to the issue but this is the error I get:

Tried to read() on a socket from outside its originating thread

I've run into this problem myself when trying to use channels in a shakespeare character counter https://gist.github.com/MadcapJake/c29b76d7eccf5af86e973c5da7c7f7a3#file-workers-pl6

Not quite sure this is relevant to the issue but this is the error I get:

Tried to read() on a socket from outside its originating thread
@hoelzro

This comment has been minimized.

Show comment
Hide comment
@hoelzro

hoelzro May 6, 2016

Contributor

@MadcapJake That's an error I added to MoarVM so that users would see a hard failure instead of scratching their heads for an hour trying to figure out why their reads were always short. =)

Contributor

hoelzro commented May 6, 2016

@MadcapJake That's an error I added to MoarVM so that users would see a hard failure instead of scratching their heads for an hour trying to figure out why their reads were always short. =)

@scovit

This comment has been minimized.

Show comment
Hide comment
@scovit

scovit Jun 10, 2016

I've been running into this error message when trying to do my $stdinput = $*IN.Supply; and then use it into an async socket connection. Workarounded by using something better:

my $stdinput = {my $insupplier=Supplier.new;use Readline;start {my $rl = Readline.new;while my $msg = $rl.readline("") {$insupplier.emit($msg);}}; $insupplier.Supply; }();

scovit commented Jun 10, 2016

I've been running into this error message when trying to do my $stdinput = $*IN.Supply; and then use it into an async socket connection. Workarounded by using something better:

my $stdinput = {my $insupplier=Supplier.new;use Readline;start {my $rl = Readline.new;while my $msg = $rl.readline("") {$insupplier.emit($msg);}}; $insupplier.Supply; }();
@zoffixznet

This comment has been minimized.

Show comment
Hide comment
@zoffixznet

zoffixznet Aug 18, 2016

Contributor

Here's another place this presents an issue. Trying to read from STDOUT from another thread:

$ perl6 -e 'await start get'
Tried to read() on a socket from outside its originating thread
  in block <unit> at -e line 1
Contributor

zoffixznet commented Aug 18, 2016

Here's another place this presents an issue. Trying to read from STDOUT from another thread:

$ perl6 -e 'await start get'
Tried to read() on a socket from outside its originating thread
  in block <unit> at -e line 1
@jnthn

This comment has been minimized.

Show comment
Hide comment
Member

jnthn commented Nov 2, 2016

@jnthn

This comment has been minimized.

Show comment
Hide comment
@jnthn

jnthn May 23, 2017

Member

This is now fixed for the case of sockets; file handles are still to come. The socket part of the fix is covered by an added spectest.

Member

jnthn commented May 23, 2017

This is now fixed for the case of sockets; file handles are still to come. The socket part of the fix is covered by an added spectest.

@jnthn

This comment has been minimized.

Show comment
Hide comment
@jnthn

jnthn Jun 6, 2017

Member

In MoarVM HEAD, it is now addressed for file handles too, and TTYs and pipes now use the code in syncfile.c.

Member

jnthn commented Jun 6, 2017

In MoarVM HEAD, it is now addressed for file handles too, and TTYs and pipes now use the code in syncfile.c.

@jnthn

This comment has been minimized.

Show comment
Hide comment
@jnthn

jnthn Jun 23, 2017

Member

All process handling in Rakudo is now done in terms of the async process API, which makes sense in that MoarVM was supporting two APIs that were both backed with the libuv async API anyway, but with one of them having the trouble discussed in this issue.

I've just removed the synchronous process bits from MoarVM, and therefore I believe this issue can be counted as fully resolved. We could with a little further effort also eliminate the tc->loop except for in the case of the event loop thread, but I don't believe any of the current remaining uses are problematic.

Member

jnthn commented Jun 23, 2017

All process handling in Rakudo is now done in terms of the async process API, which makes sense in that MoarVM was supporting two APIs that were both backed with the libuv async API anyway, but with one of them having the trouble discussed in this issue.

I've just removed the synchronous process bits from MoarVM, and therefore I believe this issue can be counted as fully resolved. We could with a little further effort also eliminate the tc->loop except for in the case of the event loop thread, but I don't believe any of the current remaining uses are problematic.

@jnthn jnthn closed this Jun 23, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment