Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IO.pipe with IO.select #313

Closed
snarfmason opened this issue Jun 5, 2013 · 18 comments

Comments

Projects
None yet
5 participants
@snarfmason
Copy link

commented Jun 5, 2013

pipe.rb: https://gist.github.com/snarfmason/5715476
pipe-select.rb: https://gist.github.com/snarfmason/5715484

pipe.rb is a simple threaded pipe example that works as expected in Maglev (the same as in ruby 1.8.7, ruby 1.9.3, rubinius and jruby)

pipe-select is adapted from the Ruby 1.9.3 docs at http://ruby-doc.org/core-1.9.3/IO.html#method-c-select

The output is a bit of a mess with my extra puts in there, but the idea is that it says ping pong 5 times (it actually ends with a ping that has no paired pong).

Maglev doesn't stop on IO.select and just returns nil.

@snarfmason

This comment has been minimized.

Copy link
Author

commented Jun 5, 2013

$ ruby1.9 pipe-select.rb

rs: []
ws: [#<IO:fd 6>]
rs: [#<IO:fd 5>]
ws: [#<IO:fd 6>]
ping rs: [#<IO:fd 5>]
ws: [#<IO:fd 6>]
pong
rs: [#<IO:fd 5>]
ws: [#<IO:fd 6>]
ping rs: [#<IO:fd 5>]
ws: [#<IO:fd 6>]
pong
rs: [#<IO:fd 5>]
ws: [#<IO:fd 6>]
ping rs: [#<IO:fd 5>]
ws: [#<IO:fd 6>]
pong
rs: [#<IO:fd 5>]
ws: [#<IO:fd 6>]
ping rs: [#<IO:fd 5>]
ws: [#<IO:fd 6>]
pong
rs: [#<IO:fd 5>]
ws: [#<IO:fd 6>]
ping 
@snarfmason

This comment has been minimized.

Copy link
Author

commented Jun 5, 2013

$ maglev-ruby pipe-select.rb

rs: [nil]
ws: [nil]
rs: [nil]
ws: [nil]
rs: [nil]
ws: [nil]
rs: [nil]
ws: [nil]
rs: [nil]
ws: [nil]
rs: [nil]
ws: [nil]
rs: [nil]
ws: [nil]
rs: [nil]
ws: [nil]
rs: [nil]
ws: [nil]
rs: [nil]
ws: [nil]
@pbm

This comment has been minimized.

Copy link
Member

commented Jun 5, 2013

I believe the problem with your script is the fork() part. A MagLev VM wasn't designed to fork(). There are user login sessions and transaction contexts etc. that were not designed for forking action. We support the IO.popen, because it is a fork+exec and the VM can control it and shutdown appropriate connections to shared page cache and stone. If I understood your original need for IO.pipe, it was for interthread communication (i.e., in the same process). I'm not sure if that will work, but is worth a try. But the sample script's use of fork to spawn a new VM is not a use case that MagLev supports.

@snarfmason

This comment has been minimized.

Copy link
Author

commented Jun 5, 2013

I was under the impression that threads were okay. Ruby has a shortcut method on the thread class called fork, but it's just creating a new thread and starting it immediately.

It's not actually a fork in the unix sense, in that it's not creating a new process.

I think this is just about the IO.select not waiting on the IO.pipe ends to be ready and returning nil.

@pbm

This comment has been minimized.

Copy link
Member

commented Jun 5, 2013

Apparently my attention span is too short this monring. I saw "blahblah.FORK". I should have looked more closely to see it was "Thread.fork". Sorry about that. I'm not sure why it doesn't work, except to say that multithreaded coding and IO is not an area that has received much attention, so it is not too surprising to find a bug here.

@mmmries

This comment has been minimized.

Copy link
Contributor

commented Feb 26, 2014

I just ran into this same issue. It appears that IO.select hasn't received any attention for a while. I'm going to try to look into this, but if anyone actually knows anything about non-blocking IO stuff they are welcome to beat me to it :)

@timfel

This comment has been minimized.

Copy link
Member

commented Feb 26, 2014

I'll need to dig deeper, and don't have time this week, but here is a clue: src/kernel/post_prims/IO.rb uses FFI and File.__fopen to create pipes, and in the comment to the snalltalk code for Kernel.select, "selectRead: readIos write: writeIos error: errIos timeout: timeoutArr", it says: "If any of the IOs is a GsFile, that IO will have been immediately added to the ioSelectResult."

Maybe the type is the problem, need to check the primitive to confirm, or maybe @AllenOtis knows from memory?

@mmmries

This comment has been minimized.

Copy link
Contributor

commented Feb 26, 2014

I also talked to @johnnyt about this last night at a user group meeting and he had the same idea about digging into the primitive. I'm totally out of my depth as far as digging into FFI primitives, but luckily it sounds like @johnnyt is pretty set on finding a fix for this.

@timfel

This comment has been minimized.

Copy link
Member

commented Feb 26, 2014

I have a working (but hacked) solution in b6007cd.Let's see what Travis-CI thinks about those changes

@timfel

This comment has been minimized.

Copy link
Member

commented Feb 26, 2014

Also, yeah, I know I said I didn't have time, but writing papers gets really boring.

@timfel

This comment has been minimized.

Copy link
Member

commented Feb 26, 2014

Just as an explanation, the select primitive really only waits on Sockets. So I made Pipe endpoints be Sockets, and they have a file object on the same file descriptor, because you can't use the socket api to write to pipes.

My solution would need a more generic method to delegate to its @file ivar. The real solution would of course be to fix the VM primitive, but I don't have the expertise there.

@AllenOtis

This comment has been minimized.

Copy link
Contributor

commented Feb 26, 2014

This looks like a reasonable solution.

Allen

On Wed, Feb 26, 2014 at 8:29 AM, Tim Felgentreff
notifications@github.comwrote:

I have a working (but hacked) solution in b6007cdhttps://github.com/MagLev/maglev/commit/b6007cd5f0133745fdfef4db46487f27d067e960.Let's
see what Travis-CI thinks about those changes

Reply to this email directly or view it on GitHubhttps://github.com//issues/313#issuecomment-36144599
.

@mmmries

This comment has been minimized.

Copy link
Contributor

commented Feb 27, 2014

@timfel Thanks! I got puma working by pulling down your branch and building locally. Hopefully this can get merged into master soon. Looks like Travis CI had no complaints as well.

@mmmries

This comment has been minimized.

Copy link
Contributor

commented Feb 27, 2014

:'( sorry I had a PATH issue and puma was actually using mri, not working quite yet

@timfel

This comment has been minimized.

Copy link
Member

commented Feb 27, 2014

@hqmq is it the next problem, or still an issue with IO.select?

@mmmries

This comment has been minimized.

Copy link
Contributor

commented Feb 27, 2014

@timfel I'll have to do some more digging on it later, but initially what I get is connections to the server hanging. It doesn't crash, but it looks like the problem is probably related to writing back out to the sockets since the connection is made correctly. I'll debug a bit more this evening (after my normal work hours)

@mmmries

This comment has been minimized.

Copy link
Contributor

commented Feb 28, 2014

@timfel apparently the vmunit tests are actually not passing. If you scroll to the bottom of the last build (https://travis-ci.org/MagLev/maglev/jobs/19664314) it actually fails on Trac438.rb, but travis is somehow seeing an exit status of 0 so it doesn't know that it failed. Still digging into puma to try to figure out where it is hanging, but we probably need to get the tests actually running as well (perhaps in a separate issue/Pull Request)

@mmmries

This comment has been minimized.

Copy link
Contributor

commented Feb 28, 2014

I tracked down the puma issue. When a connection comes in puma tries to call TCPSocket#read_nonblock(16384) and the read nonblock is apparently blocking trying to wait for more data. So I think the select issue is fixed.

There is still a slight behavior difference between MRI and Maglev on IO.select

MRI

2.1.0 :001 > rs, ws = IO.pipe
 => [#<IO:fd 9>, #<IO:fd 10>]
2.1.0 :002 > IO.select([rs],[ws])
 => [[], [#<IO:fd 10>], []]

Maglev

irb(main):001:0> rs, ws = IO.pipe
=> #<IO::Pipe:0x6efa101 @_st_fileDescriptor=13 @_st_lineNumber=nil @_st_readWaiters=nil @_st_writeWaiters=nil @_st_readyEvents=0 @_st_pollArrayOfs=-1 @_st_selectWaiters=nil @_st_readBuffer=nil @_st_bufferOffset=nil @_st_isRubyBlocking=true @file=#<File:0x6efa301 @_st_fileDescriptor=13 @_st_lineNumber=nil @_st_isClient=false @_st_pathName=nil @_st_mode="w"> @apparant_class=Class>
irb(main):002:0> IO.select([rs],[ws])
=> [nil, [#<IO::Pipe:0x6efa101 @_st_fileDescriptor=13 @_st_lineNumber=nil @_st_readWaiters=nil @_st_writeWaiters=nil @_st_readyEvents=4 @_st_pollArrayOfs=-1 @_st_selectWaiters=[nil, 0] @_st_readBuffer=nil @_st_bufferOffset=nil @_st_isRubyBlocking=true @file=#<File:0x6efa301 @_st_fileDescriptor=13 @_st_lineNumber=nil @_st_isClient=false @_st_pathName=nil @_st_mode="w"> @apparant_class=Class>], nil]

Basically If none of ios are ready to read or ready to write in MRI it returns an empty list, but Maglev returns nil. I think the branch can be merged as-is, but eventually we will want to make that behavior consistent because a lot of library code will assume that it is getting back a list rather than a nil.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.