-
Notifications
You must be signed in to change notification settings - Fork 87
Reel fails on first 2-3 requests, and after cooling down. #33
Comments
Are you seeing any other traces to help debug this problem? That's not a lot to go on |
Trying to think of how to produce more diagnostic data. Just returned from 90+ minutes away from my development environment, and my first pageload died with: Celluloid::DeadTaskError: cannot resume a dead task (dead fiber called) Will troubleshoot further // |
Ok @tarcieri, I isolated this. I took my application out of the way, stopped using Rack and invoke Reel directly. Here's a gist showing my sample server ( which is your sample server in the Reel readme ): https://gist.github.com/digitalextremist/5316190 On pageload 1 and 2, I see "Hello, world!" Then, on pageload 3, this always happens: E, [2013-04-04T19:39:25.098000 #15443] ERROR -- : MyServer crashed! Then the server exits. I noticed this before in my large application, based on Rack. When I used Reel, it seemed every 3rd pageload dead, which was sometimes an image, sometimes critical javascript, sometimes a stylesheet. This example is much more simple though. I can't imagine this is something going on in the wild for people. Death by a 3rd pageload seems a bit limited in uptime :) |
@digitalextremist I recently found a rather severe bug in Celluloid::IO. I'm going to do another release (0.13.1). If you can try to reproduce the same problem on that, that'd be great. |
Then, with this gist: https://gist.github.com/digitalextremist/5316264 Same thing, except once supervised: pageload 4 and 5 work, then 6 dies; 7 and 8 work, then 9 dies... etc. |
Ah sorry, missed your comment while I was gisting a second repro. |
Or anywhere I can dig in 0.13.0 to help? |
I'll be rolling the gem shortly, hold tight... |
I just released 0.13.1. This resolved a number of problems that @aniero had with one of his projects. Let me know if it fixes yours. |
I re-ran my dummy application, and it is affected... When I cancel the pageload in the browser and refresh, the hang continues. |
Thanks for your work on this branch of thinking in systems theory by the way. I appreciate your work. I feel a need out of just plain honesty in taking true advantage of what we're capable of to go down this route you've helped pave. |
Enabled invokedynamic in jRuby and the same issue continues. I saw you mentioning something about invokedynamic being encouraged. Not sure how to troubleshoot. If you can let me into your thinking on this I will help all I can. |
Let me take a look. If the gist you posted is still crashing, something is On Thu, Apr 4, 2013 at 9:49 PM, digitalextremist
Tony Arcieri |
Yeah, the Hello, World test seems so simple. If I can help somewhere, clue me into where the cheese moved in the last update in your mind and I'll dig around also in my fork. |
With the code I have ( Reel 0.3.0 and Celluloid 0.13.0 and Celluloid::IO 0.13.1 ) running Reel as a Rack handler seems to allow pageloads to get past that hang, but the hang does occur. Everything seems generally slower too, after the C::IO change and the application cannot completely function. Before, when pageloads died, you knew something did not complete for the browser. Obviously them not dying is better, but a new issue comes with silent hangs. It will usually take more than three pageloads for the application to render one instance. One for the HTML, one for the CSS, one for the JS, one for a JS template and one for an AJAX call, with another AJAX call I am trying to make into a websocket sync connection. There is a pageload for each image also. So depending on the order of those requests, something is always left out, sometimes vital, sometimes aesthetic. I think this issue you are looking at is the crux of one of two issues keeping me from being able to shift to Reel, so thanks for digging into it. |
A lot of the problems I found were because of the socket reuse occurring in the browsers. If you try and reproduce with curl, what happens? |
If you reproduce with curl, you'll Aside from not being able to get httperf working on my machine (wtf), I was able to hammer the Reel portion of ringleader pretty hard without any problems using ab, and I haven't seen any errors from the browser either. |
Er, to clarify: since ringleader doesn't use the rack handler, I'd look there first. |
@halorgium Same issue with curl... except there is a short warning: root@two:/mu/zero# jruby reel.rb So to clarify, there is that warning once, which is the start of an infinite hang. Curl sits at the command line waiting for a response, same as chrome and firefox did, until I stop the attempt ( Ctrl-C to Curl ). Then when I try to run curl again on the Reel endpoint, the hang continues but with no Log line for GET / and no warning. It's zombie. This is happening with gist. No rack whatsoever, and not subclassed: |
This does seem to work, for some strange reason: That is a test using Reel::App and Octarine. |
By "work" I mean in Chrome, I can refresh that endpoint over and over and it is always instantaneous and never hangs. |
Strangely, upon checking the previous dummy test again, that also works. I updated my bundle which is the only thing I could guess that would affect anything. I had already been using 0.13.0 of Celluloid, 0.13.1 of Celluloid::IO and 0.3.0 of Reel. I brought in Octarine. Also, previous to that had restarted my vagrant ubuntu virtual machine. It appears nothing is stopping me from fleshing out a Reel+Octarine test application to try and pull in my existing code without Rack+Sinatra. I'd like to keep this open until I get some more time to break this again. As a matter of fact I will try reusing the Rack handler version and see if that's reliable, in which case I'll try that in production first. |
Wait! On the standard demo @tarcieri posted on the Reel readme (no subclass) there is still the cool-down issue, but it doesn't seem to be nearly as bad now. After X time away, when you return and refresh an endpoint, there is: W, [2013-04-06T19:00:22.543000 #2889] WARN -- : reactor attempted to resume a dead task |
The hang issue returned, after a cool down period. Only way to get the Rack::Reel server back is Ctrl-C, then I re-ran the ultra simple examples and saw they still work beyond 3 pageloads. Certain pageloads error out intermittently, so an image here and there is dead. Will build lite version of my application without Rack and see if it's Rack causing it. |
Running my 01E.rb gist [ https://gist.github.com/digitalextremist/5328506 ] I am able to repeat the hang issue that started this thread, but unsure where the hangup is coming from. Once the hangup happens, the server never comes back. It's usually after this error: W, [2013-04-06T20:02:53.487000 #4103] WARN -- : reactor attempted to resume a dead task In this case, the hang occurred on the second pageload. Tested with Chrome, Safari, Firefox and then curl. All are hung. No console activity announcing a request either. Reel just goes off into a netherworld, unresponsive. |
Yeah, even just using Curl, after 2 or 3 pageloads, the 01E.rb example hangs. |
Thinking restarting the server reopened the ability for Reel to provide pageloads, I went to shutdown -h now and something hung up the shutdown, so I had to hard-powerdown the virtual. After returning with a fresh virtual, Reel can do many pageloads without hanging. Going to give the test server time to cool down and see if I can trigger the hanging error again that way. |
Caused hang again. Without giving time for cooldown, leaving Chrome ( and closing it) I used curl and after the first pageload succeeded, on the second pageload it gave the same WARN line and is now hungup. |
After Ctrl-C on hungup server, then rerunning server, using Curl to hit the test endpoint, the 3rd pageload hang returns, presumably unless I recycle the virtual. Recycled virtual, and this time it did not hang on shutdown. In fresh virtual, reran with Curl, and it hangs again on the third pageload. Ctrl-C on 01E.rb, then using Chrome, does over 20 pageloads no problem. Then ran with Firefox, reloaded 10 or so times no problem, then hit the endpoint with curl once, and no pageload, just the WARN line. So in this case there wasn't even a first success, it just immediately crashed for curl, then Chrome and Firefox were also blocked out. |
Not sure of the relationship to the issue linked. Read up on that and don't see a connection. Hang can be caused by using Chrome/Firefox for over 0-50 pageloads, which seems like it could keep going forever no problem. Point being, you can pageload forever until you use Curl. Then, make one Curl request. That first request will succeed, then all requests no matter what from will hang. Interesting nuance: If you do that Curl request, but only once, then use Chrome again, it will fire off a WARN for attempted to resume a dead task, then it will work for the following pageload, and then hang. |
@digitalextremist are you able to jump on IRC? I'd love to get to the bottom of this? freenode#celluloid |
So far the ways to hang this are:
I'll see you there. Nickname: decentrality |
At prompting of @halorgium, added this after require 'reel': Celluloid.task_class = Celluloid::TaskThread Per Celluloid #169 regarding jRuby 1.7.3 That seems to fix it. All previous causes of hang are gone. |
So far, what @halorgium suggested from @tarcieri post referenced is so effective it has also stopped hangs and failed fibers when using Reel as a Rack handler. |
I am going to go ahead and close this for now, since it seems like a duplicate or at least cousin issue to Celluloid #169 |
Without any work around, reverting to jRuby 1.7.2 solves issue (again). |
My first 2-3 pageloads always fail, using the Rack adapter for Reel. Not always in the process of returning a response from my application, but more so from returning an essential public file, like a JS or CSS file.
Example stack trace:
Reel::Server crashed!
Celluloid::DeadTaskError: cannot resume a dead task (dead fiber called)
/usr/local/rvm/gems/jruby-1.7.3/gems/celluloid-0.12.4/lib/celluloid/tasks/task_fiber.rb:51:in
resume' /usr/local/rvm/gems/jruby-1.7.3/gems/celluloid-0.12.4/lib/celluloid/tasks/task_fiber.rb:47:in
resume'/usr/local/rvm/gems/jruby-1.7.3/gems/celluloid-0.12.4/lib/celluloid/responses.rb:11:in
dispatch' /usr/local/rvm/gems/jruby-1.7.3/gems/celluloid-0.12.4/lib/celluloid/actor.rb:329:in
handle_message'/usr/local/rvm/gems/jruby-1.7.3/gems/celluloid-0.12.4/lib/celluloid/actor.rb:196:in
run' /usr/local/rvm/gems/jruby-1.7.3/gems/celluloid-0.12.4/lib/celluloid/actor.rb:184:in
initialize'/usr/local/rvm/gems/jruby-1.7.3/gems/celluloid-0.12.4/lib/celluloid/thread_handle.rb:17:in
initialize' org/jruby/RubyProc.java:249:in
call'/usr/local/rvm/gems/jruby-1.7.3/gems/celluloid-0.12.4/lib/celluloid/internal_pool.rb:48:in `create'
10.126.130.1 - - [13/Mar/2013 15:10:56] "GET /01E.min.css 1.1" 304 - 0.0380
If I reload the URL in the browser 2-3 times, the request works.
And after an undetermined cool down period, 2-3 requests will also fail, as it does when Reel first starts up to return Rack responses.
The text was updated successfully, but these errors were encountered: