-
Notifications
You must be signed in to change notification settings - Fork 538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruby 1.9.1 and god #6
Comments
The same error happens with god 0.7.22 as well. |
It looks like God is no longer using that "critical=" method of Thread which no longer exists in Ruby 1.9, so it might work now. |
god 0.8 has removed the usage of Thread.critical=. I believe there may be a couple other problems that need to be solved before 1.9 works, though. |
I'm getting this with 0.8.0 on Ruby 1.9.1
I've not had a chance to dig into it though, but I'll have a look over the next few days. |
bump I think this should be promoted to critical... |
igrigorik: Did the fix you tried end up working? |
Partially. I installed your branch which introduces compat19.rb + grabbed thread.rb from latest head of ruby 1.9 to make it work. That solved the "driver loop" exception show above, but there are other bugs still. Namely, god appears to hang indefinitely when it tries to fork a new process: I see the child that it's trying to spawn in the tree, and it appears to be stuck in that state. (Doesn't happen everytime & unfortunately don't know the exact conditions under which this happens) |
Could you attach gdb to the process and give the output of running "where"? That's odd that it would be hanging there — there isn't a lot that it does before running exec(). |
Attaching to the god process itself yields: (gdb) where #0 0x00002b9122f581ad in pthread_cond_destroy@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000000000051622e in terminate_atfork_i (key=, val=, current_th=) at thread_pthread.c:178 #2 0x00000000004b00b9 in st_foreach (table=0x7dc8f0, func=0x516120 , arg=23970512) at st.c:708 #3 0x0000000000513d88 in rb_thread_atfork () at thread.c:2649 #4 0x000000000046f60e in rb_f_fork (obj=) at process.c:2618 #5 0x00000000004fedef in vm_call_cfunc (th=0x16dc2d0, reg_cfp=0x2aaaae83dc48, num=0, recv=16886520, blockptr=0x2aaaae83dc70, flag=202, me=0x8bbdd0) at vm_insnhelper.c:385 #6 0x0000000000511b43 in vm_call_method (th=0x16dc2d0, cfp=dwarf2_read_address: Corrupted DWARF expression. ) at vm_insnhelper.c:510 #7 0x00000000005045d6 in vm_exec_core (th=0x16dc2d0, initial=) at insns.def:994 #8 0x0000000000509a04 in vm_exec (th=0x16dd5b8) at vm.c:1099 #9 0x000000000050a0c3 in invoke_block_from_c (th=0x16dc2d0, block=0x2aaaae83df30, self=16897880, argc=0, argv=0x0, blockptr=0x0, cref=0x0) at vm.c:543 #10 0x000000000050a948 in loop_i () at vm.c:573 #11 0x0000000000416ae9 in rb_rescue2 (b_proc=0x50a910 , data1=0, r_proc=0, data2=0) at eval.c:574 #12 0x00000000004fe739 in rb_f_loop (self=16897880) at vm_eval.c:605 #13 0x00000000004fedef in vm_call_cfunc (th=0x16dc2d0, reg_cfp=0x2aaaae83df08, num=0, recv=16897880, blockptr=0x2aaaae83df30, flag=202, me=0x864210) at vm_insnhelper.c:385 #14 0x0000000000511b43 in vm_call_method (th=0x16dc2d0, cfp=dwarf2_read_address: Corrupted DWARF expression. ) at vm_insnhelper.c:510 #15 0x00000000005045d6 in vm_exec_core (th=0x16dc2d0, initial=) at insns.def:994 #16 0x0000000000509a04 in vm_exec (th=0x16dd5b8) at vm.c:1099 #17 0x000000000050a0c3 in invoke_block_from_c (th=0x16dc2d0, block=0x142fd40, self=16897880, argc=0, argv=0x0, blockptr=0x0, cref=0x0) at vm.c:543 #18 0x000000000050a5d2 in rb_vm_invoke_proc (th=0x16dc2d0, proc=0x142fd40, self=16897880, argc=0, argv=0x101ac70, blockptr=0x0) at vm.c:590 #19 0x00000000005197d5 in thread_start_func_2 (th=0x16dc2d0, stack_start=0x40613120) at thread.c:387 #20 0x00000000005199ee in thread_start_func_1 (th_ptr=0x16dd5b8) at thread_pthread.c:351 #21 0x00002b9122f541b5 in start_thread () from /lib64/libpthread.so.0 #22 0x00002b9123afa36d in clone () from /lib64/libc.so.6 #23 0x0000000000000000 in ?? () Also, strace: Process 10887 attached - interrupt to quit futex(0x16dd5b8, FUTEX_WAIT, 2, NULL Process 10887 detached When it's in this state, the process is completely unresponsive.. god log / status, etc., all return "server not available". |
I've committed another method to remove the need to grab thread.rb. It is available here: http://github.com/eric/god/tree/condition-variable-1.9-support I'm still trying to figure out why it would be hanging here. |
Eric, don't think that solved it. I'm trying to find the actual sequence that reproduces the error, but so far this it the pattern I'm seeing: startups fine, loads all the processes it should. Then I start randomly killing processes (kill -9) to get god to bring them back up. Works well for the most part, except every once in a while, it get's stuck. In ps I can see that god is trying to fork a process, but it's stuck in that state.. Below is the output of god process itself, and the child that's trying to fork: god process itself.. God process itself: ------------------- (gdb) where #0 0x00002aec325a1376 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x0000000000516561 in native_sleep (th=0x7dc500, tv=0x0) at thread_pthread.c:618 #2 0x0000000000519edc in thread_join_sleep (arg=) at thread.c:770 #3 0x0000000000416714 in rb_ensure (b_proc=0x519e80 , data1=140737486727824, e_proc=0x513170 , data2=140737486727824) at eval.c:671 #4 0x0000000000514c7c in thread_join (target_th=0x1830c70, delay=1e+30) at thread.c:637 #5 0x0000000000514da2 in thread_join_m (argc=, argv=, self=) at thread.c:718 #6 0x00000000004fedef in vm_call_cfunc (th=0x7dc500, reg_cfp=0x2aec324a5a90, num=0, recv=15750920, blockptr=0x0, flag=0, me=0x8d9b60) at vm_insnhelper.c:385 #7 0x0000000000511b43 in vm_call_method (th=0x7dc500, cfp=dwarf2_read_address: Corrupted DWARF expression. ) at vm_insnhelper.c:510 #8 0x00000000005045d6 in vm_exec_core (th=0x7dc500, initial=) at insns.def:994 #9 0x0000000000509a04 in vm_exec (th=0x7dc594) at vm.c:1099 #10 0x000000000050a0c3 in invoke_block_from_c (th=0x7dc500, block=0x1619f80, self=8631080, argc=0, argv=0x0, blockptr=0x0, cref=0x0) at vm.c:543 #11 0x000000000050a5d2 in rb_vm_invoke_proc (th=0x7dc500, proc=0x1619f80, self=8631080, argc=0, argv=0x10456a0, blockptr=0x0) at vm.c:590 #12 0x0000000000418ced in rb_exec_end_proc () at eval_jump.c:134 #13 0x0000000000418df4 in ruby_finalize_0 () at eval.c:107 #14 0x0000000000418ea9 in ruby_cleanup (ex=0) at eval.c:142 #15 0x0000000000419209 in ruby_stop (ex=8242580) at eval.c:217 #16 0x000000000046f65d in rb_f_fork (obj=) at process.c:2623 #17 0x00000000004fedef in vm_call_cfunc (th=0x7dc500, reg_cfp=0x2aec324a5c48, num=0, recv=16884920, blockptr=0x2aec324a5c70, flag=0, me=0x8bbdd0) at vm_insnhelper.c:385 #18 0x0000000000511b43 in vm_call_method (th=0x7dc500, cfp=dwarf2_read_address: Corrupted DWARF expression. ) at vm_insnhelper.c:510 #19 0x00000000005045d6 in vm_exec_core (th=0x7dc500, initial=) at insns.def:994 #20 0x0000000000509a04 in vm_exec (th=0x7dc594) at vm.c:1099 #21 0x000000000050be7b in vm_call0 (th=0x7dc500, recv=16884920, id=448, argc=1, argv=0x2aec323a60b0, me=0x1217660) at vm_eval.c:62 #22 0x00000000005105cd in rb_funcall2 (recv=16884920, mid=448, argc=1, argv=0x2aec323a60b0) at vm_eval.c:266 #23 0x0000000000449402 in rb_class_new_instance (argc=1, argv=0x2aec323a60b0, klass=) at object.c:1494 #24 0x00000000004fedef in vm_call_cfunc (th=0x7dc500, reg_cfp=0x2aec324a5e00, num=1, recv=16885080, blockptr=0x0, flag=0, me=0x84e330) at vm_insnhelper.c:385 #25 0x0000000000511b43 in vm_call_method (th=0x7dc500, cfp=dwarf2_read_address: Corrupted DWARF expression. ) at vm_insnhelper.c:510 #26 0x00000000005045d6 in vm_exec_core (th=0x7dc500, initial=) at insns.def:994 #27 0x0000000000509a04 in vm_exec (th=0x7dc594) at vm.c:1099 #28 0x0000000000509d71 in rb_iseq_eval (iseqval=16382880) at vm.c:1296 #29 0x000000000054e875 in rb_load_internal (fname=16492120, wrap=) at load.c:293 #30 0x000000000054e9cc in rb_f_load (argc=, argv=) at load.c:366 #31 0x00000000004fedef in vm_call_cfunc (th=0x7dc500, reg_cfp=0x2aec324a5f08, num=1, recv=8631080, blockptr=0x0, flag=0, me=0x8c61f0) at vm_insnhelper.c:385 #32 0x0000000000511b43 in vm_call_method (th=0x7dc500, cfp=dwarf2_read_address: Corrupted DWARF expression. ) at vm_insnhelper.c:510 #33 0x00000000005045d6 in vm_exec_core (th=0x7dc500, initial=) at insns.def:994 child process: (gdb) where #0 0x00002aec325a11ad in pthread_cond_destroy@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x000000000051622e in terminate_atfork_i (key=, val=, current_th=) at thread_pthread.c:178 #2 0x00000000004b00b9 in st_foreach (table=0x7dc8f0, func=0x516120 , arg=23356528) at st.c:708 #3 0x0000000000513d88 in rb_thread_atfork () at thread.c:2649 #4 0x000000000046f60e in rb_f_fork (obj=) at process.c:2618 #5 0x00000000004fedef in vm_call_cfunc (th=0x1646470, reg_cfp=0x2aaaae63bb98, num=0, recv=17068080, blockptr=0x2aaaae63bbc0, flag=202, me=0x8bbdd0) at vm_insnhelper.c:385 #6 0x0000000000511b43 in vm_call_method (th=0x1646470, cfp=dwarf2_read_address: Corrupted DWARF expression. ) at vm_insnhelper.c:510 #7 0x00000000005045d6 in vm_exec_core (th=0x1646470, initial=) at insns.def:994 #8 0x0000000000509a04 in vm_exec (th=0x1573918) at vm.c:1099 #9 0x000000000050a0c3 in invoke_block_from_c (th=0x1646470, block=0x2aaaae63bf30, self=17092960, argc=0, argv=0x0, blockptr=0x0, cref=0x0) at vm.c:543 #10 0x000000000050a948 in loop_i () at vm.c:573 #11 0x0000000000416ae9 in rb_rescue2 (b_proc=0x50a910 , data1=0, r_proc=0, data2=0) at eval.c:574 #12 0x00000000004fe739 in rb_f_loop (self=17092960) at vm_eval.c:605 #13 0x00000000004fedef in vm_call_cfunc (th=0x1646470, reg_cfp=0x2aaaae63bf08, num=0, recv=17092960, blockptr=0x2aaaae63bf30, flag=202, me=0x864210) at vm_insnhelper.c:385 #14 0x0000000000511b43 in vm_call_method (th=0x1646470, cfp=dwarf2_read_address: Corrupted DWARF expression. ) at vm_insnhelper.c:510 #15 0x00000000005045d6 in vm_exec_core (th=0x1646470, initial=) at insns.def:994 #16 0x0000000000509a04 in vm_exec (th=0x1573918) at vm.c:1099 #17 0x000000000050a0c3 in invoke_block_from_c (th=0x1646470, block=0x1815bb0, self=17092960, argc=0, argv=0x0, blockptr=0x0, cref=0x0) at vm.c:543 #18 0x000000000050a5d2 in rb_vm_invoke_proc (th=0x1646470, proc=0x1815bb0, self=17092960, argc=0, argv=0x10470b8, blockptr=0x0) at vm.c:590 #19 0x00000000005197d5 in thread_start_func_2 (th=0x1646470, stack_start=0x40511120) at thread.c:387 #20 0x00000000005199ee in thread_start_func_1 (th_ptr=0x1573918) at thread_pthread.c:351 #21 0x00002aec3259d1b5 in start_thread () from /lib64/libpthread.so.0 #22 0x00002aec3314336d in clone () from /lib64/libc.so.6 #23 0x0000000000000000 in ?? () |
A few more observations: if I kill the forked process, god responds to commands, but it doesn't bring up the process on which it blocked in the first place. However, if I issue an explicit "god restart proc", it does seem to come back. trying to figure out the actual pattern |
This is the same issue as #8. |
The Ruby 1.9 mutex patches are now in God 0.9.0. If anyone has additional feedback on how well it works now, I'd much appreciate it! |
It works fine for a while, but eventually:
|
Bump. Same issue, still there on 1.9.2 p0, another trace: |
Good news.. we have a Ruby patch which address this. Full details: (unfortunately rubylang redmine is down atm) |
Is this fix included in the new 1.9.2 release (p180)? |
According to the changelog @ http://svn.ruby-lang.org/repos/ruby/tags/v1_9_2_180/ChangeLog Unfortunately, don't see it.. http://redmine.ruby-lang.org/repositories/revision/ruby-19?rev=30272 |
Attempting to run god with Ruby 1.9.1 results in the following error:
The text was updated successfully, but these errors were encountered: