Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resque workers start hanging with activated ddtrace #466

Closed
NobodysNightmare opened this issue Jun 18, 2018 · 21 comments
Closed

Resque workers start hanging with activated ddtrace #466

NobodysNightmare opened this issue Jun 18, 2018 · 21 comments
Assignees
Labels
bug Involves a bug community Was opened by a community member
Milestone

Comments

@NobodysNightmare
Copy link

NobodysNightmare commented Jun 18, 2018

We are currently in the process of rolling out datadog over our applications. However, we encountered a quite severe issue which we believe we tracked down to the ddtrace gem.

Observed behaviour

Resque workers will eventually end in a state where they are deadlocked. This means the worker is alive, but the processing of the current job never finishes.
Subjectively this happens faster for workers that process lots of jobs (always have a full queue) than for workers that have not as much to do.

strace tells us, that the deadlocked processes are waiting for a user space mutex. One example:

$ strace -p 8806
Process 8806 attached - interrupt to quit
futex(0x7fc970d543c0, FUTEX_WAIT_PRIVATE, 2, NULL

When we remove the ddtrace gem and its configuration from our project, we don't get any deadlocking resque workers.

So far we observed this behaviour only on resque workers. We got no timeouts whatsoever for our web servers.

Environment

These are the versions of a few gems in that project:

ddtrace (0.12.1)
rails (3.2.22.5)
redis (3.3.3)
resque (1.27.4)

ruby 2.2.4

These are the specs for the application where we observed this issue, I will try to reproduce it in another environment with more recent ruby/rails versions.

edit: Reproduced the error on more environments by now

ddtrace (0.12.1)
ruby 2.2.4
rails (4.2.7.1)
redis (3.3.5)
resque (1.27.4)
ddtrace (0.13.0)
ruby 2.5.0
rails (5.1.6)
redis (4.0.1)
resque (1.27.4)

This is our datadog initializer:

yml_config = YAML.safe_load(File.read("#{Rails.root}/config/datadog.yml"))
                 .fetch(Rails.env)
                 .symbolize_keys

service = ->(name) { "#{yml_config[:service_name]}-#{name}" }

Datadog.configure do |c|
  c.tracer enabled: yml_config[:enabled]
  c.use :rails, service_name: yml_config[:service_name]
  c.use :graphql, service_name: service.call('graphql'),
                  schemas: [KaeuferportalSchema]
  c.use :http, service_name: service.call('external')
  c.use :redis, service_name: service.call('redis')

  # Note: The error also occurs with the line below being deleted
  c.use :resque, service_name: service.call('resque'), workers: [ApplicationJob]
end

Bisecting

We updated ddtrace from 0.12.0 to 0.12.1 when encountering the issue, so both versions are affected for us.

Before the problems started we already had ddtrace running on the affected application, but in version 0.11.4 and without explicitly activating any integration (thus just rails + whatever that loads automatically were active). In that state we have not had any problems.

Thanks for your support, tell me if you need any further information.

@delner
Copy link
Contributor

delner commented Jun 18, 2018

@NobodysNightmare Thanks for the report... I'm bummed to hear this is happening. I'll take a look into this, see if we can figure out what's going on. Also the detail you provided is really helpful! We'll follow up here with more questions as we dig in deeper.

@delner delner added bug Involves a bug community Was opened by a community member labels Jun 18, 2018
@delner
Copy link
Contributor

delner commented Jun 18, 2018

Could you tell me more about what these jobs are roughly doing? How long do they run? What integrations are they expected to trace? If you were able to estimate how many spans they generate, what would that number be?

Anything that can help us replicate the bug will make this immensely easier and quicker to fix.

@NobodysNightmare
Copy link
Author

NobodysNightmare commented Jun 19, 2018

It really is a mixed bag of jobs that blocks. For example calculation jobs or mail delivery jobs. All are interacting with a SQL database in one way or another.

The expected integrations for those jobs:

  • resque (it's resque jobs)
  • mysql
  • http for some jobs
    • edit: I actually can't find spans for http in traces where I would have expected them... not sure if that points towards or away from this integration as a suspect, though...
  • redis might be possible for some jobs... but rather unlikely to affect many jobs

Span count

For one job class I clicked through the traces and it uses either 14 or 22 spans. So nothing extraordinary large. To clarify: The jobs run fine until they get blocked, this is why I can look up their span counts.

Runtime

Regarding runtime: They are usually well below one second (this is the job class I clicked through earlier):
image

Though above the 99th percentile (1.8 seconds) the times go up to a maximum of a minute. The hightest trace sample I could get was at 1.7 seconds and had the "usual" 22 spans, so nothing extra ordinary there either :(

Note: This specific class of jobs is enqueued once per day and then we have a few ten thousand jobs queued up.

More bisecting

We are currently evaluating whether ddtrace 0.11.4 with a larger configuration exhibits the same behaviour or whether it runs fine without any blocking.

I will also try to reproduce the error in a different application.

@frsantos
Copy link

@NobodysNightmare Maybe rbspy can help you identify what your workers are doing when they hang.

@marcusmalmberg
Copy link

We've also experienced problems with resque. Web requests and spans worked as expected.

We tried enabling this integration when installing the 0.12 beta gem. Since it wasn't something we had enabled before I just disabled it and moved on until I could take another look at it, as it was only a "nice to have" feature then.

Jobs that normally takes a couple of ms to complete took forever to finish (but I think they actually finished eventually and wasn't timed out). Sometimes the jobs completes as expected and sometimes not. Since it was a few months ago I don't have any numbers or traces available anymore.

As far as I can remember there's no typical kind job that fails. Some jobs do http requests, some connect to redis, some to mysql and some sending emails etc.

Our config at the time:

ddtrace (0.12.0.beta2)
ruby 2.5.0p0
rails (5.1.4)
redis (4.0.1)
resque (1.27.4)
Datadog.configure do |c|
  c.tracer env: Rails.env, enabled: (Rails.env.production? || Rails.env.staging?)
  c.use :resque
  c.use :grape
  c.use :rails
  c.use :graphql, schemas: [HittamaklareSchema]
end

@NobodysNightmare
Copy link
Author

NobodysNightmare commented Jun 19, 2018

@delner So far it looks like 0.11.4 is running fine. I'll update you if I see it fail.

edit: I could provoke lockups with 0.11.4 now too. The only difference that's left for me is which integrations we had enabled before... I am now going to try and gradually disable/enable integrations. Stay tuned...

@NobodysNightmare
Copy link
Author

We were still able to reproduce the issue with

Datadog.configure do |c|
  c.tracer enabled: yml_config[:enabled]
  c.use :rails, service_name: yml_config[:service_name]
  c.use :resque, service_name: service.call('resque'), workers: [ApplicationJob]
end

and so far it did not occur again using

Datadog.configure do |c|
  c.tracer enabled: yml_config[:enabled]
  c.use :rails, service_name: yml_config[:service_name]
end

I would give it a few more hours to validate the error is gone and then probably try re-enabling other integrations again. I am still not sure whether this is about a specific integration or more about having multiple of them active at the same time.

We did not yet test another application, hopefully that will start tomorrow...

@pawelchcki
Copy link
Contributor

Thanks for such detailed report and for continued debugging @NobodysNightmare!
I've been working on reproducing this issue locally, but so far my resque workers refuse to hang.

If you would be able to provide us with a backtrace of locked thread in a hung worker that would be of great help.

I think this gem could be of some assistance - https://github.com/frsyuki/sigdump - it traps a signal, and outputs all threads backtraces to a file. With resque however there is one additional step where you'd need to change the dump from SIGCONT to e.g. SIGTSTP, as described in Gem's Readme.
If the Ruby VM itself is not locked then it should be possible to extract useful information this way.

However if the Ruby VM is locked. Then in such case its still possible to extract some information using gdb command thread apply all bt.

@NobodysNightmare
Copy link
Author

To give some update for today:

  • I tried reproducing the problem locally and of course it only happens in production
  • deploying sigdump did not help, I wasn't able to obtain a dump once a worker was locked up (which is probably already a bad sign)
    • however, I will confirm with our operations engineers that we had no other problem, such as missing write permissions. since signdump rescues all errors I wouldn't have noticed
  • I will discuss with the other engineers if we are able to obtain a gdb dump
  • I also deployed ddtrace to another application that has a high throughput of resque jobs... let's see if we find anything there...

@pawelchcki
Copy link
Contributor

Thanks for checking it @NobodysNightmare! If sigdump is not working then that is a useful info in itself. I'll dig into what kind of locks we use beside GVL when running a worker.

@NobodysNightmare
Copy link
Author

Hey there,

we did some more experimental rollouts. While we could not see the bug on every project we deployed, we could reproduce it (in production) on the following environments:

ddtrace (0.12.1)
ruby 2.2.4
rails (4.2.7.1)
redis (3.3.5)
resque (1.27.4)
ddtrace (0.13.0)
ruby 2.5.0
rails (5.1.6)
redis (4.0.1)
resque (1.27.4)

@pawelchcki
Copy link
Contributor

@NobodysNightmare thanks for continuing working on this!

While I still wasn't able to reliably cause worker to hang locally. After looking at the code closely I have a strong suspicion that this could be caused directly or indirectly by our special trace writer we only use in Resque - and this problem seems to only show up in Resque integration.

Its the only special component (SyncWriter) in Resque integration that allows us to flush the traces faster, ensuring that they are delivered to the agent before Resque worker shuts down.

I've created this PR #474 that disables that special writer and instead flushes all data at the job exit. The difference is that its uses less resources (in some cases - a lot less) than the special SyncWriter.

If you would be able to try out the changes from that PR and see if the problem still persists, it would allow us to eliminate one possible source.

@NobodysNightmare
Copy link
Author

@pawelchcki I am going to include that branch in our Gemfile and see if the error can still be reproduced.

Please make sure not to remove the branch and the tip commit without letting us know, since our Gemfile (and therefore bundler) will now depend on that one temporarily.

@pawelchcki
Copy link
Contributor

Will do @NobodysNightmare.

@NobodysNightmare
Copy link
Author

I have bad news:

$ strace -p 12344
Process 12344 attached - interrupt to quit
futex(0x7f82734e33c0, FUTEX_WAIT_PRIVATE, 2, NULL

Whatever the cause is, apparently we did not yet find it. Thank you very much for your continued support.

@NobodysNightmare
Copy link
Author

We now obtained a dump using gdb. I really hope that you can make more use of it than I can. The dump was obtained from the stale child worker process.

Thread 2 (Thread 0x7f8279a0a700 (LWP 12346)):
#0  0x00007f8279bdf363 in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f827ab57b0f in timer_thread_sleep (gvl=0xed3f28) at thread_pthread.c:1442
#2  thread_timer (p=0xed3f28) at thread_pthread.c:1549
#3  0x00007f827a764b50 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#4  0x00007f8279be9fbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f827b043700 (LWP 12344)):
#0  0x00007f827a76bd5c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f827a7673a9 in _L_lock_926 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f827a7671cb in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#3  0x00007f82732dd4a4 in _nss_files_gethostbyname2_r () from /lib/x86_64-linux-gnu/libnss_files.so.2
#4  0x00007f8279c00e75 in gethostbyname2_r () from /lib/x86_64-linux-gnu/libc.so.6
#5  0x00007f8279bc61a2 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#6  0x00007f8279bc86c2 in getaddrinfo () from /lib/x86_64-linux-gnu/libc.so.6
#7  0x00007f8278bdb8b8 in nogvl_getaddrinfo (arg=<optimized out>) at raddrinfo.c:162
#8  0x00007f827ab5deb2 in call_without_gvl (data2=<optimized out>, ubf=<optimized out>, data1=0x7ffe00815690, 
    func=0x7f8278bdb8a0 <nogvl_getaddrinfo>, fail_if_interrupted=<optimized out>) at thread.c:1274
#9  rb_thread_call_without_gvl (func=func@entry=0x7f8278bdb8a0 <nogvl_getaddrinfo>, data1=data1@entry=0x7ffe00815690, 
    ubf=ubf@entry=0xffffffffffffffff, data2=data2@entry=0x0) at thread.c:1384
#10 0x00007f8278bdc805 in rb_getaddrinfo (node=node@entry=0x7ffe00815740 "raynor-b.kaeuferportal.eu", service=service@entry=0x0, 
    hints=hints@entry=0x7ffe00815c00, res=res@entry=0x7ffe00815700) at raddrinfo.c:280
#11 0x00007f8278bdd6be in rsock_getaddrinfo (host=<optimized out>, port=<optimized out>, hints=hints@entry=0x7ffe00815c00, 
    socktype_hack=socktype_hack@entry=0) at raddrinfo.c:481
#12 0x00007f8278bd3234 in sock_s_getaddrinfo (argc=<optimized out>, argv=<optimized out>) at socket.c:1316
#13 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b089f0, th=0xed45f0) at vm_insnhelper.c:1380
#14 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b089f0, ci=ci@entry=0x5df1110) at vm_insnhelper.c:1473
#15 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b089f0, ci=0x5df1110) at vm_insnhelper.c:1689
#16 0x00007f827ab409ed in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1070
#17 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#18 0x00007f827ab46064 in vm_call0_body (th=0xed45f0, ci=ci@entry=0x7ffe00816210, argv=argv@entry=0x7f8279a0bff8) at vm_eval.c:180
#19 0x00007f827ab46367 in vm_call0 (defined_class=<optimized out>, me=0x5cce280, argv=0x7f8279a0bff8, argc=3, id=3041, recv=187435160, 
    th=<optimized out>) at vm_eval.c:59
#20 rb_call0 (recv=187435160, mid=3041, argc=3, argv=0x7f8279a0bff8, scope=scope@entry=CALL_FCALL, self=62451440) at vm_eval.c:349
#21 0x00007f827ab4818a in rb_call (scope=CALL_FCALL, argv=<optimized out>, argc=<optimized out>, mid=<optimized out>, recv=<optimized out>)
    at vm_eval.c:830
#22 rb_funcallv (recv=<optimized out>, mid=<optimized out>, argc=<optimized out>, argv=<optimized out>) at vm_eval.c:831
#23 0x00007f827aa5d160 in rb_class_new_instance (argc=3, argv=0x7f8279a0bff8, klass=<optimized out>) at object.c:1862
#24 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b08a90, th=0xed45f0) at vm_insnhelper.c:1380
#25 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b08a90, ci=ci@entry=0x5939db0) at vm_insnhelper.c:1473
#26 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b08a90, ci=0x5939db0) at vm_insnhelper.c:1689
#27 0x00007f827ab409ed in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1070
#28 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#29 0x00007f827ab4af9b in invoke_block_from_c (splattable=1, defined_class=<optimized out>, cref=0x0, blockptr=0x0, argv=<optimized out>, 
    argc=<optimized out>, self=121660720, block=0x7f8279b08ba0, th=0xed45f0) at vm.c:813
#30 vm_yield (th=0xed45f0, argc=<optimized out>, argv=<optimized out>) at vm.c:853
#31 0x00007f827ab4b0e0 in rb_yield_0 (argc=<optimized out>, argv=<optimized out>) at vm_eval.c:993
#32 rb_yield (val=val@entry=52) at vm_eval.c:1000
#33 0x00007f827a9fc000 in rb_ensure (b_proc=0x7f827ab4b050 <rb_yield>, data1=52, e_proc=<optimized out>, data2=<optimized out>) at eval.c:907
#34 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b08b80, th=0xed45f0) at vm_insnhelper.c:1380
#35 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b08b80, ci=ci@entry=0x5ddf1c0) at vm_insnhelper.c:1473
#36 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b08b80, ci=0x5ddf1c0) at vm_insnhelper.c:1689
#37 0x00007f827ab40948 in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1040
#38 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#39 0x00007f827ab4abf4 in invoke_block_from_c (splattable=1, defined_class=63429280, cref=0x0, blockptr=0x0, argv=0x0, argc=0, self=121660720, 
    block=<optimized out>, th=0xed45f0) at vm.c:813
#40 vm_yield (argv=0x0, argc=0, th=0xed45f0) at vm.c:853
#41 rb_yield_0 (argc=<optimized out>, argv=<optimized out>) at vm_eval.c:993
#42 loop_i () at vm_eval.c:1064
#43 0x00007f827a9fba92 in rb_rescue2 (b_proc=b_proc@entry=0x7f827ab4a850 <loop_i>, data1=data1@entry=0, r_proc=r_proc@entry=0, data2=data2@entry=0)
    at eval.c:808
#44 0x00007f827ab3634e in rb_f_loop (self=121660720) at vm_eval.c:1098
#45 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b08c20, th=0xed45f0) at vm_insnhelper.c:1380
#46 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b08c20, ci=ci@entry=0x5bd3c50) at vm_insnhelper.c:1473
#47 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b08c20, ci=0x5bd3c50) at vm_insnhelper.c:1689
#48 0x00007f827ab40948 in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1040
#49 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#50 0x00007f827ab46ecc in invoke_block_from_c (th=0xed45f0, block=0xd1057e0, self=121838760, argc=<optimized out>, argv=<optimized out>, 
    blockptr=<optimized out>, cref=0x0, defined_class=33108720, splattable=0) at vm.c:813
#51 0x00007f827ab47167 in vm_invoke_proc (th=0xed45f0, proc=proc@entry=0xd1057e0, self=121838760, defined_class=33108720, argc=argc@entry=0, 
    argv=argv@entry=0x7f8279a0bef0, blockptr=0x0) at vm.c:878
#52 0x00007f827ab4724a in rb_vm_invoke_proc (th=<optimized out>, proc=proc@entry=0xd1057e0, argc=argc@entry=0, argv=argv@entry=0x7f8279a0bef0, 
    blockptr=<optimized out>) at vm.c:897
#53 0x00007f827aa031ad in proc_call (argc=0, argv=0x7f8279a0bef0, procval=187435400) at proc.c:731
#54 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b08d60, th=0xed45f0) at vm_insnhelper.c:1380
#55 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b08d60, ci=ci@entry=0x5c60028) at vm_insnhelper.c:1473
#56 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b08d60, ci=0x5c60028) at vm_insnhelper.c:1689
#57 0x00007f827ab409ed in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1070
#58 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#59 0x00007f827ab46064 in vm_call0_body (th=0xed45f0, ci=ci@entry=0x7ffe00817fb0, argv=argv@entry=0x0) at vm_eval.c:180
#60 0x00007f827ab46367 in vm_call0 (defined_class=<optimized out>, me=0x5cc3280, argv=0x0, argc=0, id=2945, recv=187436880, th=<optimized out>)
    at vm_eval.c:59
#61 rb_call0 (recv=187436880, mid=2945, argc=0, argv=0x0, scope=<optimized out>, self=187436880) at vm_eval.c:349
#62 0x00007f827ab3a5f9 in rb_iterate (it_proc=it_proc@entry=0x7f827ab48b40 <iterate_method>, data1=data1@entry=140728906908128, 
    bl_proc=bl_proc@entry=0x7f827a9e8920 <collect_i>, data2=data2@entry=187436600) at vm_eval.c:1136
#63 0x00007f827ab3a7bb in rb_block_call (obj=obj@entry=187436880, mid=mid@entry=2945, argc=argc@entry=0, argv=argv@entry=0x0, 
    bl_proc=bl_proc@entry=0x7f827a9e8920 <collect_i>, data2=data2@entry=187436600) at vm_eval.c:1205
#64 0x00007f827a9e937d in enum_collect (obj=187436880) at enum.c:433
#65 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b090d0, th=0xed45f0) at vm_insnhelper.c:1380
#66 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b090d0, ci=ci@entry=0x39ebd38) at vm_insnhelper.c:1473
#67 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b090d0, ci=0x39ebd38) at vm_insnhelper.c:1689
#68 0x00007f827ab40948 in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1040
#69 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#70 0x00007f827ab46064 in vm_call0_body (th=0xed45f0, ci=ci@entry=0x7ffe008187d0, argv=argv@entry=0x7f8279a0bcd0) at vm_eval.c:180
#71 0x00007f827ab46367 in vm_call0 (defined_class=<optimized out>, me=0x621d6a0, argv=0x7f8279a0bcd0, argc=3, id=3041, recv=187437240, 
    th=<optimized out>) at vm_eval.c:59
#72 rb_call0 (recv=187437240, mid=3041, argc=3, argv=0x7f8279a0bcd0, scope=scope@entry=CALL_FCALL, self=102778600) at vm_eval.c:349
#73 0x00007f827ab4818a in rb_call (scope=CALL_FCALL, argv=<optimized out>, argc=<optimized out>, mid=<optimized out>, recv=<optimized out>)
    at vm_eval.c:830
#74 rb_funcallv (recv=<optimized out>, mid=<optimized out>, argc=<optimized out>, argv=<optimized out>) at vm_eval.c:831
#75 0x00007f827aa5d160 in rb_class_new_instance (argc=3, argv=0x7f8279a0bcd0, klass=<optimized out>) at object.c:1862
#76 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b09210, th=0xed45f0) at vm_insnhelper.c:1380
#77 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b09210, ci=ci@entry=0x61e22c0) at vm_insnhelper.c:1473
#78 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b09210, ci=0x61e22c0) at vm_insnhelper.c:1689
#79 0x00007f827ab409ed in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1070
#80 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#81 0x00007f827ab46ecc in invoke_block_from_c (th=0xed45f0, block=0x7f8279b09780, self=187486400, argc=<optimized out>, argv=<optimized out>, 
    blockptr=<optimized out>, cref=0x0, defined_class=108319640, splattable=1) at vm.c:813
#82 0x00007f827ab4b0c1 in vm_yield (argv=0x7ffe00818e98, argc=1, th=<optimized out>) at vm.c:853
#83 rb_yield_0 (argv=0x7ffe00818e98, argc=1) at vm_eval.c:993
#84 rb_yield (val=140198256980928, val@entry=187498240) at vm_eval.c:1003
#85 0x00007f827aa5a979 in rb_obj_tap (obj=187498240) at object.c:687
#86 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b09760, th=0xed45f0) at vm_insnhelper.c:1380
#87 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b09760, ci=ci@entry=0x67286c8) at vm_insnhelper.c:1473
#88 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b09760, ci=0x67286c8) at vm_insnhelper.c:1689
#89 0x00007f827ab40948 in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1040
#90 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#91 0x00007f827ab46ecc in invoke_block_from_c (th=0xed45f0, block=0x7f8279b09910, self=61390440, argc=<optimized out>, argv=<optimized out>, 
    blockptr=<optimized out>, cref=0x0, defined_class=41796480, splattable=1) at vm.c:813
#92 0x00007f827ab4b0c1 in vm_yield (argv=0x7ffe008194b8, argc=1, th=<optimized out>) at vm.c:853
#93 rb_yield_0 (argv=0x7ffe008194b8, argc=1) at vm_eval.c:993
#94 rb_yield (val=140198256980928, val@entry=187498240) at vm_eval.c:1003
#95 0x00007f827aa5a979 in rb_obj_tap (obj=187498240) at object.c:687
#96 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b098f0, th=0xed45f0) at vm_insnhelper.c:1380
#97 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b098f0, ci=ci@entry=0x66e0e68) at vm_insnhelper.c:1473
#98 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b098f0, ci=0x66e0e68) at vm_insnhelper.c:1689
#99 0x00007f827ab40948 in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1040
#100 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#101 0x00007f827ab46ecc in invoke_block_from_c (th=0xed45f0, block=0xae93400, self=172014960, argc=<optimized out>, argv=<optimized out>, 
    blockptr=<optimized out>, cref=0x0, defined_class=69295520, splattable=0) at vm.c:813
#102 0x00007f827ab47167 in vm_invoke_proc (th=0xed45f0, proc=proc@entry=0xae93400, self=172014960, defined_class=69295520, argc=argc@entry=0, 
    argv=argv@entry=0x7f8279a0b830, blockptr=0x0) at vm.c:878
#103 0x00007f827ab4724a in rb_vm_invoke_proc (th=<optimized out>, proc=proc@entry=0xae93400, argc=argc@entry=0, argv=argv@entry=0x7f8279a0b830, 
    blockptr=<optimized out>) at vm.c:897
#104 0x00007f827aa031ad in proc_call (argc=0, argv=0x7f8279a0b830, procval=171626120) at proc.c:731
#105 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b09c60, th=0xed45f0) at vm_insnhelper.c:1380
#106 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b09c60, ci=ci@entry=0x6a2bdf0) at vm_insnhelper.c:1473
#107 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b09c60, ci=0x6a2bdf0) at vm_insnhelper.c:1689
#108 0x00007f827ab409ed in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1070
#109 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#110 0x00007f827ab4af9b in invoke_block_from_c (splattable=1, defined_class=<optimized out>, cref=0x0, blockptr=0x0, argv=<optimized out>, 
    argc=<optimized out>, self=104338160, block=0x7f8279b09dc0, th=0xed45f0) at vm.c:813
#111 vm_yield (th=0xed45f0, argc=<optimized out>, argv=<optimized out>) at vm.c:853
#112 0x00007f827ab4b0e0 in rb_yield_0 (argc=<optimized out>, argv=<optimized out>) at vm_eval.c:993
#113 rb_yield (val=val@entry=52) at vm_eval.c:1000
#114 0x00007f827a9fbd9c in rb_protect (proc=0x7f827ab4b050 <rb_yield>, data=data@entry=52, state=state@entry=0x7ffe0081a52c) at eval.c:876
#115 0x00007f827aa8361a in rb_f_fork (obj=<optimized out>) at process.c:3687
#116 rb_f_fork (obj=<optimized out>) at process.c:3675
#117 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b09da0, th=0xed45f0) at vm_insnhelper.c:1380
#118 vm_call_cfunc (th=0xed45f0, reg_cfp=0x7f8279b09da0, ci=<optimized out>) at vm_insnhelper.c:1473
#119 0x00007f827ab40948 in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1040
#120 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#121 0x00007f827ab4abf4 in invoke_block_from_c (splattable=1, defined_class=112066200, cref=0x0, blockptr=0x0, argv=0x0, argc=0, self=104338160, 
    block=<optimized out>, th=0xed45f0) at vm.c:813
#122 vm_yield (argv=0x0, argc=0, th=0xed45f0) at vm.c:853
#123 rb_yield_0 (argc=<optimized out>, argv=<optimized out>) at vm_eval.c:993
#124 loop_i () at vm_eval.c:1064
#125 0x00007f827a9fba92 in rb_rescue2 (b_proc=b_proc@entry=0x7f827ab4a850 <loop_i>, data1=data1@entry=0, r_proc=r_proc@entry=0, data2=data2@entry=0)
    at eval.c:808
#126 0x00007f827ab3634e in rb_f_loop (self=104338160) at vm_eval.c:1098
#127 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b09ee0, th=0xed45f0) at vm_insnhelper.c:1380
#128 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b09ee0, ci=ci@entry=0x6a5a290) at vm_insnhelper.c:1473
#129 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b09ee0, ci=0x6a5a290) at vm_insnhelper.c:1689
#130 0x00007f827ab40948 in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1040
#131 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#132 0x00007f827ab46ecc in invoke_block_from_c (th=0xed45f0, block=0x82dc190, self=15965320, argc=<optimized out>, argv=<optimized out>, 
    blockptr=<optimized out>, cref=0x0, defined_class=15969400, splattable=0) at vm.c:813
#133 0x00007f827ab47167 in vm_invoke_proc (th=0xed45f0, proc=proc@entry=0x82dc190, self=15965320, defined_class=15969400, argc=argc@entry=2, 
    argv=argv@entry=0x7f8279a0b6b8, blockptr=0x0) at vm.c:878
#134 0x00007f827ab4724a in rb_vm_invoke_proc (th=<optimized out>, proc=proc@entry=0x82dc190, argc=argc@entry=2, argv=argv@entry=0x7f8279a0b6b8, 
    blockptr=<optimized out>) at vm.c:897
#135 0x00007f827aa031ad in proc_call (argc=2, argv=0x7f8279a0b6b8, procval=118501480) at proc.c:731
#136 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b09fd0, th=0xed45f0) at vm_insnhelper.c:1380
#137 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b09fd0, ci=ci@entry=0x220a1c0) at vm_insnhelper.c:1473
#138 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b09fd0, ci=0x220a1c0) at vm_insnhelper.c:1689
#139 0x00007f827ab409ed in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1070
#140 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#141 0x00007f827ab46ecc in invoke_block_from_c (th=0xed45f0, block=0x7f8279b0a090, self=118500520, argc=<optimized out>, argv=<optimized out>, 
    blockptr=<optimized out>, cref=0x0, defined_class=31311120, splattable=1) at vm.c:813
#142 0x00007f827ab4b0c1 in vm_yield (argv=0x7ffe0081b9f8, argc=1, th=<optimized out>) at vm.c:853
#143 rb_yield_0 (argv=0x7ffe0081b9f8, argc=1) at vm_eval.c:993
#144 rb_yield (val=140198256980928) at vm_eval.c:1003
#145 0x00007f827a9ac032 in rb_ary_each (array=118500240) at array.c:1814
#146 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b0a070, th=0xed45f0) at vm_insnhelper.c:1380
#147 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b0a070, ci=ci@entry=0x220a8b0) at vm_insnhelper.c:1473
#148 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b0a070, ci=0x220a8b0) at vm_insnhelper.c:1689
#149 0x00007f827ab40948 in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1040
#150 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#151 0x00007f827ab4af9b in invoke_block_from_c (splattable=1, defined_class=<optimized out>, cref=0x0, blockptr=0x0, argv=<optimized out>, 
    argc=<optimized out>, self=183035560, block=0x7f8279b0a2c0, th=0xed45f0) at vm.c:813
#152 vm_yield (th=0xed45f0, argc=<optimized out>, argv=<optimized out>) at vm.c:853
#153 0x00007f827ab4b0e0 in rb_yield_0 (argc=<optimized out>, argv=<optimized out>) at vm_eval.c:993
#154 rb_yield (val=val@entry=52) at vm_eval.c:1000
#155 0x00007f827a9fbd9c in rb_protect (proc=0x7f827ab4b050 <rb_yield>, data=data@entry=52, state=state@entry=0x7ffe0081c2ac) at eval.c:876
#156 0x00007f827aa8361a in rb_f_fork (obj=<optimized out>) at process.c:3687
#157 rb_f_fork (obj=<optimized out>) at process.c:3675
#158 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b0a2a0, th=0xed45f0) at vm_insnhelper.c:1380
#159 vm_call_cfunc (th=0xed45f0, reg_cfp=0x7f8279b0a2a0, ci=<optimized out>) at vm_insnhelper.c:1473
#160 0x00007f827ab40948 in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1040
#161 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#162 0x00007f827ab46ecc in invoke_block_from_c (th=0xed45f0, block=0x7f8279b0a360, self=183035560, argc=<optimized out>, argv=<optimized out>, 
    blockptr=<optimized out>, cref=0x0, defined_class=115612960, splattable=1) at vm.c:813
#163 0x00007f827ab4b0c1 in vm_yield (argv=0x7ffe0081c7e8, argc=1, th=<optimized out>) at vm.c:853
#164 rb_yield_0 (argv=0x7ffe0081c7e8, argc=1) at vm_eval.c:993
#165 rb_yield (val=140198256980928, val@entry=5) at vm_eval.c:1003
#166 0x00007f827aa506ce in int_dotimes (num=9) at numeric.c:3868
#167 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b0a340, th=0xed45f0) at vm_insnhelper.c:1380
#168 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b0a340, ci=ci@entry=0x6e2d240) at vm_insnhelper.c:1473
#169 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b0a340, ci=0x6e2d240) at vm_insnhelper.c:1689
#170 0x00007f827ab40948 in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1040
#171 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#172 0x00007f827ab46ecc in invoke_block_from_c (th=0xed45f0, block=0x83aa5b0, self=15965320, argc=<optimized out>, argv=<optimized out>, 
    blockptr=<optimized out>, cref=0x0, defined_class=15969400, splattable=0) at vm.c:813
#173 0x00007f827ab47167 in vm_invoke_proc (th=0xed45f0, proc=proc@entry=0x83aa5b0, self=15965320, defined_class=15969400, argc=argc@entry=2, 
    argv=argv@entry=0x7f8279a0b518, blockptr=0x0) at vm.c:878
#174 0x00007f827ab4724a in rb_vm_invoke_proc (th=<optimized out>, proc=proc@entry=0x83aa5b0, argc=argc@entry=2, argv=argv@entry=0x7f8279a0b518, 
    blockptr=<optimized out>) at vm.c:897
#175 0x00007f827aa031ad in proc_call (argc=2, argv=0x7f8279a0b518, procval=118470080) at proc.c:731
#176 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b0a480, th=0xed45f0) at vm_insnhelper.c:1380
#177 vm_call_cfunc (th=0xed45f0, reg_cfp=0x7f8279b0a480, ci=<optimized out>) at vm_insnhelper.c:1473
#178 0x00007f827ab409ed in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1070
#179 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#180 0x00007f827ab46ecc in invoke_block_from_c (th=0xed45f0, block=0x7f8279b0a540, self=118469200, argc=<optimized out>, argv=<optimized out>, 
    blockptr=<optimized out>, cref=0x0, defined_class=31311120, splattable=1) at vm.c:813
#181 0x00007f827ab4b0c1 in vm_yield (argv=0x7ffe0081d518, argc=1, th=<optimized out>) at vm.c:853
#182 rb_yield_0 (argv=0x7ffe0081d518, argc=1) at vm_eval.c:993
#183 rb_yield (val=140198256980928) at vm_eval.c:1003
#184 0x00007f827a9ac032 in rb_ary_each (array=118469120) at array.c:1814
#185 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b0a520, th=0xed45f0) at vm_insnhelper.c:1380
#186 vm_call_cfunc (th=0xed45f0, reg_cfp=0x7f8279b0a520, ci=<optimized out>) at vm_insnhelper.c:1473
#187 0x00007f827ab40948 in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1040
#188 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#189 0x00007f827ab46ecc in invoke_block_from_c (th=0xed45f0, block=0x7f8279b0a7c0, self=23353480, argc=<optimized out>, argv=<optimized out>, 
    blockptr=<optimized out>, cref=0x0, defined_class=31762320, splattable=1) at vm.c:813
#190 0x00007f827ab4b0c1 in vm_yield (argv=0x7ffe0081da78, argc=1, th=<optimized out>) at vm.c:853
#191 rb_yield_0 (argv=0x7ffe0081da78, argc=1) at vm_eval.c:993
#192 rb_yield (val=140198256980928) at vm_eval.c:1003
#193 0x00007f827a9ac032 in rb_ary_each (array=17777640) at array.c:1814
#194 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b0a7a0, th=0xed45f0) at vm_insnhelper.c:1380
#195 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b0a7a0, ci=ci@entry=0x2151d68) at vm_insnhelper.c:1473
#196 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b0a7a0, ci=0x2151d68) at vm_insnhelper.c:1689
#197 0x00007f827ab40948 in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1040
#198 0x00007f827ab45354 in vm_exec (th=0xed45f0) at vm.c:1435
#199 0x00007f827ab4958e in rb_iseq_eval (iseqval=22304240) at vm.c:1667
#200 0x00007f827a9fe6ab in rb_load_internal0 (th=th@entry=0xed45f0, fname=22320560, wrap=<optimized out>) at load.c:612
#201 0x00007f827a9ff3a6 in rb_load_internal (wrap=<optimized out>, fname=<optimized out>) at load.c:642
#202 rb_f_load (argc=<optimized out>, argv=<optimized out>) at load.c:714
#203 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b0aa20, th=0xed45f0) at vm_insnhelper.c:1380
#204 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b0aa20, ci=ci@entry=0x1e76478) at vm_insnhelper.c:1473
#205 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b0aa20, ci=0x1e76478) at vm_insnhelper.c:1689
#206 0x00007f827ab409ed in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1070
#207 0x00007f827ab45354 in vm_exec (th=0xed45f0) at vm.c:1435
#208 0x00007f827ab4958e in rb_iseq_eval (iseqval=22430040) at vm.c:1667
#209 0x00007f827a9fe6ab in rb_load_internal0 (th=th@entry=0xed45f0, fname=22444640, wrap=<optimized out>) at load.c:612
#210 0x00007f827a9ff3a6 in rb_load_internal (wrap=<optimized out>, fname=<optimized out>) at load.c:642
#211 rb_f_load (argc=<optimized out>, argv=<optimized out>) at load.c:714
#212 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b0aac0, th=0xed45f0) at vm_insnhelper.c:1380
#213 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b0aac0, ci=ci@entry=0x1c31ae0) at vm_insnhelper.c:1473
#214 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b0aac0, ci=0x1c31ae0) at vm_insnhelper.c:1689
#215 0x00007f827ab409ed in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1070
#216 0x00007f827ab45354 in vm_exec (th=0xed45f0) at vm.c:1435
#217 0x00007f827ab4958e in rb_iseq_eval (iseqval=17880680) at vm.c:1667
#218 0x00007f827a9fe6ab in rb_load_internal0 (th=th@entry=0xed45f0, fname=17928080, wrap=<optimized out>) at load.c:612
#219 0x00007f827a9ff3a6 in rb_load_internal (wrap=<optimized out>, fname=<optimized out>) at load.c:642
#220 rb_f_load (argc=<optimized out>, argv=<optimized out>) at load.c:714
#221 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b0aed0, th=0xed45f0) at vm_insnhelper.c:1380
#222 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b0aed0, ci=ci@entry=0x135ff38) at vm_insnhelper.c:1473
#223 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b0aed0, ci=0x135ff38) at vm_insnhelper.c:1689
#224 0x00007f827ab409ed in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1070
#225 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#226 0x00007f827ab49a39 in eval_string_with_cref (self=self@entry=15965320, src=17958320, scope=17958280, cref_arg=cref_arg@entry=0x0, file=19743600, 
    line=1) at vm_eval.c:1335
#227 0x00007f827ab4a7a7 in eval_string (line=<optimized out>, file=<optimized out>, scope=<optimized out>, src=<optimized out>, self=15965320)
    at vm_eval.c:1374
#228 rb_f_eval (argc=3, argv=<optimized out>, self=15965320) at vm_eval.c:1413
#229 0x00007f827ab38482 in vm_call_cfunc_with_frame (ci=<optimized out>, reg_cfp=0x7f8279b0af70, th=0xed45f0) at vm_insnhelper.c:1380
#230 vm_call_cfunc (th=th@entry=0xed45f0, reg_cfp=reg_cfp@entry=0x7f8279b0af70, ci=ci@entry=0x1308ef0) at vm_insnhelper.c:1473
#231 0x00007f827ab4e955 in vm_call_method (th=0xed45f0, cfp=0x7f8279b0af70, ci=0x1308ef0) at vm_insnhelper.c:1689
#232 0x00007f827ab409ed in vm_exec_core (th=th@entry=0xed45f0, initial=initial@entry=0) at insns.def:1070
#233 0x00007f827ab45354 in vm_exec (th=th@entry=0xed45f0) at vm.c:1435
#234 0x00007f827ab45bbe in rb_iseq_eval_main (iseqval=iseqval@entry=19720640) at vm.c:1680
#235 0x00007f827a9f980f in ruby_exec_internal (n=0x12ce9c0) at eval.c:252
#236 0x00007f827a9fb4bd in ruby_exec_node (n=n@entry=0x12ce9c0) at eval.c:317
#237 0x00007f827a9fdb8c in ruby_run_node (n=0x12ce9c0) at eval.c:309
#238 0x000000000040090b in main (argc=6, argv=0x7ffe0081fce8) at main.c:36

@pawelchcki
Copy link
Contributor

@NobodysNightmare Thanks!
Looks like the backtrace of the second thread confirms lingering suspicion I had that the workers might block on resolving DNS hostname.

#0  0x00007f827a76bd5c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f827a7673a9 in _L_lock_926 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f827a7671cb in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#3  0x00007f82732dd4a4 in _nss_files_gethostbyname2_r () from /lib/x86_64-linux-gnu/libnss_files.so.2
#4  0x00007f8279c00e75 in gethostbyname2_r () from /lib/x86_64-linux-gnu/libc.so.6

Its in locked state as per line 3 (pthread_mutex_lock) and its caused by calling gethostbyname2_r which in turn was executed by Ruby trying to resolve the hostname.

In general that lock probably happens quite rarely but if every freshly spawned worker tries to resolve the agent hostname, then it increases the probability of this bug showing up.

By default we try to resolve 'localhost' as agent hostaname. It can be overriden by using tracer options If you set hostname to 127.0.0.1 then ruby will not need to resolve hostnames.

@NobodysNightmare
Copy link
Author

NobodysNightmare commented Jun 26, 2018 via email

@NobodysNightmare
Copy link
Author

NobodysNightmare commented Jun 27, 2018

I configured the hostname of the tracer to be 127.0.0.1 and ran our test load, which so far always locked up.
This time nothing locked up, so we seem to have found the cause :)

We will still observe our natural load closely and see if anything unexpected happens. (Update: a day later we still do not see any issues :))

So how do we proceed here? Should we use 127.0.0.1 and call it a day? Is there an underlying problem that should be fixed? While this is now certainly workable for us, I still feel we are not done yet, are we?

@pawelchcki
Copy link
Contributor

Great that its working stable now!

As for next steps, I'd suggest keeping using 127.0.0.1. We might make that the default in next releases.

As for addressing underlying issue, upgrading libc to newer version might resolve it.

I don't think the bug is something we can fix on our side, however if you would provide info which libc version you are using and what OS distribution and version it comes from.
We could still try reproducing it locally and if need be, it would help the development of other workarounds.

@pawelchcki
Copy link
Contributor

@NobodysNightmare since the underlying cause was identified and we've got a working workaround for systems which exhibit this bug, I'm going to close this issue.

Thanks for great help with diagnosing this!

ivoanjo added a commit that referenced this issue Aug 8, 2023
**What does this PR do?**:

This PR adds the "Resque workers hang on exit" issue discussed in
#3015 to the
"Known issues and suggested configurations" section of our docs.

**Motivation**:

This issue seems to come up a few times, e.g. in:
* #466
* #2379
* #3015

...so I've decided to document the issue and the best-known workaround
so that other customers than unfortunately run into it may find our
suggested solution.

**Additional Notes**:

N/A

**How to test the change?**:

Docs-only change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Involves a bug community Was opened by a community member
Projects
None yet
Development

No branches or pull requests

5 participants