-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite connection reaper test with timeout #51038
Conversation
After the second attempt (#50037), I was finally be able to debug that flaky test that is cancelled after the 30m timeout time! See https://buildkite.com/rails/rails/builds/104764#018d9373-57da-4bb8-a019-e1e2a9c12f41/1156-1587 We fork a process in rails/activerecord/test/cases/reaper_test.rb Line 131 in 0f9aaa5
rails/activerecord/test/cases/reaper_test.rb Line 145 in 0f9aaa5
We can wrap that cc @byroot Do you have any ideas? |
If the parent process is blocked on It likely run into some sort of deadlock. |
But there is no printing in child from https://github.com/rails/rails/pull/51038/files#diff-db41b8a333b0b2a49f0b5cd5361a65426419bea8746069f52f07d0ab80f359f8R149. What can be the reason? |
|
This test is about restarting the reaper thread in the child, which we do from an rails/activesupport/lib/active_support/fork_tracker.rb Lines 19 to 25 in a8d6d47
|
@byroot Do you have an idea on why that can be and how to solve it? This test is so annoying ... If this is not easily solvable/debuggable, maybe we can wrap it into some timeout to reduce the damage, until it gets solved? |
No, I'd need to dig into it. I barely had a look.
It's probably a good idea in general when forking to then invoke code. Might be worth extracting some sort of test helper. The proper way to timeout a forked process being to use a pipe, allowing to do |
@byroot Rewrote that test with your suggestion. Please take a look. CI is red, but seems unrelated. |
Process.waitpid(pid) | ||
assert_predicate $?, :success? | ||
else | ||
Process.kill("KILL", pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should call Process.wait
even after a SIGKILL
otherwise the child will linger around as a zombie.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, maybe excessive, but rather than SIGKILL
, you could send SIGABRT
and cause the child process to print a Ruby crash report, which would include its main thread backtrace, giving more information on where it's stuck.
if completed | ||
Process.waitpid(pid) | ||
assert_predicate $?, :success? | ||
else | ||
Process.kill("KILL", pid) | ||
flunk("Process timed out") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if completed | |
Process.waitpid(pid) | |
assert_predicate $?, :success? | |
else | |
Process.kill("KILL", pid) | |
flunk("Process timed out") | |
unless completed | |
Process.kill("KILL", pid) | |
end | |
_, status = Process.wait2(pid) | |
assert_predicate status, :success? |
Thanks! Applied all the suggestions. |
Trying to debug
reaper_test.rb
.