Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rarely failing nil assertion in signal_spec #7243

Closed
straight-shoota opened this issue Dec 31, 2018 · 2 comments · Fixed by #7409
Closed

Rarely failing nil assertion in signal_spec #7243

straight-shoota opened this issue Dec 31, 2018 · 2 comments · Fixed by #7409

Comments

@straight-shoota
Copy link
Member

Nil assertion failed (Exception)
  from src/class.cr:148:0 in 'not_nil!'
  from spec/std/signal_spec.cr:0:23 in '->'
  from src/signal.cr:255:3 in '->'
  from src/signal.cr:255:3 in 'process'
  from src/signal.cr:198:9 in '->'
  from src/fiber.cr:255:3 in 'run'
  from src/fiber.cr:29:34 in '->'
  from ???
FATAL: uncaught exception while processing handler for CHLD, exiting

It seems to be a race condition in the Singal::CHLD handler and happens very rarely.

I' might have seen this error more often, but here are at least two instances where it failed in unrelated CI runs:

@straight-shoota
Copy link
Member Author

Another occurrance: https://circleci.com/gh/crystal-lang/crystal/18298?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link

@asterite
Copy link
Member

asterite commented Feb 10, 2019

I GOT IT!!! 💡

This is the spec:

it "CHLD.trap is called after default Crystal child handler" do
  called = false
  child = nil

  Signal::CHLD.trap do
    called = true
    Process.exists?(child.not_nil!.pid).should be_false
  end

  child = Process.new("true", shell: true)
  child.not_nil!.wait # doesn't block forever
  called.should be_true
end

TIL I learned about SIGCHLD: it's trapped when a child process ends.

The spec fails because of a not_nil! (the trace isn't entirely correct with the filenames and line numbers). But my guess was that child.not_nil! is raising. How could that happen?

Well, if Process.new("true", shell: true) runs and ends before it gets assigned to child, which is super rare, then the handler will be invoked and child will still be nil ⚠️

This race condition can be reproduced if you change the child assignment to:

child = begin
  p = Process.new("true", shell: true)
  sleep 1
  p
end

Now to fix it. I guess if child is nil I'll just Fiber.yield until child gets a non-nil value. It's a bit hard fixing this because something happens before an expression gets assigned to a variable 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants