-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nix-daemon should not core dump on SIGINT (i.e. CTRL-C) #1692
Comments
Hmm, well a number of commands such as What invocation/command is used when this happens? (probably could guess by exploring them, but easier to ask 😇) |
$ nix-shell -p some-new-package
downloading...
Hit Ctrl-C |
Hmm, having problems reproducing (using the exact same nix you're using, The coredump seems to be a symptom of the unhandled exception. As I understand it, it's a big case of "YMMV" anytime exceptions are thrown across DSO boundaries, which appears to be the case here (libnixstore -> libnixutil/nix-shell, not sure).... It might also be something in the voodoo that is the code in Nix that catches interrupts and makes them do reasonable things re:exceptions and re:threads. Perhaps someone who understands it can comment 😁 Anyway if this is just another problem like #1678, perhaps a pass should be made that ensures all of our exception classes are anchored somewhere like libnixutil.so? |
Switched to nixUnstable entirely, and I can now reproduce! Yay? :). For anyone else poking at this, the client looks fine (reports "error: interrupted by the user"), but nix-daemon logs (journalctl) show the problem. |
sorry, my error report was not precise enough. I updated it. |
No worries, this looks to be trickier than I realized at first. AFAICT the problem is the way in which nix-daemon dies (throwing an uncaught "Interrupted" exception, apparently?) when the client is interrupted with control-C. (Control-C'ing the nix-daemon itself seems to work as expected) Haha the commit history for the interrupt stuff at least acknowledges how much magic it involves 😁. |
Poked at this a bit more, here's what I've found: In my own debugging, as well as the stack trace reported, the problem is the result of throwing an exception from the destructors. This is badness and results in program termination if this happens while running already unwinding from an exception. Reconstructed Timeline
[*] in some places code has checks to avoid re-throwing Interrupt while unwinding, but either this doesn't use that code or it's not working. Possibly complicated by threads, not sure. Possible SolutionsIgnore Exceptions in DownloadItem destructor~DownloadItem() already ignores exceptions for part of what it's doing, perhaps this should be extended to cover the curl shutdown/cleanup code? This might be bad if it causes us to miss important download errors, but I'm not sure if that can happen. Don't throw from Callbacks, particularly progress reporting callbackAlso possibly overkill, but without this we're a bit reliant on what callbacks curl decides to invoke while shutdown occurs. Specifically ignore interrupted exceptions in progress reporting callbackNarrower in scope but should be safe.
Since this callback uses "isInterrupted" to indicate to curl that it should abort, ignoring any Interrupted exceptions encountered (which cause _isInterrupted to be set) seems reasonable. In my testing this fixes the reported issue. Specifically ignoring interrupted exception in destructorThis should (untested) fix the problem, but since the curl code performing cleanup encountered an exception any cleanup it normally would do after the callback may not be executed. SummaryWhat a doozy. The code above fixes the problem in my testing, I'll submit it as a PR shortly. Hope this helps, thoughts/comments welcome-- there's a lot of details to be accounted for here :). |
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
Context/discusson: NixOS#1692 (comment)
On signal SIGINT in nix-shell, nix-daemon will throw
nix::Interrupted
:and crash with the following error:
that will lead to nix beeing core dumped:
I guess a high-level catch block would avoid this.
example stacktrace:
The text was updated successfully, but these errors were encountered: