Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MonitorFdHup: replace pthread_cancel trick with a notification pipe #12714

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

picnoir
Copy link
Member

@picnoir picnoir commented Mar 21, 2025

Motivation

On #8946, we faced a surprising
behaviour wrt. exception when using pthread_cancel. In a nutshell when
a thread is inside a catch block and it's getting pthread_cancel by
another one, then the original exception is bubbled up and crashes the
process.

We now poll on the notification pipe from the thread and exit when the
main thread closes its end. This solution does not exhibit surprising
behaviour wrt. exceptions.

Context

#8946


Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@picnoir picnoir requested a review from edolstra as a code owner March 21, 2025 16:05
@Mic92 Mic92 force-pushed the pic/monitorhup-fix-pthread-cancellation branch from 9e495a0 to e35b2c9 Compare March 21, 2025 16:10
On NixOS#8946, we faced a surprising
behaviour wrt. exception when using pthread_cancel. In a nutshell when
a thread is inside a catch block and it's getting pthread_cancel by
another one, then the original exception is bubbled up and crashes the
process.

We now poll on the notification pipe from the thread and exit when the
main thread closes its end. This solution does not exhibit surprising
behaviour wrt. exceptions.

Co-authored-by: Mic92 <joerg@thalheim.io>

fixes NixOS#8946
@Mic92 Mic92 force-pushed the pic/monitorhup-fix-pthread-cancellation branch from e35b2c9 to 0d8a35f Compare March 21, 2025 16:11
@Mic92 Mic92 added infra Issue affecting the @NixOS/infra team backport 2.24-maintenance Automatically creates a PR against the branch backport 2.26-maintenance Automatically creates a PR against the branch backport 2.27-maintenance Automatically creates a PR against the branch labels Mar 21, 2025
@Mic92 Mic92 enabled auto-merge March 21, 2025 16:13
@Mic92
Copy link
Member

Mic92 commented Mar 21, 2025

on macOS:

       >  13/189 nix-functional-tests:main / multiple-outputs                        FAIL            5.78s   exit status 1
       >  47/189 nix-functional-tests:main / dependencies                            FAIL            4.57s   exit status 1
       >  97/189 nix-functional-tests:main / build-delete                            FAIL            4.62s   exit status 1
       >

Will check if I can reproduce this.

Syscalls can fail for many reasons and we don't want to loose the errno
and error context.
@Mic92 Mic92 force-pushed the pic/monitorhup-fix-pthread-cancellation branch from 0d8a35f to b39ad3a Compare March 21, 2025 17:36
@Mic92
Copy link
Member

Mic92 commented Mar 21, 2025

Can reproduce this locally. Normal macOS tests works but this is a nix-daemon-compat-tests-2.27.0pre20250321_b39ad3a-with-daemon-2.27.0pre20250321_b39ad3a. test

@lf-
Copy link
Member

lf- commented Mar 21, 2025

fyi this is https://gerrit.lix.systems/c/lix/+/1605

@Mic92
Copy link
Member

Mic92 commented Mar 22, 2025

Similar fix also this commit predates the commit in Lix by a year. However the race condition in macOS uncovered here seems to be not tested in Lix as there is no equivalent to nix-daemon-compat-tests, so unsure if it also happens there. I can reproduce it the issue without the commits here by running tests in isolation on macOS and they go away if I add sleeps before the garbage collection calls.

The issue is that there are still temproots shown as hold by the nix-daemon preventing the garbage collection temporarily.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.24-maintenance Automatically creates a PR against the branch backport 2.26-maintenance Automatically creates a PR against the branch backport 2.27-maintenance Automatically creates a PR against the branch infra Issue affecting the @NixOS/infra team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants