Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting "error: cannot kill processes for uid '1001': failed with exit code 1" when running "nix-build" on FreeBSD #3250

Closed
0mp opened this issue Nov 28, 2019 · 7 comments

Comments

@0mp
Copy link
Contributor

0mp commented Nov 28, 2019

I am attempting to port Nix to FreeBSD and I've encountered a problem with the nix-build command.

I am working on the check.sh test now so I'm executing the following command: nix-build -vvvvv check.nix -A nondeterministic --no-out-link --repeat 1.

As a result I am getting:

error: cannot kill processes for uid '1001': failed with exit code 1

I believe that it is related to the implementation of killUser():

nix/src/libutil/util.cc

Lines 907 to 908 in ba87b08

#else
if (kill(-1, SIGKILL) == 0) break;

I've tried to quickly patch it with to use the following code (as I got inspired by a similarly looking patch) but it didn't work (it could have been a bad idea from the beginning but I wanted to try anyway):

#elif __FreeBSD__
            if (kill(-1, SIGKILL) == 0) break;
            else if (errno == EPERM && kill(-1, 0) == 0) break;

Have you got any idea what the issue could be?

Cheers!

PS I posted some more debug logs in the issue in the repository where I work on the Nix port: 0mp/freebsd-ports-nix#2

@AMDmi3
Copy link

AMDmi3 commented Nov 29, 2019

As far as kill(2) manpage is correct, the existing code should work without modifications. Some kernel code reading and/or experiments are needed to determine the cause of EPERM here.

@AMDmi3
Copy link

AMDmi3 commented Nov 29, 2019

Related kernel code (sys_kill, killpg1 from kern_sig.c) indeed looks like it does what the manpage says.

@0mp
Copy link
Contributor Author

0mp commented Nov 29, 2019

As far as kill(2) manpage is correct, the existing code should work without modifications. Some kernel code reading and/or experiments are needed to determine the cause of EPERM here.

I wonder if that might be related to the fact that I'm running Nix in a jail (spawned by Poudriere).

@AMDmi3
Copy link

AMDmi3 commented Nov 29, 2019

Unlikely. In fact, poudriere itself does kill -9 -1 in the jail to stop the build.

@AMDmi3
Copy link

AMDmi3 commented Nov 29, 2019

This reproduces with this simple program even without a jail.

#include <err.h>
#include <errno.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>

int main() {
    if (setuid(66) == -1)  // uucp, just for the test
        err(1, "setuid");

    int res = kill(-1, 0);
    fprintf(stderr, "kill(-1, 0) result=%d, errno=%s\n", res, strerror(errno));

    return 0;
}

@AMDmi3
Copy link

AMDmi3 commented Nov 29, 2019

I've sorted out the kernel code. Yes, that's how it behaves on FreeBSD (it sends signals to all processes, instead of only these belonging to the current uid, and records the first failure which is inevitable as there are always processes belonging to root) and it's likely incorrect. Should check with POSIX and fix it, meanwhile you'd have to just ignore kill(2) failure here on FreeBSD

NB: In fact EPERM is only issues when there are no processes which can be signalled. If any process can be signalled, 0 will be returned. And yes, it's going to be fixed in FreeBSD.

@0mp
Copy link
Contributor Author

0mp commented Sep 22, 2020

The patches landed in FreeBSD 13.0-CURRENT, and FreeBSD 12-STABLE. I am not sure if 12.1-RELEASE has those patches already, but 12.2-RELEASE will have it for sure. 11-STABLE didn't get the MFC.

References;

@0mp 0mp closed this as completed Sep 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants