Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NetBSD 10.0 fails test 'Did not need kill_kill' #175

Open
nmisch opened this issue Apr 29, 2024 · 1 comment
Open

NetBSD 10.0 fails test 'Did not need kill_kill' #175

nmisch opened this issue Apr 29, 2024 · 1 comment
Assignees

Comments

@nmisch
Copy link
Collaborator

nmisch commented Apr 29, 2024

Example pass w/ NetBSD 9.3: https://www.cpantesters.org/report/5dd40274-f12f-11ee-be45-8b0dec80b09a
Example fail w/ NetBSD 10.0: https://www.cpantesters.org/report/efbcc1aa-f5f1-11ee-a642-ee3ed50263fb

I've reproduced this via GitHub Actions (uses bsd workflow fixes that I need to polish for inclusion):
https://github.com/nmisch/IPC-Run/actions/runs/8873497758/job/24359422461

I plan to investigate a fix like this:

-            'sleep while 1',
+            '$SIG{TERM}="DEFAULT";$|=1;print "running\n";sleep while 1',
@nmisch nmisch self-assigned this Apr 29, 2024
@nmisch
Copy link
Collaborator Author

nmisch commented May 20, 2024

That did not succeed. The SIGTERM is sometimes lost if it arrives between the
start and end of the child's execve(). kdump excerpt:

2024-05-19T23:31:23.1388650Z   7858   7858 perl     CALL  execve(0x7315a6eec320,0x7315a6eec340,0x7315a8aae000)
2024-05-19T23:31:23.1388795Z   7858   7858 perl     NAMI  "/usr/pkg/bin/perl"
2024-05-19T23:31:23.1388944Z   7858   7858 perl     NAMI  "/usr/libexec/ld.elf_so"
2024-05-19T23:31:23.1389080Z   8000   8000 perl     GIO   fd 4 read 0 bytes
...
2024-05-19T23:31:23.1462538Z   8000   8000 perl     CALL  kill(0x1eb2, SIGTERM)
2024-05-19T23:31:23.1462650Z   8000   8000 perl     RET   kill 0
2024-05-19T23:31:23.1462760Z   7858   7858 perl     EMUL  "netbsd"
2024-05-19T23:31:23.1462893Z   7858   7858 perl     RET   execve JUSTRETURN

The fd 4 read 0 comes from the following code in the parent (pid 8000)
observing completion of child (pid 7858) FD_CLOEXEC processing:

    ## Wait for kid to get to its exec() and see if it fails.
    _close $self->{SYNC_WRITER_FD};
    my $sync_pulse = _read $sync_reader_fd;
    _close $sync_reader_fd;

Testing in C code, I found NetBSD's rules for pending signals at execve() are
different than other kernels I tested. I've reported this at
https://gnats.netbsd.org/58268. I can modify the test case to accept both
behaviors, since the point of the test isn't to check kernel behavior.

IPC::Run could partially hide the kernel-specific behavior by retrying SIGTERM.
I'm not inclined to do that, since there's no guarantee that an arbitrary child
process will treat two copies of SIGTERM the same way as one copy. Other
opinions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant