Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failure when SIGSYS is blocked and SECCOMP_RET_TRAP happens #2413

Closed
jld opened this issue Dec 3, 2019 · 5 comments
Closed

Assertion failure when SIGSYS is blocked and SECCOMP_RET_TRAP happens #2413

jld opened this issue Dec 3, 2019 · 5 comments

Comments

@jld
Copy link

jld commented Dec 3, 2019

If the recorded process is blocking SIGSYS, and has a handler for it, and executes a system call where the seccomp-bpf policy returns SECCOMP_RET_TRAP, the kernel will both unblock the signal and reset the signal disposition. This is annoying for sandbox developers, but also it breaks rr:

[FATAL /home/jld/src/rr/src/RecordSession.cc:1264:inject_handled_signal()] 
 (task 111084 (rec:111084) at time 207)
 -> Assertion `t->stop_sig() == SIGTRAP' failed to hold. Got unexpected status 0x1f7f (STOP-SIGSYS)

I wrote a simple test case:

#include <linux/prctl.h>
#include <linux/filter.h>
#include <linux/seccomp.h>
#include <signal.h>
#include <stddef.h>
#include <stdint.h>
#include <stdio.h>
#include <sys/prctl.h>
#include <sys/syscall.h>
#include <unistd.h>

static void handler(int sig) {
	*(int*)(uintptr_t)sig = sig;
}

int main() {
	struct sock_filter filter[] = {
	    { BPF_LD | BPF_W | BPF_ABS,		0, 0, offsetof(struct seccomp_data, nr) },
	    { BPF_JMP | BPF_JEQ | BPF_K,	0, 1, __NR_read },
	    { BPF_RET | BPF_K,			0, 0, SECCOMP_RET_TRAP },
	    { BPF_RET | BPF_K,			0, 0, SECCOMP_RET_ALLOW }
	};
	struct sock_fprog fprog = { 4, filter };
	sigset_t sigs;

	signal(SIGSYS, handler);

	sigemptyset(&sigs);
	sigaddset(&sigs, SIGSYS);
	sigprocmask(SIG_BLOCK, &sigs, NULL);	

	if (prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0) != 0) {
		perror("PR_SET_NO_NEW_PRIVS");
		return 1;
	}
	if (prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, (uintptr_t)&fprog, 0, 0) != 0) {
		perror("PR_SET_SECCOMP");
		return 1;
	}
	syscall(__NR_read, 0, 0, 0);
	return 0;
}

The original failing case is Firefox and a recent development version of glibc; see also bug 1600574 comment #12.

@rocallahan
Copy link
Collaborator

In rr this is complicated by the fact that we synthesize the SIGSYS.

@jld
Copy link
Author

jld commented Dec 3, 2019

Something else I found, and I don't know if it should be a separate issue: if the SIGSYS handler isn't set, rr record hangs instead:

#0  0x00007f6bef86ae6e in __waitpid (pid=-1, stat_loc=0x7ffc1f52aca4, options=1073741826) at ../sysdeps/unix/sysv/linux/waitpid.c:30
#1  0x0000562576fa48c3 in rr::Scheduler::reschedule (this=0x562577684f00, switchable=rr::ALLOW_SWITCH) at /home/jld/src/rr/src/Scheduler.cc:559
#2  0x0000562576efe775 in rr::RecordSession::record_step (this=0x562577684d60) at /home/jld/src/rr/src/RecordSession.cc:1969

@rocallahan
Copy link
Collaborator

I'm not going to have time to fix this this week, and it's a tricky one.

When generating our synthetic seccomp SIGSYS, we need to either manually unblock and reset the disposition of SIGSYS, or we need to fully understand what the kernel does when we inject SIGSYS via ptrace and it is blocked with a signal handler, and take that into account.

The former is difficult because when we synthesize the SIGSYS we are in a seccomp ptrace trap, where we can't safely execute remote syscalls in the tracee ... we can change the sigmask using PTRACE_SETSIGMASK, but we can't AFAIK reset the signal handler to default in a straightforward way. We'd need to defer that to later, which gets complicated.

@jld
Copy link
Author

jld commented Dec 4, 2019

This isn't urgent for Firefox; we know what's going on in the bug I mentioned.

But in general it is a use case where rr is otherwise a good tool for the job: one process in a multiprocess application mysteriously exits without logging anything.

@rocallahan
Copy link
Collaborator

I have this partially fixed here: https://github.com/mozilla/rr/tree/sigsys

One problem I ran into is that when running the 32-bit version of the test, we'd hit a nasty case where doing the AutoRemoteSyscall to reset the signal handler to default would receive a SIGPWR every time we tried to enter the syscall. I think this is because we're inside the syscallbuf and the desched signal is armed. In the version of the patch on the branch, I tried to avoid that by disabling all signals during AutoRemoteSyscalls, but this breaks a lot of tests. I don't have time to look into this further right now.

bkin pushed a commit to bkin/rr that referenced this issue May 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants