Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support syscalls necessary for Go runtime #1540

Closed
cohosh opened this issue Jul 26, 2021 · 17 comments
Closed

Support syscalls necessary for Go runtime #1540

cohosh opened this issue Jul 26, 2021 · 17 comments
Labels
Type: Bug Error or flaw producing unexpected results

Comments

@cohosh
Copy link
Contributor

cohosh commented Jul 26, 2021

Describe the issue
I'm working on getting Snowflake simulations running in Shadow, and for all of the Go binaries I'm running up against the same unsupported system calls. Here's an example from the broker:

00:00:00.079100 [worker-2] 00:00:01.000000000 [ERROR] [broker:11.0.0.1] [syscall_handler.c:506] [syscallhandler_make_syscall] Returning error 38 (ENOSYS) for unsupported syscall 204, which may result in unusual behavior
00:00:00.130658 [worker-2] 00:00:01.000000000 [ERROR] [broker:11.0.0.1] [syscall_handler.c:506] [syscallhandler_make_syscall] Returning error 38 (ENOSYS) for unsupported syscall 281, which may result in unusual behavior
00:00:00.131060 [worker-2] 00:00:01.000000000 [ERROR] [broker:11.0.0.1] [syscall_handler.c:506] [syscallhandler_make_syscall] Returning error 38 (ENOSYS) for unsupported syscall 24, which may result in unusual behavior

The unsupported system calls are:

  • sched_getaffinity (204)
  • epoll_pwait (281)

It's also throwing an ENOSYS error for sched_yield (24) which is strange because AFAICT from looking at the source code, it's implemented.

Since these are being called from the Go runtime library and I've tried setting GOMAXPROCS=1 with no success, I suspect they are necessary for all Go programs (or at least all that have any goroutines).

To Reproduce
Try running any Go program with goroutines

Shadow (please complete the following information):

  • Version: Shadow 2.0.0-pre.4
@cohosh cohosh added the Type: Bug Error or flaw producing unexpected results label Jul 26, 2021
@cohosh
Copy link
Contributor Author

cohosh commented Jul 26, 2021

Also thank you so much for all this awesome Shadow work, I'm really excited about this project!

@cohosh
Copy link
Contributor Author

cohosh commented Jul 26, 2021

Also, fwiw it might be that epoll_pwait is the only one that's strictly necessary as it was almost certainly the cause of this go panic:

runtime: epollwait on fd 5 failed with 38
fatal error: runtime: netpoll failed

runtime stack:
runtime.throw(0x98829b, 0x17)
        /usr/local/go/src/runtime/panic.go:1116 +0x72
runtime.netpoll(0x4e94914f0000, 0x1)
        /usr/local/go/src/runtime/netpoll_epoll.go:130 +0x368
runtime.findrunnable(0xc000030000, 0x0)
        /usr/local/go/src/runtime/proc.go:2448 +0x785
runtime.schedule()
        /usr/local/go/src/runtime/proc.go:2683 +0x2d7
runtime.park_m(0xc000001800)
        /usr/local/go/src/runtime/proc.go:2851 +0x9d
runtime.mcall(0x800000)
        /usr/local/go/src/runtime/asm_amd64.s:318 +0x5b

goroutine 1 [running]:
        goroutine running on other thread; stack unavailable

@robgjansen
Copy link
Member

The place to check for syscall support in Shadow would be in the syscall handler switch statement:

switch (args->number) {
HANDLE(accept);
HANDLE(accept4);
HANDLE(bind);
HANDLE(brk);
HANDLE(clock_gettime);
HANDLE(clone);
HANDLE_RUST(close);
HANDLE(connect);
HANDLE(creat);
HANDLE_RUST(dup);
HANDLE(epoll_create);
HANDLE(epoll_create1);
HANDLE(epoll_ctl);
HANDLE(epoll_wait);
HANDLE(eventfd);
HANDLE(eventfd2);
HANDLE(execve);
HANDLE(exit_group);
HANDLE(faccessat);
HANDLE(fadvise64);
HANDLE(fallocate);
HANDLE(fchmod);
HANDLE(fchmodat);
HANDLE(fchown);
HANDLE(fchownat);
HANDLE(fcntl);
#ifdef SYS_fcntl64
HANDLE(fcntl64);
#endif
HANDLE(fdatasync);
HANDLE(fgetxattr);
HANDLE(flistxattr);
HANDLE(flock);
HANDLE(fremovexattr);
HANDLE(fsetxattr);
HANDLE(fstat);
HANDLE(fstatfs);
HANDLE(fsync);
HANDLE(ftruncate);
HANDLE(futex);
HANDLE(futimesat);
HANDLE(getdents);
HANDLE(getdents64);
HANDLE(getpeername);
HANDLE(getpid);
HANDLE(getppid);
HANDLE(gettid);
HANDLE(getrandom);
HANDLE(get_robust_list);
HANDLE(getsockname);
HANDLE(getsockopt);
HANDLE(gettimeofday);
HANDLE(ioctl);
HANDLE(kill);
HANDLE(linkat);
HANDLE(listen);
HANDLE(lseek);
HANDLE(mkdirat);
HANDLE(mknodat);
HANDLE(mmap);
#ifdef SYS_mmap2
HANDLE(mmap2);
#endif
HANDLE(mprotect);
HANDLE(mremap);
HANDLE(munmap);
HANDLE(nanosleep);
HANDLE(newfstatat);
HANDLE(open);
HANDLE(openat);
HANDLE_RUST(pipe);
HANDLE_RUST(pipe2);
HANDLE(poll);
HANDLE(ppoll);
HANDLE(prctl);
HANDLE_RUST(pread64);
HANDLE(preadv);
#ifdef SYS_preadv2
HANDLE(preadv2);
#endif
#ifdef SYS_prlimit
HANDLE(prlimit);
#endif
#ifdef SYS_prlimit64
HANDLE(prlimit64);
#endif
HANDLE_RUST(pwrite64);
HANDLE(pwritev);
#ifdef SYS_pwritev2
HANDLE(pwritev2);
#endif
HANDLE_RUST(read);
HANDLE(readahead);
HANDLE(readlinkat);
HANDLE(readv);
HANDLE(recvfrom);
HANDLE(renameat);
HANDLE(renameat2);
HANDLE(shadow_set_ptrace_allow_native_syscalls);
HANDLE(shadow_get_ipc_blk);
HANDLE(shadow_get_shm_blk);
HANDLE(shadow_hostname_to_addr_ipv4);
HANDLE(sendto);
HANDLE(setsockopt);
#ifdef SYS_sigaction
// Superseded by rt_sigaction in Linux 2.2
UNSUPPORTED(sigaction);
#endif
HANDLE(rt_sigaction);
#ifdef SYS_signal
// Superseded by sigaction in glibc 2.0
UNSUPPORTED(signal);
#endif
#ifdef SYS_sigprocmask
// Superseded by rt_sigprocmask in Linux 2.2
UNSUPPORTED(sigprocmask);
#endif
HANDLE(rt_sigprocmask);
HANDLE(set_robust_list);
HANDLE(set_tid_address);
HANDLE(shutdown);
HANDLE(socket);
HANDLE(socketpair);
#ifdef SYS_statx
HANDLE(statx);
#endif
HANDLE(symlinkat);
HANDLE(sync_file_range);
HANDLE(syncfs);
HANDLE(sysinfo);
HANDLE(tgkill);
HANDLE(time);
HANDLE(timerfd_create);
HANDLE(timerfd_gettime);
HANDLE(timerfd_settime);
HANDLE(tkill);
HANDLE(uname);
HANDLE(unlinkat);
HANDLE(utimensat);
HANDLE_RUST(write);
HANDLE(writev);
// **************************************
// Not handled (yet):
// **************************************
// NATIVE(chdir);
// NATIVE(fchdir);
// NATIVE(io_getevents);
// NATIVE(waitid);
// NATIVE(msync);
//// operations on pids (shadow overrides pids)
// NATIVE(sched_getaffinity);
// NATIVE(sched_setaffinity);
//// operations on file descriptors
// NATIVE(dup2);
// NATIVE(dup3);
// NATIVE(select);
// NATIVE(pselect6);
//// copying data between various types of fds
// NATIVE(copy_file_range);
// NATIVE(sendfile);
// NATIVE(splice);
// NATIVE(vmsplice);
// NATIVE(tee);
//// additional socket io
// NATIVE(recvmsg);
// NATIVE(sendmsg);
// NATIVE(recvmmsg);
// NATIVE(sendmmsg);
// ***************************************
// We think we don't need to handle these
// (because the plugin can natively):
// ***************************************
NATIVE(access);
NATIVE(arch_prctl);
NATIVE(chmod);
NATIVE(chown);
NATIVE(exit);
NATIVE(getcwd);
NATIVE(geteuid);
NATIVE(getegid);
NATIVE(getgid);
NATIVE(getresgid);
NATIVE(getresuid);
NATIVE(getrlimit);
NATIVE(getuid);
NATIVE(getxattr);
NATIVE(lchown);
NATIVE(lgetxattr);
NATIVE(link);
NATIVE(listxattr);
NATIVE(llistxattr);
NATIVE(lremovexattr);
NATIVE(lsetxattr);
NATIVE(lstat);
NATIVE(madvise);
NATIVE(mkdir);
NATIVE(mknod);
NATIVE(readlink);
NATIVE(removexattr);
NATIVE(rename);
NATIVE(rmdir);
NATIVE(rt_sigreturn);
NATIVE(setfsgid);
NATIVE(setfsuid);
NATIVE(setgid);
NATIVE(setregid);
NATIVE(setresgid);
NATIVE(setresuid);
NATIVE(setreuid);
NATIVE(setrlimit);
NATIVE(setuid);
NATIVE(setxattr);
NATIVE(stat);
#ifdef SYS_stat64
NATIVE(stat64);
#endif
NATIVE(statfs);
NATIVE(sigaltstack);
NATIVE(symlink);
NATIVE(truncate);
NATIVE(unlink);
NATIVE(utime);
NATIVE(utimes);
default:
warning(
"Detected unsupported syscall %ld called from thread %i in process %s on host %s",
args->number, thread_getID(sys->thread), process_getName(sys->process),
host_getName(sys->host));
error("Returning error %i (ENOSYS) for unsupported syscall %li, which may result in "
"unusual behavior",
ENOSYS, args->number);
scr = (SysCallReturn){.state = SYSCALL_DONE, .retval.as_i64 = -ENOSYS};
break;
}

The HANDLE cases are handled in some form by Shadow, and the NATIVE cases we allow the managed process to make the syscall directly without Shadow intervention. It does indeed look like sched_* and epoll_pwait are missing, which is why you're seeing ENOSYS for these.

@robgjansen
Copy link
Member

For epoll_pwait, Shadow does not yet deal with signals very well. Right now, we could try stubbing out a epoll_pwait handler that basically just ignores the signal field, and that might be enough of a kick to get things working. But it would probably result in fragility and unexpected behavior until we more thoroughly handle signals.

I just pushed robgjansen@1a1c7ea, an untested, quick-and-dirty example of adding a epoll_pwait handler that you might use at your own risk ;)

@cohosh
Copy link
Contributor Author

cohosh commented Jul 26, 2021

Thanks! I just tried it out and it was enough to get things working. I'll try running some experiments with this and see what happens :)

@cohosh
Copy link
Contributor Author

cohosh commented Jul 26, 2021

I'm getting stuck again. I made a minimal Go server example and I'm seeing the same behaviour there: https://github.com/cohosh/go-shadow

This is the output I get with Rob's patch for epoll_pwait, and then it stalls out without finishing the simulation:
shadow.log

It looks like it's either the result of the unsupported sched_* system calls, undersupported epoll_pwait, or unsupported levels and options for setsockopt. Not sure which one is causing the issue yet.

@robgjansen
Copy link
Member

Is a CPU core taxed at 100% during the stall, indicating a spin-loop? Otherwise, my guess is that the signaling part of epoll_pwait that we just took out is actually important and needs to be handled correctly.

@cohosh
Copy link
Contributor Author

cohosh commented Jul 27, 2021

Yup it's spinning at 100%.

@robgjansen
Copy link
Member

Hmm, I don't like spin loops :( I'm guessing it's making that sched_yield call inside the spin loop, which we don't handle right now.

There are some ways we could handle infinite loops, none of them are very nice from a design standpoint. In this case, we would probably need to hook the sched_yield call and advance time a bit to hopefully (eventually) cause the managed process to break out of the loop. But I'm not sure if this approach works well in general.

@stevenengler
Copy link
Contributor

stevenengler commented Jul 28, 2021

Here's part of an strace of the server after one curl client, then stopped with SIGINT. I don't see anything related to sched_yield here, but there's a lot of IPv6 and signal stuff that we don't support.

...
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0f05f47000
arch_prctl(ARCH_SET_FS, 0x7f0f05f47740) = 0
mprotect(0x7f0f06132000, 12288, PROT_READ) = 0
mprotect(0x7f0f06159000, 4096, PROT_READ) = 0
mprotect(0x7f0f061a2000, 4096, PROT_READ) = 0
munmap(0x7f0f06161000, 80753)           = 0
set_tid_address(0x7f0f05f47a10)         = 3278712
set_robust_list(0x7f0f05f47a20, 24)     = 0
rt_sigaction(SIGRTMIN, {sa_handler=0x7f0f06143bf0, sa_mask=[], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x7f0f061513c0}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {sa_handler=0x7f0f06143c90, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f0f061513c0}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
brk(NULL)                               = 0x1bf2000
brk(0x1c13000)                          = 0x1c13000
sched_getaffinity(0, 8192, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]) = 8
openat(AT_FDCWD, "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size", O_RDONLY) = 3
read(3, "2097152\n", 20)                = 8
close(3)                                = 0
mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0f05f07000
mmap(NULL, 131072, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0f05ee7000
mmap(NULL, 1048576, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0f05de7000
mmap(NULL, 8388608, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0f055e7000
mmap(NULL, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0f015e7000
mmap(NULL, 536870912, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0ee15e7000
mmap(0xc000000000, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xc000000000
mmap(0xc000000000, 67108864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xc000000000
mmap(NULL, 33554432, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0edf5e7000
mmap(NULL, 2165776, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0edf3d6000
mmap(0x7f0f05ee7000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f0f05ee7000
mmap(0x7f0f05e67000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f0f05e67000
mmap(0x7f0f059ed000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f0f059ed000
mmap(0x7f0f03617000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f0f03617000
mmap(0x7f0ef1767000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f0ef1767000
mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0edf2d6000
mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0f06165000
mmap(NULL, 65536, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0edf2c6000
rt_sigprocmask(SIG_SETMASK, NULL, [], 8) = 0
sigaltstack(NULL, {ss_sp=NULL, ss_flags=SS_DISABLE, ss_size=0}) = 0
sigaltstack({ss_sp=0xc000002000, ss_flags=0, ss_size=32768}, NULL) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
gettid()                                = 3278712
rt_sigaction(SIGHUP, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGHUP, {sa_handler=0x46e360, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_ONSTACK|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f0f061513c0}, NULL, 8) = 0
rt_sigaction(SIGINT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGINT, {sa_handler=0x46e360, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_ONSTACK|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f0f061513c0}, NULL, 8) = 0
rt_sigaction(SIGQUIT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGQUIT, {sa_handler=0x46e360, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_ONSTACK|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f0f061513c0}, NULL, 8) = 0
rt_sigaction(SIGILL, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGILL, {sa_handler=0x46e360, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_ONSTACK|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f0f061513c0}, NULL, 8) = 0
rt_sigaction(SIGTRAP, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGTRAP, {sa_handler=0x46e360, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_ONSTACK|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f0f061513c0}, NULL, 8) = 0
...
rt_sigaction(SIGRT_27, {sa_handler=0x46e360, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_ONSTACK|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f0f061513c0}, NULL, 8) = 0
rt_sigaction(SIGRT_28, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGRT_28, {sa_handler=0x46e360, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_ONSTACK|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f0f061513c0}, NULL, 8) = 0
rt_sigaction(SIGRT_29, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGRT_29, {sa_handler=0x46e360, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_ONSTACK|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f0f061513c0}, NULL, 8) = 0
rt_sigaction(SIGRT_30, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGRT_30, {sa_handler=0x46e360, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_ONSTACK|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f0f061513c0}, NULL, 8) = 0
rt_sigaction(SIGRT_31, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGRT_31, {sa_handler=0x46e360, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_ONSTACK|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f0f061513c0}, NULL, 8) = 0
rt_sigaction(SIGRT_32, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGRT_32, {sa_handler=0x46e360, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_ONSTACK|SA_RESTART|SA_SIGINFO, sa_restorer=0x7f0f061513c0}, NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
mmap(NULL, 8392704, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f0edeac5000
mprotect(0x7f0edeac6000, 8388608, PROT_READ|PROT_WRITE) = 0
clone(child_stack=0x7f0edf2c4fb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tid=[3278713], tls=0x7f0edf2c5700, child_tidptr=0x7f0edf2c59d0) = 3278713
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
mmap(NULL, 8392704, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f0ede2c4000
mprotect(0x7f0ede2c5000, 8388608, PROT_READ|PROT_WRITE) = 0
clone(child_stack=0x7f0edeac3fb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tid=[3278714], tls=0x7f0edeac4700, child_tidptr=0x7f0edeac49d0) = 3278714
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGURG {si_signo=SIGURG, si_code=SI_TKILL, si_pid=3278712, si_uid=1000} ---
rt_sigreturn({mask=[]})                 = 0
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
mmap(NULL, 8392704, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f0edd282000
mprotect(0x7f0edd283000, 8388608, PROT_READ|PROT_WRITE) = 0
clone(child_stack=0x7f0edda81fb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tid=[3278716], tls=0x7f0edda82700, child_tidptr=0x7f0edda829d0) = 3278716
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
futex(0xc00004ed50, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xc000080150, FUTEX_WAKE_PRIVATE, 1) = 1
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
mmap(NULL, 8392704, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f0edca81000
mprotect(0x7f0edca82000, 8388608, PROT_READ|PROT_WRITE) = 0
clone(child_stack=0x7f0edd280fb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tid=[3278717], tls=0x7f0edd281700, child_tidptr=0x7f0edd2819d0) = 3278717
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
readlinkat(AT_FDCWD, "/proc/self/exe", "/path/to/go-shadow/go-se"..., 128) = 41
fcntl(0, F_GETFL)                       = 0x2 (flags O_RDWR)
futex(0xc000080150, FUTEX_WAKE_PRIVATE, 1) = 1
fcntl(1, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(2, F_GETFL)                       = 0x2 (flags O_RDWR)
socket(AF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_TCP) = 3
close(3)                                = 0
socket(AF_INET6, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_TCP) = 3
setsockopt(3, SOL_IPV6, IPV6_V6ONLY, [1], 4) = 0
bind(3, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = 0
socket(AF_INET6, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_TCP) = 4
setsockopt(4, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
bind(4, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_scope_id=0}, 28) = 0
close(4)                                = 0
close(3)                                = 0
socket(AF_INET6, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 3
setsockopt(3, SOL_IPV6, IPV6_V6ONLY, [0], 4) = 0
openat(AT_FDCWD, "/proc/sys/net/core/somaxconn", O_RDONLY|O_CLOEXEC) = 4
epoll_create1(EPOLL_CLOEXEC)            = 5
pipe2([6, 7], O_NONBLOCK|O_CLOEXEC)     = 0
epoll_ctl(5, EPOLL_CTL_ADD, 6, {EPOLLIN, {u32=8836008, u64=8836008}}) = 0
epoll_ctl(5, EPOLL_CTL_ADD, 4, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=3727216312, u64=139701833486008}}) = 0
fcntl(4, F_GETFL)                       = 0x8000 (flags O_RDONLY|O_LARGEFILE)
fcntl(4, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 0
read(4, "4096\n", 65536)                = 5
read(4, "", 65531)                      = 0
epoll_ctl(5, EPOLL_CTL_DEL, 4, 0xc0000c19c4) = 0
close(4)                                = 0
setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(3, {sa_family=AF_INET6, sin6_port=htons(8080), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::", &sin6_addr), sin6_scope_id=0}, 28) = 0
listen(3, 4096)                         = 0
epoll_ctl(5, EPOLL_CTL_ADD, 3, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=3727216312, u64=139701833486008}}) = 0
getsockname(3, {sa_family=AF_INET6, sin6_port=htons(8080), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::", &sin6_addr), sin6_scope_id=0}, [112->28]) = 0
accept4(3, 0xc0000c1ac8, [112], SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
epoll_pwait(5, [], 128, 0, NULL, 8402280) = 0
epoll_pwait(5, [{EPOLLIN, {u32=3727216312, u64=139701833486008}}], 128, -1, NULL, 11) = 1
futex(0x83f7b8, FUTEX_WAKE_PRIVATE, 1)  = 1
futex(0x83f6b8, FUTEX_WAKE_PRIVATE, 1)  = 1
accept4(3, {sa_family=AF_INET6, sin6_port=htons(42484), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_scope_id=0}, [112->28], SOCK_CLOEXEC|SOCK_NONBLOCK) = 4
epoll_ctl(5, EPOLL_CTL_ADD, 4, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=3727216080, u64=139701833485776}}) = 0
getsockname(4, {sa_family=AF_INET6, sin6_port=htons(8080), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::ffff:127.0.0.1", &sin6_addr), sin6_scope_id=0}, [112->28]) = 0
setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0
setsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
setsockopt(4, SOL_TCP, TCP_KEEPINTVL, [15], 4) = 0
setsockopt(4, SOL_TCP, TCP_KEEPIDLE, [15], 4) = 0
futex(0xc00004ed50, FUTEX_WAKE_PRIVATE, 1) = 1
accept4(3, 0xc0000c1ac8, [112], SOCK_CLOEXEC|SOCK_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
epoll_pwait(5, [], 128, 0, NULL, 8402280) = 0
futex(0x83fc50, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0
futex(0x83fc50, FUTEX_WAIT_PRIVATE, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)

I think we can get rid of the ipv6-related errors using:

server := &http.Server{Handler: handler}
l, err := net.Listen("tcp4", ":80")
if err != nil {
    log.Fatal(err)
}
err = server.Serve(l)

When I run this in shadow, I don't get the same spinning issue. But server.main.1000.stderr says:

Oops ignore me, was running with the unpatched version by accident.

Edit: When running with rob's patched version, shadow gets stuck, and a backtace of the server process says the following (blocking on SYS_epoll_pwait):

(gdb) bt
#0  runtime.epollwait () at /usr/local/go/src/runtime/sys_linux_amd64.s:725
#1  0x0000000000433592 in runtime.netpoll (delay=-1, ~r1=...) at /usr/local/go/src/runtime/netpoll_epoll.go:126
#2  0x000000000043f5ee in runtime.findrunnable (gp=0xc000024000, inheritTime=false) at /usr/local/go/src/runtime/proc.go:2923
#3  0x0000000000440c77 in runtime.schedule () at /usr/local/go/src/runtime/proc.go:3169
#4  0x00000000004411fd in runtime.park_m (gp=0xc000001800) at /usr/local/go/src/runtime/proc.go:3318
#5  0x000000000046a99b in runtime.mcall () at /usr/local/go/src/runtime/asm_amd64.s:327
#6  0x000000000046a894 in runtime.rt0_go () at /usr/local/go/src/runtime/asm_amd64.s:226
#7  0x0000000000000000 in ?? ()

@stevenengler
Copy link
Contributor

stevenengler commented Jul 28, 2021

@robgjansen: The issue seems to be a problem with shadow's epoll. Shadow is getting stuck at:

00:00:01.065033 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:171] [syscallhandler_epoll_wait] Epoll 4 says 1 events are ready.
00:00:01.065040 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:214] [syscallhandler_epoll_wait] Found 0 ready events on epoll 4.
00:00:01.065056 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:171] [syscallhandler_epoll_wait] Epoll 4 says 1 events are ready.
00:00:01.065063 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:214] [syscallhandler_epoll_wait] Found 0 ready events on epoll 4.
00:00:01.065077 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:171] [syscallhandler_epoll_wait] Epoll 4 says 1 events are ready.
00:00:01.065084 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:214] [syscallhandler_epoll_wait] Found 0 ready events on epoll 4.
00:00:01.065099 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:171] [syscallhandler_epoll_wait] Epoll 4 says 1 events are ready.
00:00:01.065106 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:214] [syscallhandler_epoll_wait] Found 0 ready events on epoll 4.
00:00:01.065120 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:171] [syscallhandler_epoll_wait] Epoll 4 says 1 events are ready.
00:00:01.065128 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:214] [syscallhandler_epoll_wait] Found 0 ready events on epoll 4.
00:00:01.065145 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:171] [syscallhandler_epoll_wait] Epoll 4 says 1 events are ready.
00:00:01.065152 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:214] [syscallhandler_epoll_wait] Found 0 ready events on epoll 4.
00:00:01.065166 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:171] [syscallhandler_epoll_wait] Epoll 4 says 1 events are ready.
00:00:01.065173 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:214] [syscallhandler_epoll_wait] Found 0 ready events on epoll 4.
00:00:01.065189 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:171] [syscallhandler_epoll_wait] Epoll 4 says 1 events are ready.
00:00:01.065196 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:214] [syscallhandler_epoll_wait] Found 0 ready events on epoll 4.
00:00:01.065210 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:171] [syscallhandler_epoll_wait] Epoll 4 says 1 events are ready.
00:00:01.065217 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:214] [syscallhandler_epoll_wait] Found 0 ready events on epoll 4.
00:00:01.065233 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:171] [syscallhandler_epoll_wait] Epoll 4 says 1 events are ready.
00:00:01.065240 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:214] [syscallhandler_epoll_wait] Found 0 ready events on epoll 4.
00:00:01.065254 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:171] [syscallhandler_epoll_wait] Epoll 4 says 1 events are ready.
00:00:01.065261 [worker-0] 00:00:05.012000000 [DEBUG] [server:11.0.0.2] [epoll.c:214] [syscallhandler_epoll_wait] Found 0 ready events on epoll 4

Time doesn't seem to advance, and it seems that there are items in the ready table that aren't actually ready.

I replaced epoll_getNumReadyEvents() with the following as a hack:

guint epoll_getNumReadyEvents(Epoll* epoll) {
    MAGIC_ASSERT(epoll);

    GHashTableIter iter;
    g_hash_table_iter_init(&iter, epoll->ready);

    gpointer key;
    gpointer val;

    while (g_hash_table_iter_next(&iter, (gpointer) &key, (gpointer) &val)) {
        EpollWatch* watch = val;
        if (!_epollwatch_isReady(watch)) {
            g_hash_table_iter_remove(&iter);
        }
    }

    return g_hash_table_size(epoll->ready);
}

And shadow no longer gets stuck, and we successfully see Hi, I'm a simple web server in the curl stdout, which means the go server I think is working correctly.

@stevenengler
Copy link
Contributor

@cohosh If you try the Go server in Shadow it should work now. There are still unsupported syscalls (mainly IPv6 and signal related), but the curl client is able to successfully make requests to the server. If you run into any issues running other Go programs, please let us know.

@cohosh
Copy link
Contributor Author

cohosh commented Aug 3, 2021

Ah thank you! I'll give it a try today.

@cohosh
Copy link
Contributor Author

cohosh commented Aug 3, 2021

I just tried running the go minimal example (with your modifications to prevent the IPv6 errors), it's not terminating, and I'm not getting any successful stdout or stderr for either the client or server, and the shadow.log is getting spammed with these messages:

00:00:16.581396 [worker-0] 00:00:05.012000000 [ERROR] [server:11.0.0.2] [epoll.c:660] [epoll_getEvents] epoll descriptor 6 ready list has items that aren't ready
00:00:16.581456 [worker-0] 00:00:05.012000000 [ERROR] [server:11.0.0.2] [epoll.c:660] [epoll_getEvents] epoll descriptor 6 ready list has items that aren't ready
00:00:16.581506 [worker-0] 00:00:05.012000000 [ERROR] [server:11.0.0.2] [epoll.c:660] [epoll_getEvents] epoll descriptor 6 ready list has items that aren't ready

@stevenengler
Copy link
Contributor

Oops sorry, I should have tested in release mode as well. I have a fix at #1553, which should be merged later today.

@cohosh
Copy link
Contributor Author

cohosh commented Aug 3, 2021

Okay awesome, the simple Go server is working for me now! I'll try the snowflake experiments again :)

@robgjansen
Copy link
Member

Closing based on this comment in #1558 from @cohosh:

Will do! I think we can close #1540 at this point since AFACT the fixes already merged for the system calls are enough for most Go networking programs. I can open more issues/MRs if new issues come up :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Error or flaw producing unexpected results
Projects
None yet
Development

No branches or pull requests

3 participants