Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate why setrlimit setting in PAL on Alpine in breaks GDB #7487

Closed
janvorli opened this issue Oct 4, 2016 · 2 comments

Comments

Projects
None yet
2 participants
@janvorli
Copy link
Member

commented Oct 4, 2016

When enabling CoreCLR build on Alpine, I have found that the INIT_IncreaseDescriptorLimit somehow breaks GDB. As soon as I step over the setrlimit in there, GDB reports an unknown signal and stops being able e.g. to continue execution. Even just running a CoreCLR app under GDB without any breakpoints cause the same issue at that point.

This is what happens:

999         result = getrlimit(RLIMIT_NOFILE, &rlp);
(gdb) n
1000        if (result != 0)
(gdb)
1006        rlp.rlim_cur = rlp.rlim_max;
(gdb) p/x rlp
$1 = {rlim_cur = 0x80000, rlim_max = 0x100000}
(gdb) n
1007        result = setrlimit(RLIMIT_NOFILE, &rlp);
(gdb)

Thread 3 "corerun" received signal ?, Unknown signal.
[Switching to LWP 28373]
0x00007ffff7da8087 in syscall () from /lib/ld-musl-x86_64.so.1

If I run it out of the GDB, I haven't noticed any problem.
I've even tried to lower the rlim_max by 1 before assigning it to the rlim_cur, but it didn't help.
It is interesting that LLDB doesn't have this problem.

What's even more interesting, creating a simple testing app containing just the body of the INIT_IncreaseDescriptorLimit in the main and stepping through that in GDB doesn't have this issue.

@shaharv

This comment has been minimized.

Copy link

commented Sep 3, 2018

I've faced a similar issue with OpenJDK on Alpine V3.8, so I hope I could reason about it.

The culprint is musl's setrlimit implementation (setrlimit.c):

int setrlimit(int resource, const struct rlimit *rlim)
{
	struct ctx c = { .res = resource, .rlim = rlim, .err = -1 };
	__synccall(do_setrlimit, &c);
	if (c.err) {
		if (c.err>0) errno = c.err;
		return -1;
	}
	return 0;
}

It is a AS-Safe implementation and updates the limits of active threads in a synchronized manner, which is done by __synccall (syncall.c). Once threads are blocked and ready to be updated, __synccall sends them with a SIGSYNCCALL signal:

r = -__syscall(SYS_tgkill, pid, tid, SIGSYNCCALL);

Trouble is, SIGSYNCCALL is internal to musl (defined in internal/pthread_impl.h) and not handled by gdb. It seems gdb handles all signal types explicitly, but SIGSYNCCALL is not included in the handled signals (see gdb's signals.c). Therefore, when the signal is raised, gdb terminates with the Unknown signal error.

The issue seems to be specific to gdb's implementation, so it could explain why it not encountered in lldb.

@janvorli

This comment has been minimized.

Copy link
Member Author

commented Sep 3, 2018

@shaharv thank you so much for the details, that explains the issue perfectly!
cc: @sergiy-k

@janvorli janvorli closed this Sep 3, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.