iv_fd_poll: epoll_pwait2() returns EPERM on CentOS7 with unprivileged… #33

bazsi · 2024-05-08T12:01:24Z

Under RHEL7 running an unprivileged container with a 3.10 based kernel, syslog-ng gets an abort, because the automatic fallback to epoll_pwait() does not work.

Because this is an unprivileged container and because on 3.10 epoll_pwait2() is an unknown system call, we get EPERM instead of ENOSYS.

See for example: axoflow/axosyslog#85 (comment)

… containers Work this around by accepting EPERM as well. Signed-off-by: Balazs Scheidler <balazs.scheidler@axoflow.com>

buytenh · 2024-05-09T19:26:39Z

Is this EPERM behavior a (mis)feature of podman? I would argue that it is a bug in the container technology to be returning EPERM for random system calls instead of ENOSYS -- what do you think?

MrAnno · 2024-05-09T20:07:51Z

Probably this one:
containers/podman#10337
containers/common#573

bazsi · 2024-05-09T21:00:41Z

I'd agree, this is on centos7 with an ancient podman. Wasn't reported in centos8, so probably fixed in later versions.

The issue is, this is never going to get fixed in that podman release.

And the workaround is to disable epoll completely and we will start with an abort without thatn which is kind of ugly.

I understand you don't want to merge this, but I'm not sure how to fix this. Don't want to use a local patch.

bazsi · 2024-05-15T13:59:38Z

This came up again. I've tried to run this under Raspberry Pi in 32 bit, running Debian buster (old, I know) and the same thing happens. So it's not just RHEL7 that has this problem, but there's probably a range of kernel/docker combinations where epoll_pwait2() returns EPERM and no ENOSYS.

I am inclined to add a workaround somehow, as this generates pretty ugly SIGABRT-s at startup. Maybe the probing for epoll_pwait2() needs to be a bit more complex and done at auto detection path? would you be open to take such a patch?

@mstopa-splunk

Commit 491daf4 ("iv_fd_epoll: Add support for epoll_pwait2().") added support for epoll_pwait2(), with a fallback to epoll_wait() in case epoll_pwait2() is not supported by the kernel we are running on, which would be indicated by epoll_pwait2() returning -ENOSYS. Some reports (e.g. axoflow/axosyslog#85 , #33 (comment) ) suggest that some container technologies can cause -EPERM to be returned for epoll_pwait2(), independently of whether or not epoll_pwait2() is actually supported by the kernel we are running on, and this trips us up because we don't currently handle -EPERM gracefully, as we did not expect that we would have to do so. Making system calls return -EPERM to indicate that they were filtered out by a security policy framework seems somewhat dubious, especially when considering the amount of application and user confusion generated by system calls that are not documented as being able to fail with -EPERM now suddenly being able to fail with -EPERM, but there is not much we can do about this. I would be against adding EPERM-as-ENOSYS fallbacks for every current or future case where we handle ENOSYS, but: 1. it seems that this is the only case where this triggers; 2. upstream seems to agree that this EPERM behavior is a bug (see e.g. these links dug up by László Várady: containers/common#573 , containers/podman#10337 , opencontainers/runtime-spec#1087 ), so there will hopefully be no new cases of this in the future; 3. there's at least one container technology release (podman on CentOS 7) where this bug triggers and where the platform is sufficiently old to no longer be receiving updates, as pointed out by Balazs Scheidler, so this issue can't be fixed by users updating their container software. Under these circumstances, adding a workaround on our end seems reasonable, and this commit does so. This issue was originally reported by @mstopa-splunk on GitHub. Workaround originally by Balazs Scheidler. Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>

buytenh · 2024-05-16T06:07:20Z

I wrote a commit message rationalising adding this workaround, and added a comment to the relevant code:

a98ca24

Does this look OK? If so, I can release this as v0.43.1.

ikheifets-splunk · 2024-05-27T18:20:26Z

@buytenh are you have a plans to release this fix? We also depending on you lib

buytenh · 2024-05-27T18:29:46Z

@buytenh are you have a plans to release this fix? We also depending on you lib

I was hoping that @bazsi could test and ACK this, given that he originally opened this PR.

bazsi · 2024-05-28T07:29:32Z

This indeed looks good to me. I'll test this later today. Thanks a lot Lennert.

bazsi · 2024-06-01T06:45:11Z

I have got around to testing this, and I can confirm it fixes the issue for syslog-ng on my aging Raspberry 32 bit. While it crashed at startup before, it now starts up properly.

buytenh · 2024-06-01T07:30:30Z

Fix from #33 (comment) merged into master.

iv_fd_poll: epoll_pwait2() returns EPERM on CentOS7 with unprivileged…

6cae9fd

… containers Work this around by accepting EPERM as well. Signed-off-by: Balazs Scheidler <balazs.scheidler@axoflow.com>

bazsi closed this May 28, 2024

bazsi reopened this May 28, 2024

Repository owner deleted a comment from bazsi May 28, 2024

buytenh closed this Jun 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

iv_fd_poll: epoll_pwait2() returns EPERM on CentOS7 with unprivileged… #33

iv_fd_poll: epoll_pwait2() returns EPERM on CentOS7 with unprivileged… #33

bazsi commented May 8, 2024

buytenh commented May 9, 2024

MrAnno commented May 9, 2024 •

edited

bazsi commented May 9, 2024

bazsi commented May 15, 2024

buytenh commented May 16, 2024

ikheifets-splunk commented May 27, 2024

buytenh commented May 27, 2024

bazsi commented May 28, 2024 •

edited

bazsi commented Jun 1, 2024

buytenh commented Jun 1, 2024

iv_fd_poll: epoll_pwait2() returns EPERM on CentOS7 with unprivileged… #33

iv_fd_poll: epoll_pwait2() returns EPERM on CentOS7 with unprivileged… #33

Conversation

bazsi commented May 8, 2024

buytenh commented May 9, 2024

MrAnno commented May 9, 2024 • edited

bazsi commented May 9, 2024

bazsi commented May 15, 2024

buytenh commented May 16, 2024

ikheifets-splunk commented May 27, 2024

buytenh commented May 27, 2024

bazsi commented May 28, 2024 • edited

bazsi commented Jun 1, 2024

buytenh commented Jun 1, 2024

MrAnno commented May 9, 2024 •

edited

bazsi commented May 28, 2024 •

edited