Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: unable to call seccomp_load() more than once with v2.5.0 #273

Closed
stoeckmann opened this issue Jul 23, 2020 · 28 comments · Fixed by #280
Closed

BUG: unable to call seccomp_load() more than once with v2.5.0 #273

stoeckmann opened this issue Jul 23, 2020 · 28 comments · Fixed by #280
Assignees
Milestone

Comments

@stoeckmann
Copy link

Hi,

a user of xwallpaper noticed an issue with libseccomp 2.5.0, see here: stoeckmann/xwallpaper#23

The tool xwallpaper calls seccomp_load twice during runtime to further tighten its sandbox during runtime. An example for "stage 1" sandbox would be opening files and only in "stage 2" these files would be parsed, not allowing further open calls etc.

I compared libseccomp 2.4.3 and 2.5.0 and figured out that 2.5.0 would add SECCOMP_FILTER_FLAG_NEW_LISTENER to seccomp call. When setting up the "stage 1" sandbox, everything is fine. The seccomp system call returns a file descriptor for the listener to use. But if "stage 2" is set up, SECCOMP_FILTER_FLAG_NEW_LISTENER is added again and seccomp returns -1 with EBUSY.

Forcefully denying SECCOMP_FILTER_FLAG_NEW_LISTENER by patching the code made xwallpaper functional again.

Since I am not implying that you should use a fullblown tool like xwallpaper to debug this issue, you can modify libseccomp-2.5.0/tests/52-basic-load.c to call seccomp_load twice, which makes it fail.

My understand is that seccomp_load is allowed to be called multiple times for exactly this kind of use case, i.e. adding more filters. Please let me know if I am mistaken here or if there is another way to add filter rules.

Thanks in advance!

@stoeckmann
Copy link
Author

The seccomp code in question is rather separated from other xwallpaper code parts, so you can see xwallpaper's libseccomp usage here: https://github.com/stoeckmann/xwallpaper/blob/master/seccomp.c

@pcmoore pcmoore added this to the v2.5.1 milestone Jul 23, 2020
@pcmoore pcmoore changed the title Unable to call seccomp_load twice with 2.5.0 BUG: unable to call seccomp_load() more than once with v2.5.0 Jul 23, 2020
@pcmoore
Copy link
Member

pcmoore commented Jul 23, 2020

Hi @stoeckmann, thanks for the bug report! Yes, you are correct that we do allow seccomp_load() to be called multiples times.

We'll need to think about how we want to resolve this.

@pcmoore
Copy link
Member

pcmoore commented Jul 23, 2020

@stoeckmann @drakenclimber, we still need to look at how the kernel handles multiple notification fd requests to make sure this is correct, but my initial thinking is that we may need to keep a global flag (yuck), which tracks if we've already requested a notification fd and if we have one, we don't request another.

Essentially we would be moving notify_fd from the filter context to a global value in "src/system.c".

@pcmoore
Copy link
Member

pcmoore commented Jul 23, 2020

Building on my previous comment, we might want to allow a caller to request another notification fd on the odd chance that they did a fork() or a clone() after requesting a notification fd. We really need to understand what the kernel does in these cases with respect to the notification fd.

If we need to handle this, we could always keep the notify_fd in the filter context while also keeping a global value. On seccomp_init() the context would inherit the fd from the global value, but a seccomp_reset() would still reset it to -1 while leaving the global value intact. This would require a caller to do an init/reset for each filter after a fork where they wanted to request a new notification fd, but that seems like a reasonable compromise for such a niche use case.

@pcmoore
Copy link
Member

pcmoore commented Jul 23, 2020

... or, maybe we allow a caller to call seccomp_reset(NULL, ...) instead of returning -EINVAL? In this particular case, the NULL context, we would reset any global state, e.g. the saved notification fd.

@pcmoore
Copy link
Member

pcmoore commented Jul 23, 2020

To close the loop on the kernel portion of this, yes, it looks like if any of the filters loaded into the current task have requested a notifier fd then the load operation fails with -EBUSY, see kernel/seccomp.c:init_listener().

@pcmoore pcmoore self-assigned this Jul 23, 2020
@brauner
Copy link

brauner commented Jul 23, 2020

To close the loop on the kernel portion of this, yes, it looks like if any of the filters loaded into the current task have requested a notifier fd then the load operation fails with -EBUSY, see kernel/seccomp.c:init_listener().

The seccomp notifier currently has the restriction that only a single fd can be requested. This prevents seccomp notify fd filters to override each other's decisions.

@pcmoore
Copy link
Member

pcmoore commented Jul 23, 2020

@brauner yeah.

Do you anticipate the kernel's behavior changing in this regard anytime soon? We need to figure out a way to support this in libseccomp and it would be helpful to know about any future or queued work so we don't end up painting ourselves into a corner API wise.

@brauner
Copy link

brauner commented Jul 23, 2020

@brauner yeah.

Do you anticipate the kernel's behavior changing in this regard anytime soon? We need to figure out a way to support this in libseccomp and it would be helpful to know about any future or queued work so we don't end up painting ourselves into a corner API wise.

It has been brought up recently but no-one has actually made an argument why it's needed. So no, I don't think so. I think once we enable this we might need to revisit the "override" behavior.

@pcmoore
Copy link
Member

pcmoore commented Jul 23, 2020

Okay, let me think on this for a bit, as well as get @drakenclimber's thoughts on the API, but my gut feeling is we probably want to solve this by stashing the notification fd in a global and providing the ability to reset/clear it via seccomp_reset(NULL, ...).

ahesford added a commit to ahesford/void-packages that referenced this issue Jul 24, 2020
This reverts commit aa21e5a.

According to seccomp/libseccomp#273 libseccomp
v2.5.0 unintentionally refuses to allow multiple calls to seccomp_load,
which is a regression.

As a result, xwallpaper is inoperable:
stoeckmann/xwallpaper#23
ahesford added a commit to void-linux/void-packages that referenced this issue Jul 24, 2020
This reverts commit aa21e5a.

According to seccomp/libseccomp#273 libseccomp
v2.5.0 unintentionally refuses to allow multiple calls to seccomp_load,
which is a regression.

As a result, xwallpaper is inoperable:
stoeckmann/xwallpaper#23
pcmoore added a commit to pcmoore/misc-libseccomp that referenced this issue Jul 26, 2020
XXX - work in progress

This fixes the GitHub issue below:
* seccomp#273

Reported-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
@pcmoore
Copy link
Member

pcmoore commented Jul 26, 2020

@drakenclimber what do think of the approach in the commit below?

@stoeckmann the commit above should fix the problem you are seeing, if you could give it a test and report back that would be appreciated.

@stoeckmann
Copy link
Author

Hi @pcmoore,

just tested the commit. xwallpaper works with it again.

pcmoore added a commit to pcmoore/misc-libseccomp that referenced this issue Jul 27, 2020
XXX - work in progress

This fixes the GitHub issue below:
* seccomp#273

Reported-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
@pcmoore
Copy link
Member

pcmoore commented Jul 27, 2020

Thanks for verifying this @stoeckmann; if the approach looks okay to @drakenclimber I'll submit a proper PR.

I did just do a force-push to address a problem with one of the regressions tests (test 11 was still expecting seccomp_reset(NULL, ...) to return an error).

@drakenclimber
Copy link
Member

Thanks for verifying this @stoeckmann; if the approach looks okay to @drakenclimber I'll submit a proper PR.

I briefly looked through the proposed patch, and it seems like a reasonable solution to the problem. Thanks for the fix. I'll give it a closer look once you put out the PR.

pcmoore added a commit to pcmoore/misc-libseccomp that referenced this issue Aug 2, 2020
It turns out that requesting the seccomp userspace notifcation fd
more than once is a bad thing which causes the kernel to complain
(rightfully so for a variety of reasons).  Unfortunately as we were
always requesting the notification fd whenever possible this results
in problems at filter load time.

Our solution is to move the notification fd out of the filter context
and into the global task context, using a newly created task_state
structure.  This allows us to store, and retrieve the notification
outside the scope of an individual filter context.  It also provides
some implementation improvements by giving us a convenient place to
stash all of the API level related support variables.  We also extend
the seccomp_reset() API call to reset this internal global state when
passed a NULL filter context.

There is one potential case which we don't currently handle well:
threads.  At the moment libseccomp is thread ignorant, and that works
well as the only global state up to this point was the currently
supported API level information which was common to all threads in a
process.  Unfortunately, it appears that the notification fd need not
be common to all threads in a process, yet this patch treats it as if
it is common.  I suspect this is a very unusual use case so I decided
to keep this patch simple and ignore this case, but in the future if
we need to support this properly we should be able to do so without
API changes by keeping an internal list of notification fds indexed
by gettid(2).

This fixes the GitHub issue below:
* seccomp#273

Reported-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
@pcmoore pcmoore linked a pull request Aug 2, 2020 that will close this issue
@pcmoore
Copy link
Member

pcmoore commented Aug 2, 2020

Proper PR time, see PR #280.

@poettering
Copy link

(For context: systemd uses libseccomp heavily, most of its daemons come with a syscall filter enforced using libseccomp. All relevant filters ended up being disabled in stable f32 due to this bug, because we install the filters for each arch separately, as well as for each of our user-selectable configuration options separately, which means in effect on x86-64 only a tiny subset of the filters and only for the i386 syscall arch got installed, so that everything effectively is wide open. This bug is quite a biggie to us.)

pcmoore added a commit to pcmoore/misc-libseccomp that referenced this issue Aug 4, 2020
It turns out that requesting the seccomp userspace notifcation fd
more than once is a bad thing which causes the kernel to complain
(rightfully so for a variety of reasons).  Unfortunately as we were
always requesting the notification fd whenever possible this results
in problems at filter load time.

Our solution is to move the notification fd out of the filter context
and into the global task context, using a newly created task_state
structure.  This allows us to store, and retrieve the notification
outside the scope of an individual filter context.  It also provides
some implementation improvements by giving us a convenient place to
stash all of the API level related support variables.  We also extend
the seccomp_reset() API call to reset this internal global state when
passed a NULL filter context.

There is one potential case which we don't currently handle well:
threads.  At the moment libseccomp is thread ignorant, and that works
well as the only global state up to this point was the currently
supported API level information which was common to all threads in a
process.  Unfortunately, it appears that the notification fd need not
be common to all threads in a process, yet this patch treats it as if
it is common.  I suspect this is a very unusual use case so I decided
to keep this patch simple and ignore this case, but in the future if
we need to support this properly we should be able to do so without
API changes by keeping an internal list of notification fds indexed
by gettid(2).

This fixes the GitHub issue below:
* seccomp#273

Reported-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Signed-off-by: Paul Moore <paul@paul-moore.com>
pcmoore added a commit to pcmoore/misc-libseccomp that referenced this issue Aug 4, 2020
It turns out that requesting the seccomp userspace notifcation fd
more than once is a bad thing which causes the kernel to complain
(rightfully so for a variety of reasons).  Unfortunately as we were
always requesting the notification fd whenever possible this results
in problems at filter load time.

Our solution is to move the notification fd out of the filter context
and into the global task context, using a newly created task_state
structure.  This allows us to store, and retrieve the notification
outside the scope of an individual filter context.  It also provides
some implementation improvements by giving us a convenient place to
stash all of the API level related support variables.  We also extend
the seccomp_reset() API call to reset this internal global state when
passed a NULL filter context.

There is one potential case which we don't currently handle well:
threads.  At the moment libseccomp is thread ignorant, and that works
well as the only global state up to this point was the currently
supported API level information which was common to all threads in a
process.  Unfortunately, it appears that the notification fd need not
be common to all threads in a process, yet this patch treats it as if
it is common.  I suspect this is a very unusual use case so I decided
to keep this patch simple and ignore this case, but in the future if
we need to support this properly we should be able to do so without
API changes by keeping an internal list of notification fds indexed
by gettid(2).

This fixes the GitHub issue below:
* seccomp#273

Reported-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Acked-by: Tom Hromatka <tom.hromatka@oracle.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
@pcmoore
Copy link
Member

pcmoore commented Aug 4, 2020

@poettering if this bug is a "biggie" for you on F32 stable, you might consider sticking with the v2.4.x release stream until the next v2.5.x release. The libseccomp v2.4.x release is still supported and I believe @drakenclimber is planning on a new v2.4.x release soon.

pcmoore added a commit to pcmoore/misc-libseccomp that referenced this issue Aug 4, 2020
It turns out that requesting the seccomp userspace notifcation fd
more than once is a bad thing which causes the kernel to complain
(rightfully so for a variety of reasons).  Unfortunately as we were
always requesting the notification fd whenever possible this results
in problems at filter load time.

Our solution is to move the notification fd out of the filter context
and into the global task context, using a newly created task_state
structure.  This allows us to store, and retrieve the notification
outside the scope of an individual filter context.  It also provides
some implementation improvements by giving us a convenient place to
stash all of the API level related support variables.  We also extend
the seccomp_reset() API call to reset this internal global state when
passed a NULL filter context.

There is one potential case which we don't currently handle well:
threads.  At the moment libseccomp is thread ignorant, and that works
well as the only global state up to this point was the currently
supported API level information which was common to all threads in a
process.  Unfortunately, it appears that the notification fd need not
be common to all threads in a process, yet this patch treats it as if
it is common.  I suspect this is a very unusual use case so I decided
to keep this patch simple and ignore this case, but in the future if
we need to support this properly we should be able to do so without
API changes by keeping an internal list of notification fds indexed
by gettid(2).

This fixes the GitHub issue below:
* seccomp#273

Reported-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Acked-by: Tom Hromatka <tom.hromatka@oracle.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
@Conan-Kudo
Copy link

@pcmoore I do not intend to roll back the libseccomp update. Please do the right thing and fix this in 2.5.x.

@pcmoore
Copy link
Member

pcmoore commented Aug 4, 2020

@Conan-Kudo please read the PR and see we are.

@Conan-Kudo
Copy link

@pcmoore Yup, sorry, I saw that later. Unfortunately this new GitHub UI makes it less obvious that there's a PR linked to an issue now...

@pcmoore
Copy link
Member

pcmoore commented Aug 4, 2020

@Conan-Kudo well to be frank I was more concerned about the attitude that came across with those comments than the fact that you had missed the PR in progress. This was a significant libseccomp release (v2.4.x -> v2.5.0) and there are bound to be bugs that aren't found until the release hits the major distros, if you rushed it from Rawhide into a Fedora stable release without sufficient testing with systemd, don't blame the upstream project.

@Conan-Kudo
Copy link

Conan-Kudo commented Aug 4, 2020

I've had a pretty bad day of badgering from various people all day over this and other things.

As for "rushing" it, I did not do that. Since it was ABI compatible, I considered it safe to submit and didn't see anything obviously wrong when I tried it locally. I pushed it into Bodhi and let it sit for testing like everything else. It was given karma by other folks and made it to stable accordingly last week. This bug only showed up in the last day or so.

@pcmoore
Copy link
Member

pcmoore commented Aug 5, 2020

I've had a pretty bad day of badgering from various people all day over this and other things.

Then you should understand how your "I do not intend to roll back the libseccomp update. Please do the right thing and fix this in 2.5.x." comment sounds.

@Conan-Kudo
Copy link

Yeah, yeah... I know. 😩

Anyway, I've cherry-picked the PR into Fedora: https://src.fedoraproject.org/rpms/libseccomp/c/b7df1ffe6c217ba2fb9251b5b9c67f12040a68b5

@pcmoore
Copy link
Member

pcmoore commented Aug 5, 2020

To elaborate a bit more, it's the "please do the right thing" that is especially grating. It implies that we are intentionally trying to do the wrong thing, which is maddening when you consider the time and energy that goes into a release.

Listen, everybody hates it when stuff breaks, it sucks; it sucks doubly hard when you're overworked, burned out, and likely going through other stuff with this global pandemic. Just assume we're upset it's broken too and we're trying to come up with a fix that doesn't screw things up worse - okay?

@Conan-Kudo
Copy link

Yep, fine with me. I'll try to keep it in mind for the next time...

@pcmoore
Copy link
Member

pcmoore commented Aug 5, 2020

Anyway, I've cherry-picked the PR into Fedora: https://src.fedoraproject.org/rpms/libseccomp/c/b7df1ffe6c217ba2fb9251b5b9c67f12040a68b5

Okay, just buyer-beware that hasn't been merged yet, although there are no ABI changes so it should be safe'ish ... ? Regardless, let us know how that goes; feedback on if it solves the problem for the Fedora crowd would be good.

@Conan-Kudo
Copy link

Yeah, I figured that the severity of the unintended breakage would warrant pulling it in early and getting it evaluated to fix the issue we've hit.

pcmoore added a commit to pcmoore/misc-libseccomp that referenced this issue Aug 18, 2020
It turns out that requesting the seccomp userspace notifcation fd
more than once is a bad thing which causes the kernel to complain
(rightfully so for a variety of reasons).  Unfortunately as we were
always requesting the notification fd whenever possible this results
in problems at filter load time.

Our solution is to move the notification fd out of the filter context
and into the global task context, using a newly created task_state
structure.  This allows us to store, and retrieve the notification
outside the scope of an individual filter context.  It also provides
some implementation improvements by giving us a convenient place to
stash all of the API level related support variables.  We also extend
the seccomp_reset() API call to reset this internal global state when
passed a NULL filter context.

There is one potential case which we don't currently handle well:
threads.  At the moment libseccomp is thread ignorant, and that works
well as the only global state up to this point was the currently
supported API level information which was common to all threads in a
process.  Unfortunately, it appears that the notification fd need not
be common to all threads in a process, yet this patch treats it as if
it is common.  I suspect this is a very unusual use case so I decided
to keep this patch simple and ignore this case, but in the future if
we need to support this properly we should be able to do so without
API changes by keeping an internal list of notification fds indexed
by gettid(2).

This fixes the GitHub issue below:
* seccomp#273

Reported-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Acked-by: Tom Hromatka <tom.hromatka@oracle.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
pcmoore added a commit that referenced this issue Aug 18, 2020
It turns out that requesting the seccomp userspace notifcation fd
more than once is a bad thing which causes the kernel to complain
(rightfully so for a variety of reasons).  Unfortunately as we were
always requesting the notification fd whenever possible this results
in problems at filter load time.

Our solution is to move the notification fd out of the filter context
and into the global task context, using a newly created task_state
structure.  This allows us to store, and retrieve the notification
outside the scope of an individual filter context.  It also provides
some implementation improvements by giving us a convenient place to
stash all of the API level related support variables.  We also extend
the seccomp_reset() API call to reset this internal global state when
passed a NULL filter context.

There is one potential case which we don't currently handle well:
threads.  At the moment libseccomp is thread ignorant, and that works
well as the only global state up to this point was the currently
supported API level information which was common to all threads in a
process.  Unfortunately, it appears that the notification fd need not
be common to all threads in a process, yet this patch treats it as if
it is common.  I suspect this is a very unusual use case so I decided
to keep this patch simple and ignore this case, but in the future if
we need to support this properly we should be able to do so without
API changes by keeping an internal list of notification fds indexed
by gettid(2).

This fixes the GitHub issue below:
* #273

Reported-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Acked-by: Tom Hromatka <tom.hromatka@oracle.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
(imported from commit ce314fe)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants