-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: inconsistent handling of ipc syscalls on s390 and s390x #193
Comments
Hi @cpaelzer, thanks for the detailed problem report, but I'm afraid to report that this is the expected behavior, let me try to explain ... At the heart of the problem is the use of the
In other words, Why is this important here? It is important because you are attempting to add a rule for the Look at the revised example:
Does this make sense? However, that said, it does appear that we do have some inconsistency with s390 and/or s390x. For both ABIs the @drakenclimber see the last paragraph above. This combined with the other recent fixes makes me think we need to do a v2.4.3 release. |
Thanks a lot @pcmoore for explaining! You said
But the example given shows only As I said this was triggered by systemd tests and it is of the backend implementing MemoryDenyWriteExecute so we'd want to fix not only the test but the code to work as intended.
OR if it has to be non-exact version as there is no way out with_exact on x86/s390x
|
Oh, and a hint to a commit or so why this changed between 2.4.1 and 2.4.2 would be great as well for when I try to reference things in a systemd discussion. |
Further clarification about s390x in this scope - you said there is some inconsistency. With 2.4.3 and a fix for that in place what should be expected? Will it be _"_exact for shmat (will) work on s390x, but on s390 it will be multiplexed and needs the non exact" or will it be different than that? Currently I need the non- |
And - I hope finally - for clarification is there anything exposed by seccomp to keep that domain-knowledge what is multiplexed in libseccomp and make use of that.
Maybe with an optional ARCH argument (=local-arch if not given). |
Since libseccomp 2.4.2 multiplexed system calls fail to be added due to seccomp_rule_add_exact failing on them since they'd need to add multiple rules. See the discussion at seccomp/libseccomp#193 For those cases we need to fall back to the non '_exact' version of the call. Since there is no seccomp_is_multiplexed() call this uses the existing switch/case to differentiate between architectures and adds tracking of calls that will multiplex (for now just shmat). add_seccomp_syscall_filter gets extended to register such multiplexed call with the non-exact seccomp_rule_add which gets the 31/32 bit arches of s390 and x86 to work again. Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Since libseccomp 2.4.2 multiplexed system calls fail to be added due to seccomp_rule_add_exact failing on them since they'd need to add multiple rules. See the discussion at seccomp/libseccomp#193 For those cases we need to fall back to the non '_exact' version of the call. Since there is no seccomp_is_multiplexed() call this uses the existing switch/case to differentiate between architectures and adds tracking of calls that will multiplex (for now just shmat). add_seccomp_syscall_filter gets extended to register such multiplexed call with the non-exact seccomp_rule_add which gets the 31/32 bit arches of s390 and x86 to work again. Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
FYI: I opened a - still experimental - MR on systemd about this. I'd be happy if you can here answer my questions I had since the initial answer. |
Ooops, sorry about that! That was a typo on my part, it should have read: "a similar call to I updated my original comment to fix this.
I would suggest using |
I would need to dig into the code to see why v2.4.1 behaved as it did, but the v2.4.2 behavior (with |
We would need to look into what the Linux kernel supports for s390 and s390x to make sure that libseccomp is doing the right thing. I haven't done that yet, but if you want to get involved in libseccomp we would love the help! |
No, and I don't think we will ever support something like that in libseccomp. One of the main goals in libseccomp is to abstract away all the ABI specifics. We want you to just be able to call |
Thanks for all the answers and clarifications @pcmoore those helped me a lot! Almost as I feared there now is a clash of you
vs @poettering in that discussion
@pcmoore - could I "ask you over" to participate in the discussion there so that we can jointly identify a solution that works, but also is acceptable to systemd? |
Since libseccomp 2.4.2 multiplexed system calls fail to be added due to seccomp_rule_add_exact failing on them since they'd need to add multiple rules. See the discussion at seccomp/libseccomp#193 For those cases we need to fall back to the non '_exact' version of the call. Since there is no seccomp_is_multiplexed() call this uses the existing switch/case to differentiate between architectures and adds tracking of calls that will multiplex (for now just shmat). add_seccomp_syscall_filter gets extended to register such multiplexed call with the non-exact seccomp_rule_add which gets the 31/32 bit arches of s390 and x86 to work again. Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Yes, Lennart does what Lennart wants to do, and I suspect my involvement will have little benefit other than intruding on the holidays here in the US (Thanksgiving). I looked quickly at the systemd thread it appears that there are some possible paths towards a solution - good luck! |
Since libseccomp 2.4.2 more architectures have shmat handled as multiplexed call. Those will fail to be added due to seccomp_rule_add_exact failing on them since they'd need to add multiple rules [1]. See the discussion at seccomp/libseccomp#193 After discussions about the options rejected [2][3] the initial thought of a fallback to the non '_exact' version of the seccomp rule adding the next option is to handle those now affected (i386, s390, s390x) the same way as ppc which ignores and does not block shmat. [1]: seccomp/libseccomp#193 [2]: systemd#14167 (comment) [3]: systemd@469830d1
Since libseccomp 2.4.2 more architectures have shmat handled as multiplexed call. Those will fail to be added due to seccomp_rule_add_exact failing on them since they'd need to add multiple rules [1]. See the discussion at seccomp/libseccomp#193 After discussions about the options rejected [2][3] the initial thought of a fallback to the non '_exact' version of the seccomp rule adding the next option is to handle those now affected (i386, s390, s390x) the same way as ppc which ignores and does not block shmat. [1]: seccomp/libseccomp#193 [2]: systemd#14167 (comment) [3]: systemd@469830d1
Since libseccomp 2.4.2 more architectures have shmat handled as multiplexed call. Those will fail to be added due to seccomp_rule_add_exact failing on them since they'd need to add multiple rules [1]. See the discussion at seccomp/libseccomp#193 After discussions about the options rejected [2][3] the initial thought of a fallback to the non '_exact' version of the seccomp rule adding the next option is to handle those now affected (i386, s390, s390x) the same way as ppc which ignores and does not block shmat. [1]: seccomp/libseccomp#193 [2]: systemd#14167 (comment) [3]: systemd@469830d1
Since libseccomp 2.4.2 more architectures have shmat handled as multiplexed call. Those will fail to be added due to seccomp_rule_add_exact failing on them since they'd need to add multiple rules [1]. See the discussion at seccomp/libseccomp#193 After discussions about the options rejected [2][3] the initial thought of a fallback to the non '_exact' version of the seccomp rule adding the next option is to handle those now affected (i386, s390, s390x) the same way as ppc which ignores and does not block shmat. [1]: seccomp/libseccomp#193 [2]: systemd#14167 (comment) [3]: systemd@469830d1
Since libseccomp 2.4.2 more architectures have shmat handled as multiplexed call. Those will fail to be added due to seccomp_rule_add_exact failing on them since they'd need to add multiple rules [1]. See the discussion at seccomp/libseccomp#193 After discussions about the options rejected [2][3] the initial thought of a fallback to the non '_exact' version of the seccomp rule adding the next option is to handle those now affected (i386, s390, s390x) the same way as ppc which ignores and does not block shmat. [1]: seccomp/libseccomp#193 [2]: systemd/systemd#14167 (comment) [3]: systemd/systemd@469830d1 (cherry picked from commit bed4668)
Hi @cpaelzer, any chance you could test the fix @drakenclimber made in #206? |
@pcmoore sorry - I was on PTO last week and I missed the ping 27 days ago by @drakenclimber . I revived my test of the initial report (see above) and was using a local build of current libsecomp master as of today, but the result is still the same:
So it didn't get worse, but the rule add with exact still didn't work - should it have worked? |
No problem @cpaelzer, thanks for checking back and giving it a test. Unfortunately it would appear that more work is needed so I'm going to reopen this issue. @drakenclimber and you okay to keep working on this with @cpaelzer? |
Sure. We can figure this out :) |
@drakenclimber AFAIK at the price of some personal data you can get a s390x VM here |
Thanks! I'll give it a shot with my Oracle email. We'll see what IBM thinks of that ;) |
I was able to reserve an s390x machine. Thanks, @cpaelzer.
Currently I am unable to reproduce the behavior you have reported. Here is what I see when running libseccomp v2.4.1. (I added some extra printfs in the library itself to verify I was running the correct library.)
And for completeness - I am seeing the exact same thing on the head of master.
|
Also, this is the line that is causing the failure. I would like to hear @pcmoore's thoughts on this section of code. It appears that s390 and x86 are designed to always reject multiplexed-socket syscall strict rules that have an argument. I'm not yet sure what changed from v2.4.1 to v2.4.2 for @cpaelzer to see different behavior.
|
When adding a "strict" rule (in other words, using the The reason we fail multiplexed socket syscalls when an argument check is supplied is that we can not perform the argument check due to the nature of how the syscall is multiplexed. In these cases the individual syscall is passed to the multiplexed syscall as arg0 and the individual syscall's arguments are combined into a structure which is passed, via a pointer, to the multiplexed syscall. It is because of this argument structure passing that we are unable to inspect the argument and do the argument filtering. ... and before anyone asks, this is a kernel limitation, not something we can fix in libseccomp. The solution is simply to use the non-exact rule APIs and let libseccomp handle it for you. It will do what you want[*], you just have to give it the license to do so ;) [*] Including generating both the argument comparison filter for the direct wired syscall as well as the slightly munged filter for the multiplexed syscall so that if you run on a newer kernel/libc the argument comparison will be made, but it will still do the basic syscall checking on an older kernel/libc. I'm not aware of any other seccomp library/mechanism that does this for you. |
Thanks! That's how I was reading the situation, but it was nice to hear the official explanation. Much appreciated. I guess I don't understand how this worked for @cpaelzer in v2.4.1 and then broke in v2.4.2. This code didn't change between v.2.4.1 and v2.4.2. (There is one commit that went into v2.4.0 that moved |
@drakenclimber IIRC this was dependent on kernel and libseccomp. At the time I think I had a 5.3 kernel and on that the switch of 2.4.1->2.4.2 broke it. I got the same helpful explanation about We never tracked down the root cause changing it in 2.4.2 (see here ) but since the fail was the expected behavior we decided to fix it in systemd. |
Thanks for the help, @cpaelzer. I think we've done what we can - both in systemd and here in libseccomp. I'll close this one out for now. Feel free to reach out if you have any further issues. |
Hi,
I was trying to push 2.4.2 to Ubuntu and after the few cleanups we had in 2.4.1->2.4.2 that looked pretty good at first. But When the extended checks kicked in I reliably found the systemd self-tests breaking on
i386
ands390x
.For what it is worth - they worked reliable on
x86-64
,armhf
,arm64
andppc64el
.I know that there were related systemd fixups but those are not strictly required since our fix triggered by my chrony tests.
After further debugging of the testcase I found that of the many that systemd runs all kinds of activity around
shmat
seems to have changed.The number resolves to 397 with old and new seccomp, but fails with version 2.4.2
I was going back and forth and isolated the test more and more to now have reached a minimal test program that on
i386
as well ass390x
reliable works/fails with2.4.1
/2.4.2
.Then build with:
Tests on i386 with the test I pasted above:
s390x looks identical to the i386 output
Note: rebuilding on new libseccomp2 2.4.2 does not change this behavior
Now I'm wondering, is that a yet undiscovered issue in libseccomp 2.4.2 and if so what could it be?
Or if you can clearly state that this is right and should never have worked could you outline why it affects just i386 and s390x and how systemd (see the related systemd test / code here) would need to adapt so that I can start a discussion there?
P.S. kernels used are Ubuntus 5.3.0-18-generic in all tests
The text was updated successfully, but these errors were encountered: