New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
selinux: handle getcon_raw producing a NULL pointer, despite returning 0 #16571
Conversation
Previously, we assumed that success meant we definitely got a valid pointer. There is at least one edge case where this is not true (i.e., we can get both a 0 return value, and *also* a NULL pointer): https://github.com/SELinuxProject/selinux/blob/4246bb550dee5246c8567804325b7da206cd76cf/libselinux/src/procattr.c#L175 When this case occurrs, if we don't check the pointer we SIGSEGV in early initialization.
Pfffffffffffffff. LGTM. |
semaphoreci:
Unrelated. |
FWIW, this should really never happen, |
Interesting. I have a kernel upon which it does happen (this was my motivation for the patch), but admittedly it's a weird kernel, in that it has a lot of Google patches so it diverges significantly from upstream. It's totally possible it's just misbehaving in some way. Thanks for the pointer, I'll investigate and see if I can make this kernel a bit less weird. :) |
What is getcon_raw() returning 0 with NULL supposed to mean though? Can a process have no label on Selinux? this sounds fishy to me, like paving over some bug elsewhere, and possibly something better caught with an assert() |
I don't think we should assert on data coming from an external library that is subject to user configuration, especially one as fickle as selinux. |
hmm, well, the way i saw it so far is that getcon_raw() is a thin wrapper around a syscall, and assert()ing on impossible errors of kernel syscalls i think is OK... but anyway, my main point is: this looks fishy, and nothing to just tape over here... |
i simply don't see why we should assume selinux has been initialized if getcon_raw() returns NULL. Because that's what this patch effectively does... |
It's not a syscall exactly. getcon_raw is returning NULL because it open()-s /proc//attr/, and then when it tries to read() it we just get EOF immediately (source). Ignoring my patch, we would have assumed the context hadn't been initialized in any number of cases ... if asprintf() to get the /proc path fails to allocate, or if the open() fails, or etc. Actually the asprintf() case seems really weird to me, but that's a separate issue. I don't think the thread @cgzones linked to tells the whole story. It's true that these functions share code, but my reading of the code is that getprocattrcon_raw (specifically the one which is actually returning 0 + NULL) is only for this particular attribute, it isn't shared among other kinds of attributes. It's not exactly clear to me that it's a "kernel bug" for the file to exist, but be empty - I'll investigate that a bit more today and see why exactly that's happening on this particular kernel. The fact that the libselinux code, which exists specifically for getcon_raw as far as I can tell, already checks for early EOF, makes me think that perhaps my kernel is not so weird after all though. Clearly whoever wrote that thought EOF was a possibility that needed to be handled? |
Ah, I guess this is the re-use they were talking about on the selinux list. These macros sure make it hard to grep for where things are used. :/ I looked more closely at the selinux code in the kernel, and I can see a bunch of code paths where we might get a zero length. Maybe these "should never happen", but the code at least allows for them. The procfs code accounts for a zero length here. This can happen for example if this function returns an error here. There are several -ENOMEM conditions in which this might happen. There are also several -EINVAL exit codes, if the state is not what we expected. See this function. There are enough different code paths where this can happen that I can't tell why this is happening in my specific case purely by inspection. If you feel strongly, let me know and I can track down the specific code path I'm hitting. FWIW, I also looked a bit more closely at what weird patches this Google kernel might have. At least with |
This userspace library function might only return 0 (success) and set
This is not considered a failure (e.g. setting Stephen (@stephensmalley) mentioned on the SELinux mailing list the kernel guarantees that
In case of a kernel failure providing
I do not think the commit does this: in case systemd/src/core/selinux-setup.c Lines 51 to 57 in b1d666a
is skipped. p.s.: I am not opposed to this commit, as under normal conditions it's just an unnecessary always-true NULL-check |
Actually, does the LSM API require this guarantee in general? It looks like what's happening here is that on my kernel we're actually calling a getprocattr function not from SELinux, but from some other LSM (some out-of-tree thing which is quite similar to PaX's mprotect). I reproduced this with CONFIG_SECURITY_SELINUX unset. It looks like this is basically because other security modules reuse /proc/self/attr/current, and without this commit: torvalds/linux@6d9c939 (which my kernel doesn't have, being that it's 4.15) we don't do anything to check which LSM is actually getting called. |
I see now. |
Maybe we could improve the comment to something like:
Also with a possible stacking of LSM's in the future and not to rely on SELinux policy writers using the string "kernel" in the kernel initial sid, we may want in the near future to change the already-initialized logic from /proc/self/attr/current-contains-string-kernel to checking if the SELinux filesystem (/sys/fs/selinux or old style /selinux) exists. |
This code was changed in this pull request: systemd#16571 After some discussion and more investigation, we better understand what's going on. So, update the comment, so things are more clear to future readers.
This code was changed in this pull request: #16571 After some discussion and more investigation, we better understand what's going on. So, update the comment, so things are more clear to future readers.
Currently the check consist of checking whether our current process context, retrieved via getcon_raw(3), does not contain the string "kernel". If the process context exists, i.e. is non-empty, a LSM is already initialized but not necessarily SELinux. Currently this does not matter, cause stacking major LSM's is not supported but might be in the future, see https://www.redhat.com/archives/linux-audit/2020-April/msg00022.html Also the string "kernel" is an implementation detail of the individual SELinux policy. (The Reference Policy and Fedora/Red Hat are using "system_u:system_r:kernel_t" as initial kernel process security identifier.) Since 68d3aca ("core: let selinux_setup() load policy more than once") this check decides whether to fail on selinux_init_load_policy(3) failure with desired enforcing mode 'enforce'. Alter to check whether the SELinux filesystem /sys/fs/selinux has been mounted already. Follow-up of systemd#16571 and systemd#16654
Currently the check consist of checking whether our current process context, retrieved via getcon_raw(3), does not contain the string "kernel". The string "kernel" is an implementation detail of the individual SELinux policy. The Reference Policy and Fedora/Red Hat are using "system_u:system_r:kernel_t" as initial kernel process security identifier. Since 68d3aca ("core: let selinux_setup() load policy more than once") this check decides whether to fail on selinux_init_load_policy(3) failure with desired enforcing mode 'enforce'; prior it had skipped that policy load. Also the transitioning to the new context (mac_selinux_get_create_label_from_exe + setcon_raw) is safe whether in the kernel or init context: $ compute_create "system_u:system_r:kernel_t:s0" "system_u:object_r:systemd_exec_t:s0" process system_u:system_r:systemd_t:s0 $ compute_create "system_u:system_r:systemd_t:s0" "system_u:object_r:systemd_exec_t:s0" process system_u:system_r:systemd_t:s0 Just check if the current porcess context is a valid SELinux context. Follow-up of systemd#16571 and systemd#16654
Previously, we assumed that success meant we definitely got a valid
pointer. There is at least one edge case where this is not true (i.e.,
we can get both a 0 return value, and also a NULL pointer):
https://github.com/SELinuxProject/selinux/blob/4246bb550dee5246c8567804325b7da206cd76cf/libselinux/src/procattr.c#L175
When this case occurrs, if we don't check the pointer we SIGSEGV in
early initialization.