selinux: handle getcon_raw producing a NULL pointer, despite returning 0 #16571

CmdrMoozy · 2020-07-23T18:04:25Z

Previously, we assumed that success meant we definitely got a valid
pointer. There is at least one edge case where this is not true (i.e.,
we can get both a 0 return value, and also a NULL pointer):
https://github.com/SELinuxProject/selinux/blob/4246bb550dee5246c8567804325b7da206cd76cf/libselinux/src/procattr.c#L175

When this case occurrs, if we don't check the pointer we SIGSEGV in
early initialization.

Previously, we assumed that success meant we definitely got a valid pointer. There is at least one edge case where this is not true (i.e., we can get both a 0 return value, and *also* a NULL pointer): https://github.com/SELinuxProject/selinux/blob/4246bb550dee5246c8567804325b7da206cd76cf/libselinux/src/procattr.c#L175 When this case occurrs, if we don't check the pointer we SIGSEGV in early initialization.

keszybz · 2020-07-23T19:36:45Z

Pfffffffffffffff. LGTM.

keszybz · 2020-07-23T19:38:12Z

semaphoreci:

+ sudo lxc-create -n buster-amd64 -t download -- -d debian -r buster -a amd64 --keyserver hkp://keyserver.ubuntu.com:80
Setting up the GPG keyring
Downloading the image index
ERROR: Invalid signature for /tmp/tmp.TpoXRXrmoY/index.asc
lxc-create: buster-amd64: lxccontainer.c: create_run_template: 1617 Failed to create container from template
lxc-create: buster-amd64: tools/lxc_create.c: main: 327 Failed to create container buster-amd64

Unrelated.

cgzones · 2020-07-24T15:16:33Z

FWIW, this should really never happen,
see https://lore.kernel.org/selinux/CAEjxPJ5nExmrkbgMfD2B9cs-iOdSAVnNfqcqbfns10vOa31=5Q@mail.gmail.com/T/#r33960dc91492b5013b1d376fc980014b568918a3

CmdrMoozy · 2020-07-24T15:43:01Z

Interesting. I have a kernel upon which it does happen (this was my motivation for the patch), but admittedly it's a weird kernel, in that it has a lot of Google patches so it diverges significantly from upstream. It's totally possible it's just misbehaving in some way. Thanks for the pointer, I'll investigate and see if I can make this kernel a bit less weird. :)

poettering · 2020-07-28T06:41:25Z

What is getcon_raw() returning 0 with NULL supposed to mean though? Can a process have no label on Selinux? this sounds fishy to me, like paving over some bug elsewhere, and possibly something better caught with an assert()

keszybz · 2020-07-28T07:50:21Z

I don't think we should assert on data coming from an external library that is subject to user configuration, especially one as fickle as selinux.

poettering · 2020-07-28T09:56:56Z

hmm, well, the way i saw it so far is that getcon_raw() is a thin wrapper around a syscall, and assert()ing on impossible errors of kernel syscalls i think is OK... but anyway, my main point is: this looks fishy, and nothing to just tape over here...

poettering · 2020-07-28T09:57:38Z

i simply don't see why we should assume selinux has been initialized if getcon_raw() returns NULL. Because that's what this patch effectively does...

CmdrMoozy · 2020-07-28T15:49:40Z

It's not a syscall exactly. getcon_raw is returning NULL because it open()-s /proc//attr/, and then when it tries to read() it we just get EOF immediately (source).

Ignoring my patch, we would have assumed the context hadn't been initialized in any number of cases ... if asprintf() to get the /proc path fails to allocate, or if the open() fails, or etc. Actually the asprintf() case seems really weird to me, but that's a separate issue.

I don't think the thread @cgzones linked to tells the whole story. It's true that these functions share code, but my reading of the code is that getprocattrcon_raw (specifically the one which is actually returning 0 + NULL) is only for this particular attribute, it isn't shared among other kinds of attributes.

It's not exactly clear to me that it's a "kernel bug" for the file to exist, but be empty - I'll investigate that a bit more today and see why exactly that's happening on this particular kernel.

The fact that the libselinux code, which exists specifically for getcon_raw as far as I can tell, already checks for early EOF, makes me think that perhaps my kernel is not so weird after all though. Clearly whoever wrote that thought EOF was a possibility that needed to be handled?

CmdrMoozy · 2020-07-28T17:20:07Z

Ah, I guess this is the re-use they were talking about on the selinux list. These macros sure make it hard to grep for where things are used. :/

I looked more closely at the selinux code in the kernel, and I can see a bunch of code paths where we might get a zero length. Maybe these "should never happen", but the code at least allows for them.

The procfs code accounts for a zero length here.

This can happen for example if this function returns an error here.

There are several -ENOMEM conditions in which this might happen. There are also several -EINVAL exit codes, if the state is not what we expected. See this function.

There are enough different code paths where this can happen that I can't tell why this is happening in my specific case purely by inspection. If you feel strongly, let me know and I can track down the specific code path I'm hitting.

FWIW, I also looked a bit more closely at what weird patches this Google kernel might have. At least with git log --author='@google\.com' --perl-regexp security/selinux/ I don't see any out-of-tree patches. There are many backports though; this kernel is based on 4.15.

cgzones · 2020-07-29T13:52:49Z

getcon() (and getcon_raw(), which skips context translation, see man:setrans.conf(8)) is a wrapper around reading /proc/self/attr/current.

This userspace library function might only return 0 (success) and set *context to NULL if the read syscall returns 0. (mentioned in the commit message)

There is at least one edge case where this is not true (i.e.,
we can get both a 0 return value, and also a NULL pointer):
https://github.com/SELinuxProject/selinux/blob/4246bb550dee5246c8567804325b7da206cd76cf/libselinux/src/procattr.c#L175

This is not considered a failure (e.g. setting errno to ENODATA and ret to -1), because the function getprocattrcon_raw() is a template for different functions, e.g. set-/get-fscreatecon() (wrapper for /proc/self/attr/fscreate, see man:setfscreatecon(8)).
fscreate determines the SELinux security context used for creating a new file system object, and is valid to be empty (an empty string), which is the default and uses the default policy behaviour (see the usages of setfscreatecon_raw(NULL) in systemd).

Stephen (@stephensmalley) mentioned on the SELinux mailing list the kernel guarantees that /proc/self/attr/current contains a non-zero length string (on success), so the read syscall will never return 0 in getprocattrcon_raw().

I looked more closely at the selinux code in the kernel, and I can see a bunch of code paths where we might get a zero length. Maybe these "should never happen", but the code at least allows for them.

The procfs code accounts for a zero length here.

This can happen for example if this function returns an error here.

There are several -ENOMEM conditions in which this might happen. There are also several -EINVAL exit codes, if the state is not what we expected. See this function.

In case of a kernel failure providing /proc/self/attr/current the read syscall should propagate the failure, and also not return 0.

i simply don't see why we should assume selinux has been initialized if getcon_raw() returns NULL. Because that's what this patch effectively does...

I do not think the commit does this: in case getcon_raw() sets the context to NULL, the initialization check

systemd/src/core/selinux-setup.c

Lines 51 to 57 in b1d666a

    
           /* Already initialized by somebody else? */ 
        
           r = getcon_raw(&con); 
        
           /* getcon_raw can return 0, and still give us a NULL pointer. */ 
        
           if (r == 0 && con) { 
        
                   initialized = !streq(con, "kernel"); 
        
                   freecon(con); 
        
           }

is skipped.

p.s.: I am not opposed to this commit, as under normal conditions it's just an unnecessary always-true NULL-check

CmdrMoozy · 2020-07-29T17:44:32Z

Actually, does the LSM API require this guarantee in general? It looks like what's happening here is that on my kernel we're actually calling a getprocattr function not from SELinux, but from some other LSM (some out-of-tree thing which is quite similar to PaX's mprotect). I reproduced this with CONFIG_SECURITY_SELINUX unset.

It looks like this is basically because other security modules reuse /proc/self/attr/current, and without this commit: torvalds/linux@6d9c939 (which my kernel doesn't have, being that it's 4.15) we don't do anything to check which LSM is actually getting called.

cgzones · 2020-07-29T18:19:58Z

I see now.
So we (may) read /proc/self/attr/current before SELinux is enabled (so the SELinux guarantee is void).

cgzones · 2020-07-30T12:47:22Z

Maybe we could improve the comment to something like:

/* getcon_raw can return 0, and still give us a NULL pointer if
 * SELinux is not yet initialized and another LSM provides /proc/self/attr/current. 
 */

Also with a possible stacking of LSM's in the future and not to rely on SELinux policy writers using the string "kernel" in the kernel initial sid, we may want in the near future to change the already-initialized logic from /proc/self/attr/current-contains-string-kernel to checking if the SELinux filesystem (/sys/fs/selinux or old style /selinux) exists.

This code was changed in this pull request: systemd#16571 After some discussion and more investigation, we better understand what's going on. So, update the comment, so things are more clear to future readers.

This code was changed in this pull request: #16571 After some discussion and more investigation, we better understand what's going on. So, update the comment, so things are more clear to future readers.

Currently the check consist of checking whether our current process context, retrieved via getcon_raw(3), does not contain the string "kernel". If the process context exists, i.e. is non-empty, a LSM is already initialized but not necessarily SELinux. Currently this does not matter, cause stacking major LSM's is not supported but might be in the future, see https://www.redhat.com/archives/linux-audit/2020-April/msg00022.html Also the string "kernel" is an implementation detail of the individual SELinux policy. (The Reference Policy and Fedora/Red Hat are using "system_u:system_r:kernel_t" as initial kernel process security identifier.) Since 68d3aca ("core: let selinux_setup() load policy more than once") this check decides whether to fail on selinux_init_load_policy(3) failure with desired enforcing mode 'enforce'. Alter to check whether the SELinux filesystem /sys/fs/selinux has been mounted already. Follow-up of systemd#16571 and systemd#16654

Currently the check consist of checking whether our current process context, retrieved via getcon_raw(3), does not contain the string "kernel". The string "kernel" is an implementation detail of the individual SELinux policy. The Reference Policy and Fedora/Red Hat are using "system_u:system_r:kernel_t" as initial kernel process security identifier. Since 68d3aca ("core: let selinux_setup() load policy more than once") this check decides whether to fail on selinux_init_load_policy(3) failure with desired enforcing mode 'enforce'; prior it had skipped that policy load. Also the transitioning to the new context (mac_selinux_get_create_label_from_exe + setcon_raw) is safe whether in the kernel or init context: $ compute_create "system_u:system_r:kernel_t:s0" "system_u:object_r:systemd_exec_t:s0" process system_u:system_r:systemd_t:s0 $ compute_create "system_u:system_r:systemd_t:s0" "system_u:object_r:systemd_exec_t:s0" process system_u:system_r:systemd_t:s0 Just check if the current porcess context is a valid SELinux context. Follow-up of systemd#16571 and systemd#16654

keszybz added this to the v246 milestone Jul 23, 2020

keszybz added the pid1 label Jul 23, 2020

keszybz added the good-to-merge/waiting-for-ci 👍 PR is good to merge, but CI hasn't passed at time of review. Please merge if you see CI has passed label Jul 23, 2020

yuwata merged commit 199a892 into systemd:master Jul 24, 2020

CmdrMoozy mentioned this pull request Aug 3, 2020

selinux: improve comment about getcon_raw semantics #16654

Merged

cgzones mentioned this pull request Aug 17, 2020

selinux: rework boot time check if already initialized #16763

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

selinux: handle getcon_raw producing a NULL pointer, despite returning 0 #16571

selinux: handle getcon_raw producing a NULL pointer, despite returning 0 #16571

CmdrMoozy commented Jul 23, 2020

keszybz commented Jul 23, 2020

keszybz commented Jul 23, 2020

cgzones commented Jul 24, 2020

CmdrMoozy commented Jul 24, 2020

poettering commented Jul 28, 2020

keszybz commented Jul 28, 2020

poettering commented Jul 28, 2020

poettering commented Jul 28, 2020

CmdrMoozy commented Jul 28, 2020

CmdrMoozy commented Jul 28, 2020

cgzones commented Jul 29, 2020 •

edited

CmdrMoozy commented Jul 29, 2020

cgzones commented Jul 29, 2020

cgzones commented Jul 30, 2020

selinux: handle getcon_raw producing a NULL pointer, despite returning 0 #16571

selinux: handle getcon_raw producing a NULL pointer, despite returning 0 #16571

Conversation

CmdrMoozy commented Jul 23, 2020

keszybz commented Jul 23, 2020

keszybz commented Jul 23, 2020

cgzones commented Jul 24, 2020

CmdrMoozy commented Jul 24, 2020

poettering commented Jul 28, 2020

keszybz commented Jul 28, 2020

poettering commented Jul 28, 2020

poettering commented Jul 28, 2020

CmdrMoozy commented Jul 28, 2020

CmdrMoozy commented Jul 28, 2020

cgzones commented Jul 29, 2020 • edited

CmdrMoozy commented Jul 29, 2020

cgzones commented Jul 29, 2020

cgzones commented Jul 30, 2020

cgzones commented Jul 29, 2020 •

edited