Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider Pulling in LKRG #408

Closed
sempervictus opened this issue Dec 15, 2020 · 14 comments
Closed

Consider Pulling in LKRG #408

sempervictus opened this issue Dec 15, 2020 · 14 comments
Labels
enhancement New feature or request

Comments

@sempervictus
Copy link

I created lkrg-org/lkrg#31 to assist with a kernel hardening effort over @ VyOS. Dawns on me that making it in-tree might also be of use here (additional integrity and control flow coverage, added measure of exploit detection, etc), though it may have incompatible Kconfig dependencies or require being built as a module (havent tested linking it into the kernel yet). Might be worth keeping an eye on it for potential future adoption.

@thestinger thestinger added the enhancement New feature or request label Dec 15, 2020
@thestinger
Copy link
Member

Since we have proper CFI, the only use seems to be the integrity validation which could be easily bypassed by an attacker aware of it or similarly not triggered by avoiding modifying the core kernel code. There are vendors like Samsung with support for a hypervisor enforcing a read-only kernel with calls into the hypervisor to make changes to the page tables, etc. which need to be approved by the hypervisor. Seems like a better much approach. I don't really feel that we should deploy something which can be inherently bypassed and compares poorly to the status quo.

Qualcomm also had a more standard implementation of an enforced read-only kernel via a hypervisor for older kernel versions. It rotted away due to the kernel's increasing usage of self-modifying code and dynamic code generation though. It seems quite hopeless at this point. It would essentially need an eBPF implementation.

If the goal is simply breaking exploits targeting more standard kernels by changing the code, it could be done in a more direct and effective way. I don't really think it makes sense as an approach though.

As is, I don't really think we want to deploy it.

We're definitely interested in having more meaningful kernel hardening but we're essentially holding off on a lot of that work until the adoption of generic base kernels for Android. At that point, we hope that it becomes a lot easier to collaborate with others.

@madaidan
Copy link

LKRG also requires enabling a lot of kernel debugging attack surface like kprobes and ftrace in order to hook kernel functions which may not be worth it. Although, I'm not sure if GrapheneOS would care much about that.

@sempervictus
Copy link
Author

@thestinger: point taken. ROMEM is a big one - kind of underpins the majority of the grsec defense model too. I keep meaning to set up a CI for my phone to build images with a grsec kernel (i'll take RAP+friends over LLVM KCFI, they recently split out the FS protections completely so there's no overhead for RBAC anymore), but i'd have to wipe the device to use my own sig keys and that's almost as annoying to deal with as the CI setup. Will definitely try to wire that up for my next device before i start using it.

@thestinger
Copy link
Member

What I mean by enforcing a read-only kernel is that they have a hypervisor controlling access to page tables, so that the kernel itself cannot violate the memory protection rules.

@solardiz
Copy link

@Adam-pi3 brought this discussion to my attention. While we have no objections to your project choosing not to use LKRG, we feel that some of what you guys wrote above is a misunderstanding, and want to correct it for others lurking in here.

As we write on the LKRG homepage, while it "is bypassable by design, such bypasses tend to require more complicated and/or less reliable exploits." Specifically, exploits that take advantage of something LKRG doesn't yet protect or that win races. Thus, the expectation is that in many cases attacks will have to become probabilistic rather than reliable. There's value in that. LKRG's pCFI plays a role in achieving this property, and isn't meant to "compete" with true CFI on its own.

LKRG's integrity checking of the kernel code (as opposed to process credentials and other critical dynamic data) plays a lesser role there, and frankly personally I see less value in it than in the above.

Moving LKRG to a higher privilege than the kernel's (hypervisor, ARM TrustZone) was something Adam wanted to implement from the very beginning and it's claimed to have been done in Aurora (blog post in Russian, and we haven't seen the code). However, there are issues with that. Some are mentioned on the slides, but in addition to those: this is only "easy" and obviously beneficial for read-only parts of the kernel, not for the moving parts.

For the moving parts (including e.g. process credentials), where I actually see more value from LKRG, we'd have the same limitations as we do now: we have to synchronize with the kernel and then either trust the kernel's own computation of the new values or duplicate that computation. LKRG's current approach is "trust the kernel's own computation", but only when and where such computation and update (e.g., of process credentials) is expected (LKRG then updates its shadow copy of the data). This replaces reliable exploits with racy exploits (and we try to shorten those race windows), but is not fixing the vulnerabilities being exploited fully. If we move that logic to a higher privilege level, these properties will stay the same (no improvement), but it'd become harder (and maybe less reliable) for us to synchronize with the kernel. Duplicating the computation would theoretically be better, and thus (if we were to do it) would be a higher priority task than moving anything to a higher privilege level, but it's really invasive and tricky and harder to maintain (and thus less reliable) across kernel versions.

@madaidan wrote:

LKRG also requires enabling a lot of kernel debugging attack surface like kprobes and ftrace in order to hook kernel functions which may not be worth it.

We discussed this confusion around "attack surface" in mid-December, yet you repeat it here two weeks later. :-(

LKRG doesn't require ftrace (but will benefit from kprobes optimization to ftrace if supported in the kernel). It does require kprobes. These aren't expected to increase the kernel's attack surface for non-root (except for possible kernel bugs in related privilege checks - e.g., wrong sysfs pseudo-file permissions), and attack surface for "attacks" by root is mostly a misnomer. (I am somewhat simplifying here, but so did you. I agree a smaller kernel with fewer features is generally better. I just wouldn't say "attack surface" where there isn't expected to be any.)

Anyway, personally I only use LKRG as poor man's partial defense on systems where I expect to keep kernels with known unfixed vulnerabilities, not intending to spend time on their otherwise-proper maintenance (timely updates and reboots). This maximizes LKRG's benefit vs. risk. I understand that hardened distros would hopefully receive updates in time, making LKRG's benefit/risk balance less obvious for those, and thus LKRG a worse fit for such projects than it is for casual/lazy use.

@sempervictus
Copy link
Author

Thanks @solardiz - appreciate the disambiguation. In terms of this project, there are consistent patches and updates backported from AOSP with improvements made by @thestinger and collaborators inline. However, looking at OOB vectors like attacks against the BT stack or other kernel or k-adjacent functions provides rapidly wormable contact-based pathways for attackers. Phones in some ways have more exposure than normal systems - larger exposed attack surface (going off the definition in the original discussion). As a result, under the right conditions, proliferation may occur much faster than nominal patch cycles can address (@thestinger does address things out-of-band on occasion if they're bad enough, but it would be great to take some pressure off of Daniel with intrinsic mechanisms). On a related note, given the draconian SELinux implementation here, as an attacker i want real root - which means smacking down the LSM context around my process space somehow. The more protection and validation we have around the LSM inside the kernel, the less likely it is that an attacker can get real root in their foothold and violate the intended boundaries around it.

@madaidan
Copy link

@solardiz

These aren't expected to increase the kernel's attack surface for non-root (except for possible kernel bugs in related privilege checks - e.g., wrong sysfs pseudo-file permissions), and attack surface for "attacks" by root is mostly a misnomer. (I am somewhat simplifying here, but so did you. I agree a smaller kernel with fewer features is generally better. I just wouldn't say "attack surface" where there isn't expected to be any.)

Android generally doesn't trust the root user, hence why I called it "attack surface". I don't believe it's a misnomer; certain threat models aim to restrict the root user and thus, kprobes would be considered attack surface in that case.

@solardiz
Copy link

@madaidan Fair enough. A further detail, though, is that kprobes are mostly an in-kernel interface, not a (root) user-to-kernel one. If a SELinux policy reasonably restricts root, that should include root's ability to load kernel modules and mess with /dev/mem and such, and this will automatically cover most access to kprobes too. (Conversely, if those things are not restricted, then having access to kprobes as well doesn't matter much.) The only user-to-kernel part is a few pseudo-files under /sys and /proc - I guess access to those (and perhaps many others, besides the few kprobes-related ones) should be restricted by the policy too, but yes this is something to have in mind and (double-)check when adding LKRG to such systems.

@solardiz
Copy link

FWIW, regarding some of the concerns raised here, from Adam's latest testing of LKRG 0.9.0:

I can confirm that LKRG works fine while:

CONFIG_KPROBES=y
CONFIG_KPROBE_EVENTS=n
CONFIG_UPROBES=n
CONFIG_UPROBE_EVENTS=n
CONFIG_FTRACE=n
CONFIG_EVENT_TRACING=n

There is no attack surface exposed to user-mode ;-)

@sempervictus
Copy link
Author

Thank you @solardiz - will update our internal kconfigs and the one pushed to VyOS as needed.
@thestinger - we've had LKRG running for a while in prod now, client envs comfortable with not having a distro kernel but not yet at the point of getting a grsec sub. So far its been a surprisingly easy implementation despite its rather intrusive nature. Adam and Alex have been great in their response to issues as discovered and providing feedback or remedial commits. It tends to catch naive attempts at namespace escapes and other "low hanging fruit" quite well - the kspace injection surface in grapheneos is relatively tiny compared to general distro kernels, but unless the overhead is considered excessive on mobiles it may well be worth including for the active checks/coverage.

@thestinger
Copy link
Member

It does look useful. I'm not entirely sure how I feel about some of the stuff it's doing or the approach being taken. It would probably be useful to include it if it turned out to be low maintenance and didn't cause stability issues. It's a much lower priority than many other things though.

The kernel situation has gotten a lot better with the Pixel 4a (5G) / Pixel 5 but we still need to support the older devices. We're not doing all of the things that we've historically done due to a lot of stuff that has been going on distracting from development work and maintenance.

@thestinger
Copy link
Member

We have a bunch of funding available and I need to find developers to work on different things. I want to get the linux-hardened project revived and try to get some real work happening there. My idea on how things should be done somewhat conflicts with what people with a different focus want though. For example, I don't really want to have anything that can be done via SELinux policy since that's redundant for us and I see the immense amount of stuff done with that as quite essential. So, for example, with SELinux you can do ioctl filtering and you don't need hard-wired toggles to disable specific ioctls. It's one thing to enable a redundant feature with minimal attack surface that's already upstream vs. implementing a bunch of them.

@thestinger
Copy link
Member

Also, only interested in kernels built with Clang and linked with LLD, as another example. I want to focus on one way of doing things and that's the way I think is most forward looking by far. I know grsecurity heavily uses GCC plugins, but their expertise on those is quite unique to them and isn't where resources are being invested elsewhere. I don't see much future for GCC plugins that are upstream but not understood or properly maintained by them. Makes sense to share as much as possible with non-Linux projects too.

@sempervictus
Copy link
Author

@thestinger: at a high level, that sounds like a Linux parallel of what HardnedBSD is doing to replicate some of the functions in grsecurity's self protection (and some of the userspace stuff like W^X). Might be useful to ping them for some tail-wind and potentially leads for devs. That "have budget need good devs" problem is becoming more common nowadays, i might be able to hook you up with a like-minded team who appreciate the finer points. Will circle up offline on that front.
I also think that it would be great to have your input on LKRG specifics and proposed "better ways to do stuff" - not a lot of public expertise in this arena, and having yours involved would be a boon IMO. The more that can be done in hardened the better, but there's no reason that common functionality can't be exported to them for DKMS builds as modules where possible for people who can't full-out use a custom kernel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants