Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement basic kernel hardening and defenses #209

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sempervictus
Copy link

Network equipment is critical infrastructure with long uptimes and
significant throughput/processing, especially in undercloud fabric.
The OS kernel is responsible for managing raw system resources and
the enforcement of security (privilege/access) boundaries. This
set of responsibilities, and a number of technical reasons such as
long-running memory layouts, and physical page table access, make
the kernel a high-value target for attackers. Rebooting the system
for upgrades can be problematic, and patches providing correct
solutions for ring0 concerns may take some time to matriculate to
stable release - leaving gaps in the security posture of systems.

In order to reduce exposure during these gaps, and the impact or
feasibility of 0-day attacks, this high-value target needs to be
better protected with probabilistic, deterministic, and semantic
defenses. While this effort is by no means a replacement for the
professional-grade mitigations in Grsecurity/PaX, it does start
down the path of elevated defensive posture by introducing the
Linux Hardened kernel patchset from GrapheneOS, and the Linux
Kernel Runtime Guard (LKRG) from OpenWall by Adam Zabrocki.

The hardening patchset implements a number of C-level fixes, higher
entropy ASLR, namespace protections, FS access restrictions to
sensitive targets like /dev/mem, and syscall restrictions. Atop the
basics, it adds GCC plugins or improves upon the upstream ones to
randomize struct layouts, initify and initialize variables at
compile-time, and provides a PRNG from the jitterentropy source.
More info at https://www.whonix.org/wiki/Hardened-kernel as well as
in the source repo https://github.com/anthraxx/linux-hardened.

LKRG provides additional tiers of mitigation by actively hashing
and validating kernel memory regions, further restricting access to
common LPE and escape vectors, as well as mechanisms for modifying
the running kernel commonly used to bypass LSMs. LKRG can be built
directly into the kernel to provide enforcement from early-boot.

Notes:
While not in the scope of this pull request, the kernel-tier
mechanisms provided here should be complemented by Daniel Micay's
hardened-malloc to guard against userspace memory corruption, UAF,
and other malfeasance.
This effort parallels a similar pull request for VyOS - #132.
The added functionality provided there in regards to LVS, XTables,
and other patches can be backported here on request.

Testing:
None on this branch, we maintain 5.4 and 5.10 branches in-house

@ghost
Copy link

ghost commented Apr 19, 2021

CLA assistant check
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

❌ sempervictus sign now
You have signed the CLA already but the status is still pending? Let us recheck it.

@sempervictus
Copy link
Author

I have intellectual property interests which may be adversely impacted by signing a CLA - would require legal review and cost money to contribute to this effort if that's enforced for the PR. Its non-sequitur here anyway, this is all GPL code given that its patches for Linux.

@paulmenzel
Copy link
Contributor

(I am an outside contributor, and agree about the CLA.)

Regarding the patches, please add commit messages to each patch.

In the pull/merge request, please also elaborate on the upstreaming efforts.

Lastly, on what devices did you test your internal 5.4 and 5.10 ports?

@sempervictus
Copy link
Author

@paulmenzel: thanks for the instructions, will rework the commits.
The linux-hardened kernel patches run on everything from mobile devices (they started life years ago in CopperHeadOS by Daniel micay) to laptops, routers/firewalls, and servers... they're pretty low grade changes (not applying per-CPU PGDs and nuclear user/kspace separation enforcement here). GCC and LLVM-built kernels work fine with them. They have also been widely used for years, survived many major kernel versions in a well-maintained state, and have been rather problem free. Several distributions use them as release kernels AFAIK. So long as the architecture is x86_64/arm64, they should work just fine. I dont have a spare Sonic-compatible chassis hanging out right now, but at some point by Q3 should have another one and some free time. Sort of hoping the buildbots and CI/CD would do some of the gruntwork there and at least perform initial QA.

LKRG is a bit newer/less used/more actively developed, but being an active defense it needs runtime testing on a wider array of equipment/kconfigs/execution environments - we've already found some weird races which needed logic changes for early-init and missing functionality under certain test conditions (it doesnt deal well with RANDSTRUCT for instance, but given what that plugin does, the coverage for targeting selinux structs is still better than upstream). The developers are fast to respond, it needs wider adoption and testing, and the default right now is to build it as a module and leave loading that and setting its tunables up to the sysadmin. It hasn't eaten our systems or pets yet, hurt performance, and it does raise the bar a bit to give defenders standoff for response.

Given the tooling available to offsec people (see my GH history for reference), seems only fair to try and empower blue team a bit by spreading proven defensive mechanisms to places where they can make the most difference. This isn't a defense in-depth play, but the idea is to put "enough electrons in the outer shells to make getting to the nucleus a rather bothersome procedure," especially for autonomous attackers (a few more bits of entropy change the math on time to probable collision enough to make the bot spend the rest of its uptime guessing).

Lastly, in regards to how we've tested this: our internal branches consist of a 5.4 and 5.10 for the Linux Hardened codebase which is what we use for "public distribution" kernels. These things have a LOT of other changes in them (compiler optimizations, schedulers for CPU and IO, ZFS, tons of networking code, SCST built-in, some other hardening stuff which i wouldn't try to push here yet, EoIP driver, NAC and LACP trickery, PF_RING, etc etc). Those are built in generic ivy/nehalem targets (work on most atoms and relevant AMDs too) and specially tailored per-cpu targets as needed. We've not seen any issues with these, and at this point they're on at least a few hundred systems across the client base and our stuff. Uptimes of >1y are pretty common, especially given that we build thin Arch Linux base systems, and then wrap runtimes in nspawn/lxc/libvirt as needed atop the hardened base, ZFS storage, and enhanced networking functionality. If anything, these kernels have helped find bugs which could otherwise cause undefined behavior; and we all love hunting for that when there's a pile of forward-looking work to be done.

@paulmenzel
Copy link
Contributor

Thank you very much for the detailed answer. It’s great effort. If these patches have proven themselves, why aren’t they in the upstream Linux kernel yet, or submitted for inclusion? That way a lot of projects would benefit automatically.

@sempervictus
Copy link
Author

Looks like i need to rewrite the diffs off whatever upstream they're using here:

commit b3b3365cf3d6fea8f25456fa902859ab82b2203a (v4.19.152-hardened-sonic)
Author: RageLtMan <rageltman [at] sempervictus>
Date:   Mon Apr 19 00:01:35 2021 -0400

    LKRG in-tree @ b913995b

commit d1f3205dfd45c60e8555519de664661c33faf319
Author: RageLtMan <rageltman [at] sempervictus>
Date:   Sun Apr 18 23:59:14 2021 -0400

    Linux hardened 4.19

commit ad326970d25cc85128cd22d62398751ad072efff (tag: v4.19.152)
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Sat Oct 17 10:12:58 2020 +0200

    Linux 4.19.152

... should apply cleanly from an upstream, but if they're "debianized" ahead of time or what-not, will need to adapt for that.

@sempervictus
Copy link
Author

@paulmenzel: a lot of them are, KSPP pulled a lot of this stuff in over the years, and some of it is finally landing these days. Unfortunately the core Linux maintainers are not known for being even notionally positive about security, or considerate of it unless there is reputational damage in plain sight. Up until a few years ago, you could download public grsec patches which are blueprints (well, more effective than the results) for the large-scale implementations around SMEP/SMAP/ALSR and possibly NX to name a few hardware derivatives... the powers that be had over a decade to make nice with their team, adopt the mindset and techniques, and literally improve the security posture of the entire world. Instead they were dismissive and derisive of technology they didn't understand for use cases they didn't want to think about. Now we have to try to push security at the distribution level or in specific implementations until we get another bite at the apple with Redox or other next-gen OS'. Thankfully, those people are starting with security in-mind.

Network equipment is critical infrastructure with long uptimes and
significant throughput/processing, especially in undercloud fabric.
The OS kernel is responsible for managing raw system resources and
the enforcement of security (privilege/access) boundaries. This
set of responsibilities, and a number of technical reasons such as
long-running memory layouts, and physical page table access, make
the kernel a high-value target for attackers. Rebooting the system
for upgrades can be problematic, and patches providing correct
solutions for ring0 concerns may take some time to matriculate to
stable release - leaving gaps in the security posture of systems.

In order to reduce exposure during these gaps, and the impact or
feasibility of 0-day attacks, this high-value target needs to be
better protected with probabilistic, deterministic, and semantic
defenses. While this effort is by no means a replacement for the
professional-grade mitigations in Grsecurity/PaX, it does start
down the path of elevated defensive posture by introducing the
Linux Hardened kernel patchset from GrapheneOS by Daniel Micay and
others.

The hardening patchset implements a number of C-level fixes, higher
entropy ASLR, namespace protections, FS access restrictions to
sensitive targets like /dev/mem, and syscall restrictions. Atop the
basics, it adds GCC plugins or improves upon the upstream ones to
randomize struct layouts, initify and initialize variables at
compile-time, and provides a PRNG from the jitterentropy source.
More info at https://www.whonix.org/wiki/Hardened-kernel as well as
in the source repo https://github.com/anthraxx/linux-hardened.

Notes:
  While not in the scope of this pull request, the kernel-tier
mechanisms provided here should be complemented by Daniel Micay's
hardened-malloc to guard against userspace memory corruption, UAF,
and other malfeasance.
  This effort parallels a similar pull request for VyOS - sonic-net#132.
The added functionality provided there in regards to LVS, XTables,
and other patches can be backported here on request.

Testing:
  None on this branch, we maintain 5.4 and 5.10 branches in-house
@sempervictus
Copy link
Author

Broke out the commits, rebased the hardened patch on the Debian-patched kernel used here. Fun note about "upstream adoption" - they already had some of the userns and BPF restrictions in their patchset, so this stuff is very much being used, just piecemeal by different entities instead of as a core function of Linux like security should be.

Import the Linux Kernel Runtime Guard (LKRG) from OpenWall by Adam
Zabrocki and and Alex Peslyak.

LKRG provides additional tiers of mitigation by actively hashing
and validating kernel memory regions, further restricting access
to common LPE and escape vectors, as well as mechanisms for
modifying the running kernel commonly used to bypass LSMs. LKRG
can be built directly into the kernel to provide enforcement from
early-boot, but should be deployed as a module initially while
tunables and operational stability are ironed out and validated on
this platform. More information is available at the projects
homepage: https://www.openwall.com/lkrg/ and in their source repo:
https://github.com/openwall/lkrg
@sempervictus
Copy link
Author

Cool, it builds. Now we need someone with a testbed to boot it.

@jarias-lfx
Copy link

/easycla

@linux-foundation-easycla
Copy link

CLA Missing ID CLA Not Signed

@sempervictus
Copy link
Author

/easycla

I've mentioned this in other efforts before: but on the advice of corporate counsel we cannot sign these as we hold intellectual property rights of our own and are not going to spend the sort of money it costs to review each CLA just to have our uncompensated work accepted in-tree (in order to ensure there is no predatory language which would strip us of said rights or claim that the terms can change unilaterally at some later date to disadvantage us in some other way). If Microsoft would like to compensate the cost of said review, happy to discuss.

Given the various predatory modalities of legal interactions in FOSS these days, it is rather uncouth of a large corporation to demand any concession of rights from potentially under-resourced developers who usually do not retain counsel; my situation is not common, but there are more people with patent rights out there than defensively-minded legal advisors protecting those interests. This particular effort is something Microsoft should have taken on internally without prompting through a GH issue - we're talking about runtime security of network equipment here (the same equipment in use under a certain public cloud which ends up in BleepingComputer/Register/etc articles for security failings seemingly weekly these days), so kind of adds insult to injury to demand CLA from people trying to address a lapse in security consideration much less implementation by one of the biggest companies on earth.

Please reconsider the modality of Microsoft's interaction w/ FOSS contributors - not employees/subordinates, often actually paying customers for your GitHub arm and other products or services, and seeing as they're private talent will negotiate terms on a per-contract basis when retained in work-for-hire arrangements. A bot-command to drop demand for signing away rights is not how anyone remotely polite would engage with someone in-person, why is it acceptable in circumstances where they are handing over digital work product on a platform for which they pay Microsoft money?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants