Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack smashing protection et al #294

Merged
merged 19 commits into from
Dec 7, 2018
Merged

Stack smashing protection et al #294

merged 19 commits into from
Dec 7, 2018

Conversation

mato
Copy link
Member

@mato mato commented Nov 22, 2018

Work in progress, fixes #293 and fixes #296. Supersedes/integrates #297 and #300. See commit logs for details.

This is currently untested with any downstreams (OCaml) and mysteriously failing test_fpu on hvt (perhaps there's a stack corruption bug in the test code, need to investigate).

@mato
Copy link
Member Author

mato commented Nov 22, 2018

@ehmry What would you like to do for Genode here? Something needs to provide __stack_chk_fail() and initialise __stack_chk_guard. See the references in #293 for details.

@mato
Copy link
Member Author

mato commented Nov 22, 2018

@hannesm The clang failure here is due to lack of support for -mstack-protector-guard=global. Would you mind checking for me on 12-RELEASE and -CURRENT if clang has decided to implement this by any chance? If not, then we'll have to go down the TLS route which is a bit annoying.

@hannesm
Copy link
Contributor

hannesm commented Nov 22, 2018

@mato there's still not such an option in clang (6.0.1 on FreeBSD-CURRENT). from cc --help output I can't find anything similar to gcc's -mstack-protector-guard=global

@ehmry
Copy link
Contributor

ehmry commented Nov 23, 2018

@mato this not a problem, I have an ssp branch that implement this, see afb3fea and 5fae39b.

@mato
Copy link
Member Author

mato commented Nov 23, 2018

@ehmry Okay, that might work, with some caveats which I'll comment on in your commits. However, since @hannesm confirmed clang lack of support for a global __stack_chk_guard, I need to re-work this using a TLS slot anyway.

@ehmry
Copy link
Contributor

ehmry commented Nov 23, 2018

Yes, well I would be surprised if I've really done enough here.

@mato
Copy link
Member Author

mato commented Nov 23, 2018

(Note to self: Replace &= ~(uintptr_t)0xff with &= ~(uintptr_t)0xff00 as per suggestion from the internets to not have the \0 on canary boundary)

@mato
Copy link
Member Author

mato commented Nov 27, 2018

Depends on #296.

@mato
Copy link
Member Author

mato commented Nov 29, 2018

Rebased to include commits from #297 in its current state, and @ehmry's Genode changes from mato#3. See commit logs and comments in configure.sh for details.

mato and others added 12 commits December 7, 2018 11:46
This change unmaps low memory (notably at least the zero page) and
disables accidental use of TLS by any code (bindings or application).

On hvt: We unmap the low 64kB, including the guest page tables. The
remainder of the first 1MB (up to X86_GUEST_MIN_BASE) is mapped
read-only, as this is used for early boot input data passed to the guest
from the tender, and the rest is considered "reserved".

For TLS: we add early "C runtime" initialisation code in
bindings/crt_init.h ("called" first thing from _start()) which ensures
that CPU registers related to TLS addressing are zeroed.

Adds the following test cases:

- test_zeropage: verifies receipt of #PF on access to zero page.
- test_notls: verifies that a low TLS slot access results in #PF.

On aarch64 this change only implements the test cases, which have been
verified manually to be correct, and are currently expected to *fail* in
tests.bats. Actually unmapping low memory on aarch64 will be done in a
separate change, at which time these tests cases can be expected to
succeed.
Initialise the __stack_chk_guard value on x86_64 and aarch64 from
crt_init_early(), inlined into _start(). Given that we don't have a
source of randomness, derive a weak pseudo-random value using CPU ticks.
This is a reasonable trade-off for now without diving into the extra
complexity and ABI changes getting randomness from the host would
entail.

__stack_chk_fail() takes care to abort as quickly as possible as the
state of the stack is unknown at this point.

A test_ssp has been added which smashes the stack intentionally, thus
verifying that SSP is working as expected.

<rant>
The ABI that the compiler expects for SSP is *totally* uncodumented, not
standardised anywhere and varies entirely depending on the host OS,
toolchain (GCC/LLVM), and CPU architecture.  This change includes
support only for some of the arch/host/toolchain combinations we care
about, based on reading through both the GCC and LLVM source code and
inspecting code generated by the compiler. Notably, SSP on OpenBSD
toolchains is not supported at this time.
</rant>
Trusty GCC is too old for -mstack-protector-guard.
c[] is only 64 bits (at sizeof (float) == 4), so the x86_64 variant was
happily loading and storing double the bits, thus silently corrupting
the stack (caught by SSP!). Extend to a 4-element vector, and test that
the vector multiplication operated on all 4 elements.

For aarch64 the code was fine, but extend it anyway to use a 4-element
vector so that we can compare against the same results.
On host/arch combinations where we need it, check that GCC supports
-mstack-protector-guard=. Advise the user of the minimum GCC version
required if support is not present (verified by checking GCC commit logs
for first appearance of this option in a release).
We have made a mistake for SCTLR_EL1 initial value. We had used
the page_table address to orr the SCTLR_EL1 initial flags. It's
totally wrong.

Signed-off-by: Wei Chen <wei.chen@arm.com>
We have added a L3 table to manage the first 2MB in 4KB page,
so we add these two macros for L3 table.

Signed-off-by: Wei Chen <wei.chen@arm.com>
We are using the 4KB page, not 64KB page. The description
is a mistake. Fix it now.

Signed-off-by: Wei Chen <wei.chen@arm.com>
We will create L2 table and L3 table to unmap the
first page, so we require 2 pages for these two
tables.

Signed-off-by: Wei Chen <wei.chen@arm.com>
We are getting the address of page table not the page number
of page table. So rename it to PGT_PAGE_ADDR.

Signed-off-by: Wei Chen <wei.chen@arm.com>
Weichen81 and others added 5 commits December 7, 2018 11:47
We can unify the data type of page table entries. The definition
of these structures are useless.

Signed-off-by: Wei Chen <wei.chen@arm.com>
Rename va_addr to mem, to make it obvious that this is hvt->mem.

Signed-off-by: Wei Chen <wei.chen@arm.com>
We use the similar low memory layout as x86_64, this will help to enforce
the ABI (with respect to the low 1MB of memory being "reserved"), and
prevent the guest from stomping over it's own page tables.
The difference is that, Arm64 support max 4GB ram currently, so we increase
the PMD table (L2 table) size to 16KB to cover all RAM area.

Signed-off-by: Wei Chen <wei.chen@arm.com>
We have reserved low 1MB as unmapped, includes the zero page
and the guest's page tables. This will provide a NULL pointer
and prevent the guest from stomping over it's own page tables.

Signed-off-by: Wei Chen <wei.chen@arm.com>
@mato
Copy link
Member Author

mato commented Dec 7, 2018

Force-pushed: Rebased and merged with #300. Additional tweaks since last rebase:

  • moved __stack_chk_guard definition into its own module (crt.o)
  • __stack_chk_fail() now makes minimal use of stack when printing a message
  • cosmetics

Commit 53f3a6a re-worked the page table
setup for aarch64 with the intent of supporting up to 4GB of memory for
the guest but neglected to actually link all the PMD tables to PUD
entries. Correct this.

/cc @Weichen81
Asking for 1MB of guest memory resulted in the memory size being rounded
down to zero, which caused an obscure error further on. Check for this
case and adjust to the minimum size.

Additionally, correct an err() call to errx() to squash a confusing
"Success" and replace error messages in the internal
aarch64_setup_memory_mapping() with assertions.
@mato
Copy link
Member Author

mato commented Dec 7, 2018

Some more fixes for corner cases discovered, tested manually. CI is all green so merging.

Note that downstream MirageOS users of master after this will need to use the gmp-freestanding and zarith-freestanding packages from https://github.com/Solo5/opam-solo5 until mirage/ocaml-solo5#47 is resolved.

Also note that our SSP implementation requires at least GCC 4.9.0 on Linux, so Solo5 will no longer build on Travis CI environments using dist: trusty, updates to downstream .travis.yml to switch to dist: xenial will be required.

@mato mato changed the title DO NOT MERGE: Stack smashing protection Stack smashing protection et al Dec 7, 2018
@mato mato merged commit d5fb5a0 into Solo5:master Dec 7, 2018
mato added a commit to mato/opam-repository that referenced this pull request Jan 17, 2019
The SSP changes in Solo5/solo5#294 break gmp-freestanding and
zarith-freestanding due to issues with the various cross-compiling
hacks used to build these packages.

Publish updated versions, whose availability will be "gated" on an
upcoming ocaml-freestanding 0.5.0 release, which itself will depend on
Solo5 0.5.0+ bindings packages.

For details, see mirage/ocaml-solo5#47.
@mato mato deleted the ssp branch March 22, 2019 12:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unmap zero page; forcibly disable TLS Enable Stack Protector
4 participants