Default seccomp profile blocks personality(...|ADDR_NO_RANDOMIZE) #43011

justinsteven · 2021-11-11T01:26:44Z

The default seccomp profile is blocking personality(PER_LINUX|ADDR_NO_RANDOMIZE):

% sudo -g docker docker run --rm -ti debian bash

root@4c48e40eadb3:/# apt update && apt install -y strace
[... SNIP ...]

root@4c48e40eadb3:/# strace -e personality setarch `uname -m` -R /bin/bash
personality(PER_LINUX|ADDR_NO_RANDOMIZE) = -1 EPERM (Operation not permitted)
personality(PER_LINUX|ADDR_NO_RANDOMIZE) = -1 EPERM (Operation not permitted)
setarch: failed to set personality to x86_64: Operation not permitted
+++ exited with 1 +++

(As an aside, it's also blocking personality(PER_LINUX32|ADDR_NO_RANDOMIZE) which is the equivalent for x86 processes)

This is preventing the use of gdb with its ASLR-disabling behaviour (which allows for more deterministic debugging):

root@4c48e40eadb3:/# apt install -y gdb
[... SNIP ...]

root@4c48e40eadb3:/# gdb $(which cat)
[... SNIP ...]

(gdb) r
Starting program: /bin/cat
warning: Error disabling address space randomization: Operation not permitted

[... SNIP - exit out of gdb ...]

root@4c48e40eadb3:/# strace -e personality gdb $(which cat)
[... SNIP ...]
(gdb) r
Starting program: /bin/cat
personality(0xffffffff)                 = 0 (PER_LINUX)
personality(PER_LINUX|ADDR_NO_RANDOMIZE) = -1 EPERM (Operation not permitted)
warning: Error disabling address space randomization: Operation not permitted

This was discussed in #22801 in the context of this breaking the building of emacs. The resolution was to wait for emacs to be buildable with ASLR enabled, and the issue was closed.

On that issue, it was said that:

It is using personality(0x40008) which is ADDR_NO_RANDOMIZE | PER_LINUX32 which disables ASLR (and forces 32 bit). I am not sure about allowing this though, it means anyone can just disable ASLR, which significantly reduces security.

I'm not sure this is the full story, and I think it should be revisited.

My understanding is that the "anyone" who can disable ASLR is someone who's already within the container and does personality() to disable ASLR of the calling process and its child processes. The only ASLR that is disabled is for the process that does personality() and its children. It doesn't affect anything outside the container.

Furthermore it seems as though the effects are reversed when doing suid/sgid (https://github.com/torvalds/linux/blob/5147da902e0dd162c6254a61e4c57f21b60a9b1c/include/uapi/linux/personality.h#L27-L34)

% personality setarch `uname -m` -R /bin/bash

$ cat /proc/self/maps | grep cat | sha1sum
25947c3fa9e641aab4bd0e2c35e56d3d80bdb8f1  -

$ cat /proc/self/maps | grep cat | sha1sum
25947c3fa9e641aab4bd0e2c35e56d3d80bdb8f1  -

(ASLR is disabled when cat is run ordinarily)

vs.

% personality setarch `uname -m` -R /bin/bash

$ sudo cat /proc/self/maps | grep cat | sha1sum
92a1f335686a52a9026ffe26faf3cc98c316e935  -

$ sudo cat /proc/self/maps | grep cat | sha1sum
01155bf3f1cadc25d9709d4654d057f967256512  -

(ASLR is enabled when cat is run under sudo)

If an attacker is already in the container and can call personality(), it doesn't win them anything. I don't see how it gets them any closer to a privesc within the container (as above, the effects are reversed when doing suid/sgid) and I don't see how it gets them any closer to a container escape.

A program that does personality() might do so for good reason (e.g. in the case of gdb trying to disable ASLR on the process being run, to allow for more deterministic debugging). It might also do so for bad reason, and make security worse for itself with respect to remote attackers who aren't already in the container. In such a bad case, it's a bug in the software that calls personality(), and I'm not convinced that the default seccomp profile should be saying "I'm going to stop you doing that for your own good".

If there is indeed a security risk to the host in allowing a process to disable its (and its non-privileged childrens') ASLR, then personality(...|ADDR_NO_RANDOMIZE) should stay blocked. However, if there's no risk to the host in allowing it, I think it should be permitted by the default seccomp profile.

The text was updated successfully, but these errors were encountered:

thaJeztah · 2021-11-12T09:37:10Z

@justincormack ptal

justincormack · 2021-11-12T11:58:30Z

ASLR is a security protection, you can customise or remove the default seccomp policy if you don't want to enforce it.

justinsteven · 2021-11-15T01:17:49Z

Thanks @justincormack

ASLR is a security protection

I understand. But is it protecting the host from a container? If it is I'd really like to understand more. If not, why should an application within a container not be able to disable its own ASLR? At the point that an attacker could call personality() (Edit: to be clear I'm talking about, say, personality(PER_LINUX|ADDR_NO_RANDOMIZE)) it doesn't win them anything. Is it just because "we know best"?

you can customise

I've done so, but I thought it'd be worth discussing changing the default.

or remove the default seccomp policy if you don't want to enforce it.

Having a seccomp policy is a security protection. Suggesting that it be disabled just to have an application within a container be able to disable its own ASLR is dangerous.

suihkulokki · 2021-11-25T12:29:58Z

This breaks running reprotest ( https://salsa.debian.org/reproducible-builds/reprotest ) in Docker containers. reprotest tests that repeated builds result the same binary, and ASLR during build time needs to be disabled as ASLR sometimes causes the differences in the resulting binaries.

sporksmith · 2022-04-20T23:51:55Z

This also breaks determinism in the Shadow system simulator. shadow/shadow#2070 (comment)

sporksmith · 2022-04-21T15:24:15Z

ASLR is a security protection, you can customise or remove the default seccomp policy if you don't want to enforce it.

@justincormack Defaults matter. Users affected by this issue are often not the ones who have the final say about what seccomp policy is deployed in their Docker containers. It's a lot to require such users to convince the corresponding sysadmins that such a change is not a security risk and that it's worth the maintenance burden of maintaining a custom seccomp policy. Likewise, the sysadmins and users are typically not as well-suited to evaluate whether such a change is safe as the Docker engineers.

If there is indeed a security risk to the host in allowing a process to disable its (and its non-privileged childrens') ASLR, then personality(...|ADDR_NO_RANDOMIZE) should stay blocked. However, if there's no risk to the host in allowing it, I think it should be permitted by the default seccomp profile.

+1. I'll add that if there's a risk, it should also be documented why it's a risk, so that individuals don't fork the seccomp policy to disable it without understanding that risk.

dmartin · 2024-01-18T16:33:37Z

To add another concrete use case: we build and run CTF challenges inside Docker containers. When teaching introductory binary exploitation concepts, it's often helpful to disable ASLR.

We currently use a forked version of the default seccomp profile, but it's a time-consuming, manual process to keep track of upstream changes.

I'm inclined to agree with @justinsteven's comment above:

If an attacker is already in the container and can call personality(), it doesn't win them anything. I don't see how it gets them any closer to a privesc within the container (as above, the effects are reversed when doing suid/sgid) and I don't see how it gets them any closer to a container escape.

It's difficult to see how this behavior increases security, and it would be quite helpful to us to have this syscall permitted.

ConnorNelson · 2024-01-18T17:59:10Z

While I agree being able to disable ASLR is useful, so is being able to disable ASLR when the stack is executable (with personality value READ_IMPLIES_EXEC). It's not clear how far this goes, and what should be permitted as default. Though I do agree, it's not immediately obvious what the security gains are here.

@dmartin This is the strategy we take for our CTF challenges in Docker containers: https://github.com/pwncollege/dojo/blob/5300bb2dea6a9254a5c72c3c8e16b1655fd3abe0/dojo_plugin/config.py#L34

This allows us to easily layer our changes on top of upstream, and has worked very well for us. We have further changes that we want beyond the defaults.

AkihiroSuda added the area/security/seccomp label Nov 11, 2021

NHellFire mentioned this issue Sep 20, 2023

Update documentation to clarify REPL usage and security implications apple/swift-docker#9

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default seccomp profile blocks personality(...|ADDR_NO_RANDOMIZE) #43011

Default seccomp profile blocks personality(...|ADDR_NO_RANDOMIZE) #43011

justinsteven commented Nov 11, 2021

thaJeztah commented Nov 12, 2021

justincormack commented Nov 12, 2021

justinsteven commented Nov 15, 2021 •

edited

suihkulokki commented Nov 25, 2021

sporksmith commented Apr 20, 2022

sporksmith commented Apr 21, 2022

dmartin commented Jan 18, 2024

ConnorNelson commented Jan 18, 2024

Default seccomp profile blocks personality(...|ADDR_NO_RANDOMIZE) #43011

Default seccomp profile blocks personality(...|ADDR_NO_RANDOMIZE) #43011

Comments

justinsteven commented Nov 11, 2021

thaJeztah commented Nov 12, 2021

justincormack commented Nov 12, 2021

justinsteven commented Nov 15, 2021 • edited

suihkulokki commented Nov 25, 2021

sporksmith commented Apr 20, 2022

sporksmith commented Apr 21, 2022

dmartin commented Jan 18, 2024

ConnorNelson commented Jan 18, 2024

justinsteven commented Nov 15, 2021 •

edited