Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate frame tables in data section to improve support for DLLs and PIEs #1323

Merged
merged 4 commits into from Sep 9, 2017

Conversation

xavierleroy
Copy link
Contributor

The ocamlopt compiler produces tables that describe the structure of stack frames. These are called the frame tables, and they contain pointers (absolute addresses) into the machine code.

Currently, the frame table data is put in the read-only data section ".rodata" for the PowerPC and System-Z / s390x target architectures, and in the read-write data section ".data" for the other ports.

For statically-linked executables, the read-only data section is a good choice because the frame tables do not change during execution. What happens is that the static linker puts the code and the read-only data together in a read-only "text" segment, which can be shared between several concurrent runs of the program.

For the same reason, putting the frame tables in the read-only data section is a poor choice for shared libraries (DLLs), because the code pointers contained in the frame table need to be relocated by the dynamic loader. So, the dynamic loader needs write access to the "text" segment, which can no longer be shared between several users of the DLL, making dynamic loading slower and possibly reducing security.

The same concerns arise for position-independent executables (PIEs). These are statically-linked executables that are loaded and run at randomly-chosen code and data addresses, as part of ASLR (address space layout randomization), a popular security measure. Like DLLs, PIEs undergo relocation of symbols, and if the text segment contains relocations, it can no longer be shared, it takes longer to start, and it could weaken security. PIEs are already or are becoming the standard, e.g. in MacOS, OpenBSD, and Ubuntu 17.04.

For those reasons, relocations in the text segment (code sections and readonly data sections) should be avoided. They are flagged as warnings or errors by tools such as Debian's lintian package checker and the --warn-shared-textrel option of GNU ld.

Consequently, ocamlopt should stop emitting its frame tables in the readonly data section and should consistently emit them in the data section. This is exactly what this PR does. There is also a small improvement to the System-Z port to make the libasmrun_shared.so shared library completely relocatable.

With this PR, the four 64-bit ports of OCaml (x86-64, ARM64, PowerPC64 and System-Z) produce "proper" shared libraries and executables that can be "properly" linked as PIEs (e.g. via ocamlopt -ccopt -pie), where "proper" means "with no relocations in the text segment".

The three 32-bit ports of OCaml (x86-32, ARM, PowerPC) have bigger issues with PIEs that will probably be discussed in another PR.

The frametable contains absolute pointers into the code, which require relocation in shared libraries and also in position-independent executables (PIE).

Before this commit, the frametable was put in the readonly data section (rodata), which is part of the text segment.  In shared libraries and PIEs, relocations in the text segment are undesirable (they make the text segment writable, at least temporarily) and are flagged as warnings or errors by various tools (Debian's lintian package checker; the --warn-shared-textrel option of GNU ld; etc).

This commit puts the frametable in the (read-write) data section (.data), like in the AMD64 port for example.  In PowerPC 64-bit mode, this is enough to produce .so files and PIE executables that contain no relocations in the text segment.

In PowerPC 32-bit mode there remains relocations in the text segment, but that was expected because the code we generate is not position-independent (PIC).
This is the System-Z analogue of commit 24980d3 for PowerPC.

With this commit ocamlopt produces .so shared libraries and PIE relocatable executables that contain no relocations in the text segment.
With this change, the generated libasmrun_shared.so is a "pure" shared library without relocations in the text segment.
@mshinwell
Copy link
Contributor

This looks correct to me. I checked the s390x instructions in the new macros and that there weren't any accidental clobberings of %r1.

OK to be merged after a Changes entry has been added.

@xavierleroy
Copy link
Contributor Author

Thanks for the prompt review! Changes entry added.

@cfcs
Copy link

cfcs commented Sep 9, 2017

So this puts the relocations in .text.rel.ro like you do for C code, and marks it read-only after updating the relocations to avoid hackers overwriting this data? (so we get the same effect as the "hardening" flags everyone advises you to use for C applications - -z relro,bindnow ?)

@xavierleroy
Copy link
Contributor Author

@cfcs : I don't know what .text.rel.ro is, but probably you meant .data.rel.ro. Indeed, GCC puts const data that needs relocation inside .data.rel.ro, a section that starts read-write, is relocated by the dynamic loader, then becomes read-only. I considered doing the same thing for the frame tables, but I don't know how portable this feature is (beyond recent Linux systems).

I don't think we lose in security by having frame tables read-write instead of read-only: any malicious code that can write to the other read-write data of an OCaml process already has full control of the process; having write access to the frame table gives the attacker no additional possibilities, as far as I can see. The question here is more a question of efficiency, trying to reduce the amount of dynamic loading work and increase the opportunities for sharing the text segment between several instances of the process or DLL.

@xavierleroy xavierleroy merged commit bcae691 into trunk Sep 9, 2017
@xavierleroy xavierleroy deleted the improve-PIC-and-PIE branch September 9, 2017 16:38
@cfcs
Copy link

cfcs commented Sep 9, 2017

@xavierleroy: Yes, .data.rel.ro, my bad. Re: portability:

  • From an OS/arch point of view it should work on any system that lets you change page permissions from read-write to read-only. I am not aware of any architectures or operating systems where this is not possible, but I would be happy to help you research if you can link me to a list of target operating systems and architectures that we want it to work on. This post mentions the option as used on various BSDs as well.
  • From a tooling perspective it depends on what the frame tables are used for (which is unclear to me - would be happy to familiarize myself with this feature if you can explain it here, or link me to code or other resources that explains it).
  • From a security perspective I would be happy to see OCaml move towards a "read-only by default, writable when strictly necessary"-policy. As a rule of thumb, an attacker benefits from any mapping that is writable. Code gadgets useful to an attacker for reading data may write to an address before doing the read, which will fail when the mapping is read-only, and so on. I can't say anything about the specific problem at hand (frame tables) since I do not know how the writable data is used, but again - as a general rule - if something is read by the runtime, it is usually bad if an attacker can overwrite it, and it seems to be the overhead of marking the pages read-only after performing the relocation (at startup) is relatively small (one system call on most systems).

This second half of my comment is a bit off-topic, but I'd like to share an example from an application I'm working on now: A ".native" application compiled with the opam switch 4.04.2+fPIC on my debian system (x86_64), with ASLR enabled.
As you can see this is a PIE executable:

user@machine:~/ocaml/ocaml-openpgp$ file _build/app/opgp.native 
_build/app/opgp.native: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=5258bb4f9deb8dc738aa0bc42f9d5c48c84d05bd, with debug_info, not stripped

By default it has these writable+executable mappings:

user@machine:~/ocaml/ocaml-openpgp$ cat /proc/9199/maps | fgrep wx
599d896bf000-599d89797000 rwxp 00252000 ca:10 203988                     /home/user/ocaml/ocaml-openpgp/_build/app/opgp.native
599d89797000-599d897ad000 rwxp 00000000 00:00 0 
599d8a710000-599d8a75e000 rwxp 00000000 00:00 0                          [heap]
70eef0236000-70eef07fa000 rwxp 00000000 00:00 0 
70eef0a12000-70eef0a13000 rwxp 00018000 fd:00 393710                     /lib/x86_64-linux-gnu/libpthread-2.24.so
70eef0a13000-70eef0a17000 rwxp 00000000 00:00 0 
70eef0dae000-70eef0db0000 rwxp 00197000 fd:00 393427                     /lib/x86_64-linux-gnu/libc-2.24.so
70eef0db0000-70eef0db4000 rwxp 00000000 00:00 0 
70eef0fb7000-70eef0fb8000 rwxp 00003000 fd:00 393462                     /lib/x86_64-linux-gnu/libdl-2.24.so
70eef12bb000-70eef12bc000 rwxp 00103000 fd:00 393471                     /lib/x86_64-linux-gnu/libm-2.24.so
70eef14c3000-70eef14c4000 rwxp 00007000 fd:00 393718                     /lib/x86_64-linux-gnu/librt-2.24.so
70eef1746000-70eef1747000 rwxp 00082000 fd:00 525660                     /usr/lib/x86_64-linux-gnu/libgmp.so.10.3.2
70eef17c8000-70eef1951000 rwxp 00000000 00:00 0 
70eef1967000-70eef196a000 rwxp 00000000 00:00 0 
70eef196b000-70eef196c000 rwxp 00024000 fd:00 393407                     /lib/x86_64-linux-gnu/ld-2.24.so
70eef196c000-70eef196d000 rwxp 00000000 00:00 0 
7ffea1008000-7ffea1029000 rwxp 00000000 00:00 0                          [stack]

Having the [heap] and [stack] be executable means that an attacker who is able to place a sequence of bytes in memory (either on the heap or stack) needs only to leak the location of one of these mappings to successfully exploit a stack overflow (by overwriting the return address) in native code. Furthermore it means that bugs that allow partially overwriting the instruction pointer are extremely easy to exploit. One could think of cases where it may not be easy to leak an address to the attacker, but it is possible to redirect execution by overwriting the instruction pointer with - say - the value of a heap pointer. I don't have any concrete examples of vulnerabilities that permit exploitation of these facts, but CVE-2015-8869 comes to mind.

I have not looked into the other w+x mappings, but ideally (from a security point of view) there would be no such mappings, so anything that can be done to bring down the number of these would be good. Do you happen to know if there is any functionality of the runtime that relies strongly on w+x and would be hard to change?

I would love to see an OCaml ecosystem that is more resistant to these attacks as I believe the OCaml language has the potential to be used for developing secure software, and I would like to contribute towards this end. I come from a background of IT security / penetration testing and have experience with exploiting memory corruption flaws like these, but I have very little experience with the ocaml compiler itself. If I can be of any assistance, either in the form of writing patches or advising people with more in-depth knowledge of the runtime/compiler who are better suited at that than me, please let me know.

@xavierleroy
Copy link
Contributor Author

I agree w+x sections are a big security risk and should never occur. In this PR we were discussing r+w versus r, which is a lot less of a risk.

Your example is a big surprise to me because OCaml doesn't need an executable heap (and does nothing to make the heap executable), nor an executable stack (and does the .section .note.GNU-stack dance to say that the stack needs not be executable). I suspect the problems come from C or assembly code that you've linked with your OCaml code. Could you please run the following minimalistic code on your machine?

let _ =
  ignore (Sys.command (Printf.sprintf "cat /proc/%d/maps" (Unix.getpid())))

On mine (Ubuntu 16.04 x86-64) it reports no w+x regions.

If you still see w+x regions with this minimal program, please submit a bug report at https://caml.inria.fr/mantis/ and I'll attend to it with high priority. Adding data to the present PR is less useful.

@cfcs
Copy link

cfcs commented Sep 14, 2017

@xavierleroy that gave me no w+x regions, too. In my project I depend on a number of libraries that have C stubs. I am currently investigating whether something like this could be the reason: ocaml/Zarith@96a8242

@xavierleroy
Copy link
Contributor Author

Yes, hand-written assembly files easily lead to executable stacks, just because they lack the magic "note" that tells the linker to keep the stack nonexecutable. I have no idea what could cause an executable heap, however.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants