Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rust Building PIE without relocations #110270

Open
WorksButNotTested opened this issue Apr 13, 2023 · 25 comments
Open

Rust Building PIE without relocations #110270

WorksButNotTested opened this issue Apr 13, 2023 · 25 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@WorksButNotTested
Copy link

I am trying to build a rust executable without any relocations. Looking at the documentation for codegen options, it seems to say that the relocation-model controls the generation of position independent code and that it should be set to pic for most targets. However, when I build a simple test application, I can see it still contains relocations? Is it possible to prevent this?

$ rustc --version
rustc 1.68.2 (9eb3afe9e 2023-03-27)
$ cargo new pic-test
     Created binary (application) `pic-test` package
$ cd pic-test/
$ cargo build
   Compiling pic-test v0.1.0 (/home/jon/git/undead2/pic-test)
    Finished dev [unoptimized + debuginfo] target(s) in 0.22s
$ file target/debug/pic-test
target/debug/pic-test: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=861c0712da0c40487ffb21f0893244237aded5f2, for GNU/Linux 3.2.0, with debug_info, not stripped
$ readelf -rW target/debug/pic-test

Relocation section '.rela.dyn' at offset 0xee8 contains 713 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
000000000004d328  0000000000000008 R_X86_64_RELATIVE                         22970
000000000004d330  0000000000000008 R_X86_64_RELATIVE                         8750
000000000004d338  0000000000000008 R_X86_64_RELATIVE                         8710
000000000004d340  0000000000000008 R_X86_64_RELATIVE                         3f000
...
@jyn514 jyn514 added A-linkage Area: linking into static, shared libraries and binaries T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Apr 13, 2023
@bjorn3
Copy link
Member

bjorn3 commented Apr 14, 2023

Try the unstable -Zbuild-std cargo option. Without this a precompiled standard library will be used which doesn't get compiled with -Crelocation-model=pic.

@WorksButNotTested
Copy link
Author

Still no luck I'm afraid:

$ rustup default nightly
$ rustc --version
rustc 1.71.0-nightly (d0f204e4d 2023-04-16)
$ rustup component add rust-src --toolchain nightly-x86_64-unknown-linux-gnu
$ cargo clean
$ cargo build -Zbuild-std --target x86_64-unknown-linux-gnu
$ readelf -rW target/x86_64-unknown-linux-gnu/debug/pic-test | head

Relocation section '.rela.dyn' at offset 0xf80 contains 6170 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
00000000001ac910  0000000000000008 R_X86_64_RELATIVE                         3a4c0
00000000001ac918  0000000000000008 R_X86_64_RELATIVE                         26170
00000000001ac920  0000000000000008 R_X86_64_RELATIVE                         26130
00000000001ac928  0000000000000008 R_X86_64_RELATIVE                         159000
00000000001ac938  0000000000000008 R_X86_64_RELATIVE                         26240
00000000001ac950  0000000000000008 R_X86_64_RELATIVE                         261d0
00000000001ac958  0000000000000008 R_X86_64_RELATIVE                         262d0

@bjorn3
Copy link
Member

bjorn3 commented Apr 17, 2023

Right, if the relocation model is set to PIC or PIE, rustc will tell the linker to produce PIE executables. As I understand it PIE executables use relative relocations when it isn't known at link time that the symbol will always come from the executable itself as opposed to a dynamic library. You can do objdump -D target/x86_64-unknown-linux-gnu/debug/pic-test | grep -A14 1ac910 to see which symbols are the targets of the R_X86_64_RELATIVE relocations.

@WorksButNotTested
Copy link
Author

WorksButNotTested commented Apr 18, 2023

Even with the helloworld example, there are tens of such relocations including the following types:

  • R_X86_64_64
  • R_X86_64_GLOB_DAT
  • R_X86_64_JUMP_SLOT
  • R_X86_64_RELATIVE

I can't see any real pattern in them to explain why they are relocated. So I figured I'd try to build a static library so that I can then manually call the linker to build a fully statically linked binary (e.g. no DT_NEEDED) so the linker will know exactly where every symbol is at link time (since I don't think rust/cargo supports building a static binary?).

Anyway, I've discovered that even if I build a staticlib, if I extract the .a file then the objects within it still contain relocations. If I understand correctly, since static libraries are just a collection of .o files, then they haven't even been linked at that point? And the linker won't be able to substitute these relocations at link time with PIC code since the assembly of the functions has already been generated at this point. Is there something else I have missed?

Thanks for your continued help with this, it's really baffling!

@bjorn3
Copy link
Member

bjorn3 commented Apr 18, 2023

R_X86_64_64 exists in data sections as those can't be position independent. R_X86_64_GLOB_DAT is used to set GOT entries at runtime. R_X86_64_JUMP_SLOT creates a PLT entry.

Anyway, I've discovered that even if I build a staticlib, if I extract the .a file then the objects within it still contain relocations. If I understand correctly, since static libraries are just a collection of .o files, then they haven't even been linked at that point?

Indeed.

And the linker won't be able to substitute these relocations at link time with PIC code since the assembly of the functions has already been generated at this point. Is there something else I have missed?

All relative relocations within the same executable will be resolved by the linker, but absolute relocations can only be resolved at runtime once the final memory location of the executable is known and GOT relocations can only be resolved once the target of the respective symbols has been determined and (if it is from a dynamic library) loaded into memory.

By the way why do you want to avoid all relocations?

@WorksButNotTested
Copy link
Author

I am trying to build a powerpc nostd raw binary (which has no dependencies on libc or any other DSOs either), which can be loaded and executed within an existing process. I am building my code into a static library and then using cargo make to directly invoke the linker to generate an ELF which I then use objcopy to extract the relevant sections.

However, if I perform the link using -pie, (but without -static), then the resulting binary has numerous R_PPC_RELATIVE relocations. If I build with both -pie and -static, then the binary no longer has any relocations, but looking at the assembly, I can see it is using absolute addressing. Can you suggest anything?

@WorksButNotTested
Copy link
Author

Also, if I build with -Wl,--oformat=binary, then I get a load of linker errors dangerous relocation: generic linker can't handle R_PPC_PLTREL24. I have also manually set rustflags = ["-Crelocation-model=pic"] in .cargo/config, to ensure the right relocation model is in use.

@bjorn3
Copy link
Member

bjorn3 commented Apr 18, 2023

When creating a statically linked position independent executable, Scrt.o is linked in. This object file defines the _start entrypoint and before it calls your main function it performs all relocations that a dynamic linker would need to perform if the executable was dynamically linked. Note that unless you use -Ctarget-feature=+crt-static, the executable will be dynamically linked even if you use #![no_std]. You will need to use -Ctarget-feature=+crt-static and unless you put a lot of restrictions on the code you write (like no data with relocations, -Zplt=yes to prevent GOT calls, ...), you will need an Scrt.o equivalent. I would also suggest switching to the musl target as musl libc is much smaller than glibc such that you can statically link it in. (only the parts you actually use will be linked, not the entirety of musl) Scrt.o depends on the exact stack layout that an executable has at startup, so you either need to write an assembly stub which sets up this layout or write your own Scrt.o replacement. https://fasterthanli.me/series/making-our-own-executable-packer might be useful. Be aware that the articles are very long form, so you may want to skip parts you already know.

@WorksButNotTested
Copy link
Author

Also, if I build with -Wl,--oformat=binary, then I get a load of linker errors dangerous relocation: generic linker can't handle R_PPC_PLTREL24. I have also manually set rustflags = ["-Crelocation-model=pic"] in .cargo/config, to ensure the right relocation model is in use.

@WorksButNotTested
Copy link
Author

Sorry, a little more detail.

I'm not using cargo/rust to perform my linking since I use it to build my crate as a static lib as I wanted total control of my linker step. I am passing my crate (.a) as an input to the linker along with -nostartfiles, -nostdlib, -nodefaultlibs and have used my own assembly stubs to allow syscalls (so I shouldn't have or need any trace of libc whatsoever). This results in a static binary (EXEC). My linker script places all the code into a single RWX segment, with the entry function in its own section at the start of the segment to enssure it appears first in the binary. I then use objcopy to extract all the sections in the segment into a raw binary.

As a test harness, I am loading the binary using a simple Linux loader process which mmaps a memory region, copies the flat payload and calls it. This works fine if the memory region is mapped at the fixed base of the binary, but not if the memory region is elsewhere. The debugger shows the fault caused by the target using absoute addressing.

@bjorn3
Copy link
Member

bjorn3 commented Apr 18, 2023

This results in a static binary (EXEC).

If you build an ET_EXEC binary, the linker will resolve all absolute relocations using a fixed base address. You need an ET_DYN position independent executable to keep the absolute relocations in metadata for Scrt.o to resolve at runtime. To produce a position independent executable which doesn't need to perform any relocations at runtime, you need to make sure that the data section doesn't contain any relocations, that you compile with -Zplt=yes such that rustc uses PLT calls (which the dynamic linker can turn into direct relative calls) rather than GOT calls (which the dynamic linker can't turn into direct relative calls), and possibly some more things.

@WorksButNotTested
Copy link
Author

Even with -Zplt=yes set in rustflags, I am still seeing approx 1500 R_PPC_RELATIVE in the .rela.dyn section?

@bjorn3
Copy link
Member

bjorn3 commented Apr 18, 2023

Is that after linking?

@WorksButNotTested
Copy link
Author

Yes. It looks like the relocations are applied to the .data.rel.ro and .got2 sections.

@bjorn3
Copy link
Member

bjorn3 commented Apr 18, 2023

If there is a .data.rel.ro section, you haven't made sure that the data section doesn't contain any relocations. You aren't allowed to store any pointers in statics or promoted temporaries if you want to avoid all relocations. As for .got2, can you check which symbols are referenced by the relocations in this section?

@WorksButNotTested
Copy link
Author

Looking at the link map, it seems the vast majority of the .data.rel.ro comes from the core library (objects in the static library starting with the prefix core-)?

Looking at the .data.rel.ro section of my own objects in the static library, I can see entries relating to static strings. Both those use in println! calls and also strings containing source file paths (these have been put in by the compiler)?

One of the sources of entries in the .got2 is string data. e.g. println!("panic: {panic_info:#?}"); results in the string panic: appearing in the binary, with a corresponding .got2 entry.

@bjorn3
Copy link
Member

bjorn3 commented Apr 19, 2023

Are you passing --gc-sections to the linker to allow the linker to remove unused parts of libcore? And try compiling with cargo build -Zbuild-std=core -Zbuild-std-features=panic_immediate_abort --release. This will replace all panics with aborting instructions and thus avoid all panic location referenced.

@WorksButNotTested
Copy link
Author

WorksButNotTested commented Apr 28, 2023

Sorry for the delay. I got distracted with a few other things. I decided to create a minimal reproducer here to help demonstrate things. Just running cargo build --release in the rust sub-folder should create the binary. The parent folder contains a makefile and simple C loader to demonstrate what I'm hoping to achive once the rust payload is built.

It's essentially just a nostd helloworld, but even after all the config in rust-toolchain.toml .cargo/config.toml and Cargo.toml, I am still left with two relocations in the .got2 section relating to the "strings". e.g. the string "HELLO" in the following snippet.

#[no_mangle]
extern "C" fn unsafe_main(_argc: i32, _argv: *const *const i8) -> i32 {
    unsafe {
        let bytes = b"HELLO\n\0";
        write(STDOUT_FILENO, bytes.as_ptr() as *const c_void, bytes.len());
        // libc::_exit(0);
        0xdeadface_u32 as i32
    }
}

Grateful if you have any more suggestions.

@bjorn3
Copy link
Member

bjorn3 commented Apr 29, 2023

I tried compiling the minimal reproducer for x86_64 and I didn't find any GOT references. Based on https://maskray.me/blog/2023-02-26-linker-notes-on-power-isa .got2 seems to be a PowerPC specific construct to handle the lack of relative addressing in PowerPC. It seems that you have to load r30 with the address of the .got2 section for the current translation unit (object file). I'm not sure it is possible to avoid it without writing your own compiler. But I'm not familiar with how ELF works on PowerPC, so I may be wrong about that.

@WorksButNotTested
Copy link
Author

Should this be refered to someone on the rust compiler team? Is it an inherent limitation of the architecture? Or perhaps of the LLVM backend for PPC? I apprecaite it may not be the most common problem people face and hence may not be prioritised for a fix, but it would be good at least to determine a definitive cause and confirm there isn't a simple workaround if possible.

@bjorn3
Copy link
Member

bjorn3 commented May 3, 2023

You could try asking someone who works on the PPC backend of LLVM.

@Manouchehri
Copy link

@WorksButNotTested Do you still have a copy of https://github.com/WorksButNotTested/pie/ somewhere?

@WorksButNotTested
Copy link
Author

I'm afraid not. I abandoned the approach in the end if I recall. The rust portion used a linker script to merge all the sections into a single executable segment (being sure to put the entry point in its own section at the start). This was then extracted using objcopy to give a flat binary. Then a C loader would just mmap the file and jump at it.

@Manouchehri
Copy link

being sure to put the entry point in its own section at the start

How did you do that sanely?

@WorksButNotTested
Copy link
Author

Use an __attribute__(section((".section_name")) to decorate the function definition. Then it's down to the order of the sections in the linker script. See here for how to control which sections go in which segment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

4 participants