New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core::fmt::Write::write_fmt
causes UB when being relocated at runtime even when code-model is pic
#113207
Comments
core::fmt::Write::write_fmt
fails when being relocated at runtime even when code-model is pie
core::fmt::Write::write_fmt
causes UB when being relocated at runtime even when code-model is pie
Could you manually verified that the relocations inside the rust executable are messed up? Could you also include some sort of reproduction so the people can test it themselves? |
TL;DR: I'm sure there is a miscompilation/a bug in the compiler. I do have a repository where people can test this out: https://github.com/phip1611/relocatable-multiboot2-chain-loader/tree/poc-rust-issue-113207 If you have nix installed, running everything should be as easy as:
This starts my loader (=the rust binary I build) in a QEMU VM. GRUB2 chainloads the loader via multiboot2. The Rust binary is supposed to have only position-independent code and is linked at 2M. GRUB relocates the binary to 6M at runtime in physical memory. To check if you have the right binary after running
Hint: I'd like to mention that GRUB relocates my binary in "dumb"/simple way: There is no global offset table magic involved. The binary is just moved as simple as it can be in physical memory. I investigated the problem further with QEMU and I know where the bug happens, but not why. I think, there is a mis-compilation? Until the IP reaches address
This belongs to the compiled code of At link address During runtime after relocation, there are a bunch of no-ops at If I remove let x = format_args!("hello");
let _ = Printer.write_fmt(x); from I've appended a screencast of a GDB session right before the bug happens. The code here is executing at link address Screencast.from.2023-07-03.09-00-08.webm |
core::fmt::Write::write_fmt
causes UB when being relocated at runtime even when code-model is pie
core::fmt::Write::write_fmt
causes UB when being relocated at runtime even when code-model is pic
Position independent code still requires you to apply ELF relocations in the loader. Normally, the compiler can just use RIP-relative addressing (this is why the rest of the binary works). But when a pointer to static memory is loaded from static memory (as is the case for |
It seems like no relocation information are emitted for my binary - maybe I'm missing something as I never worked with ELF relocations so far. I did the following modification to my linker script: diff --git a/chainloader/bin/link.ld b/chainloader/bin/link.ld
index ecd8e03..e9d490c 100644
--- a/chainloader/bin/link.ld
+++ b/chainloader/bin/link.ld
@@ -15,6 +15,7 @@ PHDRS
*/
loader PT_LOAD FLAGS(7); /* 0b111 - read + write + execute */
+ dynamic PT_DYNAMIC;
}
SECTIONS {
@@ -22,6 +23,8 @@ SECTIONS {
/* Binary is relocatable. Note: We want a 2 MiB alignment for huge pages. */
.loader 2M : AT(2M) ALIGN(4K)
{
+ *(.got .got.*)
+
KEEP(*(.multiboot2_header));
*(.text .text.*)
@@ -35,12 +38,42 @@ SECTIONS {
*(.data .data.*)
} : loader
+ /* I used `ld --verbose` to get these output sections. */
+ .rela.dyn :
+ {
+ *(.rela.init)
+ *(.rela.text .rela.text.* .rela.gnu.linkonce.t.*)
+ *(.rela.fini)
+ *(.rela.rodata .rela.rodata.* .rela.gnu.linkonce.r.*)
+ *(.rela.data .rela.data.* .rela.gnu.linkonce.d.*)
+ *(.rela.tdata .rela.tdata.* .rela.gnu.linkonce.td.*)
+ *(.rela.tbss .rela.tbss.* .rela.gnu.linkonce.tb.*)
+ *(.rela.ctors)
+ *(.rela.dtors)
+ *(.rela.got)
+ *(.rela.bss .rela.bss.* .rela.gnu.linkonce.b.*)
+ *(.rela.ldata .rela.ldata.* .rela.gnu.linkonce.l.*)
+ *(.rela.lbss .rela.lbss.* .rela.gnu.linkonce.lb.*)
+ *(.rela.lrodata .rela.lrodata.* .rela.gnu.linkonce.lr.*)
+ *(.rela.ifunc)
+ } : loader
+ .rela.plt :
+ {
+ *(.rela.plt)
+ PROVIDE_HIDDEN (__rela_iplt_start = .);
+ *(.rela.iplt)
+ PROVIDE_HIDDEN (__rela_iplt_end = .);
+ } : loader
+ .relr.dyn : { *(.relr.dyn) } : loader
+ .plt : { *(.plt) *(.iplt) } : loader
+ .plt.got : { *(.plt.got) } : loader
+ .plt.sec : { *(.plt.sec) } : loader
+ .dynamic : { *(.dynamic) } :data :dynamic
+
/DISCARD/ :
{
*(.note.*)
*(.comment .comment.*)
*(.eh_frame*)
- *(.got .got.*)
}
} The difference of 3,4c3,4
< Entry point 0x200098
< There is 1 program header, starting at offset 52
---
> Entry point 0x2000a8
> There are 2 program headers, starting at offset 52
8c8,10
< LOAD 0x001000 0x00200000 0x00200000 0x05188 0x05188 RWE 0x1000
---
> LOAD 0x001000 0x00200000 0x00200000 0x05198 0x05198 RWE 0x1000
> DYNAMIC 0x000000 0x00000000 0x00000000 0x00000 0x00000 R 0
> readelf: Error: no .dynamic section in the dynamic segment
12a15
> 01
By the way: in any way, |
By switching from Here's an excerpt from the memory map during linkage of GNU ld when I use my own linker script:
When I use lld's default linker script, the |
By adding the following patch to my project, I could indeed create an executable with relocation information. (However, my planned project doesn't work then, as GRUB2 doesn't support to (chain)load binaries with relocs. So, this is less of a bug and more a works as intended? It's a bit frustrating. I was very motivated already as some of the compiled code was indeed fully position independent without further relocation information.) I'd prefer a compiler mode where rustc either guarantees:
diff --git a/chainloader/bin/link.ld b/chainloader/bin/link.ld
index ad93a8c..5394b8f 100644
--- a/chainloader/bin/link.ld
+++ b/chainloader/bin/link.ld
@@ -22,6 +22,7 @@ PHDRS
*/
loader PT_LOAD FLAGS(7); /* 0b111 - read + write + execute */
+ dynamic PT_DYNAMIC FLAGS(6);
}
SECTIONS {
@@ -42,12 +43,21 @@ SECTIONS {
*(.data .data.*)
} : loader
+ .got : { *(.got .got.*) } : loader
+
+ .dynamic : { *(.dynamic .dynamic.*) } : loader
+
+ .interp : { *(.interp .interp.*) } : loader
+ .dynsym : { *(.dynsym .dynsym.*) } : loader
+ .dynstr : { *(.dynstr .dynstr.*) } : loader
+ .hash : { *(.hash .hash.*) } : loader
+ .interp : { *(.gnu.hash .gnu.hash.*) } : loader
+
/DISCARD/ :
{
*(.note.*)
*(.comment .comment.*)
*(.eh_frame*)
- *(.got .got.*)
}
}
diff --git a/chainloader/x86-unknown-none.json b/chainloader/x86-unknown-none.json
index d3e81d6..ef5a044 100644
--- a/chainloader/x86-unknown-none.json
+++ b/chainloader/x86-unknown-none.json
@@ -7,8 +7,10 @@
"target-c-int-width": "32",
"os": "none",
"executables": true,
- "linker-flavor": "ld.lld",
- "linker": "rust-lld",
+ "linker-flavor": "ld",
+ "linker": "ld",
+ "position-independent-executables": true,
+ "static-position-independent-executables": true,
"panic-strategy": "abort",
"disable-redzone": true,
"features": "+soft-float,-x87,-mmx,-sse,-sse2,-sse3,-ssse3,-sse4.1,-sse4.2,-avx,-avx2,-fma,-3dnow,-3dnowa" |
This would (almost) always produce a build error as every program links in libcore, which has several statics that contain references. And even in case of a PIC/PIE executable data sections that contain pointers need relocations to be applied at runtime. PIC/PIE only applies to code, not to data. For code relative addressing is possible if the target is in the same DSO, but data requires absolute pointers. |
Text relocations require that the code segment be temporarily made writable to perform relocation processing. This violates W^X, so modern toolchains do not emit them and Android forbids them outright.
Data needs relocations applied, but code does not. That said, I would not be surprised if missing data relocations could cause the problem @phip1611 observed. In short, GRUB does not support loading binaries that have been compiled as PIC/PIE, and instead requires that all code be linked at the base address it will be run from. This allows all relocation processing to be done by the linker before execution. Systems software that needs to be relocated will need to relocate itself after GRUB has loaded it. |
I see. My expectation was that having a single RWX LOAD segment, it is possible to have 100% position independent code and data accesses without additional relocation information in the ELF. So to sum this up, I ended up in an unfortunate and uncommon toolchain situation where things somehow only worked as expected to a certain degree. Would it be a sensible solution to add a compiler warning output such as the following? if relocation_model == "pie" && !target.static_position_independent_executables {
eprintln!("Binary will not have relocation information. A relocation during runtime might cause undefined behaviour.");
} |
@phip1611 To me there are two problems here:
The correct ways to fix this are to either use a better loader that can handle relocations, or to build your executable as position-dependent code with an appropriate base address. If you choose the latter, you should add a step to your build that checks for ELF relocations in the output and fails if any are present. This prevents KASLR, however, so I recommend either:
|
I was just exploring new ways to be relocatable when being loaded as static ELF by GRUB. So, unfortunately, having fully relocatable Rust code without relocation information only works partially, as described above. Thanks! From your answers, I assume that the Rust compiler should not or cannot provide additional warnings and that I in fact just ended up in an unfortunate toolchain + runtime edge case. For those of you that are interested: There is a solution for the problem, I just tried to explore new ways. A solution to be relocatable in a static ELF without relocations is to have a small position-independent assembly routine that sets up paging. This is how NOVA, Hedron, or my learning project do it. I hoped that I can get rid of most of the assembly but apparently this is not right and I have to stick with the old approach (where EFI is not an option and I have to use GRUB2 + Multiboot2). Thanks everyone! |
TL;DR: In this thread, I show how machine code emitted for
core::fmt::write
always jumps to the link address of a certain instruction although the code should be position independent (code-model is pie).I'm working on a small x86 multiboot2 chainloader with position-independent code. The ELF consists of a small entry written in assembly (that sets up the stack) and rust code for more logic that is compiled/linked with
relocation-model=pie
. (I've also tried=pie
but there was no difference.)The ELF file a static ELF executable with one single LOAD segment. The Multiboot2 bootloader (GRUB) loads this file with a forced relocation (for example: pushed from 2 MiB to 6 MiB) into physical memory when the CPU is in Multiboot2's I386 machine state.
The assembly routine finds the relocation offset, sets the stack, and jumps to the high-level rust code. While some Rust functionality work properly (such as calling functions, doing calculating, printing text via an x86 I/O port) in the relocated binary, I noticed that
causes undefined behavior and weird affects. A weird loop is started where the code above that line is executed again and again.
Interestingly,
Printer.write_str("foo")
works perfectly fine.If I disable the relocation during runtime entirely, also
core::fmt::Write::write_fmt
works as expected and the string is printed to the screen. I had a look with an experienced colleague of mine and we are confident that my stack setup (even in the relocated case) is valid. Hence, the issue needs to be something else. From looking at the objdump, I'm certain that the code indeed uses relative addresses and calls.I expected to see that the code is in fact relocatable in physical address space.
Instead, many functionality works but
core::fmt::Write::write_fmt
fails when the static ELF binary is relocated during runtime.Meta
I'm using Rust in version
nightly-2023-06-28
. I'm using the following compiler spec JSON:The text was updated successfully, but these errors were encountered: