-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revisit ordering of segments in the program header table on AArch32 #52
Comments
We have determined that the program headers are in the first
|
My first plan of attack is to just put |
With the code on that branch, I was able to maintain the index of
|
@travitch and I now think that the problem is that the program is reading this address from the aux vector, and it's incorrect: https://github.com/torvalds/linux/blob/242b23319809e05170b3cc0d44d3b4bd202bb073/fs/binfmt_elf.c#L258 The relevant call is here: The variable This calculation happens the first time we hit a In the refurbished binary, the new text segment (with the new program headers at the beginning) is the first |
@travitch I was mistaken above when I said:
It's actually a
So after coming back and re-reading, I think I've generated a set of unsatisfiable constraints for segments in the binary :) Maybe we ruled one of these out and I just don't remember.
So to accomplish (5) (so that the aux vector value comes out right), we would have to make the
Is one of those two options what you were thinking? Or am I missing something? |
Making the
So maybe we're back to the drawing board again...? |
I think that our last debugging session revealed a single simpler statement of the constraint: the
Note that the kernel code computing the The experiment I was suggesting was to force our new text segment on ARM to come after the original one (which would put us back into the x86 case where the glibc TLS setup code sees the wrong program header table, but at least one structured correctly). If that works, we can always:
Note that if we do that, we need to be a bit careful about computing what the start address of the code will be (i.e., we need to account for the program header table sitting in front of it). |
@travitch I think that this is not true:
based on this output from
It looks like the first |
Then we still need to satisfy |
Where "satisfy" means that the address of the program header table in that |
nit: Based on https://github.com/torvalds/linux/blob/dd0d718152e4c65b173070d48ea9dfc06894c3e5/fs/binfmt_elf.c#L258, I think we need this address to be the address of the program header table: load_addr + exec->e_phoff == (elf_ppnt->p_vaddr - elf_ppnt->p_offset) + exec->e_phoff where |
To confirm this theory, I tried out this calculation for the original and Original binary
meaning:
so
This makes sense, because the first Refurbished binary
meaning:
resulting in
but this should be right! This really is the virtual address of the program |
To confirm my calculations, I ran both binaries through GDB with this script:
In summary, it looks like my calculation was right for the original binary, but somehow off for the refurbished binary (by Note that and we're looking at the aux vector, which is an array of these: https://github.com/lattera/glibc/blob/895ef79e04a953cac1493863bcae29ad85657ee1/elf/elf.h#L1122 Original
Refurbished
It's worth noting that this address has this value over the lifetime of the program, this isn't modified during execution. |
I also ran both programs through a Linux kernel modified with this patch. What we see in the output is that the kernel is setting a correct value for diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 9fe3b51c1..2025c9b49 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -252,6 +252,8 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
*/
ARCH_DLINFO;
#endif
+ printk("Setting AT_PHDR=0x%lx (load_addr=0x%lx, exec->e_phoff=0x%lx)\n", load_addr + exec->e_phoff, load_addr, exec->e_phoff);
#endif
+ printk("Setting AT_PHDR=0x%lx (load_addr=0x%lx, exec->e_phoff=0x%lx)\n", load_addr + exec->e_phoff, load_addr, exec->e_phoff);
+ printk("Setting AT_PHNUM=%d\n", exec->e_phnum);
NEW_AUX_ENT(AT_HWCAP, ELF_HWCAP);
NEW_AUX_ENT(AT_PAGESZ, ELF_EXEC_PAGESIZE);
NEW_AUX_ENT(AT_CLKTCK, CLOCKS_PER_SEC);
@@ -574,6 +576,7 @@ static unsigned long load_elf_interp(struct elfhdr *interp_elf_ex,
unsigned long error = ~0UL;
unsigned long total_size;
int i;
+ printk("Unexpected: calling load_elf_interp\n");
/* First of all, some simple consistency checks */
if (interp_elf_ex->e_type != ET_EXEC &&
@@ -920,6 +923,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
/* Some simple consistency checks for the interpreter */
if (interpreter) {
+ printk("Unexpected: found an ELF interpreter\n");
retval = -ELIBBAD;
/* Not an ELF interpreter */
if (memcmp(interp_elf_ex->e_ident, ELFMAG, SELFMAG) != 0)
@@ -945,6 +949,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
break;
case PT_LOPROC ... PT_HIPROC:
+ printk("Passing a header of type %d to arch-specific handling code\n", elf_ppnt->p_type);
retval = arch_elf_pt_proc(interp_elf_ex,
elf_ppnt, interpreter,
true, &arch_state);
@@ -1118,6 +1123,7 @@ static int load_elf_binary(struct linux_binprm *bprm)
if (!load_addr_set) {
load_addr_set = 1;
load_addr = (elf_ppnt->p_vaddr - elf_ppnt->p_offset);
+ printk("load_addr=0x%lx (with elf_ppnt->p_vaddr=0x%lx and elf_ppnt->p_offset=0x%lx)\n", load_addr, elf_ppnt->p_vaddr, elf_ppnt->p_offset);
if (elf_ex->e_type == ET_DYN) {
load_bias += error -
ELF_PAGESTART(load_bias + vaddr); Original
Refurbished
|
To confirm that we're looking at the same entry in the aux vector from the kernel and binary side, I wrote this script to print out the aux vector before we reach
Once again confirming that this really is the right aux vector entry, and really does have the wrong value, as early as |
Here's the entire aux vector from the kernel:
Notice that a bunch of the entries seem different, for example |
Are you sure that code is relevant here? My impression was that that code is for qemu user mode where you give qemu a binary and it emulates it by dynamic translation to the host architecture. In the system emulation mode (where you have a kernel running from a disk image), I don't think that this code runs. In the system emulator, qemu never touches your binary directly. |
After a short discussion on Mattermost we confirmed that the QEMU user-mode code might be the point of failure, because what we're seeing is that Linux is calculating the correct value under full-system emulation, but the binary is getting the wrong value under user-mode emulation. A little more digging is needed, then we'll have to see if we can write some code that produces program headers that can run both in QEMU user-mode and in full-system mode, or if we'll be reduced to always testing via full-system emulation. |
To confirm that this really is caused by a discrepancy between QEMU user-mode and system-mode, I ran a refurbished binary with an infinite loop at the end of
These all seem reasonably high-effort, but 2a has the potential of being the lowest effort, so I'll start there. |
Here's a QEMU bug report, just in case some kind soul comes along and fixes the issue quickly: https://bugs.launchpad.net/qemu/+bug/1885332 |
So it looks like QEMU is calculating
where
This yields
as expected. If we set this equal to the kernel's calculation, we can generate the constraint we would have to satisfy to work around this bug:
In English: The virtual address of the first |
To fix the issues in #48, @travitch suggests we revisit the segment ordering in https://github.com/GaloisInc/renovate/blob/master/renovate/src/Renovate/BinaryFormat/ELF.hs. Specifically:
The text was updated successfully, but these errors were encountered: