Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support dynamic linkers #21

Closed
10 of 11 tasks
lunixbochs opened this issue Sep 3, 2015 · 59 comments
Closed
10 of 11 tasks

support dynamic linkers #21

lunixbochs opened this issue Sep 3, 2015 · 59 comments

Comments

@lunixbochs
Copy link
Owner

lunixbochs commented Sep 3, 2015

  • if type == EXEC, it's a normal executable
  • if type == DYN, we mmap it somewhere
  • also PT_INTERP gets vm_mmap()'d after elf_map runs and maps the whole elf
  • so for exec'd linker, I need to start with DYN
  • for INTERP ld-linux, I need to map the elf, then load and map the interpreter
  • be careful to not map linker after the data segment so we can brk()

  • support linux auxv
  • support linux vdso
  • support osx auxv equivalent (not sure 100% how the ABI plays out yet but dyld source is short)
  • support osx commpage
  • branch protect master when this is done and switch dev to an unstable branch
@lunixbochs
Copy link
Owner Author

probably needs more syscalls

needs reading the dynamic linker section

dyld is WEIRD

x86_64 linux test currently fails to load symbols - it jumps to 0 from PLT

@lunixbochs
Copy link
Owner Author

currently static and dynamically linked glibc all crash on load for various reasons. haven't tried any other libc implementations, and osx dynamic linking isn't supported yet either.

@lunixbochs
Copy link
Owner Author

fixed glibc crashes (was bad interp entry point)

@lunixbochs
Copy link
Owner Author

fixed incorrect auxv for musl and glibc.

musl crashes in decode_vec
glibc is jumping to a random unmapped address

@lunixbochs
Copy link
Owner Author

glibc is crashing at elf_get_dynamic_info()

oh, I need to point to the program header inside the loaded section, not copy my own phdr in...

@Noiled
Copy link

Noiled commented Oct 28, 2015

@lunixbochs can you share your code to build x86.darwin.macho?

@lunixbochs
Copy link
Owner Author

It's just the test binary from https://github.com/lunixbochs/lib43 - if you want to build the 32-bit version specifically, do cmake -DCPU=i386

I'm assuming it works because lib43 doesn't care about macho auxv :)

@lunixbochs
Copy link
Owner Author

osx auxv equivalent is supported on x86_64 now. the primary blocker is now unimplemented mach syscalls

@MagaTailor
Copy link
Contributor

This is what happens on i386 after trying to run a dynamically linked binary:

Inconsistency detected by ld.so: rtld.c: 1141: dl_main: Assertion `_rtld_local._dl_rtld_map.l_libname' failed!`

@lunixbochs
Copy link
Owner Author

That's a regression. I'm not sure why it's happening yet. Wouldn't be surprised if it's a problem in auxv.

If you want to crack open the glibc source and take a look, narrowing this down would be helpful.

@MagaTailor
Copy link
Contributor

@lunixbochs
Copy link
Owner Author

lunixbochs commented Nov 20, 2015 via email

@MagaTailor
Copy link
Contributor

Would any of usercorn's tracing options be of any help here?

@lunixbochs
Copy link
Owner Author

Yeah. -trace should enable all of the useful tracing options. I don't recommend posting entire traces as they can be quite large (and I see the same error when I run one)

@MagaTailor
Copy link
Contributor

No worries - I would've attached a file if you needed a trace. :)

-original message-
Subject: Re: [usercorn] support dynamic linkers (#21)
From: Ryan Hileman notifications@github.com
Date: 20.11.2015 22:29

Yeah. -trace should enable all of the useful tracing options. I don't recommend posting entire traces as they can be quite large (and I see the same error when I run one)


Reply to this email directly or view it on GitHub:
#21 (comment)

@MagaTailor
Copy link
Contributor

I can offer this backtrace from gdb:

(gdb) catch syscall exit
(gdb) catch syscall exit_group
(gdb) run
Starting program: /home/petevine/unpacked/usercorn-unstable/usercorn /usr/bin/ls
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/libthread_db.so.1".
Inconsistency detected by ld.so: rtld.c: 1141: dl_main: Assertion `_rtld_local._dl_rtld_map.l_libname' failed!
[New LWP 2387]
[New Thread 0xb317fb40 (LWP 2386)]
[New Thread 0xb39bfb40 (LWP 2385)]
[Switching to Thread 0xb1608b40 (LWP 2387)]

Catchpoint 4 (call to syscall exit), 0xb3c9c729 in start_thread ()
   from /lib/libpthread.so.0
(gdb) bt
#0  0xb3c9c729 in start_thread () from /lib/libpthread.so.0
#1  0xb3bf758e in clone () from /lib/libc.so.6
(gdb) c
Continuing.
[Thread 0xb1608b40 (LWP 2387) exited]
[Switching to Thread 0xb3ab0700 (LWP 2384)]

Catchpoint 5 (call to syscall exit_group), 0xb7fd9b30 in __kernel_vsyscall ()
(gdb) bt
#0  0xb7fd9b30 in __kernel_vsyscall ()
#1  0x0811701f in syscall.Syscall ()
    at /home/petevine/go/src/syscall/asm_linux_386.s:23
#2  0x08114d49 in syscall.Exit (code=127)
    at /home/petevine/go/src/syscall/zsyscall_linux_386.go:376
#3  0x080b5c6f in os.Exit (code=127) at /home/petevine/go/src/os/proc.go:54
#4  0x0804b8db in main.main ()
    at /home/petevine/go/src/github.com/lunixbochs/usercorn/go/usercorn/cli.go:85

@lunixbochs
Copy link
Owner Author

gdb isn't interesting here because it's attached to Usercorn (which I know very well) and not the target.

Once I have #10 we'll be able to attach GDB to the target, but that shouldn't be necessary here (as Usercorn should be able to provide a symbolicated traceback if the binary has symbols). It might not work on dynamic symbols? (#79)

@lunixbochs
Copy link
Owner Author

okay, looks like phnum is zero for some reason, which makes phdr impossible to parse.
edit: was a regression, is fixed in 089b6fa - back to fixing TLS and weirdness

@MagaTailor
Copy link
Contributor

Could you explain what's going on here? (I chose wc cause it seems to be working in busybox)

./usercorn /bin64/ldd /bin64/wc --help

panic: Could not identify file magic.

goroutine 1 [running]:
main.main()
        /home/petevine/go/src/github.com/lunixbochs/usercorn/go/usercorn/cli.go:55 +0x636

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
        /home/petevine/go/src/runtime/asm_386.s:1662 +0x1

or
./usercorn /bin64/wc --help

invalid read: @0x8, 0x8 = 0x0
Registers:
   fs 0x0000000000000000   rbx 0x000000000012a000   rsi 0x0000000000000000   r11 0x00000000607fe620 
   gs 0x0000000000000000   rcx 0x000000000040108d    r8 0x000000000012a000   r12 0x000000000012a000 
  rax 0x0000000000000000   rdi 0x00000000607fed5d    r9 0x0000000000000001   r13 0x0000000000000000 
  rbp 0x00000000607fe770   rdx 0x000000000d696913   r10 0x000000000101cdc1   r14 0x000000000d696913 
                                                                             r15 0x000000000040108d 

Thx!

@lunixbochs
Copy link
Owner Author

You can't run ldd under usercorn because it's a shell script that basically does LD_TRACE_LOADED_LIBRARIES=1 /bin64/wc --help

In the other one, that's the bug this issue is about. Look at the stacktrace after -etrace to see where it's breaking.

@MagaTailor
Copy link
Contributor

Oh, the --help was a silly paste mistake from trying to run the binary; provided the bug wasn't there should invoking LD_TRACE_LOADED_LIBRARIES=1 /bin64/wc under usercorn work at present?
I'll be adding the trace later.

@lunixbochs
Copy link
Owner Author

Oh, it's actually LD_TRACE_LOADED_OBJECTS, and that doesn't seem to work because it's currently breaking before link finishes. LD_DEBUG=all works for me.

This is my stacktrace:

0x100fd10 match_symbol (dl-version.c:76)
0x10100f7 _dl_check_map_versions+0x27 (dl-version.c:191)
0x10100d0 _dl_check_map_versions (dl-version.c:174)
0x1010562 _dl_check_all_versions+0x12 (dl-version.c:384)
0x1010550 _dl_check_all_versions (dl-version.c:380)
0x1004790 version_check_doit
0x100eb10 _dl_receive_error (dl-error.c:205)
0x100173a dl_main+0xca
0x1001670 dl_main
0x1015c42 _dl_sysdep_start+0x42 (dl-sysdep.c:111)
0x1015c00 _dl_sysdep_start (dl-sysdep.c:86)
0x1004825 _dl_start+0x55
0x10047d0 _dl_start
0x1001180 _start

@MagaTailor
Copy link
Contributor

Yeah, that's the trace. How do you pass vars of the guest environment to usercorn?

@lunixbochs
Copy link
Owner Author

Environment is inherited from host. It ends up on the stack above argv.

@MagaTailor
Copy link
Contributor

I'd emailed you about this thinking I broke something with a little hack but it wasn't my fault after all:

./usercorn /bin64/wc --help
hi 0
Inconsistency detected by ld.so: ../sysdeps/x86_64/dl-machine.h: 497: elf_machine_rela_relative: Assertion `((reloc->r_info) & 0xffffffff) == 8' failed!

I'm on 32-bit i386 linux.

@lunixbochs
Copy link
Owner Author

What's hi 0? I'm pretty sure the error we're getting on glibc is actually relocation related across the board, so I'm not sure your binary is anything special (besides the fact it prints an assert). If you do think something changed, can you try git bisecting?

@MagaTailor
Copy link
Contributor

No idea about hi 0 (probably wc is confused and trying to count something in spite of being asked not to).

I don't have that binary any longer and rebuilt from the latest unstable source. That's what you meant, right?

@lunixbochs
Copy link
Owner Author

Can you git bisect to before this message was being printed, and see which usercorn -trace is longer? (fyi, make usercorn is way faster than make because it doesn't try to download deps)

@MagaTailor
Copy link
Contributor

I can't do that now but bisecting from memory it would have to be this one:

d3f09a2

I'll try to provide an actual trace comparison from before that later.

@lunixbochs
Copy link
Owner Author

Alright, if that's the case can you get me a trace from before and after (well, before and current unstable) when you get a chance?

@MagaTailor
Copy link
Contributor

Ok, I tried building @ 95806ec but it failed with:

# github.com/lunixbochs/usercorn/go/usercorn
go/usercorn/cli.go:53: cannot use absPrefix (type string) as type *usercorn.Config in argument to usercorn.NewUsercorn
go/usercorn/cli.go:57: corn.Verbose undefined (type *usercorn.Usercorn has no field or method Verbose)
go/usercorn/cli.go:58: corn.TraceSys undefined (type *usercorn.Usercorn has no field or method TraceSys)
go/usercorn/cli.go:59: corn.TraceMem undefined (type *usercorn.Usercorn has no field or method TraceMem)
go/usercorn/cli.go:60: corn.TraceMemBatch undefined (type *usercorn.Usercorn has no field or method TraceMemBatch)
go/usercorn/cli.go:61: corn.TraceReg undefined (type *usercorn.Usercorn has no field or method TraceReg)
go/usercorn/cli.go:62: corn.TraceExec undefined (type *usercorn.Usercorn has no field or method TraceExec)
go/usercorn/cli.go:67: corn.TraceMatchDepth undefined (type *usercorn.Usercorn has no field or method TraceMatchDepth)
go/usercorn/cli.go:69: corn.TraceMatchDepth undefined (type *usercorn.Usercorn has no field or method TraceMatchDepth)
go/usercorn/cli.go:72: corn.TraceMatch undefined (type *usercorn.Usercorn has no field or method TraceMatch)
go/usercorn/cli.go:72: too many errors
make: *** [usercorn] Error 2

Any use going back further?

@MagaTailor
Copy link
Contributor

For now, just the latest binary's trace:

newer.txt

And an older one's @ 2b80f6a which just reads:

panic: Could not identify file magic.

goroutine 1 [running]:
main.main()
        /home/petevine/go/src/github.com/lunixbochs/usercorn/go/usercorn/cli.go:55 +0x636

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
        /home/petevine/go/src/runtime/asm_386.s:1662 +0x1

which, going in the opposite direction, was still the case @ d3f09a2

and the latest 3976cd6 brings back the same error.

@lunixbochs
Copy link
Owner Author

woah, my x86_64 dynamic musl-libc test works on the latest master, and my static glibc binary stopped working, so maybe it's fixed and I broke something else

bisect shows this broke in 089b6fa

@MagaTailor
Copy link
Contributor

It seems the error above was caused by inadvertent truncation of the wc binary - the trace is the same invalid read as in the old one.

@MagaTailor
Copy link
Contributor

Even though every trace shows:

Inconsistency detected by ld.so: do-rel.h: 116: elf_dynamic_do_Rela: Assertion `map->l_info[(34 + 0 + (0x6fffffff - (0x6ffffff0)))] != ((void *)0)' failed!

some binaries get stuff like this:

$ ./usercorn /bin64/ls
/bin64/ls: error while loading shared libraries: tls/x86_64/libcap.so.2: file too short

Is it about symlink handling?

@lunixbochs
Copy link
Owner Author

Symlinks are handled. I think there's actually a linux-specific bug. It's way more reliable on OS X.

@MagaTailor
Copy link
Contributor

You probably mean darwin host; anyway could you point me to a minimal darwin lib/bin package for client testing? I have never tried the -prefix switch and a different kernel yet.

@lunixbochs
Copy link
Owner Author

maybe puredarwin http://www.puredarwin.org/downloads/

@MagaTailor
Copy link
Contributor

I tried mounting the downloaded vmdk image with qemu-nbd, but all I got eventually was an hfsplus: invalid secondary volume header error. Maybe you could email me just a few essential libs and a test binary?

@lunixbochs
Copy link
Owner Author

I'm not sending you OS X binaries, and I don't have any other xnu/darwin setup right now.

@lunixbochs
Copy link
Owner Author

Here's a tarball of the HFS rootfs from PureDarwinNano_20091226.tar.xz: https://bochs.info/~aegis/PureDarwin.tar.gz

@MagaTailor
Copy link
Contributor

I'd already tried that too - the mounted iso image contains just a few files for booting from cd. How's the actual data supposed to be stored?

@lunixbochs
Copy link
Owner Author

Huh? This is a working root filesystem, with dynamic binaries, dyld, libraries, etc. Should be more than sufficient for basic dynamic link tests.

@MagaTailor
Copy link
Contributor

Yeah, sorry, lack of concentration! I'd already had a file named PureDarwinNano_20091226.tar.xz which you mentioned above unfortunately :)

Anyway, thanks a lot and getting back on topic:

$ ./usercorn -prefix PureDarwin/ PureDarwin/bin/ls -l


+ block @0x8fe1fbb8
                        0x8fe1fbb8:       5a pop edx
                        0x8fe1fbb9:     89e1 mov ecx, esp         edx 0x8fe1d33a 
                        0x8fe1fbbb:     0f34 sysenter             ecx 0x607fee2c 

                    + block @0x8fe1fbbd
                      0x8fe1fbbd:   0f1f00 nop dword ptr [eax]
                      0x8fe1fbc0: b806000000 mov eax, 6
                      0x8fe1fbc5:     cd82 int 0x82             eax 0x00000006 

panic: Syscall missing: 50331654

goroutine 17 [running, locked to thread]:
github.com/lunixbochs/usercorn/go.(*Usercorn).Syscall(0x106ca2a0, 0x3000006, 0x0, 0x0, 0x10856010, 0x0, 0x106122c0, 0x0, 0x0)
        /home/odroid/go/src/github.com/lunixbochs/usercorn/go/usercorn.go:695 +0x130
github.com/lunixbochs/usercorn/go/arch/x86.DarwinSyscall(0xab29e740, 0x106ca2a0, 0x3)
        /home/odroid/go/src/github.com/lunixbochs/usercorn/go/arch/x86/darwin.go:41 +0xfc
github.com/lunixbochs/usercorn/go/arch/x86.DarwinInterrupt(0xab29e740, 0x106ca2a0, 0x82)
        /home/odroid/go/src/github.com/lunixbochs/usercorn/go/arch/x86/darwin.go:53 +0x80
github.com/lunixbochs/usercorn/go.(*Usercorn).addHooks.func5(0xab29e6b0, 0x106e04d0, 0x82)
        /home/odroid/go/src/github.com/lunixbochs/usercorn/go/usercorn.go:553 +0x84
github.com/unicorn-engine/unicorn/bindings/go/unicorn.hookInterrupt(0x4c0148, 0x82, 0x108460b0)
        /home/odroid/go/src/github.com/unicorn-engine/unicorn/bindings/go/unicorn/hook.go:45 +0x80

And it seems it's still just i386 which might explain the problem.

@lunixbochs
Copy link
Owner Author

// osfmk/i386/machdep_call.c
// DDD: the last two are BSD_CALL instead of CALL...
//#define __NR_thread_get_cthread_self      VG_DARWIN_SYSCALL_CONSTRUCT_MDEP(0)
//#define __NR_thread_set_cthread_self      VG_DARWIN_SYSCALL_CONSTRUCT_MDEP(1)
// 2 is invalid
#define __NR_thread_fast_set_cthread_self VG_DARWIN_SYSCALL_CONSTRUCT_MDEP(3)
//#define __NR_thread_set_user_ldt          VG_DARWIN_SYSCALL_CONSTRUCT_MDEP(4)
//#define __NR_i386_set_ldt                 VG_DARWIN_SYSCALL_CONSTRUCT_MDEP(5)
//#define __NR_i386_get_ldt                 VG_DARWIN_SYSCALL_CONSTRUCT_MDEP(6)

So this is i386_get_ldt. I'll need to update the syscall table in ghostrace with these (I don't have any machine-dependent syscalls in there atm)

LDT/GDT support on x86_32 aren't done yet, but I think if you want a 64-bit darwin root you need to build it yourself or find a mac.

@MagaTailor
Copy link
Contributor

Ok, I'll explore the former option :)

@lunixbochs
Copy link
Owner Author

http://web.mit.edu/darwin/src/modules/xnu/osfmk/man/i386_get_ldt.html
http://web.mit.edu/darwin/src/modules/xnu/osfmk/man/i386_set_ldt.html

There's a Unicorn register LDTR which points at the LDT memory address. I'm actually not sure what the LDT does in kernel mode (as I'm not doing a usermode transition). A workaround for now would be to actually set a GDT.

There's experimental code in x86/linux.go:gdtWrite() to write GDT entries, but I don't remember if the upstream GDT problems were fixed.

To read the userspace segment descriptors, you can do s := u.StrucAt(addr), then s.Unpack(&descriptor), where descriptor is a struct type like...

type Descriptor struct {
    Addr uint32
    Size uint32
}

func() DoStuff() {
    var desc Descriptor
    s := u.StrucAt(addr)
    for i := 0; i < count; i++ {
        err := s.Unpack(&desc)
        if err != nil {
            return UINT64_MAX
        }
    }
}

You can do the opposite with s.Pack(&desc) to write descriptors back to guest memory.

@lunixbochs
Copy link
Owner Author

gdtWrite is actually working now, so maybe PureDarwin could be supported with this.

@MagaTailor
Copy link
Contributor

MagaTailor commented Sep 2, 2017

Thanks, I've just confirmed busybox-i686 is fully functional.

@lunixbochs
Copy link
Owner Author

Closing this. Will break out a couple of remaining tasks.

This was referenced Sep 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants