-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PLT accesses and symbol imports in ELF binaries #247
Comments
I started to work on this in order to support dynamically liked ELF files. My idea is to add a "symbol table" to the disassembler that tells the static analysis where (symbolic) addresses to certain functions are in memory (e.g. PLT). Then, we can recover that using an Abstract Interpretation like Bounded Address Tracking. As this is a interprocedural analysis (across functions) I have to implement a lot of infrastructure. Sadly I got sidetracked by fixing the GUI issues. You can take a look at the code here: https://github.com/flanfly/panopticon/tree/bounded-addr-tracking |
Iiuc, I'm also not sure how exactly bounded address tracking will help panopticon know a jmp slot is actually a jump slot to an import. Also every jmp slot actually has the value immediately computable, e.g., it is simply the next instructions address + the offset = GOT entry = inside the dynamic array, which means we know its name (now loading that symbols code up and matching that with the symbolic import name might require the above) So I was really only talking about identifying imported functions (and not having their actual contents resolved). This will also be useful to be for a couple of other unrelated reasons, in particular for a future tool which would need this information. Anyway, does that make any sense? Perhaps I'm missing something, or perhaps you're referring to both getting the imported functions (their name), and then also resolving their code bodies? |
Bounded address tracking simply replaces every value in the assembly code listing with a symbolic base plus offset. For example I thought it would be cool to resolve this using static analysis. My idea is that the ELF loader tells Panopticon where the GOT is located and what symbols it points to. The static analysis will recognize entry Thankfully, PLT resolution is easier. This should definitely be added. I'll try to get a simple version of bounded address tracking merged the next week. Then we only need a simple pass that renames all functions that jump to a symbolic address 1: https://github.com/lattera/glibc/blob/master/sysdeps/i386/start.S#L60 |
Hm, its still not precisely clear to me how my example above gets resolved in the manner I'm thinking. (getting But to illustrate, my use case is as follows:
What I mean by 2 is that it returns a list of every call; but some of those calls are "plt jump stubs" (but they're valid function calls nonetheless). It looks something like:
Consequently, even if bounded address tracking were to collapse the plt jump stub to:
I still think Unless you're saying that bounded address tracking when run on function targets would:
? So ideally, I'm thinking for my simple use case, calling panopticon on
(also the initial call site in the function for the import will also be important, e.g. at what address it occurs in the function, but I suspect that will be possible once its implemented) |
I want to use the dynamic relocation information to skip the lazy initialization completly and pretend it already has been run. So in in 0000000000000690 <bar@plt>:
jmp QWORD PTR [rip+0x20098a] # 201020 <_GLOBAL_OFFSET_TABLE_+0x20>
push 0x1
jmp 670 <_init+0x20> and dynamic relocation info:
The ELF loader will give us a |
Ah 👌, I see now. Yea this is a good idea, and it was the second pass I was worried about. It should be fairly simple to add a pltgot + got hashmap directly in the Elf struct. Iirc this should also be directly constructable from the information already present in the struct. I can work on updating it with this information, so you'll have the available map when needed. Incidentally, once this gets implemented might be nice for the functions "called functions" method to return list of local and imported functions, or have the function Enum contain this distinction. |
I'd like to nominate this as a crucial next step, and make this a priority. I keep bumping up against this issue all over the place, and I think it needs to get solved asap. Basically, besides from correctly determining which functions are called, almost all shared libraries will route local function calls through the PLT; glibc avoids this by some hackery I don't understand, but it involves messing with linker flags, and function visibility annotations, which most libraries and gcc wont' do by default. For example, from
That function
Consequently, panopticon only sees the callq to the plt stub. We need to fix this and teach panopticon to:
@flanfly How do you think we should do this? I remember you were working on something maybe, but not sure? I'd like to see this implemented sooner rather than later, as its a blocker on another project which is trying to use panopticon for analysis EDIT A blocker to this, as I would have implemented it already since the import symbol infrastructure is essentially in place is that I cannot get the value from the plt stub, because its RIP relative and I don't know how to access the value:
You'll note in the screenshot that the value after That is however just the mnemonics; I think I need to access and interpret the statement vector for the plt stub, but I'd need some pointers how to do this |
Minimal example demonstrating the problem #[no_mangle]
pub fn _print_rust() {
println!("deadbeef: {:#x}", 0xdeadbeefu64);
}
extern void _print_rust(void);
void foo () {
_print_rust();
}
As far as I know, cannot even do symbol vis and alias hackery to fix this, as this requires the alias to be in the same compilation unit, which it is not here. |
1. Adds function::Kind, now a field of functions 2. Adds update_plt to program, called in analysis, to rewrite candidate function stubs as their plt name, name@plt with kind updated 3. fixes bug in elf loader when pltrela symbol type was NO_TYPE (but is actually a function, because it's in the plt and its called like a function, see `pthread_mutexattr_destroy`)
Oh whoops this got implemented 😎 Also it works for mach too :) |
I'm wondering what we should do about PLT accesses (e.g., calling an imported symbol from another library), in particular, how to:
An example will be clearer, consider the following c file compiled with
gcc -shared foo.c -o libfoo.so
:Currently, panopticon correctly discovers every PLT jump stub (more on that in a bit), and the local unexported (e.g.,
internal1
above, andfunc_7bb
below) function.The PLT jump stubs are
func_650
tofunc_690
, and all have the same form (but reference different PLT entries):If we examine the function
ifoobar
in panopticon we see the following:Which are two PLT accesses to
func_670
andfunc_680
, (the PLT jump stubs forfoobar
andprintf
respectively).For the record, the PLT jump stubs are little entries created by the linker for imported symbols, which effectively implement lazy import of symbols as follows:
.data
" loadable area of the binary; on the programs first run, it is just the resolver function for the dynamic linker, which after resolving the function's address (hopefully), places this value in the GOT, so the next call jumps directly to the address of the imported function, instead of the resolver (thus implementing lazy loading of imported symbols)So, the first question is: what do we want to show here? I am for maximal explicitness, especially for beginners learning to look at disassembled code, but it would also be nice to:
printf@plt
, or something similar)gdb
performs something similar when we disassemble:(to be fair, gdb is using the section headers to locate the PLT and symbols and likely the symbol table for foobar, all of which is strippable)
The second question is, how should we do it? Is this the responsibility of goblin? Iirc, the actual PLT jump stubs aren't required to be located anywhere in the binary information (since the dynamic linker doesn't even need to know about the jump slots, only the PLT location of where to place the resolved addresses), so it might be better/more sensible to:
The text was updated successfully, but these errors were encountered: