-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ks_asm() Address to symbol offset. #45
Conversation
@@ -658,17 +658,19 @@ void MCAssembler::layout(MCAsmLayout &Layout) { | |||
uint64_t FixedValue; | |||
bool IsPCRel; | |||
std::tie(FixedValue, IsPCRel) = handleFixup(Layout, *F, Fixup); | |||
if (!IsPCRel) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be sure, you can check for X86 with:
if (getBackend().getArch() == KS_ARCH_X86)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I thought about that (or to move it further down to the X86 specific
version of handleFixup), but I'm not sure whether the problem happens on
other architectures as well.
There's another problem on ARM which prevents me from testing it there
(I'll open an issue about that later, ldr r0, =label
doesn't do what it
should do) and I'm not experienced enough with other archs for a quick
test.
In fact, I'm not even sure this is the right approach, it might be possible to add the address to the symbol itself which means the fixup will automatically have the correct absolute address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes pass it further down to X86 handler is another solution.
this issue looks arch-specific to me, so i am not sure how you add "address to symbol" in generic way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tested this with PPC code generation, it has the same issue without this address fixup so I'm going to assume this applies to all platforms. Not yet convinced this is the proper solution though, setting the fragment's offset to Address is probably a better idea.
By adding BaseAddress to MCContext, passing it from ks_asm and using it to initialise the address of the first fragment we make sure that symbols get the right absolute offset.
Much cleaner version. By setting the address of the first MCFragment to the address passed to ks_asm via the context, the label addresses get offset automatically. |
how can we be sure that this works for all other non-x86 architectures? |
From a theoretical point of view: keystone uses the ELF object (.o) streamer, ELF object files are relocatable as the final address isn't known until link time. The relocations are recorded in the .o file and the linker stitches it all together: All addresses in the binary code are relative to the section start and the linker offsets these so they become correct. This patch effectively patches the section start so all addresses become relative to the base address. From a technical point of view: The assembler and emitter have no idea where the code will live (well, you added it to the instruction class, but it's not generally used), addresses can only be relative to the current fragment because there is no other point of reference. From an (incomplete) empirical point of view: I've tested this on X86 and PPC and it happens on both those architectures. On ARM this is currently not testable because labels are interpreted as strings. |
yes basically we have to implement a simple "linker" that original assembler of LLVM does not do. merged, thanks! |
actually this causes a segfault on Ubuntu 14.04 64bit, on the testcase |
This passes the Address provided to ks_asm down to MCAssembler::layout where it's added to the fixed up value and applied to the encoded output.
Not sure what the impact is on other architectures and whether it's the right approach, but it seems to work for x86/x86_64.