Skip to content

"Relocation table synthesizer" Analysis False Negative, Symbol Name Confusion, and Comments #6

@widberg

Description

@widberg

Following our conversation in #1 I have put together a small test case. The archive delinker_issue.zip contains an executable with a switch statement in main compiled and linked with the Visual Studio 2005 toolchain, the source code for the main function, and delinker artifacts. The "Relocation table synthesizer" misidentifies an absolute operand as not needing a relocation. Granted this is as compared to my model of unlinking with functions as atomic units, you may disagree with what should and should not be relocated.

False negative: The MOVZX at 00401029 has an absolute address operand which is not found to need relocation but should be. The delinker's output says 00401029> No relocation emitted for instruction with interesting primary reference. so it considered it. The target of this operand is the switch statement value table starting at 004010d8.

True positives: The indirect JMP at 00401030 and all of the addresses in the jump table starting at 004010a4 are correctly identified as needing a relocation!

Comment on intra-function unlinking: In my opinion, the relative jumps at 00401008 and 00401027 should not need relocations since they are relative jumps landing within the same function. These labels within a function for branch targets will not move relative to the function start or each other, so it is weird to me to treat them like they are transient. However, it is not wrong to treat them as relocatable as long as the operand width is considered when generating the relocations. Also, without reassembly, if the symbol is defined at an address too far away to fit in the operand, there will be problems. Of course, relative jumps between functions still need to be relocated, which I didn't find any problems with. I don't think you need to change any behavior here, I just wanted to comment on it because it stood out to me.

More granular exports: In some use cases you may not want the entire binary to be exported into a single object file. Exporting individual functions or data elements can be useful in eliminating dead code while relinking, which there will be a lot of if you only want a small portion of the program. Also, I have some rough compiler-specific heuristics for identifying original object file boundaries, which can help identify which functions came from the same source file. So it would be nice to export based on the boundaries I have identified.

A note about switch statement jump and value tables: I consider these tables to belong to the function in which the switch statement that refers to them is located. If you allow unlinking individual functions or groups of functions into individual object files, then these tables should appear in the same object file as the function. As opposed to treating them like their own standalone data elements. I say this because both Ghidra and IDA do not consider these tables in the function's range. So if you were to just use the functions range to determine which bytes get put in the object file you would miss the tables.

Sorry for writing so much, unlinking is just really fun. I'd love to hear your thoughts on everything, there aren't many people to talk with about this kind of thing.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions