-
Notifications
You must be signed in to change notification settings - Fork 57
Description
Problem
When mwcc compiles C code that references BSS/data symbols defined in the same translation unit, it generates section-relative relocations (e.g., .bss.0+0x1234) instead of named symbol relocations (e.g., hsd_804CF810). The target object file (from DTK splitting the original binary) uses named symbol relocations for the same addresses.
objdiff counts these as relocation mismatches, penalizing the match percentage even though the instruction bytes are completely identical.
Concrete Example
In the SSBM decomp (particle.c), all 57 non-100% functions have byte-identical instructions but are capped below 100% due to relocation metadata differences.
Example: hsd_80392E80 (92.85% match, 0% instruction difference)
Target relocations:
+000e R_PPC_ADDR16_HA ...data.0
+001a R_PPC_ADDR16_LO ...data.0
+0022 R_PPC_ADDR16_HA ...bss.0
+0026 R_PPC_ADDR16_LO ...bss.0
+004e R_PPC_EMB_SDA21 hsd_804D7898
Compiled relocations (same instructions):
+000e R_PPC_ADDR16_HA lbl_8040A540
+0022 R_PPC_ADDR16_LO lbl_8040A540
+001a R_PPC_ADDR16_HA hsd_804CE728
+002a R_PPC_ADDR16_LO hsd_804CE728
+004c R_PPC_EMB_SDA21 hsd_804D7898
The target uses ...data.0 and ...bss.0 (section-relative), while the compiled output uses named symbols (lbl_8040A540, hsd_804CE728). Both resolve to the same addresses at link time.
Similarly, anonymous local symbols (@551, @667, etc.) in the target don't match named symbols in the compiled output, even when they reference the same data.
Impact
In particle.c alone:
- 57 functions are capped below 100% despite perfect instruction matches
.textsection shows 91.45% instead of the true ~100%.datasection shows 1.46% due to massive section size mismatch (target: 15568 bytes vs compiled: 4328 bytes)
Possible Solutions
-
Relocation equivalence: When comparing relocations, resolve section-relative references (
...data.0+offset) to their corresponding named symbols and compare by resolved address rather than symbol name. -
Fuzzy relocation matching: Score relocations as matching when they have the same type and resolve to the same final address, regardless of symbol naming.
-
Instruction-only scoring mode: Provide an option to score only on instruction bytes, ignoring relocation metadata entirely (for cases where the linker produces correct output regardless of relocation format).
Environment
- objdiff-cli v3.0.0
- decomp-toolkit v1.6.2
- mwcc (Metrowerks CodeWarrior for GC/Wii)
- Project: SSBM Decomp