Skip to content

Section-relative relocations (.bss.0, ...data.0) penalize match scoring despite identical instructions #333

@malvarezcastillo

Description

@malvarezcastillo

Problem

When mwcc compiles C code that references BSS/data symbols defined in the same translation unit, it generates section-relative relocations (e.g., .bss.0+0x1234) instead of named symbol relocations (e.g., hsd_804CF810). The target object file (from DTK splitting the original binary) uses named symbol relocations for the same addresses.

objdiff counts these as relocation mismatches, penalizing the match percentage even though the instruction bytes are completely identical.

Concrete Example

In the SSBM decomp (particle.c), all 57 non-100% functions have byte-identical instructions but are capped below 100% due to relocation metadata differences.

Example: hsd_80392E80 (92.85% match, 0% instruction difference)

Target relocations:

+000e R_PPC_ADDR16_HA   ...data.0
+001a R_PPC_ADDR16_LO   ...data.0
+0022 R_PPC_ADDR16_HA   ...bss.0
+0026 R_PPC_ADDR16_LO   ...bss.0
+004e R_PPC_EMB_SDA21   hsd_804D7898

Compiled relocations (same instructions):

+000e R_PPC_ADDR16_HA   lbl_8040A540
+0022 R_PPC_ADDR16_LO   lbl_8040A540
+001a R_PPC_ADDR16_HA   hsd_804CE728
+002a R_PPC_ADDR16_LO   hsd_804CE728
+004c R_PPC_EMB_SDA21   hsd_804D7898

The target uses ...data.0 and ...bss.0 (section-relative), while the compiled output uses named symbols (lbl_8040A540, hsd_804CE728). Both resolve to the same addresses at link time.

Similarly, anonymous local symbols (@551, @667, etc.) in the target don't match named symbols in the compiled output, even when they reference the same data.

Impact

In particle.c alone:

  • 57 functions are capped below 100% despite perfect instruction matches
  • .text section shows 91.45% instead of the true ~100%
  • .data section shows 1.46% due to massive section size mismatch (target: 15568 bytes vs compiled: 4328 bytes)

Possible Solutions

  1. Relocation equivalence: When comparing relocations, resolve section-relative references (...data.0+offset) to their corresponding named symbols and compare by resolved address rather than symbol name.

  2. Fuzzy relocation matching: Score relocations as matching when they have the same type and resolve to the same final address, regardless of symbol naming.

  3. Instruction-only scoring mode: Provide an option to score only on instruction bytes, ignoring relocation metadata entirely (for cases where the linker produces correct output regardless of relocation format).

Environment

  • objdiff-cli v3.0.0
  • decomp-toolkit v1.6.2
  • mwcc (Metrowerks CodeWarrior for GC/Wii)
  • Project: SSBM Decomp

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions