Skip to content

[x86] Linear sweep creates functions starting within ASCII literals and zero padding #8142

@bdash

Description

@bdash

Version and Platform (required):

  • Binary Ninja Version: 5.4.9560-dev Ultimate, 96a1058c
  • OS: macos
  • OS Version: 26.3.1
  • CPU Architecture: arm64

Bug Description:
When opening an x86 firmware ROM, linear sweep creates functions that start within ASCII string literals. In this ROM there tend to be ASCII literals between functions, followed by some padding (either 00 or one of several nop patterns), followed by the actual function start. Linear sweep seems to consistently create the functions early. It also creates functions where none exist in a sequence of ASCII string literals.

Steps To Reproduce:

  1. Load pin2000_50069_0140_game.rom from blessed jewel excels lightly with a base address of 0x100000. It is 32-bit x86 code.
  2. Look at the following addresses:
    • 0x00102481: should be DeffAttrInsrtCnRun\x00 followed by push ebp, but the ASCII is interpreted as instructions.
    • 0x001abf8b: should be midway through pdb\x00 followed by several other string literals, but b\x00 and the subsequent string literals are interpreted as instructions.
    • 0x002c59a2: should be \x00\x00 padding between functions, but is interpreted as a function starting with add byte [eax], al.
    • 0x002ed5bc: This is a sequence of \x00\x00\x00\x00 bytes, but is interpreted as a function starting with multiple add byte [eax], al instructions.

I noticed that most of the misdetected functions end up with the regparm calling convention applied to them, so the following Python snippet is a good way to find many other instances of this problem: [f for f in bv.functions if f.calling_convention.name == 'regparm']. There are ~1,200 occurrences in this binary.

Additional Information:
To work around this I ended up cooking up some Python scripts that looked for patterns of instructions at the start of functions that looked like ASCII or like nops.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Arch: x86Issues with the x86/x64 architecture pluginComponent: CoreIssue needs changes to the coreImpact: MediumIssue is impactful with a bad, or no, workaround

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions