Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pcode emulation fails if address space names are not lower case #4233

Open
Whatang opened this issue Oct 18, 2023 · 0 comments · May be fixed by #4234
Open

Pcode emulation fails if address space names are not lower case #4233

Whatang opened this issue Oct 18, 2023 · 0 comments · May be fixed by #4234
Labels
bug Something is broken

Comments

@Whatang
Copy link

Whatang commented Oct 18, 2023

Description

When emulating with the Pcode engine, multiple functions in angr.engines.pcode.emulate check address space names against fixed strings "ram", "mem", "register", "unique", etc. These space names are set in the .sinc/.slaspec files for the relevant architecture in pypcode.

For an architecture like ARM, this is fine since the same strings are used to name the address spaces: for example in https://github.com/angr/pypcode/blob/master/pypcode/processors/ARM/data/languages/ARM.sinc we have

define space ram type=ram_space size=4 default;
define space register type=register_space size=4;

However, some architectures use different names for their address spaces. For example, for the MSP430 in https://github.com/angr/pypcode/blob/master/pypcode/processors/TI_MSP430/data/languages/TI430Common.sinc we have

define space RAM type=ram_space size=$(REG_SIZE) default;
define space register type=register_space size=2;

The case difference ("RAM" rather than "ram") here causes problems in angr.engines.pcode.emulate, in _set_value, _get_value, _execute_load, and execute_store: a check is made against the name of the target address space and since "RAM" does not match "ram", an error is incorrectly thrown. This can be fixed by lowercasing the space names before doing these checks.

More generally, is this checking the wrong thing? Should it be checking the address space type rather than its name? It seems that pypcode does not currently expose the address space type to Python though.

Steps to reproduce the bug

Here's a script I've been using to load up microcorruption MSP430 images to try to use angr's pcode emulation. (I know there's a better way of doing this with angr-platforms, but I'm using this as a way to teach myself how to use the pcode engine for architectures that angr doesn't support but for which Ghidra has a SLEIGH definition)

import angr
import pypcode
import claripy
import archinfo
import cle
import io

# Define a backend which can read the image

class MCBlobBackend(cle.Backend):
    def __init__(self, binary, binary_stream, *args, **kwargs):
        if "arch" not in kwargs:
            kwargs["arch"] = "msp430"
        super().__init__(binary, binary_stream, entry_point=0x4400, base_address = 0, *args,  **kwargs)
        self._min_addr = 0
        self._max_addr = 0xFFFF
        binary_stream.seek(0)
        all_data: bytes = binary_stream.read(0x10000)
        flash_rom_start = 0xffe0
        while all_data[flash_rom_start-16:flash_rom_start] != b"\x00" * 16:
            flash_rom_start -=  16
        ram = all_data[0x200:flash_rom_start].rstrip(b"\x00")
        if (0x200 + len(ram)) % 0x1000 > 0:
            ram += b"\x00" * (0x1000 - ((0x200 + len(ram)) % 0x1000)) 

        self.memory.add_backer(0,all_data[:0x200]) 
        self.segments.append(cle.backends.Segment(0,0,0x200, 0x200))
        self.memory.add_backer(0x200, ram)
        self.segments.append(cle.backends.Segment(0x200,0x200, len(ram), len(ram)))
        self.memory.add_backer(flash_rom_start, all_data[flash_rom_start:0xffe0])
        self.segments.append(cle.backends.Segment(flash_rom_start,flash_rom_start,0xffe0 - flash_rom_start, 0xffe0 - flash_rom_start))
        self.memory.add_backer(0xffe0, all_data[0xffe0:])
        self.segments.append(cle.backends.Segment(0xffe0,0xffe0,0x20, 0x20))
        binary_stream.seek(0)
        

    @staticmethod
    def is_compatible(stream: io.BufferedReader):
        stream.seek(0, io.SEEK_END)
        stream_size = stream.tell()
        if stream_size < 0x10000:
            return False
        stream.seek(16)
        interrupt_bytes = stream.read(2)
        return interrupt_bytes == b"\x30\x41"

cle.register_backend("microcorruption_bin", MCBlobBackend)

# Find the right MSP430 pcode language in pypcode

msp430_lang = None
for arch in pypcode.Arch.enumerate():
    for lang in arch.languages:
        if lang.id == "TI_MSP430:LE:16:default":
            msp430_lang = lang
            break
    if msp430_lang is not None:
        break

# Create an archinfo.Arch object for that pcode language
pcode_arch = archinfo.ArchPcode(msp430_lang)

# Create a project
p = angr.Project(r"memory.bin", arch=pcode_arch, main_opts={"backend":"microcorruption_bin"}, load_options = {"rebase_granularity": 0x1000})

# Create a sim manager and try to step the entry state
sm = p.factory.simgr(p.factory.entry_state())
sm.step()

memory.zip - ZIP file containing memory.bin used by the above script

Environment

(microcorruption_310) C:\Users\mike\Documents\projects\angr\microcorruption>python -m angr.misc.bug_report
C:\Users\mike\venvs\microcorruption_310\lib\site-packages\angr\misc\bug_report.py:1: DeprecationWarning: the imp module is deprecated in favour of importlib and slated for removal in Python 3.12; see the module's documentation for alternative uses
import imp
angr environment report

Date: 2023-10-18 15:25:19.380351
Running in virtual environment at C:\Users\mike\venvs\microcorruption_310
C:\Users\mike\venvs\microcorruption_310\lib\site-packages\angr\misc\bug_report.py:88: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
import pkg_resources # pylint:disable=import-outside-toplevel
Platform: win-amd64
Python version: 3.10.4 (tags/v3.10.4:9d38120, Mar 23 2022, 23:13:41) [MSC v.1929 64 bit (AMD64)]
######## angr #########
Python found it in C:\Users\mike\venvs\microcorruption_310\lib\site-packages\angr
Pip version angr 9.2.73
Couldn't find git info
######## ailment #########
Python found it in C:\Users\mike\venvs\microcorruption_310\lib\site-packages\ailment
Pip version ailment 9.2.73
Couldn't find git info
######## cle #########
Python found it in C:\Users\mike\venvs\microcorruption_310\lib\site-packages\cle
Pip version cle 9.2.73
Couldn't find git info
######## pyvex #########
Python found it in C:\Users\mike\venvs\microcorruption_310\lib\site-packages\pyvex
Pip version pyvex 9.2.73
Couldn't find git info
######## claripy #########
Python found it in C:\Users\mike\venvs\microcorruption_310\lib\site-packages\claripy
Pip version claripy 9.2.73
Couldn't find git info
######## archinfo #########
Python found it in C:\Users\mike\venvs\microcorruption_310\lib\site-packages\archinfo
Pip version archinfo 9.2.73
Couldn't find git info
######## z3 #########
Python found it in C:\Users\mike\venvs\microcorruption_310\lib\site-packages\z3
Pip version z3-solver 4.10.2.0
Couldn't find git info
######## unicorn #########
Python found it in C:\Users\mike\venvs\microcorruption_310\lib\site-packages\unicorn
Pip version unicorn 2.0.1.post1
Couldn't find git info
######### Native Module Info ##########
angr: <CDLL 'C:\Users\mike\venvs\microcorruption_310\lib\site-packages\angr\lib\angr_native.dll', handle 7ff878150000 at 0x20d7dbe7130>
unicorn: <CDLL 'C:\Users\mike\venvs\microcorruption_310\lib\site-packages\unicorn\lib\unicorn.dll', handle 7ff87cad0000 at 0x20d7b9ca8c0>
pyvex: <cffi.api._make_ffi_library..FFILibrary object at 0x0000020D7B727880>
z3: <CDLL 'C:\Users\mike\venvs\microcorruption_310\Lib\site-packages\z3\lib\libz3.dll', handle 7ff87dbe0000 at 0x20d7a9d1720>

Additional context

No response

@Whatang Whatang added bug Something is broken needs-triage Issue has yet to be looked at by a maintainer labels Oct 18, 2023
@Whatang Whatang linked a pull request Oct 18, 2023 that will close this issue
@ltfish ltfish removed the needs-triage Issue has yet to be looked at by a maintainer label Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is broken
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants