Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Python binding for disasm_iter #2136

Merged
merged 4 commits into from
Aug 9, 2023

Conversation

fzakaria
Copy link
Contributor

@fzakaria fzakaria commented Aug 6, 2023

Add support for disasm_iter in the Cs class.

Note: I am clear on the performance of this implementation. While the C-implementation
makes claims how it's faster due to less memory allocations, my unfamiliarity with Python
does not give me the same strong assurance.

This provides merely the implementation and hopefully others can do benchmarking.

Here you can see it producing the same values between iter and lite:

❯ python test_lite.py | tail
0x100b:	pshs	cc, b, x, u
0x100d:	lda	, x++
0x100f:	sta	32767, x
0x1013:	lda	[$2017, pcr]
0x1017:	sta	[, x++]
0x1019:	lda	[$1000]
0x101d:	cmps	[4096, x]
0x1022:	rts	
0x1023:

❯ python test_iter.py | tail
0x100b:	pshs	cc, b, x, u
0x100d:	lda	, x++
0x100f:	sta	32767, x
0x1013:	lda	[$2017, pcr]
0x1017:	sta	[, x++]
0x1019:	lda	[$1000]
0x101d:	cmps	[4096, x]
0x1022:	rts	
0x1023:

fixes #2131

@fzakaria
Copy link
Contributor Author

fzakaria commented Aug 6, 2023

CC @kabeor

@@ -72,6 +72,7 @@ TESTS = test_basic.py test_detail.py test_arm.py test_arm64.py test_m68k.py test
TESTS += test_ppc.py test_sparc.py test_systemz.py test_x86.py test_xcore.py test_tms320c64x.py
TESTS += test_m680x.py test_skipdata.py test_mos65xx.py test_bpf.py test_riscv.py
TESTS += test_evm.py
TESTS += test_lite.py test_iter.py
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually noticed that test_lite.py was missing also.

@@ -511,6 +511,8 @@ def _setup_prototype(lib, fname, restype, *argtypes):
_setup_prototype(_cs, "cs_open", ctypes.c_int, ctypes.c_uint, ctypes.c_uint, ctypes.POINTER(ctypes.c_size_t))
_setup_prototype(_cs, "cs_disasm", ctypes.c_size_t, ctypes.c_size_t, ctypes.POINTER(ctypes.c_char), ctypes.c_size_t, \
ctypes.c_uint64, ctypes.c_size_t, ctypes.POINTER(ctypes.POINTER(_cs_insn)))
_setup_prototype(_cs, "cs_disasm_iter", ctypes.c_bool, ctypes.c_size_t, ctypes.POINTER(ctypes.POINTER(ctypes.c_char)), ctypes.POINTER(ctypes.c_size_t), \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to copy ctypes.c_char for the argtype like cs_disasm but the C API uses uint8_t which should map to c_ubyte.

Comment on lines +1240 to +1243
if view.readonly:
code = (ctypes.c_char * len(view)).from_buffer_copy(view)
else:
code = ctypes.pointer(ctypes.c_char.from_buffer(view))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly different here than the disasm_lite implementation.
I think it's the same effect of "copy if I can't write"

Comment on lines +8 to +29

X86_CODE16 = b"\x8d\x4c\x32\x08\x01\xd8\x81\xc6\x34\x12\x00\x00"
X86_CODE32 = b"\xba\xcd\xab\x00\x00\x8d\x4c\x32\x08\x01\xd8\x81\xc6\x34\x12\x00\x00"
X86_CODE64 = b"\x55\x48\x8b\x05\xb8\x13\x00\x00"
ARM_CODE = b"\xED\xFF\xFF\xEB\x04\xe0\x2d\xe5\x00\x00\x00\x00\xe0\x83\x22\xe5\xf1\x02\x03\x0e\x00\x00\xa0\xe3\x02\x30\xc1\xe7\x00\x00\x53\xe3"
ARM_CODE2 = b"\x10\xf1\x10\xe7\x11\xf2\x31\xe7\xdc\xa1\x2e\xf3\xe8\x4e\x62\xf3"
THUMB_CODE = b"\x70\x47\xeb\x46\x83\xb0\xc9\x68"
THUMB_CODE2 = b"\x4f\xf0\x00\x01\xbd\xe8\x00\x88\xd1\xe8\x00\xf0"
THUMB_MCLASS = b"\xef\xf3\x02\x80"
ARMV8 = b"\xe0\x3b\xb2\xee\x42\x00\x01\xe1\x51\xf0\x7f\xf5"
MIPS_CODE = b"\x0C\x10\x00\x97\x00\x00\x00\x00\x24\x02\x00\x0c\x8f\xa2\x00\x00\x34\x21\x34\x56"
MIPS_CODE2 = b"\x56\x34\x21\x34\xc2\x17\x01\x00"
MIPS_32R6M = b"\x00\x07\x00\x07\x00\x11\x93\x7c\x01\x8c\x8b\x7c\x00\xc7\x48\xd0"
MIPS_32R6 = b"\xec\x80\x00\x19\x7c\x43\x22\xa0"
ARM64_CODE = b"\x21\x7c\x02\x9b\x21\x7c\x00\x53\x00\x40\x21\x4b\xe1\x0b\x40\xb9"
PPC_CODE = b"\x80\x20\x00\x00\x80\x3f\x00\x00\x10\x43\x23\x0e\xd0\x44\x00\x80\x4c\x43\x22\x02\x2d\x03\x00\x80\x7c\x43\x20\x14\x7c\x43\x20\x93\x4f\x20\x00\x21\x4c\xc8\x00\x21"
PPC_CODE2 = b"\x10\x60\x2a\x10\x10\x64\x28\x88\x7c\x4a\x5d\x0f"
SPARC_CODE = b"\x80\xa0\x40\x02\x85\xc2\x60\x08\x85\xe8\x20\x01\x81\xe8\x00\x00\x90\x10\x20\x01\xd5\xf6\x10\x16\x21\x00\x00\x0a\x86\x00\x40\x02\x01\x00\x00\x00\x12\xbf\xff\xff\x10\xbf\xff\xff\xa0\x02\x00\x09\x0d\xbf\xff\xff\xd4\x20\x60\x00\xd4\x4e\x00\x16\x2a\xc2\x80\x03"
SPARCV9_CODE = b"\x81\xa8\x0a\x24\x89\xa0\x10\x20\x89\xa0\x1a\x60\x89\xa0\x00\xe0"
SYSZ_CODE = b"\xed\x00\x00\x00\x00\x1a\x5a\x0f\x1f\xff\xc2\x09\x80\x00\x00\x00\x07\xf7\xeb\x2a\xff\xff\x7f\x57\xe3\x01\xff\xff\x7f\x57\xeb\x00\xf0\x00\x00\x24\xb2\x4f\x00\x78"
XCORE_CODE = b"\xfe\x0f\xfe\x17\x13\x17\xc6\xfe\xec\x17\x97\xf8\xec\x4f\x1f\xfd\xec\x37\x07\xf2\x45\x5b\xf9\xfa\x02\x06\x1b\x10"
M68K_CODE = b"\xd4\x40\x87\x5a\x4e\x71\x02\xb4\xc0\xde\xc0\xde\x5c\x00\x1d\x80\x71\x12\x01\x23\xf2\x3c\x44\x22\x40\x49\x0e\x56\x54\xc5\xf2\x3c\x44\x00\x44\x7a\x00\x00\xf2\x00\x0a\x28\x4E\xB9\x00\x00\x00\x12\x4E\x75"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we captured the expected disassembly here :)

Maybe a follow up issue or PR could be to assert the values are correct and adopt pytest?

@kabeor
Copy link
Member

kabeor commented Aug 9, 2023

LGTM, Thanks! Merged.

@kabeor kabeor merged commit 2c3af48 into capstone-engine:next Aug 9, 2023
11 checks passed
@fzakaria fzakaria changed the title Add support for Python binding for diasm_iter Add support for Python binding for disasm_iter Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support cs_disasm_iter in Python API binding
3 participants