Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A new symbolic memory model based on the QF_ABV SMT logic #1185

Closed
JonathanSalwan opened this issue Sep 25, 2022 · 2 comments
Closed

A new symbolic memory model based on the QF_ABV SMT logic #1185

JonathanSalwan opened this issue Sep 25, 2022 · 2 comments

Comments

@JonathanSalwan
Copy link
Owner

JonathanSalwan commented Sep 25, 2022

Last edit: Oct 05 2022.

Description

The choices are a matter of tradeoff and in order to scale the dynamic symbolic execution on million instructions we made the choice (7 years ago) to only rely on the QF_BV logic. Which means that our symbolic expressions only contain bitvector operators and no array. Thus, our constraints sent to the SMT solver are easier to solve. However, in some cases it could be great to reason on symbolic pointers. So here we are and this contribution provides 3 new modes:

  • MEMORY_ARRAY: Enables the symbolic pointers reasoning (QF_ABV logic). When this mode is not enabled, which is the case by default, the QF_BV memory model is applied. So, this contribution will not impact your analysis tools if you do not use this mode.
  • SYMBOLIZE_LOAD: Keeps symbolic expressions on load indexes (concretize them otherwise).
  • SYMBOLIZE_STORE: Keeps symbolic expressions on store indexes (concretize them otherwise).

Important note

This mode looks to work but still experimental. I can already tell you that it will complexify a lot constraints and increase the RAM consumption a lot. I'm still working on this mode and I will try to add optimizations so that we can free memory expressions and thus keep a descent RAM consumption. However, even with those cons, this mode has the merit of existing 🙂

Short example

In this example we store the constant 0xdead to a fixed memory address 0x1032. Then, we symbolize the rsi register which is used as an index by the mov rcx, [rsi] instruction after a xor computation. Then, we constraint rcx to be equal to 0xdead.

from triton import *

ctx = TritonContext(ARCH.X86_64)
ctx.setMode(MODE.MEMORY_ARRAY, True)     # Enable the array logic
ctx.setMode(MODE.SYMBOLIZE_LOAD, True)   # Symbolize load indexing
ctx.setMode(MODE.SYMBOLIZE_STORE, True)  # Symbolize store indexing

ctx.symbolizeRegister(ctx.registers.rsi, 's_rsi')

code = [
    b"\x48\xc7\xc0\x00\x10\x00\x00", # mov rax, 0x1000
    b"\x48\xc7\xc3\x32\x00\x00\x00", # mov rbx, 0x32
    b"\xc7\x04\x18\xad\xde\x00\x00", # mov dword ptr [rax + rbx], 0xdead
    b"\x48\x81\xf6\xef\xbe\x00\x00", # xor rsi, 0xbeef
    b"\x48\x8b\x0e", # mov rcx, [rsi]
    b"\x48\x81\xf9\xad\xde\x00\x00", # cmp rcx, 0xdead
]

for op in code:
    i = Instruction(op)
    ctx.processing(i)
    print(i)
    for se in i.getSymbolicExpressions():
        print(se)
    print()

zf = ctx.getRegisterAst(ctx.registers.zf)
m = ctx.getModel(zf == 1)
print(m)

output

$ python array.py
0x0: mov rax, 0x1000
(define-fun ref!1 () (_ BitVec 64) (_ bv4096 64)) ; MOV operation
(define-fun ref!2 () (_ BitVec 64) (_ bv7 64)) ; Program Counter

0x7: mov rbx, 0x32
(define-fun ref!3 () (_ BitVec 64) (_ bv50 64)) ; MOV operation
(define-fun ref!4 () (_ BitVec 64) (_ bv14 64)) ; Program Counter

0xe: mov dword ptr [rax + rbx], 0xdead
(declare-fun ref!5 () (Array (_ BitVec 64) (_ BitVec 8)))
(define-fun ref!6 () (Array (_ BitVec 64) (_ BitVec 8)) (store ref!5 (bvadd (bvadd (bvadd ref!1 (bvmul ref!3 (_ bv1 64))) (_ bv0 64)) (_ bv3 64)) ((_ extract 31 24) (_ bv57005 32)))) ; Byte reference - MOV operation
(define-fun ref!7 () (Array (_ BitVec 64) (_ BitVec 8)) (store ref!6 (bvadd (bvadd (bvadd ref!1 (bvmul ref!3 (_ bv1 64))) (_ bv0 64)) (_ bv2 64)) ((_ extract 23 16) (_ bv57005 32)))) ; Byte reference - MOV operation
(define-fun ref!8 () (Array (_ BitVec 64) (_ BitVec 8)) (store ref!7 (bvadd (bvadd (bvadd ref!1 (bvmul ref!3 (_ bv1 64))) (_ bv0 64)) (_ bv1 64)) ((_ extract 15 8) (_ bv57005 32)))) ; Byte reference - MOV operation
(define-fun ref!9 () (Array (_ BitVec 64) (_ BitVec 8)) (store ref!8 (bvadd (bvadd (bvadd ref!1 (bvmul ref!3 (_ bv1 64))) (_ bv0 64)) (_ bv0 64)) ((_ extract 7 0) (_ bv57005 32)))) ; Byte reference - MOV operation
(define-fun ref!10 () (_ BitVec 32) (_ bv57005 32)) ; Original memory access - MOV operation
(define-fun ref!11 () (_ BitVec 64) (_ bv21 64)) ; Program Counter

0x15: xor rsi, 0xbeef
(define-fun ref!12 () (_ BitVec 64) (bvxor ref!0 (_ bv48879 64))) ; XOR operation
(define-fun ref!13 () (_ BitVec 1) (_ bv0 1)) ; Clears carry flag
(define-fun ref!14 () (_ BitVec 1) (_ bv0 1)) ; Clears overflow flag
(define-fun ref!15 () (_ BitVec 1) (bvxor (bvxor (bvxor (bvxor (bvxor (bvxor (bvxor (bvxor (_ bv1 1) ((_ extract 0 0) ref!12)) ((_ extract 1 1) ref!12)) ((_ extract 2 2) ref!12)) ((_ extract 3 3) ref!12)) ((_ extract 4 4) ref!12)) ((_ extract 5 5) ref!12)) ((_ extract 6 6) ref!12)) ((_ extract 7 7) ref!12))) ; Parity flag
(define-fun ref!16 () (_ BitVec 1) ((_ extract 63 63) ref!12)) ; Sign flag
(define-fun ref!17 () (_ BitVec 1) (ite (= ref!12 (_ bv0 64)) (_ bv1 1) (_ bv0 1))) ; Zero flag
(define-fun ref!18 () (_ BitVec 64) (_ bv28 64)) ; Program Counter

0x1c: mov rcx, qword ptr [rsi]
(define-fun ref!19 () (_ BitVec 64) (concat (select ref!9 (bvadd (bvadd (bvadd ref!12 (bvmul (_ bv0 64) (_ bv1 64))) (_ bv0 64)) (_ bv7 64))) (select ref!9 (bvadd (bvadd (bvadd ref!12 (bvmul (_ bv0 64) (_ bv1 64))) (_ bv0 64)) (_ bv6 64))) (select ref!9 (bvadd (bvadd (bvadd ref!12 (bvmul (_ bv0 64) (_ bv1 64))) (_ bv0 64)) (_ bv5 64))) (select ref!9 (bvadd (bvadd (bvadd ref!12 (bvmul (_ bv0 64) (_ bv1 64))) (_ bv0 64)) (_ bv4 64))) (select ref!9 (bvadd (bvadd (bvadd ref!12 (bvmul (_ bv0 64) (_ bv1 64))) (_ bv0 64)) (_ bv3 64))) (select ref!9 (bvadd (bvadd (bvadd ref!12 (bvmul (_ bv0 64) (_ bv1 64))) (_ bv0 64)) (_ bv2 64))) (select ref!9 (bvadd (bvadd (bvadd ref!12 (bvmul (_ bv0 64) (_ bv1 64))) (_ bv0 64)) (_ bv1 64))) (select ref!9 (bvadd (bvadd (bvadd ref!12 (bvmul (_ bv0 64) (_ bv1 64))) (_ bv0 64)) (_ bv0 64))))) ; MOV operation
(define-fun ref!20 () (_ BitVec 64) (_ bv31 64)) ; Program Counter

0x1f: cmp rcx, 0xdead
(define-fun ref!21 () (_ BitVec 64) (bvsub ref!19 (_ bv57005 64))) ; CMP operation
(define-fun ref!22 () (_ BitVec 1) (ite (= (_ bv16 64) (bvand (_ bv16 64) (bvxor ref!21 (bvxor ref!19 (_ bv57005 64))))) (_ bv1 1) (_ bv0 1))) ; Adjust flag
(define-fun ref!23 () (_ BitVec 1) ((_ extract 63 63) (bvxor (bvxor ref!19 (bvxor (_ bv57005 64) ref!21)) (bvand (bvxor ref!19 ref!21) (bvxor ref!19 (_ bv57005 64)))))) ; Carry flag
(define-fun ref!24 () (_ BitVec 1) ((_ extract 63 63) (bvand (bvxor ref!19 (_ bv57005 64)) (bvxor ref!19 ref!21)))) ; Overflow flag
(define-fun ref!25 () (_ BitVec 1) (bvxor (bvxor (bvxor (bvxor (bvxor (bvxor (bvxor (bvxor (_ bv1 1) ((_ extract 0 0) ref!21)) ((_ extract 1 1) ref!21)) ((_ extract 2 2) ref!21)) ((_ extract 3 3) ref!21)) ((_ extract 4 4) ref!21)) ((_ extract 5 5) ref!21)) ((_ extract 6 6) ref!21)) ((_ extract 7 7) ref!21))) ; Parity flag
(define-fun ref!26 () (_ BitVec 1) ((_ extract 63 63) ref!21)) ; Sign flag
(define-fun ref!27 () (_ BitVec 1) (ite (= ref!21 (_ bv0 64)) (_ bv1 1) (_ bv0 1))) ; Zero flag
(define-fun ref!28 () (_ BitVec 64) (_ bv38 64)) ; Program Counter

{0: s_rsi:64 = 0xaedd}
@JonathanSalwan JonathanSalwan self-assigned this Sep 25, 2022
JonathanSalwan added a commit that referenced this issue Sep 25, 2022
JonathanSalwan added a commit that referenced this issue Sep 25, 2022
JonathanSalwan added a commit that referenced this issue Sep 26, 2022
JonathanSalwan added a commit that referenced this issue Sep 27, 2022
@JonathanSalwan
Copy link
Owner Author

JonathanSalwan commented Sep 27, 2022

Still in progress but it smells good. We are now able to solve our first crackme using the ABV logic which is, I remind, a new memory model. The only difference between the old solve.py and the new one is that we just enable the MEMORY_ARRAY mode. I've also added other modes like CONSTANT_FOLDING and AST_OPTIMIZATIONS to stress the new memory model and everything looks ok.

--- src/examples/python/ctf-writeups/defcamp-2015-r100/solve.py	2022-09-27 19:04:26.043125028 +0200
+++ src/examples/python/ctf-writeups/defcamp-2015-r100/solve-with-abv-logic.py	2022-09-27 19:03:01.033130502 +0200
@@ -122,8 +88,11 @@
     ctx = TritonContext(ARCH.X86_64)
 
     # Define symbolic optimizations
-    ctx.setMode(MODE.ALIGNED_MEMORY, True)
-    ctx.setMode(MODE.ONLY_ON_SYMBOLIZED, True)
+    ctx.setMode(MODE.CONSTANT_FOLDING, True)
+    ctx.setMode(MODE.AST_OPTIMIZATIONS, True)
+    ctx.setMode(MODE.MEMORY_ARRAY, True)
+    ctx.setMode(MODE.SYMBOLIZE_LOAD, True)
+    ctx.setMode(MODE.SYMBOLIZE_STORE, True)
 
     # Load the binary
     loadBinary(os.path.join(os.path.dirname(__file__), 'r100.bin'))

@JonathanSalwan JonathanSalwan changed the title Symbolic memory array A new symbolic memory model based on the QF_ABV SMT logic Sep 27, 2022
JonathanSalwan added a commit that referenced this issue Sep 29, 2022
JonathanSalwan added a commit that referenced this issue Sep 29, 2022
JonathanSalwan added a commit that referenced this issue Sep 29, 2022
JonathanSalwan added a commit that referenced this issue Sep 29, 2022
JonathanSalwan added a commit that referenced this issue Sep 29, 2022
JonathanSalwan added a commit that referenced this issue Sep 29, 2022
JonathanSalwan added a commit that referenced this issue Oct 3, 2022
JonathanSalwan added a commit that referenced this issue Oct 3, 2022
JonathanSalwan added a commit that referenced this issue Oct 3, 2022
JonathanSalwan added a commit that referenced this issue Oct 4, 2022
JonathanSalwan added a commit that referenced this issue Oct 4, 2022
JonathanSalwan added a commit that referenced this issue Oct 4, 2022
JonathanSalwan added a commit that referenced this issue Oct 4, 2022
JonathanSalwan added a commit that referenced this issue Oct 4, 2022
JonathanSalwan added a commit that referenced this issue Oct 4, 2022
JonathanSalwan added a commit that referenced this issue Oct 4, 2022
JonathanSalwan added a commit that referenced this issue Oct 4, 2022
JonathanSalwan added a commit that referenced this issue Oct 4, 2022
@JonathanSalwan
Copy link
Owner Author

JonathanSalwan commented Oct 4, 2022

Done, it works. We now we have to think about an optimization to deal with the memory state to avoid RAM consumption.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant