Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SMT Array #806

Closed
JonathanSalwan opened this issue Oct 2, 2019 · 6 comments
Closed

SMT Array #806

JonathanSalwan opened this issue Oct 2, 2019 · 6 comments

Comments

@JonathanSalwan
Copy link
Owner

#723

@JonathanSalwan
Copy link
Owner Author

JonathanSalwan commented Oct 13, 2019

Note for myself. I still have to do:

  • Synchro between both memory models.
  • Memory consumption related to SSA on Array.
  • Concrete evaluation of a SELECT node.

Below a poc:

## Result:
## {0: SymVar_0:64 = 0x100}

from triton import *

ctx = TritonContext()
ctx.setArchitecture(ARCH.X86_64)
ctx.enableMode(MODE.SYMBOLIC_LOAD, True)
ctx.enableMode(MODE.SYMBOLIC_STORE, True)
ctx.enableMode(MODE.AST_OPTIMIZATIONS, True)

code = [
    b"\x48\xc7\xc7\x00\x01\x00\x00", # mov rdi, 0x100
    b"\x48\x89\xf1", # mov rcx, rsi
    b"\x48\xc7\xc0\x03\x00\x00\x00", # mov rax, 3
    b"\x48\x89\x07", # mov [rdi], rax
    b"\x48\x8b\x11", # mov rdx,  [rcx]
]

ctx.convertRegisterToSymbolicVariable(ctx.registers.rsi)

for op in code:
    inst = Instruction(op)
    ctx.processing(inst)
    print(inst)
    for se in inst.getSymbolicExpressions():
        print('\t %s' %(str(se)))
    print()

ast = ctx.getAstContext()
rdx = ctx.getSymbolicRegister(ctx.registers.rdx).getAst()
m = ctx.getModel(rdx == 0x3)
print(m)

@JonathanSalwan
Copy link
Owner Author

Could be easy to add concretization policies on a static IR (#473)

@illera88
Copy link
Contributor

illera88 commented Sep 2, 2020

Hey Jonathan!

Any update on this issue?

Thanks!

@JonathanSalwan JonathanSalwan modified the milestones: v0.9, v0.x Feb 7, 2022
JonathanSalwan added a commit that referenced this issue Feb 24, 2022
JonathanSalwan added a commit that referenced this issue Feb 24, 2022
JonathanSalwan added a commit that referenced this issue Feb 25, 2022
@JonathanSalwan JonathanSalwan modified the milestones: v1.x, v1.0 Feb 25, 2022
JonathanSalwan added a commit that referenced this issue Feb 25, 2022
JonathanSalwan added a commit that referenced this issue Feb 25, 2022
JonathanSalwan added a commit that referenced this issue Feb 25, 2022
@SweetVishnya
Copy link
Contributor

@JonathanSalwan, could you write some points that you discovered during brainstorming? Will SMT Array affect a memory model? SMT Array are useful for reversing and verification stuff (I personally verified ROP gadgets with them). However, I consider them too slow for DSE.

JonathanSalwan added a commit that referenced this issue Feb 25, 2022
@JonathanSalwan
Copy link
Owner Author

JonathanSalwan commented Feb 25, 2022

Will SMT Array affect a memory model?

No.

SMT Array are useful for reversing and verification stuff (I personally verified ROP gadgets with them). However, I consider them too slow for DSE.

100% agree with that.

could you write some points that you discovered during brainstorming?

For the moment, I'm only adding ARRAY, SELECT and STORE operators in our representation. It means that we can craft SMT expressions using the ABV logic if we want. I'm also adding the support of those operators in components like lifting engines (dot, llvm, z3 and bitwuzla). For the moment I'm not planning to modify how Triton deals with memory access regarding its DSE engine. For two main reasons:

  • A DSE (in general) using array not scale at all.
  • A DSE (with triton) using array will consume a lot of RAM (due to the SSA form of the store and select nodes).

So, I will not modify SymbolicEngine and <Arch>Semantics. I'm only adding those nodes in AstContext. Okay, I'm writing everything we need to support (one day) a DSE using array (1) but it's not my first goal. My goal is to provide to the user the possibility to craft its own expressions using the ABV logic if he want. As Triton aims to be a library that provides different kinds of analysis, it may be useful to the user to have such operators. For example, since today, we can do this now:

>>> from triton import *

>>> ctx = TritonContext(ARCH.X86_64)
>>> ast = ctx.getAstContext()
>>> var = ast.variable(ctx.newSymbolicVariable(64, 'index'))
>>> mem = ast.array(64) # 64 is the indexing size of the array

>>> # Storing 0xcc at a symbolic index + 1
>>> mem = ast.store(mem, var + 1, ast.bv(0xcc, 8))

>>> # Selecting a node at a constant address 0xdead
>>> node = ast.select(mem, 0xdead)

>>> # Asking a model to get the 0xcc constant.
>>> model = ctx.getModel(node == 0xcc)
>>> print(model)
{0: index:64 = 0xdeac}

We can also lift store and select nodes to LLVM:

>>> print(ctx.liftToLLVM(node))
; ModuleID = 'tritonModule'
source_filename = "tritonModule"

define i8 @__triton(i64 %SymVar_0) {
entry:
  %0 = add i64 %SymVar_0, 1
  %1 = inttoptr i64 %0 to i8*
  store i8 -52, i8* %1, align 1
  %2 = load i8, i8* inttoptr (i64 57005 to i8*), align 1
  ret i8 %2
}

Or lifting them to z3 and back:

>>> z3n = ast.tritonToZ3(node)
>>> print(z3n)
Store(K(BitVec(64), 0), SymVar_0 + 1, 204)[57005]

>>> ast.z3ToTriton(z3n)
(select (store Memory (bvadd index (_ bv1 64)) (_ bv204 8)) (_ bv57005 64))

That's all for the moment.

(1): Once the support of the ABV logic will be reliable (as axiomatic nodes) we can start brainstorming all together about how we can support it during a DSE. For example, if we add options to symbolize LOAD, to symbolize STORE or both. How we will declare arrays, do we symbolize stack, symbolize heap, all memory? Its cells size, 8, 16, 32 or 64 bits? etc.

@JonathanSalwan
Copy link
Owner Author

It's merged into dev-v0.1 with fc3bc08. We can now start thinking about what we can do with this. Btw, this commit may allow us to fix #473 without modifying our DSE memory model but may impact performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants