Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplification passes for ExprAssign #1004

Open
mrphrazer opened this issue Mar 13, 2019 · 7 comments
Open

Simplification passes for ExprAssign #1004

mrphrazer opened this issue Mar 13, 2019 · 7 comments

Comments

@mrphrazer
Copy link
Contributor

Hi!

I'm preparing a PR. For this, I have to apply simplification rules for ExprAssign which has to perform different transformation for src and dst.

ira_cfg = ira.new_ircfg_from_asmcfg(asm_cfg)
ira_cfg.simplify(expr_simp_high_to_explicit)
expr_simp = ExpressionSimplifier ()
expr_simp.enable_passes({ExprAssign: [my_simp]})
ira_cfg.simplify(expr_simp)

Since simplify from AssignBlock applies the same operation to src and dst:

for dst, src in viewitems(self):
    if dst == src:
        continue
    new_src = simplifier(src)
    new_dst = simplifier(dst)

I patched it as follows:

for dst, src in viewitems(self):
    if my_simp_flag:
        e = self.dst2ExprAssign(dst)
        rewritten = simplifier(e)
        new_src = rewritten.src
        new_dst = rewritten.dst
    else:
        if dst == src:
            continue
        new_src = simplifier(src)
        new_dst = simplifier(dst)

Obviously, this is not how it should be done. What do you think would be a good way to apply this?

@commial
Copy link
Contributor

commial commented Mar 15, 2019

Hello,

Hum, for now, I would rather explictely call the simplifier in your script, instead of modifying AssignBlock.simplify in Miasm.
We have this kind of code in several place in Miasm, like here: https://github.com/cea-sec/miasm/blob/master/miasm/analysis/outofssa.py#L383

@mrphrazer
Copy link
Contributor Author

mrphrazer commented Mar 29, 2019

Hi!

I could do this, but in this case I think I would also have to modify/recreate all Assign/IR blocks in ira_cfg manually, if I want to perform further analysis ?

I am currently looking for a clean solution since I'm planning to introduce a new PR that requires performing simplifications of ExprAssign on the graph level in a first step before applying SSA.

@serpilliere
Copy link
Contributor

Ok. Just some remarks:
For us, the ExprAssign is a weird word in Miasm, as it's more a statement than a right/left value.
But as it belongs to the Expr class for now, maybe we can consider that expression simplifier can deal with it:
The patch for this may be a little one: the case ExprAssign has to be added in the expression simplifier cases.

But it may trigger some new behavior:

  • For now, the replace_expr is coded as a visitor. If we do a replace_expr on it, both left value and right value will be modified by the replace_expr. The problem here is that we may want to only replace sources of the expressions. If we do constant propagation for example let say in:
@32[EAX] = @32[EAX] + 1

and let say we have concluded in a previous analysis that @32[EAX] can be replaced by 0x1337BEEF. Here we clearly want that the replace_expr on the ExprAssign gives:

@32[EAX] = 0x1337BEEF + 1

and not:

0x1337BEEF = 0x1337BEEF + 1

So the conclusion maybe that we may have to

  • modify the replace_expr
  • or add a new api, kind of: replace_righ_values and replace_left_values which take this problem into account.

For me I think we have to take this problem into account and maybe the second solution is the good for. Today, we are using replace_expr and try to twist it's behavior to match our goal but the real solution should be to have explicit and clear APIs for this. Also, It will make clearer what in Miasm is a right/left value, which seems a good point to me 😄

What do you think about this?

@mrphrazer
Copy link
Contributor Author

Hi!

I think both approaches have advantges and disadvantages. On the short term, introducing replace_right_values and replace_left_values seems for sure way more feasible (perhaps simplify_lhs and simplify_rhs are better wordings?). However, on the long term this is not the most ideal solution in terms of clean code and unnecessary computations.

Lets take for instance the following:

ira_cfg.simplify_lhs(expr_simp_lhs)
ira_cfg.simplify_rhs(expr_simp_rhs)

Lets assume simplify_lhs and simplify_rhs look as follows:

    def simplify_lhs(self, simplifier):
        """
        Return a new AssignBlock with expression simplified
        @simplifier: ExpressionSimplifier instance
        """
        new_assignblk = {}
        for dst, src in viewitems(self):
            new_dst = simplifier(dst)
            new_assignblk[new_dst] = src
        return AssignBlock(irs=new_assignblk, instr=self.instr)

In these cases, we iterate all IR instructions and generate all AssignBlocks twice. Assuming that the expession simplifier is able to handlle an ExprAssign (where we can define custom passes for the left and the right side), this would not be the case. However, way more code would have to be changed.

@serpilliere
Copy link
Contributor

Hi @mrphrazer ,

In fact I was not talking about simplification rules, but about the replace_expr. I agree with you for the double creation of assignent blocks. But maybe we can have something like:

replace_expr(left_tokens_replacement, right_tokens_replacement)

In this function we could manage left and right simultaneously, which will involve only one creation of basic block.

But I am curious about a thing: have you got some reduction rules example which may by applied on the right side of an expression and which should not be applied to the right one ? (ok, let say the replacement is a case appart)

@mrphrazer
Copy link
Contributor Author

mrphrazer commented Apr 1, 2019

Hi! Perhaps it is better to submit the PR first and discuss the details afterwards . Otherwise it might (or will) not make any sense to you.

Give me a few days, then I will take up the discussion here again.

@mrphrazer
Copy link
Contributor Author

This was quicker than intended:

See PR #1021 for more details.

The simplification pass for ExprAssign is required to rewrite memory expressions as follows:

# ebx = @32[eax]
ebx = mem_read(M, eax, 32)

# @32[eax] = ebx
M = mem_write(M, eax, ebx, 32)

Do you have any suggestions how we could this implement in a clean manner?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants