Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add parser for textual ir #126

Merged
merged 13 commits into from Mar 16, 2018

Conversation

Projects
None yet
4 participants
@HMPerson1
Copy link
Collaborator

HMPerson1 commented Mar 3, 2018

Currently, it can roundtrip all the information that was already saved by ir_writer. There's probably still some information that is lost in the process (register state?) but I'm not sure what.

Will fix #117 when done.

@HMPerson1 HMPerson1 force-pushed the HMPerson1:master branch from bf62233 to 019df57 Mar 3, 2018

@HMPerson1

This comment has been minimized.

Copy link
Collaborator Author

HMPerson1 commented Mar 3, 2018

There seems to be a weird bug where occasionally an SSA value is just not emitted. It can be seen in when running minidec on ex-bins/key in the function sym._init:

define-fun sym._init(unknown) -> unknown {
    registers: $r15,$r14,$r13,$r12,$rbp,$rbx,$r11,$r10,$r9,$r8,$rax,$rcx,$rdx,$rsi,$rdi,$rip,$cs,$cf,$pf,$af,$zf,$sf,$tf,$if,$df,$of,$rsp,$ss,$fs_base,$gs_base,$ds,$es,$fs,$gs;
    bb_0x400438.0000():
        [@0x400438.0001] %1: $Unknown64(*?) = {rsp} - #x8;
        [@0x40043C.0003] %2: $Unknown64(*?) = Load({mem}, #x600ff8);
        [@0x400443.0000] %3: $Unknown64(*?) = %2 & %2;
        [@0x400443.0002] %4: $Unknown64 = %3 - #x0;
        [@0x400443.0004] %5: $Unknown64(*?) = %4 & #xffffffffffffffff;
        [@0x400443.0006] %6: $Unknown64(*?) = #x1 ^ %5;
        [@0x400443.0008] %7: $Unknown1(*?) = Narrow1(%6);
        [@0x400443.000A] %8: $Unknown64(*?) = %4 & #xff;
        [@0x400443.000C] %9: $Unknown64 = #x101010101010101 * %8;
        [@0x400443.000E] %10: $Unknown64(*?) = %9 & #x8040201008040201;
        [@0x400443.0010] %11: $Unknown64 = %10 % #x1ff;
        [@0x400443.0012] %12: $Unknown64(*?) = %11 & #x1;
        [@0x400443.0014] %13: $Unknown1(*?) = Narrow1(%12);
        [@0x400443.0018] %14: $Unknown64(*?) = %4 >> #x3f;
        [@0x400443.001A] %15: $Unknown1(*?) = Narrow1(%14);
        JMP IF %7 0x40044D.0000
    bb_0x400448.0000():
        [@0x400448.0024] CALL #x400490($r15={r15}, $r14={r14}, $r13={r13}, $r12={r12}, $rbp={rbp}, $rbx={rbx}, $r11={r11}, $r10={r10}, $r9={r9}, $r8={r8}, $rax=%2, $rcx={rcx}, $rdx={rdx}, $rsi={rsi}, $rdi={rdi}, $rip={rip}, $cs={cs}, $cf=%16, $pf=%13, $af={af}, $zf=%7, $sf=%15, $tf={tf}, $if={if}, $df={df}, $of=%17, $rsp=%1, $ss={ss}, $fs_base={fs_base}, $gs_base={gs_base}, $ds={ds}, $es={es}, $fs={fs}, $gs={gs});
        ...

In the last call operation, $cf is set to the value %16, but %16 was never defined.

After some debugging, I found that ir_writer never emitted that value because SSAStorage.inorder_walk skipped that node because it was never added into a block.

If we want to round-trip IR through text, we have two options to fix this:

  1. Rewrite ir_writer so that it doesn't depend on values being in blocks.
  2. Say that it is invalid to have values that aren't in a block and check this in verifier::verify before it gets to ir_writer.

Option 1 is certainly more robust, but since values not being in blocks kinda breaks inorder_walk in a sense (and also bfs_walk), it might be better to go with option 2.

@sushant94

This comment has been minimized.

Copy link
Collaborator

sushant94 commented Mar 5, 2018

@HMPerson1 This looks great! I'll be trying this out today and merging it 👍

For register state, I'd like an additional line at the beginning of every function to "define" registers, much like declaring variables in source languages. This can be parsed to generate the register state later on.

One thing I'd like to see are a few testcases, nothing too complicated -- just something that takes handwritten IR, generates the ssa structure and then assertions to check the lifting is correct.

@HMPerson1 HMPerson1 force-pushed the HMPerson1:master branch from f5e4ebe to 4e4f41b Mar 7, 2018

@HMPerson1

This comment has been minimized.

Copy link
Collaborator Author

HMPerson1 commented Mar 7, 2018

I've added some test cases.

For register state, I already changed ir_writer to emit a line to "define" the registers and the register state at the very end of the function is roundtripped, but what I was confused about was the register state for all the other basicblocks. It seems like the register state for every basic block is recorded in SSAStorage (and can be retrieved with registers_in) but only that of the exit node is emited in ir_writer.

Also, could you comment on this problem?

@sushant94

This comment has been minimized.

Copy link
Collaborator

sushant94 commented Mar 7, 2018

@HMPerson1 The question is, why is %16 not a part of the block? Does it belong to some other block? Did you look into this? If not, that needs some investigation.

@HMPerson1

This comment has been minimized.

Copy link
Collaborator Author

HMPerson1 commented Mar 10, 2018

It looks like it happens in constant propagation, in emit_ssa. When creating a new node for a OpNarrowed constant, it only copies over the address of the old node but not which block the old node was in:

let new_node = if w == 64 {
    const_node
} else {
    let address = self.g.address(*k).unwrap_or_else(..);
    let opcode = MOpcode::OpNarrow(w as u16);
    let new_node = self.g.insert_op(opcode, ndata.vt, None).unwrap_or_else(..);
    self.g.set_address(new_node, address);
    self.g.op_use(new_node, 0, const_node);
    new_node
};
self.g.replace_value(*k, new_node);

If I replace set_address with insert_into_block then it works again.

@@ -363,86 +373,59 @@ impl IRWriter {
let line = match ssa.g[node] {
NodeData::Op(ref opcode, vt) => {
if let MOpcode::OpConst(_) = *opcode {
radeco_err!("found const");

This comment has been minimized.

@XVilka

XVilka Mar 12, 2018

Member

why error?

This comment has been minimized.

@HMPerson1

HMPerson1 Mar 12, 2018

Author Collaborator

There shouldn't be any const nodes in inorder_walk since all the const nodes should be in a separate table and not part of any particular block.

Do you think it might be better as radeco_warn instead?

This comment has been minimized.

@XVilka

XVilka Mar 15, 2018

Member

No, then is ok. Sorry, oversaw that.

HMPerson1 added some commits Mar 3, 2018

Add `ir_reader`
Also made some changes to `ir_writer` so that the SSA can be round-tripped.
`minidec` also now checks that `ir_reader` can correctly read back whatever
`ir_writer` wrote.
Fix overzealous selector deletion
`remove_value` will unconditionally remove all edges adjacent to the given node
and then remove it, when really we just want to remove the selector edge.

@HMPerson1 HMPerson1 force-pushed the HMPerson1:master branch 2 times, most recently from b27cdb8 to 64a16e4 Mar 12, 2018

@sushant94

This comment has been minimized.

Copy link
Collaborator

sushant94 commented Mar 15, 2018

@HMPerson1 How close is this to be merged? Most of the PR lgtm, if this can be merged, I'll give it another pass and merge it in.

@HMPerson1

This comment has been minimized.

Copy link
Collaborator Author

HMPerson1 commented Mar 15, 2018

I think it's done now. Would you prefer a merge to fix the conflicts or a rebase?

HMPerson1 added some commits Mar 15, 2018

@sushant94

This comment has been minimized.

Copy link
Collaborator

sushant94 commented Mar 15, 2018

Either one is fine, rebase could be preferable.

@HMPerson1 HMPerson1 force-pushed the HMPerson1:master branch from c247036 to bcd7ce0 Mar 15, 2018

@HMPerson1 HMPerson1 changed the title [WIP] Add parser for textual ir Add parser for textual ir Mar 16, 2018

@sushant94 sushant94 merged commit 533a269 into radareorg:master Mar 16, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@sushant94

This comment has been minimized.

Copy link
Collaborator

sushant94 commented Mar 16, 2018

Great! SGTM, merging this in :)

@radare

This comment has been minimized.

Copy link
Collaborator

radare commented Mar 16, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.