IRSB decode error for self-modifying code? #26

patrafter1999 · 2016-01-08T00:27:43Z

Hi Guys,

I am pretty new to angr. I think it's really cool. I wrote some basic code for testing a shellcode. The source is as follows:

import angr

bp = 0x401010

def check(path):
    if path.state.ip.args[0] == bp:
        return True
    else:
        return False

b = angr.Project('shellcode.exe')
state = b.factory.entry_state()
pg = b.factory.path_group(state)

pg.explore(find=check)
found = pg.found[0]

print len(pg.found)

The shellocode disassembly looks like this:

.text:00401000 start           proc near
.text:00401000                 jmp     short loc_401012
.text:00401000 start           endp
.text:00401000
.text:00401002
.text:00401002 ; =============== S U B R O U T I N E =======================================
.text:00401002
.text:00401002 ; Attributes: noreturn
.text:00401002
.text:00401002 sub_401002      proc near               ; CODE XREF: sub_401002:loc_401012�p
.text:00401002                 pop     ebx
.text:00401003                 dec     ebx
.text:00401004                 xor     ecx, ecx
.text:00401006                 mov     cx, 296h
.text:0040100A
.text:0040100A loc_40100A:                             ; CODE XREF: sub_401002+C�j
.text:0040100A                 xor     byte ptr [ebx+ecx], 9Ch
.text:0040100E                 loop    loc_40100A
.text:00401010                 jmp     short loc_401017
.text:00401012 ; ---------------------------------------------------------------------------
.text:00401012
.text:00401012 loc_401012:                             ; CODE XREF: start�j
.text:00401012                 call    sub_401002
.text:00401017 ; ---------------------------------------------------------------------------
.text:00401017
.text:00401017 loc_401017:                             ; CODE XREF: sub_401002+E�j
.text:00401017                 pop     ds
.text:00401018                 js      short loc_401086
.text:0040101A                 lodsd
.text:0040101B                 push    ebp
.text:0040101C
.text:0040101C loc_40101C:                             ; CODE XREF: sub_401002+5B�j

The shellcode XORs the obfuscated block of code starting at 0x401017. My test angr script should be able to stop right before jumping into the deobfuscated code at 0x401010, which allows me to inspect deobfuscated code. But instead I've got the following error paths.

>> pg.errored
[<Errored Path with 667 runs (at 0x4010f8, AngrExitError)>, <Errored Path with 667 runs (at 0x401098, AngrExitError)>]

Since there are only a couple of direct jumps till the 0x401010, angr shouldn't attempt to parse the obfuscated block (that contains gibberish-looking code before deobfuscation). But it appears that's what angr is doing there. I might be wrong. See more error details below.

>> pg.errored[0].error
AngrExitError('IR decoding error at 0x4010f8. You can hook this instruction with a python replacement using project.hook(0x4010f8, your_function, length=length_of_instruction).',)
>> pg.errored[1].error
AngrExitError('Cannot create run following jumpkind Ijk_SigTRAP',)

Please find the shellcode in the zip (pw: infected). Any comment will be greatly appreciated.

shellcode.exe.zip

The text was updated successfully, but these errors were encountered:

patrafter1999 · 2016-01-08T01:09:48Z

Above all, do you guys have any plan to open a forum to share knowledge? I find it very difficult to follow many different aspects of the symbolic execution. Besides it would be great to share some great techniques among researchers.

Much appreciated,

rhelmot · 2016-01-08T02:12:18Z

If you want angr to parse self-modifying code you need to initialize the project with support_selfmodifying_code=True.

zardus · 2016-01-08T02:15:09Z

On top of that, due to how angr works internally, your "check" function will only be called at the beginning of a basic block. The address you're looking for, 0x401010, isn't at the start of a basic block (according to VEX). You can see this by doing:

In [12]: project.factory.block(0x40100a).vex.pp()
IRSB {
   t0:Ity_I8 t1:Ity_I8 t2:Ity_I8 t3:Ity_I32 t4:Ity_I32 t5:Ity_I32 t6:Ity_I32 t7:Ity_I32 t8:Ity_I32 t9:Ity_I32 t10:Ity_I32 t11:Ity_I1 t12:Ity_I32 t13:Ity_I32

   00 | ------ IMark(0x40100a, 4, 0) ------
   01 | t6 = GET:I32(ecx)
   02 | t7 = GET:I32(ebx)
   03 | t4 = Add32(t7,t6)
   04 | t2 = LDle:I8(t4)
   05 | t0 = Xor8(t2,0x9c)
   06 | STle(t4) = t0
   07 | PUT(cc_op) = 0x0000000d
   08 | t8 = 8Uto32(t0)
   09 | PUT(cc_dep1) = t8
   10 | PUT(cc_dep2) = 0x00000000
   11 | PUT(cc_ndep) = 0x00000000
   12 | PUT(eip) = 0x0040100e
   13 | ------ IMark(0x40100e, 2, 0) ------
   14 | t9 = Sub32(t6,0x00000001)
   15 | PUT(ecx) = t9
   16 | t11 = CmpNE32(t9,0x00000000)
   17 | if (t11) { PUT(eip) = 0x40100a; Ijk_Boring }
   18 | ------ IMark(0x401010, 2, 0) ------
   NEXT: PUT(eip) = 0x00401017; Ijk_Boring
}

(if you want to learn more about VEX, check out https://github.com/angr/angr-doc/blob/master/ir.md)

There are two things you can do: break at 0x401017, which is the beginning of the basic block that it jumps to, or break at 0x40100a, which is the beginning of that basic block. Then the breakpoint, at least, should work.

If you really need to break at that exact instruction, SimuVEX breakpoints are more granular, and let you break at specific instructions or whenever any conditions are met (i.e., some specific address being written to). You can read more about that at https://github.com/angr/angr-doc/blob/master/simuvex.md#breakpoints

zardus · 2016-01-08T02:15:54Z

As for the forum, are you on #angr on freenode.net? That's the closest thing that we have at the moment...

patrafter1999 · 2016-01-08T04:55:14Z

Thanks heaps. I'm on freenode.net now. I will ask questions there from now on. salls already helped me on a couple of things. Knowing find callback gets invoked at the BBL level helps!

I'm trying to do some taint analysis aiming to identify the decryptor code and its associated encrypted block that gets decrypted. salls advised me to use 'TRACK_ACTION_HISTORY' for recording all taint info.

Thanks!

zardus closed this as completed Jun 4, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IRSB decode error for self-modifying code? #26

IRSB decode error for self-modifying code? #26

patrafter1999 commented Jan 8, 2016

patrafter1999 commented Jan 8, 2016

rhelmot commented Jan 8, 2016

zardus commented Jan 8, 2016

zardus commented Jan 8, 2016

patrafter1999 commented Jan 8, 2016

IRSB decode error for self-modifying code? #26

IRSB decode error for self-modifying code? #26

Comments

patrafter1999 commented Jan 8, 2016

patrafter1999 commented Jan 8, 2016

rhelmot commented Jan 8, 2016

zardus commented Jan 8, 2016

zardus commented Jan 8, 2016

patrafter1999 commented Jan 8, 2016