Snowman should use a more sophisticated disassembly technique than linear #10

jrmuizel · 2015-06-03T03:55:13Z

ARM decompilation currently seems to suffer quite a bit from confusing code and data and this should help there.

There are lots of options for a better technique

mcsema has something better but I don't know much about it.
ByteWeight http://security.ece.cmu.edu/byteweight/ seems to be what BAP is switching or is at least is a good candidate.
Dagger uses MCObjectDisassembler (a recursive traversal disassembler) from LLVM which made it upstream but was removed. In my experience it did not work very well.
I haven't looked at what radare uses.

yegord · 2015-06-03T18:33:26Z

I agree. I would try something like:

Scan through the executable section, try to disassemble N instructions in a row (you can also use various hints, e.g., symbols and the executable's entry point, for the starting points).
If disassembly succeeds, run recursive traversal from the address of the first instruction.

The traversal I have already implemented once, although never got to actually using it: 7f1e836. The traversal constructs control-flow graph on the fly and uses DataflowAnalyzer to perform abstract interpretation, so, it should be able to tell you the jump destinations, in particular, switches from/to THUMB mode.

One can take the above code as a starting point and do some experiments with it.

nihilus · 2015-06-16T17:45:10Z

Well couldnt this already piggyback on IDA? However a purely free ARM decompiler is welcomed.

yegord · 2015-06-16T21:17:51Z

IDA already knows the ranges of addresses belonging to a function. Not sure, although, if it includes data into these ranges. If it does not, the IDA plug-in should not have problems with interpreting data as code. If it does, maybe we need to find where the instructions exactly are (IDA has getFlags() function for this), and update IdaFrontend::functionAddresses() to report only ranges of addresses containing executable code.

But this does not dismiss the need in a better discovery of the code.

yegord · 2015-07-12T09:22:03Z

seems to suffer quite a bit from confusing code

Can you provide an example?

hlide · 2015-07-12T17:40:51Z

are you speaking about stuff like ROP gadget ?

yegord · 2015-07-12T23:34:34Z

Related: #14 (comment)

pull req

yegord · 2015-10-05T22:54:52Z

Related: #51

nihilus · 2015-10-05T23:07:35Z

Moreover #59 is an actual way to achieve this.

yegord · 2015-10-05T23:16:08Z

No, it's not.

yegord pushed a commit that referenced this issue Aug 9, 2015

Merge pull request #10 from yegord/master

76b604d

pull req

yegord added the enhancement label Oct 5, 2015

yegord mentioned this issue Oct 5, 2015

Improve IDA plugin to use IDA as disassembler. #59

Closed

yegord added the need info label Jan 3, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Snowman should use a more sophisticated disassembly technique than linear #10

Snowman should use a more sophisticated disassembly technique than linear #10

jrmuizel commented Jun 3, 2015

yegord commented Jun 3, 2015

nihilus commented Jun 16, 2015

yegord commented Jun 16, 2015

yegord commented Jul 12, 2015

hlide commented Jul 12, 2015

yegord commented Jul 12, 2015

yegord commented Oct 5, 2015

nihilus commented Oct 5, 2015

yegord commented Oct 5, 2015

Snowman should use a more sophisticated disassembly technique than linear #10

Snowman should use a more sophisticated disassembly technique than linear #10

Comments

jrmuizel commented Jun 3, 2015

yegord commented Jun 3, 2015

nihilus commented Jun 16, 2015

yegord commented Jun 16, 2015

yegord commented Jul 12, 2015

hlide commented Jul 12, 2015

yegord commented Jul 12, 2015

yegord commented Oct 5, 2015

nihilus commented Oct 5, 2015

yegord commented Oct 5, 2015