Control Flow Graph Scanning for Android
Switch branches/tags
Nothing to show
Clone or download
Latest commit 1ff3fc8 Jan 29, 2015
Permalink
Failed to load latest commit information.
src/main/java/org/talos/CFGScanDroid Updated graph output. Jan 29, 2015
LICENSE.txt Updated graph output. Jan 29, 2015
README.md Updated graph output. Jan 29, 2015
build.sh Public! Sep 28, 2014
pom.xml Updated graph output. Jan 29, 2015

README.md

CFGScanDroid

CFGScanDroid is a utility for comparing control flow graph (CFG) signatures to the control flow graphs of Android methods.

It was designed as a scanner for malicious applications.

Talk here: http://www.irongeek.com/i.php?page=videos/derbycon4/t420-control-flow-graph-based-virus-scanning-douglas-goddard

Building

If you do not have Maven installed:

sudo apt-get install maven

(If you're on a non-Debian OS, I believe in you and your ability to get Maven installed.)

If you have Maven, the build script should run the correct command:

./build.sh

This will create a file:

target/CFGScanDroid-0.1-jar-with-dependencies.jar

Running

java -jar target/CFGScanDroid-0.1-jar-with-dependencies.jar

USAGE:
	Must have one of (-d|-s|-l|-r) and you should probably specify some DEX files (-f) to use too

ESSENTIALS
	-f, -dex-files
		DEX file(s) to run
	-d, -dump-sigs
		Dump signature for each method of each DEX file
	-s, -sig-file
		A file containing signatures
	-r, -raw-signature
		Pass a signature in raw on the command line
	-l, -load-sigs-from-dex
		DEX file(s) whose methods to scan with

SCAN MODES
	-e, -exact-match
		Only match complete signature CFG to function CFG
	-p, -partial-match
		Find the signature graph within the function graph
	-o, -one-match
		Only match once on a method
	-b, -subgraph
		Tries to match signature depth 0 to each function vertex
	-i, -simple-match
		Match exact on vertex and edge count only (fp prone)
	-n, -normalize
		Normalize the control flow graph to have a single entry point

OUTPUT
	-j, -print-json
		Print JSON for matches
	-g, -graph
		Outputs a GraphML file of the matches
	-m, -print-matched
		Print when a match is found
	-u, -print-unmatched
		Print when no match is found
	-t, -print-statistics
		Print signature statistics after scan
	-h, -short-identifier
		Do not print full CFG identifier

Basic usage will be:

java -jar target/CFGScanDroid-0.1-jar-with-dependencies.jar -s path/to/sigfile.adb -f path/to/files/

If you don't have any signatures, you can dump the signatures for a file with -d:

java -jar target/CFGScanDroid-0.1-jar-with-dependencies.jar -d -f path/to/file/classes.dex

Files can be APKs or DEX files. Obviously DEX will be faster as it does not have to unzip to reach the code.

If you don't want to store your signature in a file, you can use -r "SOMESIGNATURE" to pass a signature in on the command line.

If the code is running to fast for you, there are some good ways to slow it down.

The -n option normalizes the control flow graph, removing any vertices not reachable from V0 (entry point). This is to get rid of catch blocks that have no branch to them.

The -p option enables partial graph matching. This means a signature will be compared to any function with >= the number of vertices and edges as it.

The -g option enables subgraph matching. This will normalize the graphs and try to match the signature's V0 with every node in a graph. This has not been optimized at all so it is considerably slow for a large number of samples and signatures.

If there is too stuff printing to the screen, there are some solutions to that.

Using -u false will hide unmatched samples, likewise -m false will hide matches.

Stats are printed at the end, if you don't like this you can use -t false to hide them.

If the class/path.functionName() is too long of an identifier for you, -h will shorten it to path.functionName(). Truncating anything before the final /.

If you don't need multiple alerts on the same method, you can use -o to stop scanning a method once it is identified with one signature.

Signature Format

The signatures are of format NAME;VERTEX_COUNT;ADJACENCY_LIST -

The adjacency list is of the format PARENT_BB:CHILD_BB[,CHILD_BB,...];

An example:

interceptSMS;4;0:1,2;1:3;2:3,0

The signature name is interceptSMS, the next fields indicates there are 4 vertices.

Breaking out the adjacency list:

0:1,2
1:3
2:3,0

Vertex 0 branches to V1 and V2. V1 branches to V3. V2 branches to V3 and V0.

Or if you prefer crappy diagrams:

    0 <----
  /   \   |
 1     2__|
  \   /
    3

Contact

Feel free to open an issue here for any problems or feature requests.

Thanks!