Permalink
Switch branches/tags
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
135 lines (80 sloc) 9.01 KB

Replication Package Instructions

These are instructions to reproduce the results in our paper "Mimic: Computing Models for Opaque Code" in FSE'15.

The virtual machine contains a clone of our GitHub repository in the directory ~/mimic, and the user should start a terminal and navigate to that folder using cd ~/mimic as the first step upon booting the virtual machine. We encourage reviewers to read our updated version of the paper, as it contains an important correction as well as an improved evaluation section.

Claims

This artifact was used to obtain all results in the paper, and can be used to replicate these results, and experiment with the prototype beyond what is discussed in the paper. Note that we cannot give guarantees about the applicability of mimic to other domains; there may be technical limitations like the experimental proxy support in node.js. Furthermore, our prototype has not been tuned for wide adoption and instead serves as a proof of concept.

The paper evaluates mimic on five research questions, and in this document we explain how to reproduce the answers to all of these questions. But first, we give a brief overview of the artifact and the usage of mimic.

Usage and Overview

mimic is open source and available at github.com/Samsung/mimic. The README.md file contains instructions on how to install and run the tool. The virtual machine we provide for the FSE Replication Packages Evaluation Committee has mimic installed and this guide should cover all necessary information. The artifact also contains the full source code.

The main way to invoke the tool is via the script mimic. It requires as input a function (for which a model should be computed) as well as some initial inputs. To start, lets consider a trivial example and synthesize a model for the identity function:

./mimic --function "return x" --argnames "x" --arguments "1" "2"

We provide the body of the function to be synthesized (return x here), as well as two sample inputs (1 and 2). mimic should be able to find a model for this trivial example in no time. Note that while we pass the code of the body to mimic, the tool does not look at the source and only uses it to execute it. Furthermore, the code might contain calls to native functions like those from the standard library.

In the paper we consider the array standard library, and we could synthesize models for those functions in the same way. To make things a bit easier, we have collected the functions from Table 1 in the paper in tests/array.json, and wrote a script to use that information to pass it to mimic. For instance, to synthesize code for Array.prototype.pop, we can simply run

scripts/example.py pop

This will actually show the command that is used to invoke mimic. The tool is now taking a bit more time. mimic can run highly parallel, and depending on the machine it is run on, synthesis can take more or less time. Hint: Use --threads <n> to control the number of threads that mimic should use (the number defaults to half the number of processors available).

To see how mimic works in more detail, we can pass the parameter --debug to mimic (or scripts/example.py, which just forwards parameters). If we run

scripts/example.py pop --debug

then mimic will only invoke one copy of mimic-core and show various additional information on what is going on. Since this just runs mimic-core once, the search may not converge (or fail), but if it is repeated often enough, eventually mimic will find a model. It works best to kill mimic-core (using Ctrl+C) after a while, and retry with a new random seed (which is automatically chosen). Of course, when we run mimic (rather than mimic-core via --debug), many runs of mimic-core are done automatically, and there is no need to manually restart.

Replication Instructions

Now that we have a basic understanding on how to run mimic, we can turn our attention to reproducing the results in the paper.

RQ1: Success Rate

We claim in Table 1 that mimic is able to find a model for 15 functions. This can be reproduced by running

scripts/example.py <function>

for all 15 different functions. Note that depending on your hardware, it may take significantly more time to find the models (see next section).

RQ2: Performance

Reproducing performance results requires the same hardware and likely mimic should be run natively (and not in a VM). We used an Intel Xeon CPU E5-2697 (which has 28 physical cores). On other machines the numbers may be different. If you would like to reproduce these numbers, we urge you to run mimic natively (and not in a VM). In particular, the VM likely has only one (virtual) CPU, which means that mimic will not be able to make use of any parallelism.

The performance numbers are obtained with 100 repetitions for all functions, and we actually additionally use two different fitness functions (to answer RQ5), which results in 15*100*2 = 3000 individual runs. On our hardware, this took just over 58 hours.

The experiment can be run by invoking make experiment, which runs the following command (be aware that this will take a long time to complete)

scripts/experiment.py --exp_name "main" -n 100 --metric "0,1"

It is possible to only reproduce some of the table (so that the experiment takes less time). For instance, to only do 10 repetitions, only use the default fitness function and only run the experiment for the function Array.prototype.pop, invoke:

scripts/experiment.py --exp_name "pop_only" -n 10 --metric "0" --filter "pop"

This should be significantly faster than the full run, but might still take anywhere from minutes to hours (depending on the hardware). Note that scripts/experiment.py only collects the data and stores it in an output folder in tests/out, but does not analyze it. For the analysis, we use scripts/process.py.

In our package we include the data gathered by our own experimental run, so the next step will work even if you have not (yet) run scripts/experiment.py yourself. To analyze this data, run

scripts/process.py --folder tests/out/2015-06-05_paper_data

If you have generated your own experimental data, change the folder above accordingly. This generates a table as well as some aggregate statistics on the console, and can also produce the LaTeX table in the paper (Table 1). The output also contains all other aggregate values found in the paper, like the average and maximum cleanup times reported in the paper.

RQ3: Usefulness

The mimic-generated models accepted into WALA can be seen here. Here is an example program whose analysis benefits from the mimic models for reduce, every, and some:

function a() {
  return false;
}

function b() {
  return a();
}


var x = [1,3,5];
console.log(x.some(b));

function c(i) {
  return i%2 === 1;
}

console.log(x.every(c));

function add(previousValue, currentValue, index, array) {
  return previousValue + currentValue;
}

var y = [0,1,3,5,2];
console.log(y.reduce(add));

Without the mimic models, when WALA constructed a call graph for the above code, it would erroneously conclude that functions a, b, c, and add were dead code. With the mimic models, WALA observed the callbacks from the library functions to these methods, and its call graph correctly identifies the methods as live. For further details on actually running WALA to generate call graphs, see the WALA Getting Started Guide.

RQ4: Obfuscation

For this experiment, we start with the model that mimic found (see models/array.js). For instance, for Array.prototype.pop, this code would be

function pop(arg0) {
  var n0 = arg0.length
  if (n0) {
    var n1 = arg0[n0-1]
    arg0.length = n0-1
    delete arg0[n0-1]
    return n1
  } else {
    arg0.length = 0
  }
}

We take the function body, and pass it through the obfuscator at javascriptobfuscator.com. This yields a new body like the following:

var _0x6339=["\x6C\x65\x6E\x67\x74\x68"];var n0=arg0[_0x6339[0]];if(n0){var n1=arg0[n0-1];arg0[_0x6339[0]]=n0-1;delete arg0[n0-1];return n1;}else {arg0[_0x6339[0]]=0};

We can run mimic directly by passing the string as the --function parameter:

./mimic --function 'var _0x6339=["\x6C\x65\x6E\x67\x74\x68"];var n0=arg0[_0x6339[0]];if(n0){var n1=arg0[n0-1];arg0[_0x6339[0]]=n0-1;delete arg0[n0-1];return n1;}else {arg0[_0x6339[0]]=0};' --argnames "arg0" --arguments "[1,2,3,4]"

mimic should successfully be able to recover a human-readable model. This can be done for all examples in a similar manner.

RQ5: Fitness Function

The steps to evaluate RQ2 actually also answer this question, as mimic is run with both fitness functions in the experiment. In particular, the output of scripts/process.py appears in Section 5.2 of the paper.