Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
added the README for the replication package
- Loading branch information
Showing
2 changed files
with
114 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
# ESEC/FSE 2017 Experimental Replication Package | ||
|
||
Our paper [Fairness Testing: Testing Software for | ||
Discrimination](http://people.cs.umass.edu/~brun/pubs/pubs/Galhotra17fse.pdf) | ||
was published in [ESEC/FSE 2017](http://esec-fse17.uni-paderborn.de/). | ||
This page contains the replication package to repeat the experiments | ||
described in the paper, and reproduce Figures 1 and 2. (Figure 3 is a | ||
theoretical result.) | ||
|
||
## Requirements | ||
|
||
The code has been tested with Python v.2.7 and associated libraries for that | ||
version of Python. This describes both our code's dependencies, and the | ||
dependencies of the underlying subject system our code uses in the evaluation. | ||
|
||
The `GenerateFigure1TestSuites.sh` and `GenerateFigure2TestSuites.sh` scripts | ||
require three python libraries, `matplotlib`, `scikit-learn`, and `numpy`. | ||
There are many ways to install these. For example, if you use | ||
[MacPorts](https://www.macports.org/), to make sure Python and the three | ||
packages are installed, run | ||
|
||
``` | ||
port install python27 | ||
port install py27-matplotlib | ||
port install py27-scikit-learn | ||
port install py27-numpy | ||
``` | ||
|
||
## The results | ||
|
||
This replication package reproduces results of two experiments. | ||
|
||
Experiment 1 produces the data for Figure 1 in [Fairness Testing: Testing | ||
Software for | ||
Discrimination](http://people.cs.umass.edu/~brun/pubs/pubs/Galhotra17fse.pdf). | ||
This experiment computes the group and causal discrimination scores for a | ||
total of 20 instances of the eight subject software systems. | ||
|
||
Experiment 2 produces data for Figure 2 in [Fairness Testing: Testing | ||
Software for | ||
Discrimination](http://people.cs.umass.edu/~brun/pubs/pubs/Galhotra17fse.pdf). | ||
This experiment computes the sets of sensitive characteristics that the 20 | ||
subject instances discriminate against causally at least 5% and that | ||
contribute to subsets of characteristics that are discriminated against at | ||
least 75%. | ||
|
||
## Reproducing the results | ||
|
||
Reproducing each table consists of two steps: | ||
1. Using Themis to produce a test suite for each of the 20 instances of the | ||
eight subject systems (this process also executes the test suites). | ||
2. Post-processing the results. | ||
|
||
Step 1, for both figures, takes a long time to execute. So the replication | ||
package ships with the produced test suites. Thus it is possible to skip step | ||
1 and run step 2 straight away. This process is very fast and produces the | ||
data you see in Figures 1 and 2. | ||
|
||
There are four scripts in the replication package (one for each step for each | ||
of the two figures): | ||
|
||
### `Figure1/GenerateFigure1TestSuites.sh` | ||
|
||
This script produces the necessary test suites for Figure 1. There are | ||
multiple test suites per subject system instance, and 20 subject system | ||
instances. Each instance has to be trained on training data before being | ||
executed. Thus, this script takes a very long time to execute. | ||
|
||
This script populates the `Figure1/Scripts` directory. (Recall that this | ||
directory is already pre-populated with the scripts so that it is possible to | ||
skip this script to save time. | ||
|
||
### `Figure1/GenerateFigure1.sh` | ||
|
||
This script processes the data in the `Figure1/Scripts` directory (that | ||
either comes with the replication package or is generated by | ||
`Figure1/GenerateFigure1TestSuites.sh`) to produce the tabulated data in | ||
Figure 1. | ||
|
||
### `Figure2/GenerateFigure2TestSuites.sh` | ||
|
||
This script produces the necessary test suites for Figure 2. Again, there are | ||
multiple test suites per subject system instance, and 20 subject system | ||
instances, and each instance has to be trained on training data before being | ||
executed. Thus, this script takes a very long time to execute. | ||
|
||
This script populates the `Figure2/Scripts` directory. (Recall that this | ||
directory is already pre-populated with the scripts so that it is possible to | ||
skip this script to save time. | ||
|
||
### `Figure2/GenerateFigure2.sh` | ||
|
||
This script processes the data in the `Figure2/Scripts` directory (that | ||
either comes with the replication package or is generated by | ||
`Figure2/GenerateFigure2TestSuites.sh`) to produce the tabulated data in | ||
Figure 2. | ||
|
||
## Note on nondeterminism | ||
|
||
Our replication package goes to great lengths to eliminate sources of | ||
nondeterminism. While Themis uses randomness in its test suite generation, it | ||
uses a seed parameter to make the randomness deterministic. These seeds are | ||
encoded in the above scripts. However, the underlying subject systems also | ||
exhibit nondeterminism. We cannot control this nondeterminism, which is | ||
typical when using real-world, off-the-shelf software. Because Themis is | ||
adaptive and its test suite generation depends on the underlying system's | ||
outputs on the inputs Themis generates, the subject systems' nondeterminism | ||
affects the test suites. As such, running | ||
`Figure1/GenerateFigure1TestSuites.sh` and | ||
`Figure2/GenerateFigure2TestSuites.sh` will produce slightly different test | ||
suites each time. These differences may result in small differences in the | ||
final, processed data. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters