This project contains the source code for the benchmark presented in our paper A Benchmark of Globally-Optimal Anonymization Methods for Biomedical Data at the 27th IEEE International Symposium on Computer-Based Medical Systems (CBMS 2014).
The source code comprises our benchmarking environment, which is based upon ARX and SUBFRAME. The benchmark currently provides implementations of the following globally-optimal anonymization algorithms:
All 11 reasonable combinations of the following privacy criteria are evaluated in our benchmark:
For licensing reasons no data is contained in this repository. Please contact email@example.com for information on how to obtain the benchmark datasets.
The following figures show key parameters averaged over either the datasets or the privacy criteria. The number of checks gives an indication of an algorithm's pruning power, the number of roll-ups gives an indication of an algorithm's optimizability and, finally, the execution times give an indication of an algorithm's overall performance within the ARX runtime environment.
On a Desktop PC with a quad-core 3.1 GHz Intel Core i5 CPU running a 64-bit Linux 3.0.14 kernel and a 64-bit Sun JVM (1.7.0 21) the following results are produced (java -Xmx4G -XX:+UseConcMarkSweepGC -jar anonbench-0.2.jar):
Geometric mean of key parameters over all five benchmark datasets:
Geometric mean of key parameters over all eleven combinations of privacy criteria:
Since the publication of the paper, we have updated anonbench 0.1, which was based on ARX 2.0.0, to anonbench 0.2, which is based on ARX 2.3.0. Due to bugfixes and various performance-related changes in ARX the results of the benchmark have changed slightly in this process. We note, however, that all conclusions drawn from our original results are still valid and strongly recommend using the latest version of anonbench.