Comparing Different Membership Inference Attacks with a Comprehensive Benchmark
Membership inference (MI) attacks threaten user privacy through determining if a given data example has been used to train a target model. However, although the new MI attacks have substantially enhanced researchers’ understanding about how Membership Inference could be conducted in various ways, we found that a common conclusion made in these attack proposals could be misleading. We call this finding the Conflicting Comparison Result (CCR) Phenomenon. This paper seeks to present the systematic analysis we have conducted and interpret the conflicting comparison results resulted from the analysis. In order to systematically conduct the analysis, we have developed MIBench, a new benchmark in the area of Membership Inference. The uniqueness of MIBench is as follows: our benchmark consists not only the evaluation metrics, but also the evaluation scenarios (ESs). And we design the ESs from four perspectives: the distance distribution of data samples in the target dataset, the distance between data samples in the target dataset, the differential distance between two datasets (i.e., the target dataset and a generated dataset with only nonmembers), and the ratio of the samples that are made no inferences by an MI attack. The evaluation metrics consist of ten typical evaluation metrics. We have designed and implemented the MIBench benchmark with 84 evaluation scenarios for each dataset. In total, we have used our benchmark to fairly and systematically compare 15 state-of-the-art MI attack algorithms across 588 evaluation scenarios, and these evaluation scenarios cover 7 widely used datasets and 7 representative types of models. To the best of our knowledge, this paper is the first work reporting and analyzing the CCR Phenomenon. Our experimental analysis reveals 83 conflicting comparison results, and we have gained strong evidences that the CCR Phenomenon is widely existing. All codes and evaluations of MIBench are publicly available at this link1.
MI attacks:
- NN_attack
- Loss-Threshold
- Label-only
- Top3-NN attack
- Top1-Threshold
- BlindMI-Diff-w
- BlindMI-Diff-w/o
- BlindMI-Diff-1CLASS
- Top2+True
- Privacy Risk Scores
- Shapley Values
- Positive Predictive Value
- Calibrated Score
- Distillation-based Thre.
- Likelihood ratio attack
Datasets: CIFAR100, CIFAR10, CH_MNIST, ImageNet, Location30, Purchase100, Texas100
Models: MLP, StandDNN, VGG16, VGG19, ResNet50, ResNet101, DenseNet121
Requirements: You can run the following script to configurate necessary environment sh ./sh/install.sh
Usage: Please first to make a folder for record, all experiment results with save to record folder as default. And make folder for data to put supported datasets. XXX XXX
Attack: This is a demo script of running NN_attack on CIFAR100. python ./attack/NN_attack.py --yaml_path ../config/attack/NN/CIFAR100.yaml --dataset CIFAR100 --dataset_path ../data --save_folder_name CIFAR100_0_1
Evaluation Framework:
MIBench is a comprehensive benchmark for comparing different MI attacks, which consists not only the evaluation metric module, but also the evaluation scenario module.
- Part I: Evaluation Scenarios
In this work, we have designed and implemented the MIBench benchmark with 84 evaluation scenarios for each dataset. In total, we have used our benchmark to fairly and systematically compare 15 state-of-the-art MI attack algorithms across 588 evaluation scenarios, and these evaluation scenarios cover 7 widely used datasets and 7 representative types of models.
(a) Evaluation Scenarios of CIFAR100.
(b) Evaluation Scenarios of CIFAR10.
(c) Evaluation Scenarios of CH_MNIST.
(d) Evaluation Scenarios of ImageNet.
(e) Evaluation Scenarios of Location30.
(f) Evaluation Scenarios of Purchase100.
(g) Evaluation Scenarios of Texas100.
-
Part II: Evaluation Metrics
We mainly use attacker-side accuracy, precision, recall, f1-score, false positive rate (FPR), false negative rate (FNR), membership advantage (MA), the Area Under the Curve (AUC) of attack Receiver Operating Characteristic (ROC) curve, TPR @ fixed (low) FPR, threshold at maximum MA, as our evaluation metrics. The details of the evaluation metrics are shown as follows.
(a) accuracy: the percentage of data samples with correct membership predictions by MI attacks;
(b) precision: the ratio of real-true members predicted among all the positive membership predictions made by an adversary;
(c) recall: the ratio of true members predicted by an adversary among all the real-true members;
(d) f1-score: the harmonic mean of precision and recall;
(e) false positive rate (FPR): the ratio of nonmember samples are erroneously predicted as members;
(f) false negative rate (FNR): the difference of the 1 and recall (e.g., FNR=1-recall);
(g) membership advantage (MA):the difference between the true positive rate and the false positive rate (e.g., MA = TPR - FPR);
(h) Area Under the Curve (AUC): computed as the Area Under the Curve of attack Receiver Operating Characteristic (ROC);
(i) TPR @ fixed (low) FPR: an attack’s truepositive rate at (fixed) low false-positive rates;
(j) threshold at maximum MA: a threshold to achieve maximum MA.
Results:
The results section consists of three parts: the results of 84 evaluation scenarios (ES), the thresholds at maximum MA of the Risk score and Shapley values attacks and the results of 4 research questions (RQ). And in part I and part III, we identify the evaluation results of 15 state-of-the-art MI attacks by ten evaluation metrics (e.g., attacker-side accuracy, precision, recall, f1-score, FPR, FNR, MA, AUC, TPR @ fixed (low) FPR (T@0.01%F and T@0.1%F), threshold at maximum MA).
- Part I: The Results of 84 Evaluation Scenarios
1. Distillation-based:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
2. Calibrated Score:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Purchase100:
(6) Texas100:
3. Label-only:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
4. NN_attack:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
5. PPV:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
6. Risk score:
(1) CIFAR100:
(2) CH_MNIST:
(3) ImageNet:
(4) Location30:
(5) Purchase100:
(6) Texas100:
7. Shapley values:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
8. Top1_Threshold:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
9. BlindMI-1CLASS:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
10. Top3_NN:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
11. LiRA:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
12. Top2+True:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
13. BlindMI-w:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
14. BlindMI-without:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) Location30:
(5) Purchase100:
(6) Texas100:
15. Loss-Threshold:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
- Part II: The Thresholds at maximum MA
1. Risk score attacks:
2. Shapley values attacks:
- Part III: The Results of 4 Research Questions
(1) CIFAR100:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES01: CIFAR100_Normal + 2.893 + 0.085 + 20%
ES29: CIFAR100_Uniform + 2.893 + 0.085 + 20%
ES57: CIFAR100_Bernoulli + 2.893 +0.085 + 20%
CIFAR100_N_2 893_d1_20% | CIFAR100_U_2 893_d1_20% | CIFAR100_B_2 893_d1_20% |
RQ2: Effect of Distance between data samples of the Target Dataset
ES02: CIFAR100_Normal + 2.893 + 0.085 + 40%
ES10: CIFAR100_Normal + 3.813 + 0.085 + 40%
ES22: CIFAR100_Normal + 4.325 + 0.085 + 40%
RQ3: Effect of Differential Distances between two datasets
ES03: CIFAR100_Normal + 2.893 + 0.085 + 45%
ES05: CIFAR100_Normal + 2.893 + 0.119 + 45%
ES07: CIFAR100_Normal + 2.893 + 0.157 + 45%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES37: CIFAR100_Uniform + 3.813 + 0.085 + 20%
ES38: CIFAR100_Uniform + 3.813 + 0.085 + 40%
ES39: CIFAR100_Uniform + 3.813 + 0.085 + 45%
ES40: CIFAR100_Uniform + 3.813 + 0.085 + 49%
(2) CIFAR10:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES13: CIFAR10_Normal + 2.501 + 0.213 + 20%
ES41: CIFAR10_Uniform + 2.501 + 0.213 + 20%
ES69: CIFAR10_Bernoulli + 2.501 + 0.213 + 20%
RQ2: Effect of Distance between data samples of the Target Dataset
ES02: CIFAR10_Normal + 1.908 + 0.155 + 40%
ES10: CIFAR10_Normal + 2.501 + 0.155 + 40%
ES22: CIFAR10_Normal + 3.472 + 0.155 + 40%
CIFAR10_N_1 908_d1_40% | CIFAR10_N_2 501_d1_40% | CIFAR10_N_3 472_d1_40% |
RQ3: Effect of Differential Distances between two datasets
ES51: CIFAR10_Uniform + 3.472 + 0.155 + 45%
ES53: CIFAR10_Uniform + 3.472 + 0.213 + 45%
ES55: CIFAR10_Uniform + 3.472 + 0.291 + 45%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES57: CIFAR10_Bernoulli + 1.908 +0.155 + 20%
ES58: CIFAR10_Bernoulli + 1.908 + 0.155 + 40%
ES59: CIFAR10_Bernoulli + 1.908 + 0.155 + 45%
ES60: CIFAR10_Bernoulli + 1.908 + 0.155 + 49%
(3) CH_MNIST:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES21: CH_MNIST_Normal + 1.720 +0.083 + 20%
ES49 : CH_MNIST_Uniform + 1.720 +0.083 + 20%
ES77: CH_MNIST_Bernoulli + 1.720 +0.083 + 20%
RQ2: Effect of Distance between data samples of the Target Dataset
ES04: CH_MNIST_Uniform + 0.954 + 0.108 + 40%
ES14: CH_MNIST_Uniform + 1.355 + 0.108 + 40%
ES24: CH_MNIST_Uniform + 1.720 + 0.108 + 40%
RQ3: Effect of Differential Distances between two datasets
ES03: CH_MNIST_Normal + 0.954 + 0.083 + 45%
ES05: CH_MNIST_Normal + 0.954 + 0.108 + 45%
ES07: CH_MNIST_Normal + 0.954 + 0.133 + 45%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES73: CH_MNIST_Bernoulli + 1.355 + 0.133 + 20%
ES74: CH_MNIST_Bernoulli + 1.355 + 0.133 + 40%
ES75: CH_MNIST_Bernoulli + 1.355 + 0.133 + 45%
ES76: CH_MNIST_Bernoulli + 1.355 + 0.133 + 49%
(4) ImageNet:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES02: ImageNet_Normal + 0.934 + 0.046 + 40%
ES30: ImageNet_Uniform + 0.934 + 0.046 + 40%
ES58: ImageNet_Bernoulli + 0.934 + 0.046 + 40%
RQ2: Effect of Distance between data samples of the Target Dataset
ES34: ImageNet_Uniform + 0.934 + 0.08 + 49%
ES44: ImageNet_Uniform + 1.130 + 0.08 + 49%
ES54: ImageNet_Uniform + 1.388 + 0.08 + 49%
RQ3: Effect of Differential Distances between two datasets
ES79: ImageNet_Bernoulli + 1.388 + 0.046 + 45%
ES81: ImageNet_Bernoulli + 1.388 + 0.080 + 45%
ES83: ImageNet_Bernoulli + 1.388 + 0.145 + 45%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES13: ImageNet_Normal + 1.130 + 0.080 + 20%
ES14: ImageNet_Normal + 1.130 + 0.080 + 40%
ES15: ImageNet_Normal + 1.130 + 0.080 + 45%
ES16: ImageNet_Normal + 1.130 + 0.080 + 49%
(5) Location30:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES01: Location30_Normal + 0.570 + 0.041 + 4%
ES29: Location30_Uniform + 0.570 + 0.041 + 4%
ES57: Location30_Bernoulli + 0.570 + 0.041 + 4%
RQ2: Effect of Distance between data samples of the Target Dataset
ES32: Location30_Uniform + 0.57 + 0.076 + 8%
ES42: Location30_Uniform + 0.724 + 0.076 + 8%
ES52: Location30_Uniform + 0.801 + 0.076 + 8%
RQ3: Effect of Differential Distances between two datasets
ES23: Location30_Normal + 0.801 + 0.041 + 12%
ES25: Location30_Normal + 0.801 + 0.076 + 12%
ES27: Location30_Normal + 0.801 + 0.094 + 12%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES73: Location30_Bernoulli + 0.724 + 0.094 + 4%
ES74: Location30_Bernoulli + 0.724 + 0.094 + 8%
ES75: Location30_Bernoulli + 0.724 + 0.094 + 12%
ES76: Location30_Bernoulli + 0.724 + 0.094 + 16%
(6) Purchase100:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES01: Purchase100_Normal + 0.550 + 0.087 + 2%
ES29: Purchase100_Uniform + 0.550 + 0.087 + 2%
ES57: Purchase100_Bernoulli + 0.550 + 0.087 + 2%
RQ2: Effect of Distance between data samples of the Target Dataset
ES04: Purchase100_Normal + 0.550 + 0.110 + 4%
ES14: Purchase100_Normal + 0.625 + 0.110 + 4%
ES24: Purchase100_Normal + 0.729 + 0.110 + 4%
RQ3: Effect of Differential Distances between two datasets
ES51: Purchase100_Uniform + 0.729 + 0.087 + 10%
ES53: Purchase100_Uniform + 0.729 + 0.110 + 10%
ES55: Purchase100_Uniform + 0.729 + 0.156 + 10%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES65: Purchase100_Bernoulli + 0.625 + 0.087 + 2%
ES66: Purchase100_Bernoulli + 0.625 + 0.087 + 4%
ES67: Purchase100_Bernoulli + 0.625 + 0.087 + 10%
ES68: Purchase100_Bernoulli + 0.625 + 0.087 + 12%
(7) Texas100:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES01: Texas100_Normal + 0.530 + 0.038 + 2%
ES29: Texas100_Uniform + 0.530 + 0.038 + 2%
ES57: Texas100_Bernoulli + 0.530 + 0.038 + 2%
RQ2: Effect of Distance between data samples of the Target Dataset
ES02: Texas100_Normal + 0.530 + 0.038 + 4%
ES10: Texas100_Normal + 0.641 + 0.038 + 4%
ES22: Texas100_Normal + 0.734 + 0.038 + 4%
RQ3: Effect of Differential Distances between two datasets
ES51: Texas100_Uniform + 0.734 + 0.038 + 10%
ES53: Texas100_Uniform + 0.734 + 0.073 + 10%
ES55: Texas100_Uniform + 0.734 + 0.107 + 10%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES65: Texas100_Bernoulli + 0.641 + 0.038 + 2%
ES66: Texas100_Bernoulli + 0.641 + 0.038 + 4%
ES67: Texas100_Bernoulli + 0.641 + 0.038 + 10%
ES68: Texas100_Bernoulli + 0.641 + 0.038 + 12%
Additional Evaluation Results