Skip to content

Latest commit

 

History

History
744 lines (590 loc) · 65.1 KB

README.md

File metadata and controls

744 lines (590 loc) · 65.1 KB

MIBench.github.io

Comparing Different Membership Inference Attacks with a Comprehensive Benchmark

Membership inference (MI) attacks threaten user privacy through determining if a given data example has been used to train a target model. However, although the new MI attacks have substantially enhanced researchers’ understanding about how Membership Inference could be conducted in various ways, we found that a common conclusion made in these attack proposals could be misleading. We call this finding the Conflicting Comparison Result (CCR) Phenomenon. This paper seeks to present the systematic analysis we have conducted and interpret the conflicting comparison results resulted from the analysis. In order to systematically conduct the analysis, we have developed MIBench, a new benchmark in the area of Membership Inference. The uniqueness of MIBench is as follows: our benchmark consists not only the evaluation metrics, but also the evaluation scenarios (ESs). And we design the ESs from four perspectives: the distance distribution of data samples in the target dataset, the distance between data samples in the target dataset, the differential distance between two datasets (i.e., the target dataset and a generated dataset with only nonmembers), and the ratio of the samples that are made no inferences by an MI attack. The evaluation metrics consist of ten typical evaluation metrics. We have designed and implemented the MIBench benchmark with 84 evaluation scenarios for each dataset. In total, we have used our benchmark to fairly and systematically compare 15 state-of-the-art MI attack algorithms across 588 evaluation scenarios, and these evaluation scenarios cover 7 widely used datasets and 7 representative types of models. To the best of our knowledge, this paper is the first work reporting and analyzing the CCR Phenomenon. Our experimental analysis reveals 83 conflicting comparison results, and we have gained strong evidences that the CCR Phenomenon is widely existing. All codes and evaluations of MIBench are publicly available at this link1.

MI attacks:

  • NN_attack
  • Loss-Threshold
  • Label-only
  • Top3-NN attack
  • Top1-Threshold
  • BlindMI-Diff-w
  • BlindMI-Diff-w/o
  • BlindMI-Diff-1CLASS
  • Top2+True
  • Privacy Risk Scores
  • Shapley Values
  • Positive Predictive Value
  • Calibrated Score
  • Distillation-based Thre.
  • Likelihood ratio attack

Datasets: CIFAR100, CIFAR10, CH_MNIST, ImageNet, Location30, Purchase100, Texas100

Models: MLP, StandDNN, VGG16, VGG19, ResNet50, ResNet101, DenseNet121

Requirements: You can run the following script to configurate necessary environment sh ./sh/install.sh

Usage: Please first to make a folder for record, all experiment results with save to record folder as default. And make folder for data to put supported datasets. XXX XXX

Attack: This is a demo script of running NN_attack on CIFAR100. python ./attack/NN_attack.py --yaml_path ../config/attack/NN/CIFAR100.yaml --dataset CIFAR100 --dataset_path ../data --save_folder_name CIFAR100_0_1

Selected attacks: 18c218f23f733985d975e2e89c486bd

Evaluation Framework:
MIBench is a comprehensive benchmark for comparing different MI attacks, which consists not only the evaluation metric module, but also the evaluation scenario module.

  • Part I: Evaluation Scenarios

In this work, we have designed and implemented the MIBench benchmark with 84 evaluation scenarios for each dataset. In total, we have used our benchmark to fairly and systematically compare 15 state-of-the-art MI attack algorithms across 588 evaluation scenarios, and these evaluation scenarios cover 7 widely used datasets and 7 representative types of models.

(a) Evaluation Scenarios of CIFAR100. CIFAR100

(b) Evaluation Scenarios of CIFAR10. CIFAR10

(c) Evaluation Scenarios of CH_MNIST. CH_MNIST

(d) Evaluation Scenarios of ImageNet. ImageNet

(e) Evaluation Scenarios of Location30. Location30

(f) Evaluation Scenarios of Purchase100. Purchase100

(g) Evaluation Scenarios of Texas100. Texas100

  • Part II: Evaluation Metrics

    We mainly use attacker-side accuracy, precision, recall, f1-score, false positive rate (FPR), false negative rate (FNR), membership advantage (MA), the Area Under the Curve (AUC) of attack Receiver Operating Characteristic (ROC) curve, TPR @ fixed (low) FPR, threshold at maximum MA, as our evaluation metrics. The details of the evaluation metrics are shown as follows.

(a) accuracy: the percentage of data samples with correct membership predictions by MI attacks;
(b) precision: the ratio of real-true members predicted among all the positive membership predictions made by an adversary;
(c) recall: the ratio of true members predicted by an adversary among all the real-true members;
(d) f1-score: the harmonic mean of precision and recall;
(e) false positive rate (FPR): the ratio of nonmember samples are erroneously predicted as members;
(f) false negative rate (FNR): the difference of the 1 and recall (e.g., FNR=1-recall);
(g) membership advantage (MA):the difference between the true positive rate and the false positive rate (e.g., MA = TPR - FPR);
(h) Area Under the Curve (AUC): computed as the Area Under the Curve of attack Receiver Operating Characteristic (ROC);
(i) TPR @ fixed (low) FPR: an attack’s truepositive rate at (fixed) low false-positive rates;
(j) threshold at maximum MA: a threshold to achieve maximum MA.

Results:

The results section consists of three parts: the results of 84 evaluation scenarios (ES), the thresholds at maximum MA of the Risk score and Shapley values attacks and the results of 4 research questions (RQ). And in part I and part III, we identify the evaluation results of 15 state-of-the-art MI attacks by ten evaluation metrics (e.g., attacker-side accuracy, precision, recall, f1-score, FPR, FNR, MA, AUC, TPR @ fixed (low) FPR (T@0.01%F and T@0.1%F), threshold at maximum MA).

  • Part I: The Results of 84 Evaluation Scenarios

1. Distillation-based:

(1) CIFAR100: 2023 5 9_Distillation-based_不同评估场景_实验结果_01 2023 5 9_Distillation-based_不同评估场景_实验结果_02 (2) CIFAR10: 2023 5 9_Distillation-based_CIFAR10_不同评估场景_实验结果_05 2023 5 9_Distillation-based_CIFAR10_不同评估场景_实验结果06 (3) CH_MNIST: 2023 5 9_Distillation-based_CH_MNIST_不同评估场景_实验结果_07 2023 5 9_Distillation-based_CH_MNIST_不同评估场景_实验结果_08 (4) ImageNet: 2023 5 9_Distillation-based_ImageNet_不同评估场景_实验结果_09 2023 5 9_Distillation-based_ImageNet_不同评估场景_实验结果_10 (5) Location30: 2023 5 9_Distillation-based_Location30_不同评估场景_实验结果_11 2023 5 9_Distillation-based_Location30_不同评估场景_实验结果_12
(6) Purchase100:
2023 5 9_Distillation-based_Purchase100_不同评估场景_实验结果_13 2023 5 9_Distillation-based_Purchase100_不同评估场景_实验结果_14 (7) Texas100:
2023 5 9_Distillation-based_Texas100_不同评估场景_实验结果_15 2023 5 9_Distillation-based_Texas100_不同评估场景_实验结果_16

2. Calibrated Score:

(1) CIFAR100: 2023 4 23_Calibrated Score_CIFAR100_不同评估场景_实验结果(2)_01 2023 4 23_Calibrated Score_CIFAR100_不同评估场景_实验结果(2)_02 (2) CIFAR10: 2023 4 23_Calibrated Score_CIFAR10_不同评估场景_实验结果(2)_03 2023 4 23_Calibrated Score_CIFAR10_不同评估场景_实验结果(2)_04 (3) CH_MNIST: 2023 4 23_Calibrated Score_CH_MNIST_不同评估场景_实验结果(2)_05 2023 4 23_Calibrated Score_CH_MNIST_不同评估场景_实验结果(2)_06 (4) ImageNet: 2023 4 23_Calibrated Score_ImageNet_不同评估场景_实验结果(2)_07 2023 4 23_Calibrated Score_ImageNet_不同评估场景_实验结果(2)_08 (5) Purchase100:
2023 4 23_Calibrated Score_Purchase100_不同评估场景_实验结果(2)_09 2023 4 23_Calibrated Score_Purchase100_不同评估场景_实验结果(2)_10 (6) Texas100:
2023 4 23_Calibrated Score_Texas100_不同评估场景_实验结果(2)_11 2023 4 23_Calibrated Score_Texas100_不同评估场景_实验结果(2)_12

3. Label-only:

(1) CIFAR100: 2023 4 23_Label-only_CIFAR100_不同评估场景_实验结果4 29_01 2023 4 23_Label-only_CIFAR100_不同评估场景_实验结果4 29_02 (2) CIFAR10: 2023 4 23_Label-only_CIFAR10_不同评估场景_实验结果4 29_03 2023 4 23_Label-only_CIFAR10_不同评估场景_实验结果4 29_04 (3) CH_MNIST: 2023 4 23_Label-only_CH_MNST_不同评估场景_实验结果4 29_05 2023 4 23_Label-only_CH_MNST_不同评估场景_实验结果4 29_06 (4) ImageNet: 2023 4 23_Label-only_ImageNet_不同评估场景_实验结果4 29_07 2023 4 23_Label-only_ImageNet_不同评估场景_实验结果4 29_08 (5) Location30:
2023 4 23_Label-only_Location30_不同评估场景_实验结果4 29_09 2023 4 23_Label-only_Location30_不同评估场景_实验结果4 29_10 (6) Purchase100:
2023 4 23_Label-only_Purchase100_不同评估场景_实验结果4 29_11 2023 4 23_Label-only_Purchase100_不同评估场景_实验结果4 29_12 (7) Texas100:
2023 4 23_Label-only_Texas100_不同评估场景_实验结果4 29_13 2023 4 23_Label-only_Texas100_不同评估场景_实验结果4 29_14

4. NN_attack:

(1) CIFAR100: 2023 4 23_NN_attack_CIFAR100_不同评估场景_实验结果4 29_01 2023 4 23_NN_attack_CIFAR100_不同评估场景_实验结果4 29_02 (2) CIFAR10: 2023 4 23_NN_attack_CIFAR10_不同评估场景_实验结果4 29_03 2023 4 23_NN_attack_CIFAR10_不同评估场景_实验结果4 29_04 (3) CH_MNIST: 2023 4 23_NN_attack_CH_MINST_不同评估场景_实验结果4 29_05 2023 4 23_NN_attack_CH_MINST_不同评估场景_实验结果4 29_06 (4) ImageNet:
2023 4 23_NN_attack_ImageNet_不同评估场景_实验结果4 29_07 2023 4 23_NN_attack_ImageNet_不同评估场景_实验结果4 29_08 (5) Location30: 2023 4 23_NN_attack_Location30_不同评估场景_实验结果4 29_09 2023 4 23_NN_attack_Location30_不同评估场景_实验结果4 29_10 (6) Purchase100:
2023 4 23_NN_attack_Purchase100_不同评估场景_实验结果4 29_11 2023 4 23_NN_attack_Purchase100_不同评估场景_实验结果4 29_12 (7) Texas100:
2023 4 23_NN_attack_Texas100_不同评估场景_实验结果4 29_13 2023 4 23_NN_attack_Texas100_不同评估场景_实验结果4 29_14

5. PPV:

(1) CIFAR100: 2023 4 23_PPV_CIFAR100_不同评估场景_实验结果_01 2023 4 23_PPV_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 4 23_PPV_CIFAR10_不同评估场景_实验结果_03 2023 4 23_PPV_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 4 23_PPV_CH_MINST_不同评估场景_实验结果_05 2023 4 23_PPV_CH_MINST_不同评估场景_实验结果_06 (4) ImageNet:
2023 4 23_PPV_ImageNet_不同评估场景_实验结果_07 2023 4 23_PPV_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 4 23_PPV_Location30_不同评估场景_实验结果_09 2023 4 23_PPV_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 4 23_PPV_Purchase100_不同评估场景_实验结果_11 2023 4 23_PPV_Purchase100_不同评估场景_实验结果_12 (7) Texas100:
2023 4 23_PPV_Texas100_不同评估场景_实验结果_13 2023 4 23_PPV_Texas100_不同评估场景_实验结果_14

6. Risk score:

(1) CIFAR100: 2023 4 23_Risk score_CIFAR100_不同评估场景_实验结果_01 2023 4 23_Risk score_CIFAR100_不同评估场景_实验结果_02 (2) CH_MNIST: 2023 4 23_Risk score_CH_MNST_不同评估场景_实验结果_03 2023 4 23_Risk score_CH_MNST_不同评估场景_实验结果_04 (3) ImageNet: 2023 4 23_Risk score_ImageNet_不同评估场景_实验结果_05 2023 4 23_Risk score_ImageNet_不同评估场景_实验结果_06 (4) Location30:
2023 4 23_Risk score_Location30_不同评估场景_实验结果_07 2023 4 23_Risk score_Location30_不同评估场景_实验结果_08 (5) Purchase100:
2023 4 23_Risk score_Purchase100_不同评估场景_实验结果_09 2023 4 23_Risk score_Purchase100_不同评估场景_实验结果_10 (6) Texas100:
2023 4 23_Risk score_Texas100_不同评估场景_实验结果_11 2023 4 23_Risk score_Texas100_不同评估场景_实验结果_12

7. Shapley values:

(1) CIFAR100: 2023 4 23_Shapley values_CIFAR100_不同评估场景_实验结果_01 2023 4 23_Shapley values_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 4 23_Shapley values_CIFAR10_不同评估场景_实验结果_03 2023 4 23_Shapley values_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 4 23_Shapley values_CH_MNST_不同评估场景_实验结果_05 2023 4 23_Shapley values_CH_MNST_不同评估场景_实验结果_06 (4) ImageNet:
2023 4 23_Shapley values_ImageNet_不同评估场景_实验结果_07 2023 4 23_Shapley values_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 4 23_Shapley values_Location30_不同评估场景_实验结果_09 2023 4 23_Shapley values_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 4 23_Shapley values_Purchase100_不同评估场景_实验结果_11 2023 4 23_Shapley values_Purchase100_不同评估场景_实验结果_12 (7) Texas100:
2023 4 23_Shapley values_Texas100_不同评估场景_实验结果_13 2023 4 23_Shapley values_Texas100_不同评估场景_实验结果_14 8. Top1_Threshold:

(1) CIFAR100: 2023 4 23_Top1_Threshold_CIFAR100_不同评估场景_实验结果_01 2023 4 23_Top1_Threshold_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 4 23_Top1_Threshold_CIFAR10_不同评估场景_实验结果_03 2023 4 23_Top1_Threshold_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 4 23_Top1_Threshold_CH_MNST_不同评估场景_实验结果_05 2023 4 23_Top1_Threshold_CH_MNST_不同评估场景_实验结果_06 (4) ImageNet:
2023 4 23_Top1_Threshold_ImageNet_不同评估场景_实验结果_07 2023 4 23_Top1_Threshold_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 4 23_Top1_Threshold_Location30_不同评估场景_实验结果_09 2023 4 23_Top1_Threshold_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 4 23_Top1_Threshold_Purchase100_不同评估场景_实验结果_11 2023 4 23_Top1_Threshold_Purchase100_不同评估场景_实验结果_12 (7) Texas100:
2023 4 23_Top1_Threshold_Texas100_不同评估场景_实验结果_13 2023 4 23_Top1_Threshold_Texas100_不同评估场景_实验结果_14

9. BlindMI-1CLASS:

(1) CIFAR100: 2023 4 27_BlinMI-1CLASS_CIFAR100_不同评估场景_实验结果_01 2023 4 27_BlinMI-1CLASS_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 4 27_BlinMI-1CLASS_CIFAR10_不同评估场景_实验结果_03 2023 4 27_BlinMI-1CLASS_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 4 27_BlinMI-1CLASS_CH_MNST_不同评估场景_实验结果_05 2023 4 27_BlinMI-1CLASS_CH_MNST_不同评估场景_实验结果_06 (4) ImageNet:
2023 4 27_BlinMI-1CLASS_ImageNet_不同评估场景_实验结果_07 2023 4 27_BlinMI-1CLASS_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 4 27_BlinMI-1CLASS_Location30_不同评估场景_实验结果_09 2023 4 27_BlinMI-1CLASS_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 4 27_BlinMI-1CLASS_Purchase100_不同评估场景_实验结果_11 2023 4 27_BlinMI-1CLASS_Purchase100_不同评估场景_实验结果_12 (7) Texas100:
2023 4 27_BlinMI-1CLASS_Texas100_不同评估场景_实验结果_13 2023 4 27_BlinMI-1CLASS_Texas100_不同评估场景_实验结果_14

10. Top3_NN:

(1) CIFAR100: 2023 4 30_Top3_NN_CIFAR100_不同评估场景_实验结果_01 2023 4 30_Top3_NN_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 4 30_Top3_NN_CIFAR10_不同评估场景_实验结果_03 2023 4 30_Top3_NN_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 4 30_Top3_NN_CH_MNST_不同评估场景_实验结果_05 2023 4 30_Top3_NN_CH_MNST_不同评估场景_实验结果_06 (4) ImageNet:
2023 4 30_Top3_NN_ImageNet_不同评估场景_实验结果_07 2023 4 30_Top3_NN_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 4 30_Top3_NN_Location30_不同评估场景_实验结果_09 2023 4 30_Top3_NN_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 4 30_Top3_NN_Purchase100_不同评估场景_实验结果_11 2023 4 30_Top3_NN_Purchase100_不同评估场景_实验结果_12 (7) Texas100: 2023 4 30_Top3_NN_Texas100_不同评估场景_实验结果_13 2023 4 30_Top3_NN_Texas100_不同评估场景_实验结果_14

11. LiRA:

(1) CIFAR100: 2023 5 1_LiRA_CIFAR100_不同评估场景_实验结果_01 2023 5 1_LiRA_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 5 1_LiRA_CIFAR10_不同评估场景_实验结果_03 2023 5 1_LiRA_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 5 1_LiRA_CH_MNST_不同评估场景_实验结果_05 2023 5 1_LiRA_CH_MNST_不同评估场景_实验结果_06 (4) ImageNet:
2023 5 1_LiRA_ImageNet_不同评估场景_实验结果_07 2023 5 1_LiRA_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 5 1_LiRA_Location30_不同评估场景_实验结果_09 2023 5 1_LiRA_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 5 1_LiRA_Purchase100_不同评估场景_实验结果_11 2023 5 1_LiRA_Purchase100_不同评估场景_实验结果_12 (7) Texas100: 2023 5 1_LiRA_Texas100_不同评估场景_实验结果_13 2023 5 1_LiRA_Texas100_不同评估场景_实验结果_14

12. Top2+True:

(1) CIFAR100: 2023 5 1_Top2+True_CIFAR100_不同评估场景_实验结果_01 2023 5 1_Top2+True_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 5 1_Top2+True_CIFAR10_不同评估场景_实验结果_03 2023 5 1_Top2+True_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 5 1_Top2+True_CH_MNST_不同评估场景_实验结果_05 2023 5 1_Top2+True_CH_MNST_不同评估场景_实验结果_06 (4) ImageNet:
2023 5 1_Top2+True_ImageNet_不同评估场景_实验结果_07 2023 5 1_Top2+True_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 5 1_Top2+True_Location30_不同评估场景_实验结果_09 2023 5 1_Top2+True_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 5 1_Top2+True_Purchase100_不同评估场景_实验结果_11 2023 5 1_Top2+True_Purchase100_不同评估场景_实验结果_12 (7) Texas100: 2023 5 1_Top2+True_Texas100_不同评估场景_实验结果_13 2023 5 1_Top2+True_Texas100_不同评估场景_实验结果_14

13. BlindMI-w:

(1) CIFAR100: 2023 5 2_BlinMI-w_CIFAR100_不同评估场景_实验结果_V2_01 2023 5 2_BlinMI-w_CIFAR100_不同评估场景_实验结果_V2_02 (2) CIFAR10: 2023 5 2_BlinMI-w_CIFAR10_不同评估场景_实验结果_V2_03 2023 5 2_BlinMI-w_CIFAR10_不同评估场景_实验结果_V2_04 (3) CH_MNIST: 2023 5 2_BlinMI-w_CH_MNST_不同评估场景_实验结果_V2_05 2023 5 2_BlinMI-w_CH_MNST_不同评估场景_实验结果_V2_06 (4) ImageNet:
2023 5 2_BlinMI-w_ImageNet_不同评估场景_实验结果_V2_07 2023 5 2_BlinMI-w_ImageNet_不同评估场景_实验结果_V2_08 (5) Location30:
2023 5 2_BlinMI-w_Location30_不同评估场景_实验结果_V2_09 2023 5 2_BlinMI-w_Location30_不同评估场景_实验结果_V2_10 (6) Purchase100:
2023 5 2_BlinMI-w_Purchase100_不同评估场景_实验结果_V2_11 2023 5 2_BlinMI-w_Purchase100_不同评估场景_实验结果_V2_12 (7) Texas100: 2023 5 2_BlinMI-w_Texas100_不同评估场景_实验结果_V2_13 2023 5 2_BlinMI-w_Texas100_不同评估场景_实验结果_V2_14

14. BlindMI-without:

(1) CIFAR100: 2023 5 2_BlinMI-without_CIFAR100_不同评估场景_实验结果(1)_01 2023 5 2_BlinMI-without_CIFAR100_不同评估场景_实验结果(1)_02 (2) CIFAR10: 2023 5 2_BlinMI-without_CIFAR10_不同评估场景_实验结果(1)_03 2023 5 2_BlinMI-without_CIFAR10_不同评估场景_实验结果(1)_04 (3) CH_MNIST: 2023 5 2_BlinMI-without_CH_MNST_不同评估场景_实验结果(1)_05 2023 5 2_BlinMI-without_CH_MNST_不同评估场景_实验结果(1)_06 (4) Location30:
2023 5 2_BlinMI-without_Location30_不同评估场景_实验结果(1)_07 2023 5 2_BlinMI-without_Location30_不同评估场景_实验结果(1)_08 (5) Purchase100:
2023 5 2_BlinMI-without_Purchase100_不同评估场景_实验结果(1)_09 2023 5 2_BlinMI-without_Purchase100_不同评估场景_实验结果(1)_10 (6) Texas100:
2023 5 2_BlinMI-without_Texas100_不同评估场景_实验结果(1)_11 2023 5 2_BlinMI-without_Texas100_不同评估场景_实验结果(1)_12

15. Loss-Threshold:

(1) CIFAR100: 2023 5 2_Loss-Threshold_CIFAR100_不同评估场景_实验结果_V2_01 2023 5 2_Loss-Threshold_CIFAR100_不同评估场景_实验结果_V2_02 (2) CIFAR10: 2023 5 2_Loss-Threshold_CIFAR10_不同评估场景_实验结果_V2_03 2023 5 2_Loss-Threshold_CIFAR10_不同评估场景_实验结果_V2_04 (3) CH_MNIST: 2023 5 2_Loss-Threshold_CH_MNST_不同评估场景_实验结果_V2_05 2023 5 2_Loss-Threshold_CH_MNST_不同评估场景_实验结果_V2_06 (4) ImageNet:
2023 5 2_Loss-Threshold_ImageNet_不同评估场景_实验结果_V2_07 2023 5 2_Loss-Threshold_ImageNet_不同评估场景_实验结果_V2_08 (5) Location30:
2023 5 2_Loss-Threshold_Location30_不同评估场景_实验结果_V2_09 2023 5 2_Loss-Threshold_Location30_不同评估场景_实验结果_V2_10 (6) Purchase100:
2023 5 2_Loss-Threshold_Purchase100_不同评估场景_实验结果_V2_11 2023 5 2_Loss-Threshold_Purchase100_不同评估场景_实验结果_V2_12 (7) Texas100: 2023 5 2_Loss-Threshold_Texas100_不同评估场景_实验结果_V2_13 2023 5 2_Loss-Threshold_Texas100_不同评估场景_实验结果_V2_14

  • Part II: The Thresholds at maximum MA

1. Risk score attacks:

(1) CIFAR100: CIFAR100_Risk score_不同类别_阈值_01

(2) CH_MNIST: CH_MNST_Risk score_不同类别_阈值_02 CH_MNST_Risk score_不同类别_阈值_03

(3) ImageNet: ImageNet_Risk score_不同类别_阈值_04

(4) Location30: Location30_Risk score_不同类别_阈值_05

(5) Purchase100: Purchase100_Risk score_不同类别_阈值_06

(6) Texas100: Texas100_Risk score_不同类别_阈值_07

2. Shapley values attacks:

(1) CIFAR100: CIFAR100_Shapley values_不同类别_阈值_01

(2) CIFAR10: CIFAR10_Shapley values_不同类别_阈值_02 CIFAR10_Shapley values_不同类别_阈值_03

(3) CH_MNIST: CH_MNST_Shapley values_不同类别_阈值_04 CH_MNST_Shapley values_不同类别_阈值_05

(4) ImageNet: ImageNet_Shapley values_不同类别_阈值_06

(5) Location30: Location30_Shapley values_不同类别_阈值_07

(6) Purchase100: Purchase100_Shapley values_不同类别_阈值_08

(7) Texas100: Texas100_Shapley values_不同类别_阈值_09

  • Part III: The Results of 4 Research Questions

(1) CIFAR100:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES01: CIFAR100_Normal + 2.893 + 0.085 + 20%
ES29: CIFAR100_Uniform + 2.893 + 0.085 + 20%
ES57: CIFAR100_Bernoulli + 2.893 +0.085 + 20% CIFAR100_RQ1

CIFAR100_N_2 893_d1_20% CIFAR100_U_2 893_d1_20% CIFAR100_B_2 893_d1_20%

RQ2: Effect of Distance between data samples of the Target Dataset

ES02: CIFAR100_Normal + 2.893 + 0.085 + 40%
ES10: CIFAR100_Normal + 3.813 + 0.085 + 40%
ES22: CIFAR100_Normal + 4.325 + 0.085 + 40% CIFAR100_RQ2

RQ3: Effect of Differential Distances between two datasets

ES03: CIFAR100_Normal + 2.893 + 0.085 + 45%
ES05: CIFAR100_Normal + 2.893 + 0.119 + 45%
ES07: CIFAR100_Normal + 2.893 + 0.157 + 45% CIFAR100_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES37: CIFAR100_Uniform + 3.813 + 0.085 + 20%
ES38: CIFAR100_Uniform + 3.813 + 0.085 + 40%
ES39: CIFAR100_Uniform + 3.813 + 0.085 + 45%
ES40: CIFAR100_Uniform + 3.813 + 0.085 + 49% CIFAR100_RQ4

(2) CIFAR10:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES13: CIFAR10_Normal + 2.501 + 0.213 + 20%
ES41: CIFAR10_Uniform + 2.501 + 0.213 + 20%
ES69: CIFAR10_Bernoulli + 2.501 + 0.213 + 20% CIFAR10_RQ1

RQ2: Effect of Distance between data samples of the Target Dataset

ES02: CIFAR10_Normal + 1.908 + 0.155 + 40%
ES10: CIFAR10_Normal + 2.501 + 0.155 + 40%
ES22: CIFAR10_Normal + 3.472 + 0.155 + 40% CIFAR10_RQ2

CIFAR10_N_1 908_d1_40% CIFAR10_N_2 501_d1_40% CIFAR10_N_3 472_d1_40%

RQ3: Effect of Differential Distances between two datasets

ES51: CIFAR10_Uniform + 3.472 + 0.155 + 45%
ES53: CIFAR10_Uniform + 3.472 + 0.213 + 45%
ES55: CIFAR10_Uniform + 3.472 + 0.291 + 45% CIFAR10_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES57: CIFAR10_Bernoulli + 1.908 +0.155 + 20%
ES58: CIFAR10_Bernoulli + 1.908 + 0.155 + 40%
ES59: CIFAR10_Bernoulli + 1.908 + 0.155 + 45%
ES60: CIFAR10_Bernoulli + 1.908 + 0.155 + 49% CIFAR10_RQ4

(3) CH_MNIST:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES21: CH_MNIST_Normal + 1.720 +0.083 + 20%
ES49 : CH_MNIST_Uniform + 1.720 +0.083 + 20%
ES77: CH_MNIST_Bernoulli + 1.720 +0.083 + 20% CH_MNIST_RQ1

RQ2: Effect of Distance between data samples of the Target Dataset

ES04: CH_MNIST_Uniform + 0.954 + 0.108 + 40%
ES14: CH_MNIST_Uniform + 1.355 + 0.108 + 40%
ES24: CH_MNIST_Uniform + 1.720 + 0.108 + 40% CH_MNIST_RQ2

RQ3: Effect of Differential Distances between two datasets

ES03: CH_MNIST_Normal + 0.954 + 0.083 + 45%
ES05: CH_MNIST_Normal + 0.954 + 0.108 + 45%
ES07: CH_MNIST_Normal + 0.954 + 0.133 + 45% CH_MNIST_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES73: CH_MNIST_Bernoulli + 1.355 + 0.133 + 20%
ES74: CH_MNIST_Bernoulli + 1.355 + 0.133 + 40%
ES75: CH_MNIST_Bernoulli + 1.355 + 0.133 + 45%
ES76: CH_MNIST_Bernoulli + 1.355 + 0.133 + 49% CH_MNIST_RQ4

(4) ImageNet:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES02: ImageNet_Normal + 0.934 + 0.046 + 40%
ES30: ImageNet_Uniform + 0.934 + 0.046 + 40%
ES58: ImageNet_Bernoulli + 0.934 + 0.046 + 40% ImageNet_RQ1

RQ2: Effect of Distance between data samples of the Target Dataset

ES34: ImageNet_Uniform + 0.934 + 0.08 + 49%
ES44: ImageNet_Uniform + 1.130 + 0.08 + 49%
ES54: ImageNet_Uniform + 1.388 + 0.08 + 49% ImageNet_RQ2

RQ3: Effect of Differential Distances between two datasets

ES79: ImageNet_Bernoulli + 1.388 + 0.046 + 45%
ES81: ImageNet_Bernoulli + 1.388 + 0.080 + 45%
ES83: ImageNet_Bernoulli + 1.388 + 0.145 + 45% ImageNet_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES13: ImageNet_Normal + 1.130 + 0.080 + 20%
ES14: ImageNet_Normal + 1.130 + 0.080 + 40%
ES15: ImageNet_Normal + 1.130 + 0.080 + 45%
ES16: ImageNet_Normal + 1.130 + 0.080 + 49% ImageNet_RQ4

(5) Location30:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES01: Location30_Normal + 0.570 + 0.041 + 4%
ES29: Location30_Uniform + 0.570 + 0.041 + 4%
ES57: Location30_Bernoulli + 0.570 + 0.041 + 4% Location30_RQ1

RQ2: Effect of Distance between data samples of the Target Dataset

ES32: Location30_Uniform + 0.57 + 0.076 + 8%
ES42: Location30_Uniform + 0.724 + 0.076 + 8%
ES52: Location30_Uniform + 0.801 + 0.076 + 8% Location30_RQ2

RQ3: Effect of Differential Distances between two datasets

ES23: Location30_Normal + 0.801 + 0.041 + 12%
ES25: Location30_Normal + 0.801 + 0.076 + 12%
ES27: Location30_Normal + 0.801 + 0.094 + 12% Location30_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES73: Location30_Bernoulli + 0.724 + 0.094 + 4%
ES74: Location30_Bernoulli + 0.724 + 0.094 + 8%
ES75: Location30_Bernoulli + 0.724 + 0.094 + 12%
ES76: Location30_Bernoulli + 0.724 + 0.094 + 16% Location30_RQ4

(6) Purchase100:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES01: Purchase100_Normal + 0.550 + 0.087 + 2%
ES29: Purchase100_Uniform + 0.550 + 0.087 + 2%
ES57: Purchase100_Bernoulli + 0.550 + 0.087 + 2% Purchase100_RQ1

RQ2: Effect of Distance between data samples of the Target Dataset

ES04: Purchase100_Normal + 0.550 + 0.110 + 4%
ES14: Purchase100_Normal + 0.625 + 0.110 + 4%
ES24: Purchase100_Normal + 0.729 + 0.110 + 4% Purchase100_RQ2

RQ3: Effect of Differential Distances between two datasets

ES51: Purchase100_Uniform + 0.729 + 0.087 + 10%
ES53: Purchase100_Uniform + 0.729 + 0.110 + 10%
ES55: Purchase100_Uniform + 0.729 + 0.156 + 10% Purchase100_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES65: Purchase100_Bernoulli + 0.625 + 0.087 + 2%
ES66: Purchase100_Bernoulli + 0.625 + 0.087 + 4%
ES67: Purchase100_Bernoulli + 0.625 + 0.087 + 10%
ES68: Purchase100_Bernoulli + 0.625 + 0.087 + 12% Purchase100_RQ4

(7) Texas100:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES01: Texas100_Normal + 0.530 + 0.038 + 2%
ES29: Texas100_Uniform + 0.530 + 0.038 + 2%
ES57: Texas100_Bernoulli + 0.530 + 0.038 + 2% Texas100_RQ1

RQ2: Effect of Distance between data samples of the Target Dataset

ES02: Texas100_Normal + 0.530 + 0.038 + 4%
ES10: Texas100_Normal + 0.641 + 0.038 + 4%
ES22: Texas100_Normal + 0.734 + 0.038 + 4% Texas100_RQ2

RQ3: Effect of Differential Distances between two datasets

ES51: Texas100_Uniform + 0.734 + 0.038 + 10%
ES53: Texas100_Uniform + 0.734 + 0.073 + 10%
ES55: Texas100_Uniform + 0.734 + 0.107 + 10% Texas100_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES65: Texas100_Bernoulli + 0.641 + 0.038 + 2%
ES66: Texas100_Bernoulli + 0.641 + 0.038 + 4%
ES67: Texas100_Bernoulli + 0.641 + 0.038 + 10%
ES68: Texas100_Bernoulli + 0.641 + 0.038 + 12% Texas100_RQ4

Additional Evaluation Results