Skip to content

MIBench/MIBench.github.io

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 

Repository files navigation

MIBench.github.io

Comparing Different Membership Inference Attacks with a Comprehensive Benchmark

Membership inference attacks pose a significant threat to user privacy in machine learning systems. While numerous attack mechanisms have been proposed in the literature, the lack of standardized evaluation parameters and metrics has led to inconsistent and even conflicting comparison results. To address this issue and facilitate a systematic analysis of these disparate findings, we introduce MIBench, a comprehensive benchmark for membership inference attacks. MIBench includes a suite of carefully designed evaluation scenarios and evaluation metrics to provide a consistent framework for assessing the efficacy of various membership inference techniques. The evaluation scenarios are crafted to encompass four critical factors: intra-dataset distance distribution, inter-sample distance within the target dataset, differential distance analysis, and inference withholding ratio. In total, MIBench includes ten typical evaluation metrics and incorporates 84 distinct evaluation scenarios for each dataset. Using this robust framework, we conducted a thorough comparative analysis of 15 state-of-the-art membership inference attack algorithms across 588 evaluation scenarios, 7 widely adopted datasets, and 7 representative model architectures. Our analysis revealed 83 instances of Conflicting Comparison Results (CCR), providing substantial evidence for the CCR Phenomenon. We identified two CCR types: Type 1 (single-factor) and Type 2 (dual-factor). The distribution of CCR instances across the four critical factors was: inter-sample distance (40.96%), differential distance (37.35%), inference withholding ratio (19.28%), and intra-dataset distance (2.41%). All codes and evaluations of MIBench are publicly available in the following link1.

MI attacks:

  • NN_attack
  • Loss-Threshold
  • Label-only
  • Top3-NN attack
  • Top1-Threshold
  • BlindMI-Diff-w
  • BlindMI-Diff-w/o
  • BlindMI-Diff-1CLASS
  • Top2+True
  • Privacy Risk Scores
  • Shapley Values
  • Positive Predictive Value
  • Calibrated Score
  • Distillation-based Thre.
  • Likelihood ratio attack

Datasets: CIFAR100, CIFAR10, CH_MNIST, ImageNet, Location30, Purchase100, Texas100

Models: MLP, StandDNN, VGG16, VGG19, ResNet50, ResNet101, DenseNet121

Requirements: You can run the following script to configurate necessary environment sh ./sh/install.sh

Usage: Please first to make a folder for record, all experiment results with save to record folder as default. And make folder for data to put supported datasets. XXX XXX

Attack: This is a demo script of running NN_attack on CIFAR100. python ./attack/NN_attack.py --yaml_path ../config/attack/NN/CIFAR100.yaml --dataset CIFAR100 --dataset_path ../data --save_folder_name CIFAR100_0_1

Selected attacks: 18c218f23f733985d975e2e89c486bd

Evaluation Framework:
MIBench is a comprehensive benchmark for comparing different MI attacks, which consists not only the evaluation metric module, but also the evaluation scenario module.

  • Part I: Evaluation Scenarios

In this work, we have designed and implemented the MIBench benchmark with 84 evaluation scenarios for each dataset. In total, we have used our benchmark to fairly and systematically compare 15 state-of-the-art MI attack algorithms across 588 evaluation scenarios, and these evaluation scenarios cover 7 widely used datasets and 7 representative types of models.

(a) Evaluation Scenarios of CIFAR100. CIFAR100

(b) Evaluation Scenarios of CIFAR10. CIFAR10

(c) Evaluation Scenarios of CH_MNIST. CH_MNIST

(d) Evaluation Scenarios of ImageNet. ImageNet

(e) Evaluation Scenarios of Location30. Location30

(f) Evaluation Scenarios of Purchase100. Purchase100

(g) Evaluation Scenarios of Texas100. Texas100

  • Part II: Evaluation Metrics

    We mainly use attacker-side accuracy, precision, recall, f1-score, false positive rate (FPR), false negative rate (FNR), membership advantage (MA), the Area Under the Curve (AUC) of attack Receiver Operating Characteristic (ROC) curve, TPR @ fixed (low) FPR, threshold at maximum MA, as our evaluation metrics. The details of the evaluation metrics are shown as follows.

(a) accuracy: the percentage of data samples with correct membership predictions by MI attacks;
(b) precision: the ratio of real-true members predicted among all the positive membership predictions made by an adversary;
(c) recall: the ratio of true members predicted by an adversary among all the real-true members;
(d) f1-score: the harmonic mean of precision and recall;
(e) false positive rate (FPR): the ratio of nonmember samples are erroneously predicted as members;
(f) false negative rate (FNR): the difference of the 1 and recall (e.g., FNR=1-recall);
(g) membership advantage (MA):the difference between the true positive rate and the false positive rate (e.g., MA = TPR - FPR);
(h) Area Under the Curve (AUC): computed as the Area Under the Curve of attack Receiver Operating Characteristic (ROC);
(i) TPR @ fixed (low) FPR: an attack’s truepositive rate at (fixed) low false-positive rates;
(j) threshold at maximum MA: a threshold to achieve maximum MA.

Results:

The results section consists of three parts: the results of 84 evaluation scenarios (ES), the thresholds at maximum MA of the Risk score and Shapley values attacks and the results of 4 research questions (RQ). And in part I and part III, we identify the evaluation results of 15 state-of-the-art MI attacks by ten evaluation metrics (e.g., attacker-side accuracy, precision, recall, f1-score, FPR, FNR, MA, AUC, TPR @ fixed (low) FPR (T@0.01%F and T@0.1%F), threshold at maximum MA).

  • Part I: The Results of 84 Evaluation Scenarios

1. Distillation-based:

(1) CIFAR100: 2023 5 9_Distillation-based_不同评估场景_实验结果_01 2023 5 9_Distillation-based_不同评估场景_实验结果_02 (2) CIFAR10: 2023 5 9_Distillation-based_CIFAR10_不同评估场景_实验结果_05 2023 5 9_Distillation-based_CIFAR10_不同评估场景_实验结果06 (3) CH_MNIST: 2023 5 9_Distillation-based_CH_MNIST_不同评估场景_实验结果_07 2023 5 9_Distillation-based_CH_MNIST_不同评估场景_实验结果_08 (4) ImageNet: 2023 5 9_Distillation-based_ImageNet_不同评估场景_实验结果_09 2023 5 9_Distillation-based_ImageNet_不同评估场景_实验结果_10 (5) Location30: 2023 5 9_Distillation-based_Location30_不同评估场景_实验结果_11 2023 5 9_Distillation-based_Location30_不同评估场景_实验结果_12
(6) Purchase100:
2023 5 9_Distillation-based_Purchase100_不同评估场景_实验结果_13 2023 5 9_Distillation-based_Purchase100_不同评估场景_实验结果_14 (7) Texas100:
2023 5 9_Distillation-based_Texas100_不同评估场景_实验结果_15 2023 5 9_Distillation-based_Texas100_不同评估场景_实验结果_16

2. Calibrated Score:

(1) CIFAR100: 2023 4 23_Calibrated Score_CIFAR100_不同评估场景_实验结果(2)_01 2023 4 23_Calibrated Score_CIFAR100_不同评估场景_实验结果(2)_02 (2) CIFAR10: 2023 4 23_Calibrated Score_CIFAR10_不同评估场景_实验结果(2)_03 2023 4 23_Calibrated Score_CIFAR10_不同评估场景_实验结果(2)_04 (3) CH_MNIST: 2023 4 23_Calibrated Score_CH_MNIST_不同评估场景_实验结果(2)_05 2023 4 23_Calibrated Score_CH_MNIST_不同评估场景_实验结果(2)_06 (4) ImageNet: 2023 4 23_Calibrated Score_ImageNet_不同评估场景_实验结果(2)_07 2023 4 23_Calibrated Score_ImageNet_不同评估场景_实验结果(2)_08 (5) Purchase100:
2023 4 23_Calibrated Score_Purchase100_不同评估场景_实验结果(2)_09 2023 4 23_Calibrated Score_Purchase100_不同评估场景_实验结果(2)_10 (6) Texas100:
2023 4 23_Calibrated Score_Texas100_不同评估场景_实验结果(2)_11 2023 4 23_Calibrated Score_Texas100_不同评估场景_实验结果(2)_12

3. Label-only:

(1) CIFAR100: 2023 4 23_Label-only_CIFAR100_不同评估场景_实验结果4 29_01 2023 4 23_Label-only_CIFAR100_不同评估场景_实验结果4 29_02 (2) CIFAR10: 2023 4 23_Label-only_CIFAR10_不同评估场景_实验结果4 29_03 2023 4 23_Label-only_CIFAR10_不同评估场景_实验结果4 29_04 (3) CH_MNIST: 2023 4 23_Label-only_CH_MNST_不同评估场景_实验结果4 29_05 2023 4 23_Label-only_CH_MNST_不同评估场景_实验结果4 29_06 (4) ImageNet: 2023 4 23_Label-only_ImageNet_不同评估场景_实验结果4 29_07 2023 4 23_Label-only_ImageNet_不同评估场景_实验结果4 29_08 (5) Location30:
2023 4 23_Label-only_Location30_不同评估场景_实验结果4 29_09 2023 4 23_Label-only_Location30_不同评估场景_实验结果4 29_10 (6) Purchase100:
2023 4 23_Label-only_Purchase100_不同评估场景_实验结果4 29_11 2023 4 23_Label-only_Purchase100_不同评估场景_实验结果4 29_12 (7) Texas100:
2023 4 23_Label-only_Texas100_不同评估场景_实验结果4 29_13 2023 4 23_Label-only_Texas100_不同评估场景_实验结果4 29_14

4. NN_attack:

(1) CIFAR100: 2023 4 23_NN_attack_CIFAR100_不同评估场景_实验结果4 29_01 2023 4 23_NN_attack_CIFAR100_不同评估场景_实验结果4 29_02 (2) CIFAR10: 2023 4 23_NN_attack_CIFAR10_不同评估场景_实验结果4 29_03 2023 4 23_NN_attack_CIFAR10_不同评估场景_实验结果4 29_04 (3) CH_MNIST: 2023 4 23_NN_attack_CH_MINST_不同评估场景_实验结果4 29_05 2023 4 23_NN_attack_CH_MINST_不同评估场景_实验结果4 29_06 (4) ImageNet:
2023 4 23_NN_attack_ImageNet_不同评估场景_实验结果4 29_07 2023 4 23_NN_attack_ImageNet_不同评估场景_实验结果4 29_08 (5) Location30: 2023 4 23_NN_attack_Location30_不同评估场景_实验结果4 29_09 2023 4 23_NN_attack_Location30_不同评估场景_实验结果4 29_10 (6) Purchase100:
2023 4 23_NN_attack_Purchase100_不同评估场景_实验结果4 29_11 2023 4 23_NN_attack_Purchase100_不同评估场景_实验结果4 29_12 (7) Texas100:
2023 4 23_NN_attack_Texas100_不同评估场景_实验结果4 29_13 2023 4 23_NN_attack_Texas100_不同评估场景_实验结果4 29_14

5. PPV:

(1) CIFAR100: 2023 4 23_PPV_CIFAR100_不同评估场景_实验结果_01 2023 4 23_PPV_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 4 23_PPV_CIFAR10_不同评估场景_实验结果_03 2023 4 23_PPV_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 4 23_PPV_CH_MINST_不同评估场景_实验结果_05 2023 4 23_PPV_CH_MINST_不同评估场景_实验结果_06 (4) ImageNet:
2023 4 23_PPV_ImageNet_不同评估场景_实验结果_07 2023 4 23_PPV_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 4 23_PPV_Location30_不同评估场景_实验结果_09 2023 4 23_PPV_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 4 23_PPV_Purchase100_不同评估场景_实验结果_11 2023 4 23_PPV_Purchase100_不同评估场景_实验结果_12 (7) Texas100:
2023 4 23_PPV_Texas100_不同评估场景_实验结果_13 2023 4 23_PPV_Texas100_不同评估场景_实验结果_14

6. Risk score:

(1) CIFAR100: 2023 4 23_Risk score_CIFAR100_不同评估场景_实验结果_01 2023 4 23_Risk score_CIFAR100_不同评估场景_实验结果_02 (2) CH_MNIST: 2023 4 23_Risk score_CH_MNST_不同评估场景_实验结果_03 2023 4 23_Risk score_CH_MNST_不同评估场景_实验结果_04 (3) ImageNet: 2023 4 23_Risk score_ImageNet_不同评估场景_实验结果_05 2023 4 23_Risk score_ImageNet_不同评估场景_实验结果_06 (4) Location30:
2023 4 23_Risk score_Location30_不同评估场景_实验结果_07 2023 4 23_Risk score_Location30_不同评估场景_实验结果_08 (5) Purchase100:
2023 4 23_Risk score_Purchase100_不同评估场景_实验结果_09 2023 4 23_Risk score_Purchase100_不同评估场景_实验结果_10 (6) Texas100:
2023 4 23_Risk score_Texas100_不同评估场景_实验结果_11 2023 4 23_Risk score_Texas100_不同评估场景_实验结果_12

7. Shapley values:

(1) CIFAR100: 2023 4 23_Shapley values_CIFAR100_不同评估场景_实验结果_01 2023 4 23_Shapley values_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 4 23_Shapley values_CIFAR10_不同评估场景_实验结果_03 2023 4 23_Shapley values_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 4 23_Shapley values_CH_MNST_不同评估场景_实验结果_05 2023 4 23_Shapley values_CH_MNST_不同评估场景_实验结果_06 (4) ImageNet:
2023 4 23_Shapley values_ImageNet_不同评估场景_实验结果_07 2023 4 23_Shapley values_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 4 23_Shapley values_Location30_不同评估场景_实验结果_09 2023 4 23_Shapley values_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 4 23_Shapley values_Purchase100_不同评估场景_实验结果_11 2023 4 23_Shapley values_Purchase100_不同评估场景_实验结果_12 (7) Texas100:
2023 4 23_Shapley values_Texas100_不同评估场景_实验结果_13 2023 4 23_Shapley values_Texas100_不同评估场景_实验结果_14 8. Top1_Threshold:

(1) CIFAR100: 2023 4 23_Top1_Threshold_CIFAR100_不同评估场景_实验结果_01 2023 4 23_Top1_Threshold_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 4 23_Top1_Threshold_CIFAR10_不同评估场景_实验结果_03 2023 4 23_Top1_Threshold_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 4 23_Top1_Threshold_CH_MNST_不同评估场景_实验结果_05 2023 4 23_Top1_Threshold_CH_MNST_不同评估场景_实验结果_06 (4) ImageNet:
2023 4 23_Top1_Threshold_ImageNet_不同评估场景_实验结果_07 2023 4 23_Top1_Threshold_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 4 23_Top1_Threshold_Location30_不同评估场景_实验结果_09 2023 4 23_Top1_Threshold_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 4 23_Top1_Threshold_Purchase100_不同评估场景_实验结果_11 2023 4 23_Top1_Threshold_Purchase100_不同评估场景_实验结果_12 (7) Texas100:
2023 4 23_Top1_Threshold_Texas100_不同评估场景_实验结果_13 2023 4 23_Top1_Threshold_Texas100_不同评估场景_实验结果_14

9. BlindMI-1CLASS:

(1) CIFAR100: 2023 4 27_BlinMI-1CLASS_CIFAR100_不同评估场景_实验结果_01 2023 4 27_BlinMI-1CLASS_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 4 27_BlinMI-1CLASS_CIFAR10_不同评估场景_实验结果_03 2023 4 27_BlinMI-1CLASS_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 4 27_BlinMI-1CLASS_CH_MNST_不同评估场景_实验结果_05 2023 4 27_BlinMI-1CLASS_CH_MNST_不同评估场景_实验结果_06 (4) ImageNet:
2023 4 27_BlinMI-1CLASS_ImageNet_不同评估场景_实验结果_07 2023 4 27_BlinMI-1CLASS_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 4 27_BlinMI-1CLASS_Location30_不同评估场景_实验结果_09 2023 4 27_BlinMI-1CLASS_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 4 27_BlinMI-1CLASS_Purchase100_不同评估场景_实验结果_11 2023 4 27_BlinMI-1CLASS_Purchase100_不同评估场景_实验结果_12 (7) Texas100:
2023 4 27_BlinMI-1CLASS_Texas100_不同评估场景_实验结果_13 2023 4 27_BlinMI-1CLASS_Texas100_不同评估场景_实验结果_14

10. Top3_NN:

(1) CIFAR100: 2023 4 30_Top3_NN_CIFAR100_不同评估场景_实验结果_01 2023 4 30_Top3_NN_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 4 30_Top3_NN_CIFAR10_不同评估场景_实验结果_03 2023 4 30_Top3_NN_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 4 30_Top3_NN_CH_MNST_不同评估场景_实验结果_05 2023 4 30_Top3_NN_CH_MNST_不同评估场景_实验结果_06 (4) ImageNet:
2023 4 30_Top3_NN_ImageNet_不同评估场景_实验结果_07 2023 4 30_Top3_NN_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 4 30_Top3_NN_Location30_不同评估场景_实验结果_09 2023 4 30_Top3_NN_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 4 30_Top3_NN_Purchase100_不同评估场景_实验结果_11 2023 4 30_Top3_NN_Purchase100_不同评估场景_实验结果_12 (7) Texas100: 2023 4 30_Top3_NN_Texas100_不同评估场景_实验结果_13 2023 4 30_Top3_NN_Texas100_不同评估场景_实验结果_14

11. LiRA:

(1) CIFAR100: 2023 5 1_LiRA_CIFAR100_不同评估场景_实验结果_01 2023 5 1_LiRA_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 5 1_LiRA_CIFAR10_不同评估场景_实验结果_03 2023 5 1_LiRA_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 5 1_LiRA_CH_MNST_不同评估场景_实验结果_05 2023 5 1_LiRA_CH_MNST_不同评估场景_实验结果_06 (4) ImageNet:
2023 5 1_LiRA_ImageNet_不同评估场景_实验结果_07 2023 5 1_LiRA_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 5 1_LiRA_Location30_不同评估场景_实验结果_09 2023 5 1_LiRA_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 5 1_LiRA_Purchase100_不同评估场景_实验结果_11 2023 5 1_LiRA_Purchase100_不同评估场景_实验结果_12 (7) Texas100: 2023 5 1_LiRA_Texas100_不同评估场景_实验结果_13 2023 5 1_LiRA_Texas100_不同评估场景_实验结果_14

12. Top2+True:

(1) CIFAR100: 2023 5 1_Top2+True_CIFAR100_不同评估场景_实验结果_01 2023 5 1_Top2+True_CIFAR100_不同评估场景_实验结果_02 (2) CIFAR10: 2023 5 1_Top2+True_CIFAR10_不同评估场景_实验结果_03 2023 5 1_Top2+True_CIFAR10_不同评估场景_实验结果_04 (3) CH_MNIST: 2023 5 1_Top2+True_CH_MNST_不同评估场景_实验结果_05 2023 5 1_Top2+True_CH_MNST_不同评估场景_实验结果_06 (4) ImageNet:
2023 5 1_Top2+True_ImageNet_不同评估场景_实验结果_07 2023 5 1_Top2+True_ImageNet_不同评估场景_实验结果_08 (5) Location30:
2023 5 1_Top2+True_Location30_不同评估场景_实验结果_09 2023 5 1_Top2+True_Location30_不同评估场景_实验结果_10 (6) Purchase100:
2023 5 1_Top2+True_Purchase100_不同评估场景_实验结果_11 2023 5 1_Top2+True_Purchase100_不同评估场景_实验结果_12 (7) Texas100: 2023 5 1_Top2+True_Texas100_不同评估场景_实验结果_13 2023 5 1_Top2+True_Texas100_不同评估场景_实验结果_14

13. BlindMI-w:

(1) CIFAR100: 2023 5 2_BlinMI-w_CIFAR100_不同评估场景_实验结果_V2_01 2023 5 2_BlinMI-w_CIFAR100_不同评估场景_实验结果_V2_02 (2) CIFAR10: 2023 5 2_BlinMI-w_CIFAR10_不同评估场景_实验结果_V2_03 2023 5 2_BlinMI-w_CIFAR10_不同评估场景_实验结果_V2_04 (3) CH_MNIST: 2023 5 2_BlinMI-w_CH_MNST_不同评估场景_实验结果_V2_05 2023 5 2_BlinMI-w_CH_MNST_不同评估场景_实验结果_V2_06 (4) ImageNet:
2023 5 2_BlinMI-w_ImageNet_不同评估场景_实验结果_V2_07 2023 5 2_BlinMI-w_ImageNet_不同评估场景_实验结果_V2_08 (5) Location30:
2023 5 2_BlinMI-w_Location30_不同评估场景_实验结果_V2_09 2023 5 2_BlinMI-w_Location30_不同评估场景_实验结果_V2_10 (6) Purchase100:
2023 5 2_BlinMI-w_Purchase100_不同评估场景_实验结果_V2_11 2023 5 2_BlinMI-w_Purchase100_不同评估场景_实验结果_V2_12 (7) Texas100: 2023 5 2_BlinMI-w_Texas100_不同评估场景_实验结果_V2_13 2023 5 2_BlinMI-w_Texas100_不同评估场景_实验结果_V2_14

14. BlindMI-without:

(1) CIFAR100: 2023 5 2_BlinMI-without_CIFAR100_不同评估场景_实验结果(1)_01 2023 5 2_BlinMI-without_CIFAR100_不同评估场景_实验结果(1)_02 (2) CIFAR10: 2023 5 2_BlinMI-without_CIFAR10_不同评估场景_实验结果(1)_03 2023 5 2_BlinMI-without_CIFAR10_不同评估场景_实验结果(1)_04 (3) CH_MNIST: 2023 5 2_BlinMI-without_CH_MNST_不同评估场景_实验结果(1)_05 2023 5 2_BlinMI-without_CH_MNST_不同评估场景_实验结果(1)_06 (4) Location30:
2023 5 2_BlinMI-without_Location30_不同评估场景_实验结果(1)_07 2023 5 2_BlinMI-without_Location30_不同评估场景_实验结果(1)_08 (5) Purchase100:
2023 5 2_BlinMI-without_Purchase100_不同评估场景_实验结果(1)_09 2023 5 2_BlinMI-without_Purchase100_不同评估场景_实验结果(1)_10 (6) Texas100:
2023 5 2_BlinMI-without_Texas100_不同评估场景_实验结果(1)_11 2023 5 2_BlinMI-without_Texas100_不同评估场景_实验结果(1)_12

15. Loss-Threshold:

(1) CIFAR100: 2023 5 2_Loss-Threshold_CIFAR100_不同评估场景_实验结果_V2_01 2023 5 2_Loss-Threshold_CIFAR100_不同评估场景_实验结果_V2_02 (2) CIFAR10: 2023 5 2_Loss-Threshold_CIFAR10_不同评估场景_实验结果_V2_03 2023 5 2_Loss-Threshold_CIFAR10_不同评估场景_实验结果_V2_04 (3) CH_MNIST: 2023 5 2_Loss-Threshold_CH_MNST_不同评估场景_实验结果_V2_05 2023 5 2_Loss-Threshold_CH_MNST_不同评估场景_实验结果_V2_06 (4) ImageNet:
2023 5 2_Loss-Threshold_ImageNet_不同评估场景_实验结果_V2_07 2023 5 2_Loss-Threshold_ImageNet_不同评估场景_实验结果_V2_08 (5) Location30:
2023 5 2_Loss-Threshold_Location30_不同评估场景_实验结果_V2_09 2023 5 2_Loss-Threshold_Location30_不同评估场景_实验结果_V2_10 (6) Purchase100:
2023 5 2_Loss-Threshold_Purchase100_不同评估场景_实验结果_V2_11 2023 5 2_Loss-Threshold_Purchase100_不同评估场景_实验结果_V2_12 (7) Texas100: 2023 5 2_Loss-Threshold_Texas100_不同评估场景_实验结果_V2_13 2023 5 2_Loss-Threshold_Texas100_不同评估场景_实验结果_V2_14

  • Part II: The Thresholds at maximum MA

1. Risk score attacks:

(1) CIFAR100: CIFAR100_Risk score_不同类别_阈值_01

(2) CH_MNIST: CH_MNST_Risk score_不同类别_阈值_02 CH_MNST_Risk score_不同类别_阈值_03

(3) ImageNet: ImageNet_Risk score_不同类别_阈值_04

(4) Location30: Location30_Risk score_不同类别_阈值_05

(5) Purchase100: Purchase100_Risk score_不同类别_阈值_06

(6) Texas100: Texas100_Risk score_不同类别_阈值_07

2. Shapley values attacks:

(1) CIFAR100: CIFAR100_Shapley values_不同类别_阈值_01

(2) CIFAR10: CIFAR10_Shapley values_不同类别_阈值_02 CIFAR10_Shapley values_不同类别_阈值_03

(3) CH_MNIST: CH_MNST_Shapley values_不同类别_阈值_04 CH_MNST_Shapley values_不同类别_阈值_05

(4) ImageNet: ImageNet_Shapley values_不同类别_阈值_06

(5) Location30: Location30_Shapley values_不同类别_阈值_07

(6) Purchase100: Purchase100_Shapley values_不同类别_阈值_08

(7) Texas100: Texas100_Shapley values_不同类别_阈值_09

  • Part III: The Results of 4 Research Questions

(1) CIFAR100:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES01: CIFAR100_Normal + 2.893 + 0.085 + 20%
ES29: CIFAR100_Uniform + 2.893 + 0.085 + 20%
ES57: CIFAR100_Bernoulli + 2.893 +0.085 + 20% CIFAR100_RQ1

CIFAR100_N_2 893_d1_20% CIFAR100_U_2 893_d1_20% CIFAR100_B_2 893_d1_20%

RQ2: Effect of Distance between data samples of the Target Dataset

ES02: CIFAR100_Normal + 2.893 + 0.085 + 40%
ES10: CIFAR100_Normal + 3.813 + 0.085 + 40%
ES22: CIFAR100_Normal + 4.325 + 0.085 + 40% CIFAR100_RQ2

RQ3: Effect of Differential Distances between two datasets

ES03: CIFAR100_Normal + 2.893 + 0.085 + 45%
ES05: CIFAR100_Normal + 2.893 + 0.119 + 45%
ES07: CIFAR100_Normal + 2.893 + 0.157 + 45% CIFAR100_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES37: CIFAR100_Uniform + 3.813 + 0.085 + 20%
ES38: CIFAR100_Uniform + 3.813 + 0.085 + 40%
ES39: CIFAR100_Uniform + 3.813 + 0.085 + 45%
ES40: CIFAR100_Uniform + 3.813 + 0.085 + 49% CIFAR100_RQ4

(2) CIFAR10:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES13: CIFAR10_Normal + 2.501 + 0.213 + 20%
ES41: CIFAR10_Uniform + 2.501 + 0.213 + 20%
ES69: CIFAR10_Bernoulli + 2.501 + 0.213 + 20% CIFAR10_RQ1

RQ2: Effect of Distance between data samples of the Target Dataset

ES02: CIFAR10_Normal + 1.908 + 0.155 + 40%
ES10: CIFAR10_Normal + 2.501 + 0.155 + 40%
ES22: CIFAR10_Normal + 3.472 + 0.155 + 40% CIFAR10_RQ2

CIFAR10_N_1 908_d1_40% CIFAR10_N_2 501_d1_40% CIFAR10_N_3 472_d1_40%

RQ3: Effect of Differential Distances between two datasets

ES51: CIFAR10_Uniform + 3.472 + 0.155 + 45%
ES53: CIFAR10_Uniform + 3.472 + 0.213 + 45%
ES55: CIFAR10_Uniform + 3.472 + 0.291 + 45% CIFAR10_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES57: CIFAR10_Bernoulli + 1.908 +0.155 + 20%
ES58: CIFAR10_Bernoulli + 1.908 + 0.155 + 40%
ES59: CIFAR10_Bernoulli + 1.908 + 0.155 + 45%
ES60: CIFAR10_Bernoulli + 1.908 + 0.155 + 49% CIFAR10_RQ4

(3) CH_MNIST:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES21: CH_MNIST_Normal + 1.720 +0.083 + 20%
ES49 : CH_MNIST_Uniform + 1.720 +0.083 + 20%
ES77: CH_MNIST_Bernoulli + 1.720 +0.083 + 20% CH_MNIST_RQ1

RQ2: Effect of Distance between data samples of the Target Dataset

ES04: CH_MNIST_Uniform + 0.954 + 0.108 + 40%
ES14: CH_MNIST_Uniform + 1.355 + 0.108 + 40%
ES24: CH_MNIST_Uniform + 1.720 + 0.108 + 40% CH_MNIST_RQ2

RQ3: Effect of Differential Distances between two datasets

ES03: CH_MNIST_Normal + 0.954 + 0.083 + 45%
ES05: CH_MNIST_Normal + 0.954 + 0.108 + 45%
ES07: CH_MNIST_Normal + 0.954 + 0.133 + 45% CH_MNIST_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES73: CH_MNIST_Bernoulli + 1.355 + 0.133 + 20%
ES74: CH_MNIST_Bernoulli + 1.355 + 0.133 + 40%
ES75: CH_MNIST_Bernoulli + 1.355 + 0.133 + 45%
ES76: CH_MNIST_Bernoulli + 1.355 + 0.133 + 49% CH_MNIST_RQ4

(4) ImageNet:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES02: ImageNet_Normal + 0.934 + 0.046 + 40%
ES30: ImageNet_Uniform + 0.934 + 0.046 + 40%
ES58: ImageNet_Bernoulli + 0.934 + 0.046 + 40% ImageNet_RQ1

RQ2: Effect of Distance between data samples of the Target Dataset

ES34: ImageNet_Uniform + 0.934 + 0.08 + 49%
ES44: ImageNet_Uniform + 1.130 + 0.08 + 49%
ES54: ImageNet_Uniform + 1.388 + 0.08 + 49% ImageNet_RQ2

RQ3: Effect of Differential Distances between two datasets

ES79: ImageNet_Bernoulli + 1.388 + 0.046 + 45%
ES81: ImageNet_Bernoulli + 1.388 + 0.080 + 45%
ES83: ImageNet_Bernoulli + 1.388 + 0.145 + 45% ImageNet_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES13: ImageNet_Normal + 1.130 + 0.080 + 20%
ES14: ImageNet_Normal + 1.130 + 0.080 + 40%
ES15: ImageNet_Normal + 1.130 + 0.080 + 45%
ES16: ImageNet_Normal + 1.130 + 0.080 + 49% ImageNet_RQ4

(5) Location30:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES01: Location30_Normal + 0.570 + 0.041 + 4%
ES29: Location30_Uniform + 0.570 + 0.041 + 4%
ES57: Location30_Bernoulli + 0.570 + 0.041 + 4% Location30_RQ1

RQ2: Effect of Distance between data samples of the Target Dataset

ES32: Location30_Uniform + 0.57 + 0.076 + 8%
ES42: Location30_Uniform + 0.724 + 0.076 + 8%
ES52: Location30_Uniform + 0.801 + 0.076 + 8% Location30_RQ2

RQ3: Effect of Differential Distances between two datasets

ES23: Location30_Normal + 0.801 + 0.041 + 12%
ES25: Location30_Normal + 0.801 + 0.076 + 12%
ES27: Location30_Normal + 0.801 + 0.094 + 12% Location30_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES73: Location30_Bernoulli + 0.724 + 0.094 + 4%
ES74: Location30_Bernoulli + 0.724 + 0.094 + 8%
ES75: Location30_Bernoulli + 0.724 + 0.094 + 12%
ES76: Location30_Bernoulli + 0.724 + 0.094 + 16% Location30_RQ4

(6) Purchase100:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES01: Purchase100_Normal + 0.550 + 0.087 + 2%
ES29: Purchase100_Uniform + 0.550 + 0.087 + 2%
ES57: Purchase100_Bernoulli + 0.550 + 0.087 + 2% Purchase100_RQ1

RQ2: Effect of Distance between data samples of the Target Dataset

ES04: Purchase100_Normal + 0.550 + 0.110 + 4%
ES14: Purchase100_Normal + 0.625 + 0.110 + 4%
ES24: Purchase100_Normal + 0.729 + 0.110 + 4% Purchase100_RQ2

RQ3: Effect of Differential Distances between two datasets

ES51: Purchase100_Uniform + 0.729 + 0.087 + 10%
ES53: Purchase100_Uniform + 0.729 + 0.110 + 10%
ES55: Purchase100_Uniform + 0.729 + 0.156 + 10% Purchase100_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES65: Purchase100_Bernoulli + 0.625 + 0.087 + 2%
ES66: Purchase100_Bernoulli + 0.625 + 0.087 + 4%
ES67: Purchase100_Bernoulli + 0.625 + 0.087 + 10%
ES68: Purchase100_Bernoulli + 0.625 + 0.087 + 12% Purchase100_RQ4

(7) Texas100:

RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset

ES01: Texas100_Normal + 0.530 + 0.038 + 2%
ES29: Texas100_Uniform + 0.530 + 0.038 + 2%
ES57: Texas100_Bernoulli + 0.530 + 0.038 + 2% Texas100_RQ1

RQ2: Effect of Distance between data samples of the Target Dataset

ES02: Texas100_Normal + 0.530 + 0.038 + 4%
ES10: Texas100_Normal + 0.641 + 0.038 + 4%
ES22: Texas100_Normal + 0.734 + 0.038 + 4% Texas100_RQ2

RQ3: Effect of Differential Distances between two datasets

ES51: Texas100_Uniform + 0.734 + 0.038 + 10%
ES53: Texas100_Uniform + 0.734 + 0.073 + 10%
ES55: Texas100_Uniform + 0.734 + 0.107 + 10% Texas100_RQ3

RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack

ES65: Texas100_Bernoulli + 0.641 + 0.038 + 2%
ES66: Texas100_Bernoulli + 0.641 + 0.038 + 4%
ES67: Texas100_Bernoulli + 0.641 + 0.038 + 10%
ES68: Texas100_Bernoulli + 0.641 + 0.038 + 12% Texas100_RQ4

Additional Evaluation Results

About

MIBench: A Comprehensive Benchmark for Membership Inference Attacks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published