# Dataset Usage Cardinality Inference Demo
**How much of a given dataset was used to train a machine learning model?** As AI continues to advance, this question becomes increasingly critical. According to Section 107 of the U.S. Copyright Act, determining whether a use constitutes fair use or copyright infringement requires evaluating the "_amount and substantiality of the portion used in relation to the copyrighted work_" under the "_nature of the copyrighted work_."

Dataset Usage Cardinality Inference (DUCI) enables data owners to estimate the exact proportion of dataset used, assessing the risk of unauthorized usage and protect their rights. DUCI achieves this through a debiasing process that aggregates individual Membership Inference Attack (MIA) guesses to deliver accurate results. More details are discussed in the [DUCI document](../documentation/duci.md).

## Problem Overview
The Dataset Usage Cardinality Inference (DUCI) algorithm---acting as an agent for the dataset owner with full access to a target dataset---aims to estimate the proportion of the target dataset used in training a victim model, given black-box access to the model and knowledge of the training algorithm (e.g., the population data and model archtecture).

<img src="documentation/images/duci_problem.png" alt="Problem Illustration" title="Simple DUCI Pipeline" width="600">

## Method

To estimate the proportion of a target dataset being used, the Dataset Usage Cardinality Inference (DUCI) algorithm first debiases the membership predictions \(\hat{m}_i\) provided by any Membership Inference Attack (MIA) method to obtain the probability of each data record being used, using the following formula:

$\hat{p}_i = \frac{\hat{m}_i - P(\hat{m}_i = 1 \mid m_i = 0)}{P(\hat{m}_i = 1 \mid m_i = 1) - P(\hat{m}_i = 1 \mid m_i = 0)}$,

After debiasing, DUCI aggregates the unbiased probability estimators over the entire dataset to compute the overall proportion:

$\hat{p} = \frac{1}{|X|} \sum_{i=1}^{|X|} \hat{p}_i,$

where $|X|$ is the size of the target dataset.

## Set up the Colab environment

In [1]:
# # Clone the github repo
# !git clone https://github.com/privacytrustlab/ml_privacy_meter.git

# # Update the Colab environment
# !pip install datasets==2.21.0 transformers==4.44.2 torch==2.4.1 torchvision==0.19.1 torchaudio

In [2]:
# # Change the directory to the cloned repo
# import sys
# sys.path.append('/content/ml_privacy_meter')

# %cd ml_privacy_meter

In [3]:
# !pip install numpy torch  # Install NumPy and PyTorch if not already installed

In [4]:
from torch.utils.data import Subset
import logging
import numpy as np
import random
import time


# Set up the logger
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger()

In [5]:
from privacy_meter.dataset import get_dataset
from privacy_meter.models.utils import train_models, load_models
from privacy_meter.get_signals import get_model_signals
from privacy_meter.modules.mia import MIA
from privacy_meter.modules.duci import DUCI

  from .autonotebook import tqdm as notebook_tqdm
2025-02-12 15:42:35,531 - INFO - PyTorch version 2.4.0 available.
2025-02-12 15:42:36,825 - DEBUG - matplotlib data path: /home/yao/.conda/envs/pytorch/lib/python3.8/site-packages/matplotlib/mpl-data
2025-02-12 15:42:36,829 - DEBUG - CONFIGDIR=/home/yao/.config/matplotlib
2025-02-12 15:42:36,830 - DEBUG - interactive is False
2025-02-12 15:42:36,831 - DEBUG - platform is linux
2025-02-12 15:42:37,021 - DEBUG - CACHEDIR=/home/yao/.cache/matplotlib
2025-02-12 15:42:37,024 - DEBUG - Using fontManager instance from /home/yao/.cache/matplotlib/fontlist-v330.json


## Prepare dataset
As the dataset owner, we have a target dataset $X$ and access to a population pool. For simplicity, assume the population pool is the CIFAR-10 dataset, and we sample a subset $X$ of size 500 from this pool.

In [6]:
# Set Configs
_dataset = 'cifar10' # cifar10 as the population pool
dataset_dir = '../data'
log_dir = 'demo_duci'
configs = {
    'run': {
        'random_seed': 12345,
        'log_dir': 'demo_duci',
        'time_log': True,
        'num_experiments': 1
    },
    'audit': {
        'privacy_game': 'privacy_loss_model',
        'algorithm': 'RMIA',
        'num_ref_models': 1,
        'device': 'cuda:0',
        'report_log': 'report_rmia',
        'batch_size': 5000
    },
    'train': {
        'model_name': 'wrn28-2',
        'device': 'cuda:0',
        'batch_size': 256,
        'optimizer': 'SGD',
        'learning_rate': 0.1,
        'weight_decay': 0,
        'epochs': 100
    },
    'data': {
        'dataset': 'cifar10',
        'data_dir': 'data'
    }
}



In [7]:
dataset, population = get_dataset(_dataset, dataset_dir, logger)

# Select 500 points from the dataset as the target dataset
all_indices = list(range(len(dataset)))
random.shuffle(all_indices)
target_indices = all_indices[:500]
remaining_indices = all_indices[500:]
TRAIN_SIZE = len(dataset) // 2

2025-02-12 15:42:37,267 - INFO - Data loaded from data/cifar10.pkl
2025-02-12 15:42:37,296 - INFO - Population data loaded from data/cifar10_population.pkl
2025-02-12 15:42:37,297 - INFO - The whole dataset size: 50000


## Set up the victim model
Suppose a victim model is trained on a randomly selected $p$ proportion of our dataset $X$. Our goal is to infer the value of $p$.

In [8]:
proportions = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
p = random.choice(proportions)

In [9]:
# Randomly selection 0.3 proportion of the target dataset
selected_indices = random.sample(target_indices, int(p * len(target_indices)))
remaining_size = TRAIN_SIZE - len(selected_indices)
selected_remaining_indices = random.sample(remaining_indices, remaining_size)
selected_victim_indices = selected_indices + selected_remaining_indices

# select all unselected indices in all_indices as the test indices
test_indices = list(set(all_indices) - set(selected_indices) - set(selected_remaining_indices))
target_data_split = {
    'train': selected_victim_indices,
    'test': test_indices
}
target_membership = np.zeros(len(dataset))
target_membership[selected_victim_indices] = 1

## Train reference models

In the **Privacy Meter** library, $2N$ reference models are trained by default, ensuring that each data point is included in one model's training set and excluded from another. We first explore dataset usage inference using two reference models before moving to the special case of single-reference models (by adapting the MIA implementations in the library).

In [10]:
# Randomly selection half of the target dataset
ref_selected_indices = random.sample(target_indices, int(0.5 * len(target_indices)))
ref_remaining_size = TRAIN_SIZE - len(ref_selected_indices)
ref_selected_remaining_indices = random.sample(remaining_indices, ref_remaining_size)
ref_selected_victim_indices = ref_selected_indices + ref_selected_remaining_indices

# select all unselected indices in all_indices as the test indices
ref_test_indices = list(set(all_indices) - set(ref_selected_indices) - set(ref_selected_remaining_indices))
ref_data_split = {
    'train': ref_selected_victim_indices,
    'test': ref_test_indices
}
ref_membership = np.zeros(len(dataset))
ref_membership[ref_selected_victim_indices] = 1

# Get the pair reference model
# ref_paired_selected_indices = list(set(target_indices)-set(ref_selected_indices))
# ref_paired_remaining_size = TRAIN_SIZE - len(ref_paired_selected_indices)
# ref_paired_selected_remaining_indices = random.sample(remaining_indices, ref_paired_remaining_size)
# ref_paired_selected_victim_indices = ref_paired_selected_indices + ref_paired_selected_remaining_indices
ref_paired_selected_indices = random.sample(target_indices, int(0.5 * len(target_indices)))
ref_paired_remaining_size = TRAIN_SIZE - len(ref_paired_selected_indices)
ref_paired_selected_remaining_indices = random.sample(remaining_indices, ref_paired_remaining_size)
ref_paired_selected_victim_indices = ref_paired_selected_indices + ref_paired_selected_remaining_indices

# select all unselected indices in all_indices as the test indices
ref_paired_test_indices = list(set(all_indices) - set(ref_paired_selected_indices) - set(ref_paired_selected_remaining_indices))
ref_paired_data_split = {
    'train': ref_paired_selected_victim_indices,
    'test': ref_paired_test_indices
}
ref_paired_membership = np.zeros(len(dataset))
ref_paired_membership[ref_paired_selected_victim_indices] = 1


In [11]:
data_splits = [target_data_split, ref_data_split, ref_paired_data_split]
memberships = np.array([target_membership, ref_membership, ref_paired_membership]) # size: 2N+1 * len(dataset)
models_list = train_models(
    log_dir, dataset, data_splits, memberships, configs, logger
)

2025-02-12 15:42:37,449 - INFO - Training 3 models
2025-02-12 15:42:37,452 - INFO - --------------------------------------------------
2025-02-12 15:42:37,453 - INFO - Training model 0: Train size 25000, Test size 25000


Using optimizer: SGD | Learning Rate: 0.1 | Weight Decay: 0
Epoch [1/100] | Train Loss: 2.2693 | Train Acc: 0.1566
Test Loss: 2.1526 | Test Acc: 0.2058
Epoch 1 took 5.23 seconds
Epoch [2/100] | Train Loss: 1.9806 | Train Acc: 0.2658
Test Loss: 1.8610 | Test Acc: 0.2988
Epoch 2 took 4.51 seconds
Epoch [3/100] | Train Loss: 1.7937 | Train Acc: 0.3288
Test Loss: 1.7606 | Test Acc: 0.3361
Epoch 3 took 4.50 seconds
Epoch [4/100] | Train Loss: 1.6903 | Train Acc: 0.3722
Test Loss: 1.6698 | Test Acc: 0.3763
Epoch 4 took 4.48 seconds
Epoch [5/100] | Train Loss: 1.6038 | Train Acc: 0.4046
Test Loss: 1.6139 | Test Acc: 0.3992
Epoch 5 took 4.50 seconds
Epoch [6/100] | Train Loss: 1.5172 | Train Acc: 0.4366
Test Loss: 1.5782 | Test Acc: 0.4183
Epoch 6 took 4.52 seconds
Epoch [7/100] | Train Loss: 1.4409 | Train Acc: 0.4716
Test Loss: 1.5245 | Test Acc: 0.4196
Epoch 7 took 4.48 seconds
Epoch [8/100] | Train Loss: 1.3672 | Train Acc: 0.5030
Test Loss: 1.4208 | Test Acc: 0.4804
Epoch 8 took 4.48 seco

2025-02-12 15:50:13,039 - INFO - Train accuracy 1.0, Train Loss 0.0030063338630015447
2025-02-12 15:50:13,041 - INFO - Test accuracy 0.61392, Test Loss 1.5615280221919625
2025-02-12 15:50:13,056 - INFO - Training model 0 took 455.60379457473755 seconds
2025-02-12 15:50:13,073 - INFO - --------------------------------------------------
2025-02-12 15:50:13,074 - INFO - Training model 1: Train size 25000, Test size 25000


Using optimizer: SGD | Learning Rate: 0.1 | Weight Decay: 0
Epoch [1/100] | Train Loss: 2.2376 | Train Acc: 0.1516
Test Loss: 2.0878 | Test Acc: 0.2406
Epoch 1 took 4.74 seconds
Epoch [2/100] | Train Loss: 1.9509 | Train Acc: 0.2836
Test Loss: 1.8576 | Test Acc: 0.3239
Epoch 2 took 4.52 seconds
Epoch [3/100] | Train Loss: 1.7912 | Train Acc: 0.3412
Test Loss: 1.7769 | Test Acc: 0.3447
Epoch 3 took 4.51 seconds
Epoch [4/100] | Train Loss: 1.6763 | Train Acc: 0.3779
Test Loss: 1.6298 | Test Acc: 0.4056
Epoch 4 took 4.51 seconds
Epoch [5/100] | Train Loss: 1.5652 | Train Acc: 0.4259
Test Loss: 1.5369 | Test Acc: 0.4376
Epoch 5 took 4.53 seconds
Epoch [6/100] | Train Loss: 1.4787 | Train Acc: 0.4556
Test Loss: 1.4494 | Test Acc: 0.4660
Epoch 6 took 4.51 seconds
Epoch [7/100] | Train Loss: 1.3972 | Train Acc: 0.4909
Test Loss: 1.4280 | Test Acc: 0.4776
Epoch 7 took 4.52 seconds
Epoch [8/100] | Train Loss: 1.3210 | Train Acc: 0.5192
Test Loss: 1.5315 | Test Acc: 0.4263
Epoch 8 took 4.55 seco

2025-02-12 15:57:47,811 - INFO - Train accuracy 1.0, Train Loss 0.003254153979562071
2025-02-12 15:57:47,812 - INFO - Test accuracy 0.62428, Test Loss 1.5147887894085474
2025-02-12 15:57:47,837 - INFO - Training model 1 took 454.7633502483368 seconds
2025-02-12 15:57:47,857 - INFO - --------------------------------------------------
2025-02-12 15:57:47,858 - INFO - Training model 2: Train size 25000, Test size 25000


Using optimizer: SGD | Learning Rate: 0.1 | Weight Decay: 0
Epoch [1/100] | Train Loss: 2.2637 | Train Acc: 0.1442
Test Loss: 2.1554 | Test Acc: 0.2340
Epoch 1 took 4.78 seconds
Epoch [2/100] | Train Loss: 2.0124 | Train Acc: 0.2791
Test Loss: 1.8895 | Test Acc: 0.3175
Epoch 2 took 4.53 seconds
Epoch [3/100] | Train Loss: 1.8021 | Train Acc: 0.3520
Test Loss: 1.7632 | Test Acc: 0.3630
Epoch 3 took 4.52 seconds
Epoch [4/100] | Train Loss: 1.6473 | Train Acc: 0.4060
Test Loss: 1.5997 | Test Acc: 0.4174
Epoch 4 took 4.51 seconds
Epoch [5/100] | Train Loss: 1.5365 | Train Acc: 0.4423
Test Loss: 1.6156 | Test Acc: 0.4161
Epoch 5 took 4.51 seconds
Epoch [6/100] | Train Loss: 1.4412 | Train Acc: 0.4810
Test Loss: 1.4512 | Test Acc: 0.4682
Epoch 6 took 4.53 seconds
Epoch [7/100] | Train Loss: 1.3641 | Train Acc: 0.5070
Test Loss: 1.5189 | Test Acc: 0.4428
Epoch 7 took 4.51 seconds
Epoch [8/100] | Train Loss: 1.2945 | Train Acc: 0.5332
Test Loss: 1.7329 | Test Acc: 0.4188
Epoch 8 took 4.51 seco

2025-02-12 16:05:22,631 - INFO - Train accuracy 1.0, Train Loss 0.004078457265027932
2025-02-12 16:05:22,632 - INFO - Test accuracy 0.62208, Test Loss 1.5799390868264802
2025-02-12 16:05:22,650 - INFO - Training model 2 took 454.7932333946228 seconds


In [12]:
models_list, memberships = load_models(log_dir, dataset, 3, configs, logger)
target_membership, ref_membership, ref_paired_membership = memberships

2025-02-12 16:05:22,775 - INFO - Loading model 0
  return torch.load(io.BytesIO(b))
2025-02-12 16:05:22,836 - INFO - Loading model 1
2025-02-12 16:05:22,877 - INFO - Loading model 2


## Inferring the proportion $p$ of the dataset used
We have imported the DUCI module using: `from modules.duci import DUCI`. In DUCI, the standard MIA is executed first, so we need to set up the MIA before proceeding.

In [13]:
# Sample the population dataset used in MIA
population = Subset(
    population,
    np.random.choice(
        len(population),
        configs["audit"].get("population_size", len(population)),
        replace=False,
    ),
)

### Query the victim model and the reference model to generate signals (softmax outputs)

In [None]:
baseline_time = time.time()
auditing_dataset = Subset(dataset, target_indices)
auditing_membership = np.array([target_membership[target_indices], ref_membership[target_indices], ref_paired_membership[target_indices]]).astype(bool) # size: 2 * len(auditing_dataset)   
signals = get_model_signals(models_list, auditing_dataset, configs, logger) # num_samples * num_models
auditing_membership = auditing_membership.T
assert signals.shape == auditing_membership.shape, f"signals or auditing_membership has incorrect shape (num_samples * num_models): {signals.shape} vs {auditing_membership.shape}"
population_signals = get_model_signals(
    models_list, population, configs, logger, is_population=True
)
logger.info("Preparing signals took %0.5f seconds", time.time() - baseline_time)

2025-02-12 16:05:23,187 - INFO - Computing signals for all models.
Computing softmax: 100%|██████████| 1/1 [00:00<00:00, 20.68it/s]
Computing softmax: 100%|██████████| 1/1 [00:00<00:00, 52.46it/s]
Computing softmax: 100%|██████████| 1/1 [00:00<00:00, 52.20it/s]
2025-02-12 16:05:23,324 - INFO - Signals saved to disk.
2025-02-12 16:05:24,387 - INFO - Computing signals for all models.
Computing softmax: 100%|██████████| 2/2 [00:00<00:00,  5.55it/s]
Computing softmax: 100%|██████████| 2/2 [00:00<00:00,  5.55it/s]
Computing softmax: 100%|██████████| 2/2 [00:00<00:00,  5.55it/s]
2025-02-12 16:05:25,529 - INFO - Signals saved to disk.
2025-02-12 16:05:25,542 - INFO - Preparing signals took 2.60461 seconds


In [15]:
auditing_membership[:, 0].mean(), auditing_membership[:, 1].mean(), auditing_membership[:, 2].mean()

(0.2, 0.5, 0.5)

### Perform DUCI

In [16]:
baseline_time = time.time()
target_model_idx = 0
ref_model_indices = [1, 2]

logger.info(f"Initiate DUCI for target models: {target_model_idx}")

args = {
    "attack": "RMIA",
    "dataset": configs["data"]["dataset"], # TODO: have DUCI config
    "model": configs["train"]["model_name"],
    "offline_a": 0.2 # If set to None, an extra reference model is required to tune the offline_a
}
# Initialize MIA instance
MIA_instance = MIA(logger)
DUCI_instance = DUCI(MIA_instance, logger, args)

logger.info("Collecting membership prediction for each sample in the target dataset on target models and reference models.")
logger.info("Predicting the proportion of dataset usage on target models.")

duci_preds, true_proportions, errors = DUCI_instance.pred_proportions(
    [target_model_idx], 
    [ref_model_indices], 
    signals,
    population_signals,
    auditing_membership,
)

logger.info(
    "DUCI %0.1f seconds", time.time() - baseline_time
)
logger.info(f"Average prediction errors: {np.mean(errors)}")
logger.info(f"All prediction errors: {errors}")
logger.info(f"Prediction details: DUCI predictions: {duci_preds}, True proportions: {true_proportions}")



2025-02-12 16:05:25,573 - INFO - Initiate DUCI for target models: 0
2025-02-12 16:05:25,575 - INFO - Collecting membership prediction for each sample in the target dataset on target models and reference models.
2025-02-12 16:05:25,576 - INFO - Predicting the proportion of dataset usage on target models.
2025-02-12 16:05:25,577 - INFO - Args for MIA attack: {'attack': 'RMIA', 'dataset': 'cifar10', 'model': 'wrn28-2', 'offline_a': 0.2}
2025-02-12 16:05:25,579 - INFO - Running RMIA attack on target model 0 with offline_a=0.2
2025-02-12 16:05:25,599 - INFO - Collect membership prediction for target dataset on target model 0 costs 0.0 seconds
2025-02-12 16:05:25,600 - INFO - Args for MIA attack: {'attack': 'RMIA', 'dataset': 'cifar10', 'model': 'wrn28-2', 'offline_a': 0.2}
2025-02-12 16:05:25,601 - INFO - Running RMIA attack on target model 1 with offline_a=0.2
2025-02-12 16:05:25,614 - INFO - Args for MIA attack: {'attack': 'RMIA', 'dataset': 'cifar10', 'model': 'wrn28-2', 'offline_a': 0.2

### Check the prediction and $p$

The ground-truth proportion $p$ is:

In [17]:
p

0.2

Our prediction is:

In [18]:
duci_preds[0]

0.2419354838709677

## Inferring the Proportion $p$ of a Dataset Used with a *Single* Reference Model  

In our implementation using **Privacy Meter**, we follow the commonly used **half-half data split setting** from the MIA literature. Under this setting, using $N$ reference models for an offline attack means each target sample has $N$ reference models trained without it. However, due to the half-half split, a total of $2N$ models are trained. This implies that the **minimal number of trained models is 2**.  

In **dataset usage inference**, we aim to infer dataset usage for the **entire dataset** (with $|X|$ samples) while using only a **single reference model**, rather than training separate offline models for all data points. The key question is: **can we achieve dataset usage inference without ensuring every data point has a corresponding offline model?**  

### Adaptations to Achieve Single Reference Model Inference  

1. **Denominator Approximation Using In-Model Estimates**  
   - In offline **RMIA**, the denominator is approximated as $aP(x|\theta_{\text{out}}) + a -1 $ using out-models.  
   - We extend this idea by approximating the denominator **using in-model estimates**: $(P(x|\theta_{\text{in}}) + a - 1)/a$
   - With a half-half split, for samples included in the **single reference model** training, we estimate probabilities using the **in-world** model. For the remaining samples, we follow **RMIA** and approximate the denominator using **out-world** models.

2. **Eliminating the Denominator to Transition to a Population Attack**  
   - With a **single reference model**, the denominators in the RMIA scores (i.e., $P(x)$ and $P(z)$) **no longer approximate probabilities**, but instead act as a **linear transformation** of the numerator.  
   - Given this, we **drop the denominator** for both $x$ and $z$, transforming the method into a **population attack**.

This adaptation allows us to infer dataset usage **without requiring an extensive set of reference models**, while still leveraging the insights from RMIA.

### Define MyMIA class to implement the updated version of RMIA

In [19]:
import numpy as np
from typing import Dict, Any, Tuple, Optional

class MyMIA(MIA):
    def __init__(self, logger: logging.Logger):
        super().__init__(logger)
    
    def run_mia(
        self,
        all_signals: np.ndarray,
        all_memberships: np.ndarray,
        target_model_idx: int,
        reference_model_indices: np.ndarray,
        logger: logging.Logger,
        args: Dict[str, Any],
        population_signals: Optional[np.ndarray] = None,
        reuse_offline_a: Optional[bool] = False,
    ) -> Tuple[np.ndarray, np.ndarray]:
        """
        Custom implementation of the MIA attack.
        """
        assert all_signals.shape == all_memberships.shape, (
            f"all_signals and all_memberships must have the same shape: {all_signals.shape} vs {all_memberships.shape}"
        )
        
        target_signals = all_signals[:, target_model_idx]
        target_memberships = all_memberships[:, target_model_idx]

        ref_signals = all_signals[:, reference_model_indices]
        ref_memberships = all_memberships[:, reference_model_indices]

        z_target_signals = population_signals[:, target_model_idx]
        z_ref_signals = population_signals[:, reference_model_indices]

        logger.info(f"Args for MyMIA attack: {args}")
        
        assert population_signals is not None, "population_signals is required for RMIA attack"
        assert args.get("offline_a") is not None, "offline_a is required for single model RMIA attack"
            
        offline_a = args["offline_a"]
            
        logger.info(f"Running MyRMIA attack on target model {target_model_idx} with offline_a={offline_a}")
        mia_scores = self.my_rmia(target_signals, z_target_signals, offline_a)
        
        return mia_scores, target_memberships
    
    def my_rmia(
        self,
        target_signals: np.ndarray,
        z_target_signals: np.ndarray,
        offline_a: float,
    ) -> np.ndarray:
        """
        Attack a target model using the RMIA attack with the help of offline reference models.

        Args:
            target_signals (np.ndarray): Softmax value of all samples in the target model.
            ref_signals (np.ndarray): Softmax value of all samples in the reference models.
            ref_memberships (np.ndarray): Membership matrix for all reference models.
            z_target_signals (np.ndarray): Softmax value of population samples in the target model.
            z_ref_signals (np.ndarray): Softmax value of population samples in all reference models.
            offline_a (float): Coefficient offline_a is used to approximate p(x) using P_out in the offline setting.
            num_reference_models (Optional[int]): Number of reference models used for the attack. Defaults to half reference models if None.
        
        Returns:
            np.ndarray: MIA score for all samples (a larger score indicates higher chance of being member).
        """
        prob_ratio_x = target_signals.ravel()
        prob_ratio_z = z_target_signals.ravel()

        ratios = prob_ratio_x[:, np.newaxis] / prob_ratio_z
        counts = np.average(ratios > 1.0, axis=1)

        return counts


In [20]:
baseline_time = time.time()
target_model_idx = 0
ref_model_indices = [1]

logger.info(f"Initiate DUCI for target models: {target_model_idx}")

args = {
    "attack": "RMIA",
    "dataset": configs["data"]["dataset"], # TODO: have DUCI config
    "model": configs["train"]["model_name"],
    "offline_a": 0.3 # If set to None, an extra reference model is required to tune the offline_a
}
# Initialize MIA instance
MIA_instance = MyMIA(logger)
DUCI_instance = DUCI(MIA_instance, logger, args)

logger.info("Collecting membership prediction for each sample in the target dataset on target models and reference models.")
logger.info("Predicting the proportion of dataset usage on target models.")

duci_preds, true_proportions, errors = DUCI_instance.pred_proportions(
    [target_model_idx], 
    [ref_model_indices], 
    signals,
    population_signals,
    auditing_membership,
)

logger.info(
    "DUCI %0.1f seconds", time.time() - baseline_time
)
logger.info(f"Average prediction errors: {np.mean(errors)}")
logger.info(f"All prediction errors: {errors}")
logger.info(f"Prediction details: DUCI predictions: {duci_preds}, True proportions: {true_proportions}")

2025-02-12 16:05:25,706 - INFO - Initiate DUCI for target models: 0
2025-02-12 16:05:25,708 - INFO - Collecting membership prediction for each sample in the target dataset on target models and reference models.
2025-02-12 16:05:25,709 - INFO - Predicting the proportion of dataset usage on target models.
2025-02-12 16:05:25,711 - INFO - Args for MyMIA attack: {'attack': 'RMIA', 'dataset': 'cifar10', 'model': 'wrn28-2', 'offline_a': 0.3}
2025-02-12 16:05:25,712 - INFO - Running MyRMIA attack on target model 0 with offline_a=0.3
2025-02-12 16:05:25,727 - INFO - Collect membership prediction for target dataset on target model 0 costs 0.0 seconds
2025-02-12 16:05:25,728 - INFO - Args for MyMIA attack: {'attack': 'RMIA', 'dataset': 'cifar10', 'model': 'wrn28-2', 'offline_a': 0.3}
2025-02-12 16:05:25,729 - INFO - Running MyRMIA attack on target model 1 with offline_a=0.3
2025-02-12 16:05:25,742 - INFO - Best threshold = 0.6838 (Maximize TPR - FPR) = 0.98 - 0.364
2025-02-12 16:05:25,743 - INFO

### Check the prediction and $p$

In [21]:
print(f"The ground-truth proportions: {p} | Our DUCI predictions: {duci_preds[0]}")

The ground-truth proportions: 0.2 | Our DUCI predictions: 0.13961038961038966


# Debiasing the most naive loss attack

In [None]:
import numpy as np
from typing import Dict, Any, Tuple, Optional

class MyMIA(MIA):
    def __init__(self, logger: logging.Logger):
        super().__init__(logger)
    
    def run_mia(
        self,
        all_signals: np.ndarray,
        all_memberships: np.ndarray,
        target_model_idx: int,
        reference_model_indices: np.ndarray,
        logger: logging.Logger,
        args: Dict[str, Any],
        population_signals: Optional[np.ndarray] = None,
        reuse_offline_a: Optional[bool] = False,
    ) -> Tuple[np.ndarray, np.ndarray]:
        """
        Custom implementation of the MIA attack.
        """
        assert all_signals.shape == all_memberships.shape, (
            f"all_signals and all_memberships must have the same shape: {all_signals.shape} vs {all_memberships.shape}"
        )
        
        target_signals = all_signals[:, target_model_idx]
        target_memberships = all_memberships[:, target_model_idx]

        ref_signals = all_signals[:, reference_model_indices]
        ref_memberships = all_memberships[:, reference_model_indices]

        z_target_signals = population_signals[:, target_model_idx]
        z_ref_signals = population_signals[:, reference_model_indices]

        logger.info(f"Args for MyMIA attack: {args}")
        
        assert population_signals is not None, "population_signals is required for RMIA attack"
        assert args.get("offline_a") is not None, "offline_a is required for single model RMIA attack"
            
        offline_a = args["offline_a"]
            
        logger.info(f"Running MyRMIA attack on target model {target_model_idx} with offline_a={offline_a}")
        mia_scores = self.my_rmia(target_signals, z_target_signals, offline_a)
        
        return mia_scores, target_memberships
    
    def my_rmia(
        self,
        target_signals: np.ndarray,
        ref_signals: np.ndarray,
    ) -> np.ndarray:
        """
        Attack a target model using the RMIA attack with the help of offline reference models.

        Args:
            target_signals (np.ndarray): Softmax value of all samples in the target model.
            ref_signals (np.ndarray): Softmax value of all samples in the reference models.
            ref_memberships (np.ndarray): Membership matrix for all reference models.
            z_target_signals (np.ndarray): Softmax value of population samples in the target model.
            z_ref_signals (np.ndarray): Softmax value of population samples in all reference models.
            offline_a (float): Coefficient offline_a is used to approximate p(x) using P_out in the offline setting.
            num_reference_models (Optional[int]): Number of reference models used for the attack. Defaults to half reference models if None.
        
        Returns:
            np.ndarray: MIA score for all samples (a larger score indicates higher chance of being member).
        """

        return target_signals.ravel()


In [None]:
baseline_time = time.time()
target_model_idx = 0
ref_model_indices = [1]

logger.info(f"Initiate DUCI for target models: {target_model_idx}")

args = {
    "attack": "RMIA",
    "dataset": configs["data"]["dataset"], # TODO: have DUCI config
    "model": configs["train"]["model_name"],
    "offline_a": 0.3 # If set to None, an extra reference model is required to tune the offline_a
}
# Initialize MIA instance
MIA_instance = MyMIA(logger)
DUCI_instance = DUCI(MIA_instance, logger, args)

logger.info("Collecting membership prediction for each sample in the target dataset on target models and reference models.")
logger.info("Predicting the proportion of dataset usage on target models.")

duci_preds, true_proportions, errors = DUCI_instance.pred_proportions(
    [target_model_idx], 
    [ref_model_indices], 
    signals,
    population_signals,
    auditing_membership,
)

logger.info(
    "DUCI %0.1f seconds", time.time() - baseline_time
)
logger.info(f"Average prediction errors: {np.mean(errors)}")
logger.info(f"All prediction errors: {errors}")
logger.info(f"Prediction details: DUCI predictions: {duci_preds}, True proportions: {true_proportions}")