Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks

This is the official PyTorch implementation for the paper "Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks".

Abstract

The success of deep learning in medical imaging applications has led several companies to deploy proprietary models in diagnostic workflows, offering monetized services. Even though model weights are hidden to protect the intellectual property of the service provider, these models are exposed to model stealing (MS) attacks, where adversaries can clone the model's functionality by querying it with a proxy dataset and training a thief model on the acquired predictions. While extensively studied on general vision tasks, the susceptibility of medical imaging models to MS attacks remains inadequately explored. This paper investigates the vulnerability of black-box medical imaging models to MS attacks under realistic conditions where the adversary lacks access to the victim model's training data and operates with limited query budgets. We demonstrate that adversaries can effectively execute MS attacks by using publicly available datasets. To further enhance MS capabilities with limited query budgets, we propose a two-step model stealing approach termed QueryWise. This method capitalizes on unlabeled data obtained from a proxy distribution to train the thief model without incurring additional queries. Evaluation on two medical imaging models for Gallbladder Cancer and COVID-19 classi�cation substantiate the effectiveness of the proposed attack.

Installation

Clone the QueryWise repo.

git clone https://github.com/rajankita/QueryWise.git
cd QueryWise

Our code is tested on Python 3.8 and Pytorch 1.11.0. Please install the environment via

pip install -e requirements.txt

To run the QueryWise project, you need to install an external model RadFormer.

git clone https://github.com/authorname/RadFormer.git
cd RadFormer

Modify local paths in RadFormer

In RadFormer/models/resnet.py set LOCAL_WEIGHT_PATH = "./victim_models/init_resnet/resnet50.pth".
In RadFormer/models/bagnet.py modify lines 18-22 to

model_local_paths = {
        "BagNet33": "./victim_models/init_bagnet/bagnet33.pth",
        "BagNet17": "./victim_models/init_bagnet/bagnet17.pth",
        "BagNet9": "./victim_models/init_bagnet/bagnet9.pth",
    }

We provide support for stealing medical imaging models for two use-cases: Gall Bladder Cancer (GBC) identification, and COVID-19 classification.

Download Model weights

Download the following victim models.

Radformer - Download model weights from the official RadFormer repo (this link). Unzip the model weights and keep them in the victim_models directory.
POCUS-ResNet18 - Download model weights from this link. (Upload weights and provide link here), and keep it in victim_models directory.

In addition, download ImageNet-pretrained model weights from [here](upload and provide link) and keep in ckpts directory.

Prepare Datasets

Gall Bladder Cancer Ultrasound (GBCU) - This is the GBC victim dataset. To obtain the dataset, follow the instructions here. Keep the dataset in the data_msa_medical/GBCU-Shared directory.
Gall Bladder Ultrasound Video (GBUSV) - This is the GBC thief dataset. To get the dataset, follow the instructions here. Keep the dataset in the data_msa_medical/GBUSV-Shared directory.
POCUS - This is the COVID-19 victim dataset. Follow the instructions on USCL repo to get the 5 fold cross-validation POCUS dataset. Keep it in data_msa_medical/covid_5_fold.
COVIDx-US - THis is the COVID-19 thief dataset. Follow the instructions on the COVIDx-US repo to generate the dataset, and keep it in data_msa_medical/covidx_us.

Run model stealing baselines

Use the script activethief/train.py to train a thief model. Config files are named as <victim_arch>_<thief_arch>_<thief_dataset>.yaml.

cd activethief
python activethief/train.py --c activethief/configs/gbc/radformer_resnet50_gbusv.yaml

The default values in the config files support Random sample selection from KnockoffNets. To run k-Center or Entropy, edit the config to change ACTIVE.METHOD to 'kcenter' or 'entropy' respectively, and ACTIVE.CYCLES to 5.

Run proposed method

Note that for the proposed method, you must first run the baseline method to train the anchor model. Use the script train_proposed.py to train a thief model using SSL. Edit the config file to set the appropriate data paths, and path to the anchor model as well as the labeled set queried from the victim during training the anchor model.

cd ssl
python train_proposed.py --c configs/gbc/querywise_resnet50.yaml

Acknowledgements

Parts of this codebase are built upon

Thanks to the authors of these papers for making their code available for public usage.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
activethief		activethief
ssl		ssl
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks

Abstract

Installation

Download Model weights

Prepare Datasets

Run model stealing baselines

Run proposed method

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Assessing Risk of Stealing Proprietary Models for Medical Imaging Tasks

Abstract

Installation

Download Model weights

Prepare Datasets

Run model stealing baselines

Run proposed method

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages