Official repository for the MICCAI 2023 SASHIMI Workshop paper: How Good Are Synthetic Medical Images? An Empirical Study with Lung Ultrasound
MICCAI 2023 SASHIMI | Conference Paper | Arxiv Paper
Full pipeline from generating sythetic images using DC-GAN, evaluation/validation plots, and using synthetic images to train and test a downstream tasks. We provide a built-in docker image based on nvcr.io/nvidia/tensorflow:21.12-tf1-py3. (Requirements: see base image details.) This docker image will install all the packages needed to run the scripts.
- Install docker-compose version 1.29.2
sudo curl -L "https://github.com/docker/compose/releases/download/1.29.2/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo mv /usr/local/bin/docker-compose /usr/bin/docker-compose
sudo chmod 755 /usr/bin/docker-compose
- Test if docker-compose is successfully installed
docker-compose version
#### Expected output #####
docker-compose version 1.29.2, build 5becea4c
git clone https://github.com/Global-Health-Labs/US-DCGAN.git
cd US-DCGAN
docker-compose up
docker attach us-dcgan_test_1
cd code
python3 GAN/train_dcgan.py --dataroot <path_to_data> --niter 25 --cuda --nc 1 --loggerName training.log --workers 16 --ngpu 1
for path_to_data, the data path should have following structure as defined in: http://pytorch.org/vision/main/generated/torchvision.datasets.ImageFolder.html
example: root/dog/xxx.png
DCGAN does not require label information or multiple class, but ImageFolder class would require the folder structure above to generate dataset.
The outputs, including model checkpoints, example generated images, etc. are saved under <output_dir>/<model_name>/.
For qualitative evaluation: The sample images generated by the GAN from the latest epoch were saved under <output_dir>/<model_name>/fake/0. Besides, a batch of generated images were saved for each epoch with the same latent space vector.
For quantitive evaluation: check the following jupyter notebook, it will generate plots for MMD, 1NN accuracy and feature confidence scores based on folder of the trained GAN
evalutation_plot.ipynb
python3 generate.py --model_path <path_to_GAN_model> --size 2000 --save_path <path to save generated images>
size: the number of images that will be generated.
This classifier is optimized for the ultrasound dataset and currently only support 1 channel (it will read 3-channel images as black and white)
python3 CNN/classify_tunning.py --train_pos_dir <path_to_positive_train_data> --train_neg_dir <path_to_negative_train_data> --val_pos_dir <path_to_positive_validation_data> --val_neg_dir <path_to_negative_validation_data>
The output will be saved under <root_folder>/logs by default. For customized saving location, use flag --save_dir to specify.
check the following jupyter notebook, it will provide evaluation metrics such as ROC AUC, accuracy, F1, etc.
evaluate_classifier.ipynb
Placeholder, will release information after publication
@inproceedings{yu2023good,
title={How Good Are Synthetic Medical Images? An Empirical Study with Lung Ultrasound},
author={Yu, Menghan and Kulhare, Sourabh and Mehanian, Courosh and Delahunt, Charles B and Shea, Daniel E and Laverriere, Zohreh and Shah, Ishan and Horning, Matthew P},
booktitle={Simulation and Synthesis in Medical Imaging: 8th International Workshop, SASHIMI 2023, Held in Conjunction with MICCAI 2023, Vancouver, BC, Canada, October 8, 2023, Proceedings},
volume={14288},
pages={75},
year={2023},
organization={Springer Nature}
}
If you have any question or would like to contact us, please email menghan.yu@ghlabs.org or sourabh.kulhare@ghlabs.org