Throughout all stages of the experiment, the storage format for the slices we use is JPEG for data and PNG for labels, for the convience of training GAN-based translation model.
-
trained on ACDC, 10% supervised(axial slices, no cropping):
https://drive.google.com/file/d/1dS-9s_nLW4QqX-Fl0zIZV85RQUFu_wfJ/view?usp=drive_link
-
trained on PDDCA, one-shot(coronal slices, head centered cropping):
https://drive.google.com/file/d/1Y3oLN70NDp7Zdwkx77R1tiJeIQo4wBvv/view?usp=sharing
if you don't want to perform the data synthesis or conduct the training process of the GAN for image translation and the segmentation model for pseudo-label, please go to the 2.
Please use the matlab codes in the folder: "synthetic_data_create" , to create the synthetic data from random FFD based on B-spline.
The value of the variable of the reference data path needs to be changed according to your condition.
The data format of this code can accept is in MHA, and the output is volumetric data. However, it is pretty easy to perform data format conversion and volumetric data slicing using packages such as SimpleITK, nibabel, and Pillow.
You may also use our synthetic data slices from https://drive.google.com/file/d/1AQMvOedlVlW8WJr1Lt3wluzlXOybtcLc/view?usp=sharing , but only for academic use.
With the labeled data synthesized from one(pddca) or few(acdc) reference data, please use the python codes in trans-unet folder to train the pre-trained pseudo-label prediction model. The codes in this folder are mainly consistent with the official code of TransUNet, except the dataset class for our pics.
Please put the segmentation model pretrained above to the "translation" folder first, for the gan-based translation model will utilize the pseudo-labels generated by it to guide the sampling process of the patches. And put the synthetic data slice pics and clinical slice pics in one folder(clinical at subfolder trainA, synthetic at subfolder trainB) at the datasets directory. Then please use the command below to train the translation model.
# for pddca one-shot
CUDA_VISIBLE_DEVICES=0 python ./train.py --name real2syn_pddca --dataroot ./datasets/real2syn_pddca --CUT_mode CUT --dce_idt --lambda_HDCE 0.1 --lambda_SRC 0.05 --lambda_SRC_pl 0.01 --use_curriculum --HDCE_gamma 50 --HDCE_gamma_min 10 --batch_size 4 --n_epochs 200 --n_epochs_decay 200 --save_epoch_freq 1 --gpu_ids 0 --preprocess resize --load_size 256 --TU pddca_epoch_149.pth
# for acdc few-shot
CUDA_VISIBLE_DEVICES=0 python ./train.py --name real2syn_acdc --dataroot ./datasets/acdc_7070 --CUT_mode CUT --dce_idt --lambda_HDCE 0.1 --lambda_SRC 0.05 --lambda_SRC_pl 0.01 --use_curriculum --HDCE_gamma 50 --HDCE_gamma_min 10 --batch_size 4 --n_epochs 100 --n_epochs_decay 100 --save_epoch_freq 1 --gpu_ids 0 --preprocess resize --load_size 256 --num_patches 1024 --TU ACDC10ref_final.pth
After the training process, we select the Gan model whose translation results can have the best promotion on the validation set for the downstream task.
If you want to accelerate the training process of the final segmentation model, you can generate offline pseudo-labels using the pretrained translation and segment models for the warm-up stage, during which these pseudo-labels will not be updated. the according codes are at: SACL/generate_pl.py
Please run the train_pddca_v1.py and train_acdc_10sup.py in SACL folder, the commands we use are in the comments above the "main" line. Models will be saved at checkpoints folder.
Please run the test_otherpeople_acdc.py and test_otherpeople_pddca.py firstly to save prediction masks. Then run other test codes for the metrics.
if you don't want to start from the data synthesis, please download our pretrained models and data slices from: https://drive.google.com/drive/folders/12Kra8FHc2giKMhpsMAZ1iFsvwi1OpZWF?usp=sharing. Note that the slice pictures that have the suffix of "_A" means they belong to the clinical domain, and "_B" means the synthetic data. The label slice pictures of _A were generated pseudo-label from the pretrained models.