The usual installation steps involve the following commands, they should set up the correct CUDA version and all the python packages:
conda create -n Siamese-Diffusion python=3.10
conda activate Siamese-Diffusion
conda install pytorch==2.4.0 torchvision==0.19.0 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -U xformers --index-url https://download.pytorch.org/whl/cu118
pip install deepspeedWe evaluated our method on three public datasets: Polyps (as provided by the PraNet project), ISIC2016, and ISIC2018.
--data
--images
--masks
--prompt.json💡 Note: All improvements have been integrated into cldm.py, and the DHI module is implemented in dhi.py. Both are located within the cldm folder.
🔥 Recommendation
The DHI module is a plug-and-play enhancement recommended for all ControlNet-based setups.
It significantly accelerates convergence for datasets with large domain gaps from pretrained data, such as:
- Medical segmentation images
- Anomaly detection images
- ...
Especially effective when jointly fine-tuning the Stable Diffusion UNet decoder.
Here are example commands for training:
# Initialize ControlNet with the pretrained UNet encoder weights from Stable Diffusion,
# then merge them with Stable Diffusion weights and save as: control_sd15.ckpt
python tool_add_control.py
# For multi-GPU setups, ZeRO-2 can be used to train Siamese-Diffusion
# to reduce memory consumption.
python tutorial_train.pyHere are example commands for sampling:
# ZeRO-2 distributed weights are saved under the folder:
# lightning_logs/version_#/checkpoints/epoch/
# Run the following commands to merge the weights:
python zero_to_fp32.py . pytorch_model.bin
python tool_merge_control.py
# Sampling
python tutorial_inference.pyThis code is developed based on ControlNet and incorporates several segmentation models, including SANet, Polyp-PVT, and CTNet.
If you find our work useful in your research or if you use parts of this code, please consider citing our paper:
@article{qiu2025noise,
title={Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation},
author={Qiu, Kunpeng and Gao, Zhiqiang and Zhou, Zhiying and Sun, Mingjie and Guo, Yongxin},
journal={arXiv preprint arXiv:2505.06068},
year={2025}
}
@article{qiu2025adaptively,
title={Adaptively Distilled ControlNet: Accelerated Training and Superior Sampling for Medical Image Synthesis},
author={Qiu, Kunpeng and Zhou, Zhiying and Guo, Yongxin},
journal={arXiv preprint arXiv:2507.23652},
year={2025}
}





