Official implementation of "Connecting Domains and Contrasting Samples: A Ladder for Domain Generalization" (KDD 2025).
DCCL (Domain-Aware Contrastive Cross-domain Learning) is a novel approach for domain generalization that combines multiple complementary loss components to learn robust representations across different domains. The algorithm integrates:
- Cross-entropy loss for standard classification
- Contrastive loss between aggressively augmented views for invariant representation learning
- Layer-wise contrastive loss for contrastive feature alignment with pre-trained models
- Generative alignment regularization to generative align features with pre-trained knowledge
data/
DCCL/
├── train_all.py # Main training script
├── config.yaml # Configuration file
├── domainbed/
│ ├── algorithms/
│ │ └── algorithms.py # 🔥 Core DCCL algorithm implementation
│ ├── lib/
│ │ └── cl_hparams.py # 🔥 Core hyperparameter settings
│ ├── datasets/ # Dataset loaders
│ ├── networks.py # Network architectures
│ ├── trainer.py # Training loop
│ └── ...
└── ...
DCCL/domainbed/algorithms/algorithms.py: Contains the main DCCL algorithm with detailed comments explaining each loss componentDCCL/domainbed/lib/cl_hparams.py: Core hyperparameter configurations for different datasets
The DCCL algorithm follows this training procedure:
- Data Preparation: Load original and augmented image pairs
- Feature Extraction: Extract features using trainable and frozen pre-trained networks
- Multi-Loss Computation:
- Classification loss (always active)
- Contrastive loss (controlled by
--l) - Domain alignment loss (controlled by
--l_d) - Layer-wise contrastive loss (controlled by
--l_layer)
- Optimization: Multi-component loss backpropagation with different learning rates
The main tuning parameters are located in DCCL/domainbed/lib/cl_hparams.py:
--l: Weight for contrastive loss (default: 1.0)--l_d: Weight for domain alignment loss (default: 0.05)--l_layer: Weight for layer-wise contrastive loss (default: 1.0)--t: Temperature for contrastive loss (default: 0.1)--t_pre: Temperature for pre-trained feature loss (default: 0.2)--n_layer: Number of layers in projection head (default: 1)
Python: 3.6+
PyTorch: latest
Torchvision: 0.10.0
CUDA: 10.2
CUDNN: 7605
NumPy: 1.19.5
PIL: 7.2.0
You can
git clone <this-repo>
cd DCCL/
pip install -r requirements.txtEach dataset can be easily accessed from official sources. For example, the VLCS dataset can be found on the official repo.
To download all datasets automatically:
python download.py --data_dir dataNavigate to the DCCL directory and run:
We default set --algorithm DCCL
cd DCCL/
python train_all.py DCCL_OH_0 --dataset OfficeHome --deterministic --trial_seed 0 --checkpoint_freq 100 --data_dir ../datapython train_all.py DCCL_OH_0 --dataset OfficeHome --deterministic --trial_seed 0 --checkpoint_freq 100 --data_dir ../data
python train_all.py DCCL_OH_1 --dataset OfficeHome --deterministic --trial_seed 1 --checkpoint_freq 100 --data_dir ../data
python train_all.py DCCL_OH_2 --dataset OfficeHome --deterministic --trial_seed 2 --checkpoint_freq 100 --data_dir ../dataDifferent Backbone Models:
# CLIP ViT-B/16
python train_all.py DCCL_OH_vit --dataset OfficeHome --deterministic --trial_seed 2 --checkpoint_freq 100 --data_dir ../data --model clip_vit-b16
# RegNet
python train_all.py DCCL_OH_reg --dataset OfficeHome --deterministic --trial_seed 2 --checkpoint_freq 100 --data_dir ../data --model regnetLimited Labeled Data:
# 10% labeled data
python train_all.py DCCL_OH_res50_0.1 --dataset OfficeHome --deterministic --trial_seed 2 --checkpoint_freq 100 --data_dir ../data --label_ratio 0.1Different Datasets:
# PACS
python train_all.py DCCL_PACS_0 --dataset PACS --deterministic --trial_seed 0 --checkpoint_freq 100 --data_dir ../data
# VLCS
python train_all.py DCCL_VLCS_0 --dataset VLCS --deterministic --trial_seed 0 --checkpoint_freq 100 --data_dir ../data
# TerraIncognita
python train_all.py DCCL_TI_0 --dataset TerraIncognita --deterministic --trial_seed 0 --checkpoint_freq 100 --data_dir ../dataThe training outputs will be saved in DCCL/train_output/[DATASET]/[EXPERIMENT_NAME]/ containing:
- Training logs
- Evaluation results
This codebase builds heavily upon the excellent SWAD framework. We gratefully acknowledge their foundational work in domain generalization research.
If you find this work helpful, please kindly cite:
@inproceedings{wei2025connecting,
title={Connecting domains and contrasting samples: A ladder for domain generalization},
author={Wei, Tianxin and Chen, Yifan and He, Xinrui and Bao, Wenxuan and He, Jingrui},
booktitle={Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 1},
pages={1563--1574},
year={2025}
}