Official code for "Improved Fine-Tuning by Better Leveraging Pre-Training Data", NeurIPS 2022
PyTorch >= 1.9.0
We use eight datasets following the official instructions, see the appendix in our paper.
Use the UOT_selection/UOT_select_class_unbalanced.py
in this repository for UOT data selection. You need to change the arguments in the bash file, such as data path, hyperparameters and result file path. This will produce a file of image paths selected by the UOT algorithm, for example, OT_unnorm_cos_imagenet_OT_select_100_classes_train_samples.txt
With the selected images generated by UOT algorithm, we can run the fine-tuning of supervised/self-supervised pre-trained models on a downstream task, using the code in Supervised_PT
/Self_Supervised_PT
. We use the MoCo-v2 model trained for 800 epochs as the self-supervised model. Run the fine-tuning with
bash fine_tune_script.sh
after setting the right paths for data and image selection file.
If you use our code in your research, please cite with:
@inproceedings{
liu2022improved,
title={Improved Fine-Tuning by Better Leveraging Pre-Training Data},
author={Ziquan Liu and Yi Xu and Yuanhong Xu and Qi Qian and Hao Li and Xiangyang Ji and Antoni B. Chan and Rong Jin},
booktitle={Advances in Neural Information Processing Systems},
year={2022}
}
We use POT: Python Optimal Transport package in the unbalanced optimal transport computation.