This repository is the official implementation of Deep Compositional Metric Learning.
The majority of this codebase is built upon research and implementations provided in Paper: https://arxiv.org/abs/2002.08473 Repo: https://github.com/Confusezius/Revisiting_Deep_Metric_Learning_PyTorch
Architecturally, our DCML framework is as follows:
Download from (http://www.vision.caltech.edu/visipedia/CUB-200-2011.html).
Download from (http://ai.stanford.edu/~jkrause/cars/car_dataset.html).
Download from (http://cvgl.stanford.edu/projects/lifted_struct/)
Organize the dataset as follows:
CUB200-2011
cub200
└───images
| └───001.Black_footed_Albatross
| │ Black_Footed_Albatross_0001_796111
| │ ...
| ...
CARS196
cars196
└───images
| └───Acura Integra Type R 2001
| │ 00128.jpg
| │ ...
| ...
Online Products
online_products
└───images
| └───bicycle_final
| │ 111085122871_0.jpg
| ...
|
└───Info_Files
| │ bicycle.txt
| │ ...
- Pytorch 1.2.0+ & Faiss-Gpu
- Python 3.7
- pretrainedmodels 0.7.4
- touchvision 0.5.0
Training is done by using main.py
, with settings available in parameters.py
. The main parameters that control our compositiors and ensembles are --compos_num
and --ensemble_num
.
For example, we initialize the --compos_num
to 4
and --ensemble_num
to 4
. More compositions are worth trying.
To train the DCML model with margin loss on CUB200, run this command:
python main.py --dataset cub200 --tau 55 --gamma 0.2 --gpu 0 --seed 0 --compos_num 4 --ensemble_num 4 --embed_dim 128 --bs 100
--n_epochs 300 --samples_per_class 2 --loss margin --batch_mining distance --arch resent50_frozen_normalize
Besides, our architecture can be implemented in the diva framework (http://arxiv.org/abs/2004.13458) with this command:
python main.py --dataset cub200 --tau 55 --gamma 0.2 --gpu 0 --seed 0 --compos_num 4 --ensemble_num 4 --embed_dim 128 --bs 100
--n_epochs 300 --samples_per_class 2 --loss margin --batch_mining distance --arch resent50_frozen_normalize
--diva_ssl fast_moco --lr 0.000015 --evaltypes all --diva_rho_decorrelation 1500 1500 1500 --diva_features discriminative selfsimilarity shared intra
--diva_moco_temperature 0.01 --diva_moco_n_key_batches 30 --diva_aplha_ssl 0.5 diva_alpha_shared 0.3 --diva_alpha_intra 0.3
We tested our code on a linux machine with an Nvidia RTX 2080ti GPU card. We recommend using a GPU card with a memory > 11GB.
Revisiting_Deep_Metric_Learning_PyTorch (https://github.com/Confusezius/Revisiting_Deep_Metric_Learning_PyTorch) by Confusezius.