Skip to content

stegmuel/CrOC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CrOC

[webpage] [arXiv]

This repo contains the Pytorch implementation of our CVPR 2023 paper:

CrOC: Cross-View Online Clustering for Dense Visual Representation Learning

Thomas Stegmüller*, Tim Lebailly*, Behzad Bozorgtabar, Tinne Tuytelaars, and Jean-Philippe Thiran.

alt text

Dependencies

Our code only has a few dependencies. First, install PyTorch for your machine following https://pytorch.org/get-started/locally/. Then, install other needed dependencies:

pip install einops

Pretraining

Single GPU pretraining

Run the main_croc.py file. Command line args are defined in parser.py.

python main_croc.py --args1 val1

Make sure to use the right arguments specified in the table below!

1 node pretraining

python -m torch.distributed.launch --nproc_per_node=8 main_croc.py --args1 val1

Citation

If you find our work useful, please consider citing:

@inproceedings{stegmuller2023croc,
  title={CrOC: Cross-view online clustering for dense visual representation learning},
  author={Stegm{\"u}ller, Thomas and Lebailly, Tim and Bozorgtabar, Behzad and Tuytelaars, Tinne and Thiran, Jean-Philippe},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={7000--7009},
  year={2023}
}

Pretrained models

You can download the full checkpoint which contains backbone and projection head weights for both student and teacher networks. We also provide detailed arguments to reproduce our results. Note that the results here are slightly higher than those reported in the paper for COCO and COCO+. This is because we realized that these runs had not finished training for 300 epochs.

pretraining dataset arch params batchsize LC PVOC12 LC COCO things LC COCO stuff download
COCO ViT-S/16 21M 256 54.9% 55.7% 49.9% full ckpt args
COCO+ ViT-S/16 21M 256 61.6% 64.4% 52.2% full ckpt args
ImageNet-1k ViT-S/16 21M 1024 70.6% 66.1% 52.6% full ckpt args

Acknowledgments

This code is adapted from DINO.