PyTorch implementation of Refine and Represent: Region-to-Object Representation Learning.
Installation of Apex is required to enable DDP.
To log metrics to wandb switch to enable_wandb:True
in train_imagenet_300.yaml
python>=3.9
pytorch>=1.10.0
torchvision>=0.11.0
joblib
scikit-image
matplotlib
opencv-python
tqdm
tensorflow
pyyaml
tensorboardx
wandb
pycocotools
classy_vision
This repo uses torch.distributed.launch
for pretraining:
python -m torch.distributed.launch --nproc_per_node=4 --nnodes=32 --node_rank=0 --master_addr="" --master_port=12345 r2o_main.py --cfg={CONFIG_FILENAME}
imagenet
├── images
│ ├── train
│ │ ├── n01440764
│ │ ├── ...
│ │ ├── n15075141
│ ├── val
│ │ ├── n01440764
│ │ ├── ...
│ │ ├── n15075141
We release pretrained weights pretrained on ImageNet-1k for 300 epochs in original, torchvision and d2 format.
Original [Download]
Converted: Torchvision (MMSegmentation) [Download] D2 (Detectron2) [Download]
The evaluation baselines are as follows
Metric | Value |
---|---|
PASCAL VOC mIoU | 77.3 |
Cityscapes mIoU | 76.6 |
MS COCO |
41.7 |
MS COCO |
38.3 |
@misc{gokul2022refine,
title = {Refine and Represent: Region-to-Object Representation Learning},
author = {Gokul, Akash and Kallidromitis, Konstantinos and Li, Shufan and Kato, Yusuke and Kozuka, Kazuki and Darrell, Trevor and Reed, Colorado J},
journal={arXiv preprint arXiv:2208.11821},
year = {2022}
}
We use MMSegmentation for PASCAL VOC and Cityscapes semantic segmentation. We use detectron2 for MS COCO object detection and instance segmentation. The corresponding config can be found in evaluation
folder.
This repo is based on the BYOL implementation from Yao: https://github.com/yaox12/BYOL-PyTorch and K-Means implementation from Ali Hassani https://github.com/alihassanijr/TorchKMeans