Skip to content
/ GTSA Public

Self-Supervised Learning from Non-Object Centric Images with a Geometric Transformation Sensitive Architecture

Notifications You must be signed in to change notification settings

bok3948/GTSA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GTSA: A PyTorch Implementation

This is a PyTorch official implementation of the paper Self-Supervised Learning from Non-Object Centric Images with a Geometric Transformation Sensitive Architecture

Example Image

Requirements

  • PyTorch: 1.13.1
  • CUDA: 11.6
  • timm: 0.6.13
  • kornia: 0.6.11
  • mmsegmentation: v0.30.0
  • mmdetection: v2.28.1

Pre-Training GTSA with Non-Object Centric Images


To pre-train ViT-Small (recommended default) with single-node distributed training, run the following on 1 nodes with 8 GPUs. our default pretraining epoch is 100.

python -m torch.distributed.launch   --nnodes 1 --nproc_per_node 8 main_pretrain.py --data /data_path CoCo or ADE20K --batch_size 64 --model gtsa_small

The following table provides the pre-trained checkpoints used in the paper.

Model Pretraining Data Pretrain Epochs Checkpoint
GTSA(ours) COCO train2017 100 Download
GTSA(ours) ADE20K(2016) train 100 Download
DINO COCO train2017 100 Download
DINO ADE20K(2016) train 100 Download

Fine-tuning with pre-trained checkpoints


1.Classification

We evaluated the performance of our models on the iNaturalists 2019 classification benchmark.

To fine-tuning ViT-Small with iNat19 dataset, first go to dir ./downstream/classification and run the following on 1 nodes with 8 GPUs

python -m torch.distributed.launch --nproc_per_node=8 --nnodes 1 main_finetune.py --accum_iter 1 --batch_size 128 --model vit_small --finetune /your_checkpoint --epochs 300 --blr 5e-4 --layer_decay 0.65 --weight_decay 0.05 --drop_path 0.1 --mixup 0.8 --cutmix 1.0 --reprob 0.25 --dist_eval

The following table provides the finetuning log.

Model Pretraining Data Pretrain Epochs Fintuning Data Log
GTSA(ours) COCO train2017 100 iNaturalists 2019 Download
DINO COCO train2017 100 iNaturalists 2019 Download

The results should be

Method Top-1 Acc Top-5 Acc
DINO 54.8 82.9
GTSA (Ours) 59.7 85.7


2.Detection & Instace Segmentation

We evaluated the performance of our models on the COCO 2017 Detection & Instace Segmentation benchmark with mask-rcnn model.

To fine-tuning mask-rcnn with COCO dataset, first download mmdetection. and use configs, model of ours(in /dowstream/mmdet). The following code should run mmdetection dir.

 tools/dist_train.sh /your_path/GTSA/downstream/mmdet/my_configs/CoCo_GTSA_mask_rcnn_vit_small_12_p16_1x_coco.py 8 --work-dir ./save

The following table provides the finetuning log.

Model Pretraining Data Pretrain Epochs Fintuning Data Log
GTSA(ours) COCO train2017 100 COCO2017 Download
DINO COCO train2017 100 COCO2017 Download

The results should be

Method Detection Instance Segmentation
APb APb50 APb75 APm APm50 APm75
DINO 32.4 54.2 33.8 30.8 51.1 32.2
GTSA(ours) 35.8 57.8 38.5 33.5 54.7 35.3


3.Semantic Segmentation

We evaluated the performance of our models on the ADE20K Semantic Segmentation benchmark.

To fine-tuning Semantic FPN with ADE20K dataset, first download mmsegmentation. Second convert checkpoint to mmsegmentation vit style with following code.

python tools/model_converters/vit2mmseg.py /your_checkpoint ./new_checkpoint_name

Finally, use configs of ours(in /dowstream/mmseg). The following code should run mmsegmentation dir.

 tools/dist_train.sh /your_path/GTSA/downstream/mmseg/my_configs/ADE20K_GTSA_pretrained_semfpn_vit-s16_512_512_40k_ade20k.py  8 --work-dir ./save --seed 0 --deterministic

The following table provides the finetuning log.

Model Pretraining Data Pretrain Epochs Fintuning Data Log
GTSA(ours) COCO train2017 100 ADE20K Download
GTSA(ours) ADE20K(2016) train 100 ADE20K Download
DINO COCO train2017 100 ADE20K Download
DINO ADE20K(2016) train 100 ADE20K Download

The results should be

Method aAcc mIoU mAcc
DINO 74.7 27.3 35.9
GTSA (Ours) 76.4 30.6 40.0

About

Self-Supervised Learning from Non-Object Centric Images with a Geometric Transformation Sensitive Architecture

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages