GOCA

Guided Online Cluster Assignment for Self Supervised Video Representation Learning.
_{Official PyTorch Implementation of the ECCV 2022 Paper. Feel free to contact hcoskun-at-snap.com if you have questions.}

Overview

We propose a principled way to combine two views. Specifically, we propose a novel clustering strategy where we use the initial cluster assignment of each modality as prior to guide the final cluster assignment of the other modality. This idea will enforce similar cluster structures for both modalities, and the formed clusters will be semantically abstract and robust to noisy inputs coming from each individual modality.
You can find the implementation of this idea in sinkhorn_withprior

Link:

[Arxiv]

Prerequisites

Python 3.7
PyTorch==1.4.0, torchvision 0.5.0
Cuda 10.2
Apex with cuda extension (see also: this issue)

Preparing Dataset

Download dataset sh datasets/ds_prep/kinetics-400/download.sh
Extract rar files sh datasets/ds_prep/kinetics-400/extract.sh
We use the TVL1 algorithm to compute optical-flow. We modified the MemDPC Code for efficient GPU utilization to compute optical flow.
1. Run this script python datasets/ds_prep/efficent_optical_flow_with_GPU.py
2. If you have more than one-GPU to dedicate computing the optical flow, you can run this script for each GPU.
3. Unfortunately, I couldn't find a way to batch-wise optical flow computation with open-CV. If you can manage it, please let me know.

Pretrain Instruction

Generate prototypes python prots/prototypes.py
1. It will save the prototypes to "prot/fls/" and the model will load from there. If you save another location please update "helper/opt_aug.py".
2. Please be sure that use_precomp_prot set true otherwise model will use randomly generated prototypes.
3. Trained prototypes should look like Figure 3 (on the right) in the paper.
Run pre-training: sh scripts/pretrain_on_cluster.sh
1. The above script is for multi-node slurm training, however, code can be used for single node training as well.
2. Please setup your dataset location in ".sh" file or in "helper/opt_aug.py" file.

Nearest-neighbour Retrieval Instruction

You can use the following script for evaluation. You need to be sure that "root_dir" argument is correctly set.

 sh scripts/knn_on_cluster.sh

please update "root_dir" for your computed features. Model generates features during the evaluation stage. You can set where to save in the

Acknowledgements

We used code from Selavi, SWaV, VICC, and CoCLR

Notes On Code

We are still cleaning the code, because of this maybe you might see some un-used methods in the code, please ignore them.
We experimented training with float-16, however, we observe significant drop in accuracy. We couldn't solve this problem. Ideally, accuracy shouldn't change that much. If anyone interested in this code, we can provide that also.
Please be careful with the open-cv implementation for optical flow. We observe that there can be significant differences in computed optical flows.
- We extract optical flow with at the image-size of 256
- You should use the same parameters to extract optical flow for all the datasets.
We did our best to follow common evaluation strategies however, there are differences in earlier works. We mostly follow: A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning . We saw most of the works follow it, however we observe following differences:
- For instance, CoCLR and VICC use different learning rate-scheduler during fine-tuning.
- We observe differences in fine-tuning duration.
- Selavi uses different features (extracted from the different layer with embedding size of 4096) for evaluation than others. In our all experiment, we use 2048. We did not see a significant difference with 4096.
- We also observe that number of projection layer changes significantly in earlier works.
- We observe also significant differences in optimizers and learning-rate schedulers during pre-training.

Citation

@inproceedings{goca,
  title={GOCA: Guided Online Cluster Assignment for Self Supervised Video Representation Learning},
  author={Coskun, Huseyin and Zareian, Alireza and Moore, Joshua L and Tombari, Federico and Wang, Chen},
  booktitle={ECCV},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
datasets		datasets
figs		figs
helper		helper
my_ds		my_ds
my_models		my_models
prots		prots
scripts		scripts
.gitignore		.gitignore
README.md		README.md
knn_eval.py		knn_eval.py
ssv_main.py		ssv_main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datasets

datasets

figs

figs

helper

helper

my_ds

my_ds

my_models

my_models

prots

prots

scripts

scripts

.gitignore

.gitignore

README.md

README.md

knn_eval.py

knn_eval.py

ssv_main.py

ssv_main.py

Repository files navigation

GOCA

Overview

Link:

Prerequisites

Preparing Dataset

Pretrain Instruction

Nearest-neighbour Retrieval Instruction

Acknowledgements

Notes On Code

Citation

About

Releases

Packages

Languages

Seleucia/goca

Folders and files

Latest commit

History

Repository files navigation

GOCA

Overview

Link:

Prerequisites

Preparing Dataset

Pretrain Instruction

Nearest-neighbour Retrieval Instruction

Acknowledgements

Notes On Code

Citation

About

Resources

Stars

Watchers

Forks

Languages