Learning the What and How of Annotation in Video Object Segmentation

Created by Thanos Delatolas, Vikcy Kalogeiton, Dim P. Papadopoulos

[Paper (WACV 2024)] [Project page] [Extended Abstract (ICCV-W 2023)]

Installation

conda env create -f environment.yaml

Data

Download the data with python download_data.py. The data should be arranged with the following layout:

data
├── DAVIS_17
│   ├── Annotations
│   ├── ImageSets
│   └── JPEGImages
│           
├── MOSE
│   ├── Annotations
│   ├── ImageSets
│   └── JPEGImages

The script download_data.py also creates the train/val/test splits in MOSE, as discussed in the paper. If qdown denies access to the MOSE dataset, you can manually download MOSE from here and place it in the directory: ./data/MOSE/

Download weights

Download the model-weights with python download_weights.py. The weights should be arranged with the following layout:

model_weights
├── mivos
│   └── stcn.pth
│   └── fusion.pth
├── qnet
│   └── qnet.pth
├── rl_agent
│   └── model.pth
├── sam
│   └── sam.pth

We provide the weights of MiVOS trained only on YouTube-VOS. If you wish to replicate the training process, please refer to the original repository.

Training

Generate Frame Quality Dataset: python generate_fq_dataset.py --imset subset_train_4 and python generate_fq_dataset.py --imset val
Train QNet: python train_qnet.py
Generate the annotation type dataset: python generate_annotation_dataset.py --imset subset_train_4
Train the RL Agent: python train_rl_agent.py

Experiments

The script eval_annotation_method.py is used to execute all annotation methods. The script scripts/eval.sh can be used to run all the experiments. Finally, the scripts vis/frame_selection.py and vis/full_pipeline.py are used to plot the results obtained from the experiments conducted. To speed up the process, it is recommended to run the experiments simultaneously on multiple GPUs.

Citation

@inproceedings{delatolas2024learning,
  title={Learning the What and How of Annotation in Video Object Segmentation},
  author={Thanos Delatolas and Vicky Kalogeiton and Dim P. Papadopoulos},
  year={2024},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
annotator		annotator
assets		assets
config		config
datasets		datasets
feature_extractors		feature_extractors
interactions		interactions
mivos		mivos
models		models
ppo		ppo
robots		robots
sam		sam
scripts		scripts
util		util
vis		vis
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
download_data.py		download_data.py
download_weights.py		download_weights.py
environment.yaml		environment.yaml
eval_annotation_method.py		eval_annotation_method.py
generate_annotation_dataset.py		generate_annotation_dataset.py
generate_fq_dataset.py		generate_fq_dataset.py
requirements.txt		requirements.txt
train_qnet.py		train_qnet.py
train_rl_agent.py		train_rl_agent.py

License

thanosDelatolas/eva-vos

Folders and files

Latest commit

History

Repository files navigation

Learning the What and How of Annotation in Video Object Segmentation

Installation

Data

Download weights

Training

Experiments

Citation

About

Resources

License

Stars

Watchers

Forks

Languages