Object-centric Representation Benchmark

This repository contains code, data and a benchmark leaderboard from the paper Benchmarking Unsupervised Object Representations for Video Sequences by M.A. Weis, K. Chitta, Y. Sharma, W. Brendel, M. Bethge, A. Geiger and A.S. Ecker (2021).

Code for training OP3, TBA and SCALOR was adapted using: OP3 codebase, TBA codebase and SCALOR codebase.

Table of Contents

Installation
Datasets
- Extract data
Training
Evaluation
Leaderboard
- VMDS
- SpMOT
- VOR
- texVMDS
Citation

Installation

python3 setup.py install

Datasets

Download data from OSF to ocrb/data/datasets.

Available datasets:

Video Multi-dSprites (VMDS)
Sprites-MOT (SpMOT)
Video Object Room (VOR)

Textured Video Multi-dSprites (texVMDS)

Extract Data

Extract data from hdf5 files:

python3 ocrb/data/extract_data.py --path='ocrb/data/datasets/' --dataset='vmds'

Training

Training ViMON

To run ViMON training:

python3 ocrb/vimon/main.py --config='ocrb/vimon/config.json'

where hyperparameters are specified in the config file.

Training OP3

To run OP3 training:

python3 ocrb/op3/main.py --va vmds

where the --va flag can be toggled between vmds, vor, and spmot. Hyperparameters for each can be found in the corresponding file. For details see the original OP3 repository.

Training TBA

For TBA training, the input datasets need to be pre-processed into batches, for which we provide a function:

python3 ocrb/tba/data/create_batches.py --batch_size=64 --dataset='vmds' --mode='train'

To run training:

python3 ocrb/tba/run.py --task vmds

the --task flag can be toggled between vmds, spmot and vor. For details regarding other training flags see the original TBA repository.

Evaluation

Generating ViMON annotation file

To generate annotation file with mask and object id predictions per frame for each video in the test set, run:

python3 ocrb/vimon/generate_pred_json.py --config='ocrb/vimon/config.json' --ckpt_file='ocrb/vimon/ckpts/pretrained/ckpt_vimon_vmds.pt' --out_path='ocrb/vimon/ckpts/pretrained/vmds_pred_list.json'

where hyperparameters including dataset are specified in ocrb/vimon/config.json file and --ckpt_file gives the path to the trained model weights.

Generating OP3 annotation file

To generate annotation file with mask and object id predictions per frame for each video in the test set, run:

python3 ocrb/op3/generate_pred_json.py --va vmds --ckpt_file='ocrb/op3/ckpts/vmds_params.pkl' --out_path='ocrb/op3/ckpts/vmds_pred_list.json'

where hyperparameters can be found in the corresponding file and --ckpt_file gives the path to the trained model weights. For details see the original OP3 repository.

Generating TBA annotation file

To generate annotation file for TBA, run:

python3 ocrb/tba/run.py --task vmds --metric 1 --v 2 --init_model sp_latest.pt

The annotation file is generated in the folder ocrb/tba/pic. For details regarding other evaluation flags see the original TBA repository.

Evaluating MOT metrics

To compute MOT metrics, run:

python3 ocrb/eval/eval_mot.py --gt_file='ocrb/data/gt_jsons/vmds_test.json' --pred_file='ocrb/vimon/ckpts/pretrained/vmds_pred_list.json' --results_path='ocrb/vimon/ckpts/pretrained/vmds_results.json' --exclude_bg

where --gt_file specifies the path to the ground truth annotation file, --pred_file specifies the path to the annotation file containing the model predictions and -results_path gives the path where to save the result dictionary. Set --exclude_bg to exclude background segmentations masks from the evaluation.

Leaderboard

Analysis of SOTA object-centric representation learning models for MOT. Results shown as mean ± standard deviation of three runs with different random training seeds. Models ranked according to MOTA for each dataset. If you want to add your own method and results on any of the three datasets, please open a pull request where you add the results in the tables below.

SpMOT

Rank	Model	Reference	MOTA ↑	MOTP ↑	MD ↑	MT ↑	Match ↑	Miss ↓	ID S. ↓	FPs ↓	MSE ↓
1	SCALOR	Jiang et al. 2020	94.9 ± 0.5	80.2 ± 0.1	96.4 ± 0.1	93.2 ± 0.7	95.9 ± 0.4	2.4 ± 0.0	1.7 ± 0.4	1.0 ± 0.1	3.4 ± 0.1
2	ViMON	Weis et al. 2020	92.9 ± 0.2	91.8 ± 0.2	87.7 ± 0.8	87.2 ± 0.8	95.0 ± 0.2	4.8 ± 0.2	0.2 ± 0.0	2.1 ± 0.1	11.1 ± 0.6
3	OP3	Veerapaneni et al. 2019	89.1 ± 5.1	78.4 ± 2.4	92.4 ± 4.0	91.8 ± 3.8	95.9 ± 2.2	3.7 ± 2.2	0.4 ± 0.0	6.8 ± 2.9	13.3 ± 11.9
4	TBA	He et al. 2019	79.7 ± 15.0	71.2 ± 0.3	83.4 ± 9.7	80.0 ± 13.6	87.8 ± 9.0	9.6 ± 6.0	2.6 ± 3.0	8.1 ± 6.0	11.9 ± 1.9
5	MONet	Burgess et al. 2019	70.2 ± 0.8	89.6 ± 1.0	92.4 ± 0.6	50.4 ± 2.4	75.3 ± 1.3	4.4 ± 0.4	20.3 ± 1.6	5.1 ± 0.5	13.0 ± 2.0

VMDS

Rank	Model	Reference	MOTA ↑	MOTP ↑	MD ↑	MT ↑	Match ↑	Miss ↓	ID S. ↓	FPs ↓	MSE ↓
1	OP3	Veerapaneni et al. 2019	91.7 ± 1.7	93.6 ± 0.4	96.8 ± 0.5	96.3 ± 0.4	97.8 ± 0.1	2.0 ± 0.1	0.2 ± 0.0	6.1 ± 1.5	4.3 ± 0.2
2	ViMON	Weis et al. 2020	86.8 ± 0.3	86.8 ± 0.0	86.2 ± 0.3	85.0 ± 0.3	92.3 ± 0.2	7.0 ± 0.2	0.7 ± 0.0	5.5 ± 0.1	10.7 ± 0.1
3	SCALOR	Jiang et al. 2020	74.1 ± 1.2	87.6 ± 0.4	67.9 ± 1.1	66.7 ± 1.1	78.4 ± 1.0	20.7 ± 1.0	0.8 ± 0.0	4.4 ± 0.4	14.0 ± 0.1
4	TBA	He et al. 2019	54.5 ± 12.1	75.0 ± 0.9	62.9 ± 5.9	58.3 ± 6.1	75.9 ± 4.3	21.0 ± 4.2	3.2 ± 0.3	21.4 ± 7.8	28.1 ± 2.0
5	MONet	Burgess et al. 2019	49.4 ± 3.6	78.6 ± 1.8	74.2 ± 1.7	35.7 ± 0.8	66.7 ± 0.7	13.6 ± 1.0	19.7 ± 0.6	17.2 ± 3.1	22.2 ± 2.2

VOR

Rank	Model	Reference	MOTA ↑	MOTP ↑	MD ↑	MT ↑	Match ↑	Miss ↓	ID S. ↓	FPs ↓	MSE ↓
1	ViMON	Weis et al. 2020	89.0 ± 0.0	89.5 ± 0.5	90.4 ± 0.5	90.0 ± 0.4	93.2 ± 0.4	6.5 ± 0.4	0.3 ± 0.0	4.2 ± 0.4	6.4 ± 0.6
2	SCALOR	Jiang et al. 2020	74.6 ± 0.4	86.0 ± 0.2	76.0 ± 0.4	75.9 ± 0.4	77.9 ± 0.4	22.1 ± 0.4	0.0 ± 0.0	3.3 ± 0.2	6.4 ± 0.1
3	OP3	Veerapaneni et al. 2019	65.4 ± 0.6	89.0 ± 0.6	88.0 ± 0.6	85.4 ± 0.5	90.7 ± 0.3	8.2 ± 0.4	1.1 ± 0.2	25.3 ± 0.6	3.0 ± 0.1
4	MONet	Burgess et al. 2019	37.0 ± 6.8	81.7 ± 0.5	76.9 ± 2.2	37.3 ± 7.8	64.4 ± 5.0	15.8 ± 1.6	19.8 ± 3.5	27.4 ± 2.3	12.2 ± 1.4

texVMDS

Rank	Model	Reference	MOTA ↑	MOTP ↑	MD ↑	MT ↑	Match ↑	Miss ↓	ID S. ↓	FPs ↓	MSE ↓
1	MONet	Burgess et al. 2019	-73.3 ± 5.5	67.7 ± 1.1	16.0 ± 3.4	12.3 ± 3.1	24.7 ± 4.7	73.1 ± 5.1	2.2 ± 0.8	98.0 ± 1.7	200.5 ± 5.7
2	ViMON	Weis et al. 2020	-85.5 ± 2.8	69.0 ± 0.6	24.2 ± 1.3	23.8 ± 1.4	34.7 ± 1.7	65.0 ± 1.7	0.3 ± 0.0	120.2 ± 2.5	171.4 ± 3.3
3	SCALOR	Jiang et al. 2020	-99.2 ± 11.7	74.0 ± 0.5	6.5 ± 0.6	6.3 ± 0.6	12.3 ± 0.4	87.5 ± 0.4	0.2 ± 0.0	111.5 ± 11.4	133.7 ± 11.1
4	OP3	Veerapaneni et al. 2019	-110.4 ± 4.3	70.6 ± 0.6	16.5 ± 5.1	16.2 ± 5.0	22.9 ± 6.6	76.9 ± 6.7	0.2 ± 0.1	133.4 ± 2.9	132.8 ± 16.2

Citation

If you use this repository in your research, please cite:

@article{Weis2021,
  author  = {Marissa A. Weis and Kashyap Chitta and Yash Sharma and Wieland Brendel and Matthias Bethge and Andreas Geiger and Alexander S. Ecker},
  title   = {Benchmarking Unsupervised Object Representations for Video Sequences},
  journal = {Journal of Machine Learning Research},
  year    = {2021},
  volume  = {22},
  number  = {183},
  pages   = {1-61},
  url     = {http://jmlr.org/papers/v22/21-0199.html}
}

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
img		img
ocrb		ocrb
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Object-centric Representation Benchmark

Installation

Datasets

Extract Data

Training

Training ViMON

Training OP3

Training TBA

Evaluation

Generating ViMON annotation file

Generating OP3 annotation file

Generating TBA annotation file

Evaluating MOT metrics

Leaderboard

SpMOT

VMDS

VOR

texVMDS

Citation

About

Releases

Packages

Contributors 4

Languages

License

ecker-lab/object-centric-representation-benchmark

Folders and files

Latest commit

History

Repository files navigation

Object-centric Representation Benchmark

Installation

Datasets

Extract Data

Training

Training ViMON

Training OP3

Training TBA

Evaluation

Generating ViMON annotation file

Generating OP3 annotation file

Generating TBA annotation file

Evaluating MOT metrics

Leaderboard

SpMOT

VMDS

VOR

texVMDS

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages