This repository contains a reimplementation of the NeurIPS 2022 paper "Associating Objects and Their Effects in Video Through Coordination Games."
This is not an officially supported Google product.
- Linux
- JAX [0.4.1]
- Haiku [0.0.9]
- NVIDIA GPU + CUDA CuDNN
This code has been tested with JAX 0.4.1, Haiku 0.0.9 and Python 3.10.8.
Install dependencies using Conda:
conda env create -f environment.yml
conda activate omnimatte-sp
Download and extract the datasets used in our paper:
./scripts/dldata.sh- For an example of pretraining on synthetic data and fine-tuning on a synthetic test video, run:
./scripts/train-synth.sh-
For an example reproducing a real video in the paper, see
scripts/train-real.sh -
To view Tensorboard visualizations, run:
tensorboard --logdir=checkpoints/vis --port=8097and visit the URL http://localhost:8097.
Models and final results are saved to checkpoints/.
Download the pretrained weights:
./scripts/dlweights.sh
Weights will be saved to pretrained_weights/.
For an example of running inference using the pretrained weights, see scripts/inference.sh.
For an example of running the evaluation code, see scripts/eval.sh.
To finetune on a custom video, follow the preprocessing steps:
- Stabilize the video.
- Resize the video to
224x128and save the frames to<my_video>/rgb/*.png. - Place object masks in
<my_video>/mask/01/*.png,<my_video>/mask/02/*.png, etc. - Estimate the background and save it as
<my_video>/bg_est.png. - [Optional] Specify a per-frame compositing order for the mask layers at
<my_video>/order.txt. Otherwise layers will be composited back-to-front, starting withmask/01.
If you use this code for your research, please cite the following paper:
@inproceedings{lu2022,
title={Associating Objects and Their Effects in Video Through Coordination Games},
author={Lu, Erika and Cole, Forrester and Dekel, Tali and Xie, Weidi and Zisserman, Andrew and Freeman, William T and Rubinstein, Michael},
booktitle={NeurIPS},
year={2022}
}
