Skip to content

Code for "Compositional Video Synthesis with Action Graphs", Bar & Herzig et al., ICML 2021

License

Notifications You must be signed in to change notification settings

roeiherz/AG2Video

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Compositional Video Synthesis with Action Graphs (ICML 2021)

ag2vid

Back to Project Page.

Release

  • CATER training code and eval - DONE
  • Something-Something V2 training code and eval- TODO
  • Pretrained models - TODO

Installation

We recommend you to use Anaconda to create a conda environment:

conda create -n ag2vid python=3.7 pip

Then, activate the environment:

conda activate ag2vid

Installation:

conda install pytorch==1.4.0 torchvision==0.5.0 -c pytorch
pip install -r requirements.txt

Data

CATER

Download and extract CATER data:

cd <project_root>/data/CATER/max2action
wget https://cmu.box.com/shared/static/jgbch9enrcfvxtwkrqsdbitwvuwnopl0.zip && unzip jgbch9enrcfvxtwkrqsdbitwvuwnopl0.zip
wget https://cmu.box.com/shared/static/922x4qs3feynstjj42muecrlch1o7pmv.zip && unzip 922x4qs3feynstjj42muecrlch1o7pmv.zip
wget https://cmu.box.com/shared/static/7svgta3kqat1jhe9kp0zuptt3vrvarzw.zip && unzip 7svgta3kqat1jhe9kp0zuptt3vrvarzw.zip

Training

CATER

python -m scripts.train --checkpoint_every=5000 --batch_size=2 --dataset=cater --frames_per_action=4 --run_name=train_cater --image_size=256,256 --include_dummies=1 --gpu_ids=0

Note: on the first training epoch, images will be cached in the CATER dataset folder. The training should take around a week on a single V100 GPU. If you have smaller GPUs you can try to reduce batch size and image resolution (e.g, use 128,128).

Eval

A model with example validation outputs is saved every 5k iteration in the <code_root>/output/timestamp_<run_name> folder.

To run a specific checkpoint and test it:

python -m scripts.test --checkpoint <path/to/checkpoint.pt> --output_dir <save_dir> --save_actions 1

Note: this script assumes the parent directory of the checkpoint file contains the run_args.json file which includes some training configuration like dataset, etc.

Citation

@article{bar2020compositional,
  title={Compositional video synthesis with action graphs},
  author={Bar, Amir and Herzig, Roei and Wang, Xiaolong and Chechik, Gal and Darrell, Trevor and Globerson, Amir},
  journal={arXiv preprint arXiv:2006.15327},
  year={2020}
}

Related Works

If you liked this work, here are few other related works you might be intereted in: Compositional Video Prediction (ICCV 2019), HOI-GAN (ECCV 2020), Semantic video prediction (preprint).

Acknowlegments

Our work relies on other works like SPADE, Vid2Vid, sg2im, and CanonicalSg2IM.

About

Code for "Compositional Video Synthesis with Action Graphs", Bar & Herzig et al., ICML 2021

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages