Skip to content
An official implementation of the paper "Unsupervised Keypoint Learning for Guiding Class-conditional Video Prediction", NeurIPS 2019
Python Other
  1. Python 99.6%
  2. Other 0.4%
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
data Update Oct 29, 2019

Unsupervised Keypoint Learning
for Guiding Class-Conditional Video Prediction

An official implementation of the paper "Unsupervised Keypoint Learning for Guiding Class-Conditional Video Prediction", NeurIPS, 2019. [paper] [supp]

I. Requirements

  • Linux
  • NVIDIA GeForce GTX 1080Ti
  • Tensorflow 1.12.0
  • Python3 (>= 3.5.2)


You can install packages by running pip install -r requirements.txt.
Or you can download our prebuilt docker image, by running docker pull join16/python3-cuda:3.5-cuda9.0-nips2019.
If you want, you can build docker image manually, by running docker build -t {image_name} .

※ Dataset

This code is for the Penn Action dataset. The dataset can be downloaded here. After download PennAction.tar.gz, unzip and then run following code to prepare dataset.

./ {unzipped_original_dataset_dir}

※ Pretrained VGG-Net

For the training, pretrained VGG19 network is needed. It can be downloaded here.

II. Train

※※※ Please adjust the paths for inputs and outputs in the configuration file. ※※※

1. Train the keypoints detector & image translator

python --mode detector_translator --config configs/penn.yaml

2. Make pseudo-keypoints labels

python --config configs/penn.yaml --checkpoint {path/to/detector_translator/checkpoint}

3. Train the motion generator

python --mode motion_generator --config configs/penn.yaml

III. Test

python --config configs/penn.yaml \
    --checkpoint_stage1 {path/to/detector_translator/checkpoint} \
    --checkpoint_stage2 {path/to/motion_generator/checkpoint} \
    --save_dir {path/to/save/results}

Pretrained model

  1. Keypoints Detector & Image Translator
  2. Motion Generator

IV. Results

※※※ All videos were generated from a single input image. ※※※

Penn Action



※※※ Qualitative comparison of the results. ※※※


V. Related Works

Learning to Generate Long-term Future via Hierarchical Prediction, Villegas et. al., ICML, 2017. [code]
Hierarchical Long-term Video Prediction without Supervision, Wichers et. al., ICML, 2018. [code]
Flow-Grounded Spatial-Temporal Video Prediction from Still Images, Li et. al., ECCV, 2018. [code]

※ Citation

Please cite our paper when you use this code.

  title={Unsupervised Keypoint Learning for Guiding Class-Conditional Video Prediction},
  author={Kim, Yunji and Nam, Seonghyeon and Cho, In and Kim, Seon Joo},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
You can’t perform that action at this time.