Code for Video Pose Distillation

See our project website for the paper and details. Published in ICCV 2021.

@inproceedings{vpd_iccv21,
    author={Hong, James and Fisher, Matthew and Gharbi, Micha\"{e}l and Fatahalian, Kayvon},
    title={{V}ideo {P}ose {D}istillation for {F}ew-{S}hot, {F}ine-{G}rained {S}ports {A}ction {R}ecognition},
    booktitle={ICCV},
    year={2021}
}

For code in this repository, see LICENSE.

Usage

This repository contains code for VPD and VIPE*, as described in our paper.

VIPE*

To apply the VIPE* model:

./apply_vipe_model.py <pose_dir> <model_dir> -o <out_dir>

pose_dir : the directory containing the 2D poses for each video
model_dir : path to trained model
out_dir : path to save features to

To train a VIPE* model see train_vipe_model.py. Example: ./train_vipe_model.py --dataset 3d --save_dir <model_dir> Preprocessed 3D pose data for training is available here: VIPE-data.zip. This archive includes ground truth 3D pose and 2D pose from different camera views. Extract to data/vipe or update the paths in vipe_dataset_paths.py. For details on preprocessing, see preprocess_3d_pose.py.

A pre-trained VIPE model is available: VIPE-model.zip.

VPD

Data preparation

To prepare the sports datasets, there are several steps:

Fetching the videos
Pose detection / tracking
Extracting crops (see extract_square_crops.py)
Computing optical flow (see raft/README.md)

Our pose and tracking annotations can be found here: URL

For the source videos:

Diving48 : see original authors' website
Floor exercise : obtain from FineGym authors, recut using recut_finegym_video.py. If using our pose annotations, make sure the frame rates match for each video or adapt accordingly.
Figure skating : see fs-videos.csv and recut_fs_video.py
Tennis : see tennis-videos.csv

It is recommended to unzip the files to the paths defined in video_dataset_paths.py or to update those paths to where the pose files are stored. For example:

diving48
|---pose
|---crops
\---videos
fs
|---pose
|---crops
\---videos
...

To train a student model:

./train_vpd_model.py <dataset> --save_dir <model_dir> --emb_dir <teacher_dir> --flow_img <flow_name> --motion

dataset : the sports dataset to specialize to (e.g., fs)
model_dir : path to save models to
flow_name : the name of the flow images for the crops, which have names <frame_no>.<flow_name>.png
teacher_dir : path to the teacher's features

To apply a student model:

./apply_vpd_model.py <model_dir> -d <dataset> -o <out_dir> --flow_img <flow_name>

model_dir : path to the trained model
out_dir : path to save features to
flow_name : should be the same used for training

The student maintains the same output file formats as the teacher.

Downstream tasks:

For action recognition:

./recognize.py -d <dataset> <feature_dir>

dataset : the sports dataset
feature_dir : the directory containing the pose features

See options such as --retrieve for the retrieval task. For detection, see detect.py.

Pre-trained VPD and VIPE* features/embeddings are available at URL.

To use the Diving48 and FineGym (Floor Exercise) datasets, you need to download the labels per the READMEs in the diving48/data and finegym/data subdirectories.

Data formats

Video naming conventions

For Diving48 and FineGym, we maintain the original authors' video naming scheme.

For figure skating, videos (routines) are named by <video>_<number>_<start_frame>_<end_frame>.mp4.

For tennis, videos (points) are named by: <video>_<start_frame>_<end_frame>.mp4. Pose for each video is prefixed by front__ or back__ to denote the player.

2D pose format

Pose for each video is organized as follows:

men_olympic_short_program_2010_01_00011475_00015700
|---boxes.json
|---coco_keypoints.json.gz
|---mask.json.gz
\---meta.json

The format for boxes.json is:

[
    [frame_num, [x, y, w, h]], ...
]

The format coco_keypoints.json.gz is:

[
    [
        frame_num, [[score, [x, y, w, h], [[x, y, score] * 17]]], ...]
    ],
    ...
]

The format of mask.json.gz:

[
    [
        frame_num, [[score, [x, y, w, h], base64_encoded_png], ...]
    ],
    ...
]

Crop directories

Crops around the athlete, for training VPD, are extracted per video (see extract_square_crops.py):

men_olympic_short_program_2010_01_00011475_00015700
|---0.png           // <frame_num>.png
|---0.prev.png
|---0.flow.png
|---0.mask.png
|---1.png
|---1.prev.png
...

For tennis, the format is slightly different:

usopen_2015_mens_final_federer_djokovic
|---back
|   |---0.png       // <frame_num>.png
|   |---0.prev.png
|   |---0.flow.png
|   |---0.mask.png
|   ...
|
\---front
    |---0.png
    |---0.prev.png
    |---0.flow.png
    |---0.mask.png
    ...

Features / embedding format

Embeddings are stored as pickle files, one per video. The format for each video is:

[
    (frame_num, ndarray, {metadata dict}), ...
]

The ndarray may be 1D or 2D, depending on data augmentation (e.g., flip).

Name	Name	Last commit message	Last commit date
Latest commit jhong93 Clean up setup and add instructions for getting labels Mar 22, 2022 7cde1ca · Mar 22, 2022 History 3 Commits
action_dataset	action_dataset	Initial commit	Sep 3, 2021
data/sports.cache	data/sports.cache	Initial commit	Sep 3, 2021
diving48	diving48	Clean up setup and add instructions for getting labels	Mar 22, 2022
finegym	finegym	Clean up setup and add instructions for getting labels	Mar 22, 2022
models	models	Initial commit	Sep 3, 2021
raft	raft	Initial commit	Sep 3, 2021
util	util	Initial commit	Sep 3, 2021
vipe_dataset	vipe_dataset	Initial commit	Sep 3, 2021
vpd_dataset	vpd_dataset	Add comments about masking in the dataloader	Mar 18, 2022
.gitignore	.gitignore	Initial commit	Sep 3, 2021
LICENSE	LICENSE	Initial commit	Sep 3, 2021
README.md	README.md	Clean up setup and add instructions for getting labels	Mar 22, 2022
apply_vipe_model.py	apply_vipe_model.py	Initial commit	Sep 3, 2021
apply_vpd_model.py	apply_vpd_model.py	Initial commit	Sep 3, 2021
detect.py	detect.py	Initial commit	Sep 3, 2021
dummy_2d_features.py	dummy_2d_features.py	Initial commit	Sep 3, 2021
extract_square_crops.py	extract_square_crops.py	Clean up setup and add instructions for getting labels	Mar 22, 2022
plot_losses.py	plot_losses.py	Initial commit	Sep 3, 2021
preprocess_3d_pose.py	preprocess_3d_pose.py	Initial commit	Sep 3, 2021
recognize.py	recognize.py	Initial commit	Sep 3, 2021
recut_finegym_video.py	recut_finegym_video.py	Initial commit	Sep 3, 2021
recut_fs_video.py	recut_fs_video.py	Initial commit	Sep 3, 2021
requirements.txt	requirements.txt	Initial commit	Sep 3, 2021
stack_features.py	stack_features.py	Initial commit	Sep 3, 2021
train_vipe_model.py	train_vipe_model.py	Initial commit	Sep 3, 2021
train_vpd_model.py	train_vpd_model.py	Initial commit	Sep 3, 2021
video_dataset_paths.py	video_dataset_paths.py	Initial commit	Sep 3, 2021
view_2d_pose.py	view_2d_pose.py	Initial commit	Sep 3, 2021
vipe_dataset_paths.py	vipe_dataset_paths.py	Initial commit	Sep 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code for Video Pose Distillation

Usage

VIPE*

VPD

Data preparation

To train a student model:

To apply a student model:

Downstream tasks:

Data formats

Video naming conventions

2D pose format

Crop directories

Features / embedding format

About

Releases

Packages

Languages

License

jhong93/vpd

Folders and files

Latest commit

History

Repository files navigation

Code for Video Pose Distillation

Usage

VIPE*

VPD

Data preparation

To train a student model:

To apply a student model:

Downstream tasks:

Data formats

Video naming conventions

2D pose format

Crop directories

Features / embedding format

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages