No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
action-baselines
eval
ext
h36m
images
io
keras
kernels
ntu
predictors
structuredinference
vis
.gitignore
H36M-NOTES.md
README.md
action_based_evaluator.py
basic_lstm_baselines.py
comp_plot_3.py
convert_h36m.py
convert_ikea.m
convert_mpii.py
convert_ntu.py
convert_penn.py
cvel_zvel_baselines.py
expmap.py
extract_annot_poses.m
main.m
plot_2d_pose.py
plot_3d_skeleton.py
plot_pck.py
plot_random_preds.py
plot_seqs.py
requirements.txt
startup.m
stats_calculator.py
upgrade_ikeadb.m

README.md

Pose prediction code

A grid of frames from videos of people screwing legs into Ikea tables

This repository contains the code for "Human Pose Prediction via Deep Markov Models" (Toyer et al., DICTA'17). The instructions below explain how to preprocess our Ikea Furniture Assembly dataset and train both an LSTM baseline and a deep Markov model to predict poses on that dataset.

Installing dependencies

This repository has several dependencies on PyPI, as listed in requirements.txt. If you have the virtualenv and virtualenvwrapper package management tools installed, then you can install the following dependencies in an isolated development environment using the following commands in a shell:

mkvirtualenv -p "$(which python2)" pose-prediction-env
pip install -r requirements.txt

(unless stated otherwise, it's assumed that all commands, including the ones above, are run from the root of this repository)

The installed dependencies will only be usable from within the virtual environment. Whenever you open a new shell & want to use this code, you'll need to remember to execute workon pose-prediction-env to re-renter the environment. That will prepend the name of the environment to your shell prompt, like this:

$ workon pose-prediction-env
(pose-prediction-env)$ python -c 'print("now we can run Python, etc., with correct deps")'

As a final setup step, we'll make a directory to place all of our results in:

mkdir ikea-fa-results 

Obtaining the data

These instructions assume that you want to train a model to predict poses on Ikea Furniture Assembly dataset. If you don't have the dataset already, then you can download it using the following commands:

wget 'http://users.cecs.anu.edu.au/~u5568237/ikea/ikea-fa-release-data.zip'
unzip -j ikea-fa-release-data.zip ikea-fa-release-data/processed-python-data/ikea_action_data.h5 -d .

This will create a file named ikea_action_data.h5 in the current directory (md5sum is 92b334d368b798d5456401014aaf21c6). Note that most of the data in ikea-fa-release-data.zip is videos, which you will probably want later, but which you don't actually need just to train some prediction models. If you just want the .h5 file for the purpose of following these instructions, then you can get it from this link. You can also download a tiny sample file (700kb vs. 20mb for original) that has only subset of the pose data, and which will require much less memory at train time; this option is recommended if you don't have a server with a lot of memory.

Datasets other than Ikea FA

Most other datasets supported by this code---including Human3.6M (H3.6M), the NTU RGBD dataset, the Penn Action dataset, and so on---require an additional conversion step before you can use them. Specifically, you need to use one of the convert_<dataset>.py scripts to convert from the original dataset format to a uniform HDF5-based format that makes it easy for other tools in this repository to get at pose data. Refer to the inbuilt help for each script to see how to do this (e.g. python convert_ntu.py --help).

Training & evaluating an LSTM baseline

In our paper, we compared against several sequence regression baselines. You can train and test one such baseline as follows:

# choose lstm, lstm3lr, or erd
BASELINE=lstm
# this will train the model; once it has trained for a while (probably a few
# hours, or until the displayed loss stops going down), you can interrupt it
# with Ctrl+C to stop (there's no early-stopping IIRC)
python basic_lstm_baselines.py "$BASELINE" ikea_action_data.h5 ikea-fa-results/baselines/
# re-running the script will produce results
python basic_lstm_baselines.py "$BASELINE" ikea_action_data.h5 ikea-fa-results/baselines/

results_${BASELINE}.h5 contains both ground truth and predicted poses for the Ikea FA test set. You can calculate statistics for such a file using stats_calculator.py, e.g.

python stats_calculator.py --output_dir ./ikea-fa-results/csv/ \
  "./ikea-fa-results/baselines/results_${BASELINE}.h5"

This will write some PCK statistics to CSV files in ikea-fa-results/csv.

Training & evaluating a deep Markov model

Code for the deep Markov models is in the structuredinference directory. A DMM can be trained using the following commands:

# if using another dataset, replace "ikeadb" with the dataset name (e.g.
# ntu-2d, h36m-2d, etc.)
cd structuredinference/expt-ikeadb
cp ../../ikea_action_data.h5 ./
bash runme-no-actions.sh

If you've just followed the instructions up to this point, then the above command will train on the CPU using Theano. To make Theano train the network on a GPU, you'll have to [install libgpuarray](http://deeplearning.net/software/libgpuarray/installation.html**. In either case, the network will require quite a bit of memory to train, since it keeps all data in memory at once; the code was developed on a GPU server with 128GB of memory, which should be more than adequate for training. If you run into memory errors, then you may want to try again with the tiny sample dataset linked in the "Obtaining the data" section above.

As the network trains, the command executed above will periodically output a series of update messages like this:

<<Bnum: 0, Batch Bound: 1.3835, |w|: 157.8771, |dw|: 1.0000, |w_opt|: 0.0000>>
<<-veCLL:28223.2433, KL:10996.1426, anneal:0.0100, l1:0.0000>>
<<Bnum: 10, Batch Bound: 0.9274, |w|: 157.9926, |dw|: 1.0000, |w_opt|: 0.1953>>
<<-veCLL:18792.7172, KL:10044.8623, anneal:0.0200, l1:0.0000>>
<<Bnum: 20, Batch Bound: 1.0508, |w|: 158.1243, |dw|: 1.0000, |w_opt|: 0.1990>>
<<-veCLL:21316.7489, KL:6796.2031, anneal:0.0300, l1:0.0000>>
…
<<-veCLL:7437.7972, KL:2064.8850, anneal:1.0000, l1:0.0000>>
<<Bnum: 1310, Batch Bound: 0.8889, |w|: 168.7077, |dw|: 1.0000, |w_opt|: 0.2590>>
<<-veCLL:16589.0754, KL:1615.3231, anneal:1.0000, l1:0.0000>>
<<(Ep 0) Bound: 1.0867 [Took 2697.6768 seconds] >>
<<Saving at epoch 0>>
<<Saved model (./chkpt-ikeadb/DKF_lr-8_0000e-04-vm-LR-inf-structured-dh-50-ds-50-nl-relu-bs-20-ep-2000-rs-600-ttype-simple_gated-etype-mlp-previnp-False-ar-1_0000e+03-rv-5_0000e-02-nade-False-nt-5000-cond-False-ikeadb-no-acts-EP0-params) 
      opt (./chkpt-ikeadb/DKF_lr-8_0000e-04-vm-LR-inf-structured-dh-50-ds-50-nl-relu-bs-20-ep-2000-rs-600-ttype-simple_gated-etype-mlp-previnp-False-ar-1_0000e+03-rv-5_0000e-02-nade-False-nt-5000-cond-False-ikeadb-no-acts-EP0-optParams) weights>>

The message at the end signifies the end of an epoch. After this message the model will be evaluated & the results printed to stdout too.

Once the validation loss printed out at the end of an epoch stops going down, you should interrupt the training script with Ctrl+C & evaluate the model. The relevant model files are in chkpt-ikeadb; the directory contents will look something like this:

(pose-prediction-env)$ ls -l chkpt-ikeadb/
total 69156
-rw-r--r-- 1 user user     1464 Jan 24 09:47 DKF_lr-8_0000e-04-vm-LR-inf-structured-dh-50-ds-50-nl-relu-bs-20-ep-2000-rs-600-ttype-simple_gated-etype-mlp-previnp-False-ar-1_0000e+03-rv-5_0000e-02-nade-False-nt-5000-cond-False-ikeadb-no-acts-config.pkl
-rw-r--r-- 1 user user 47199544 Jan 24 10:32 DKF_lr-8_0000e-04-vm-LR-inf-structured-dh-50-ds-50-nl-relu-bs-20-ep-2000-rs-600-ttype-simple_gated-etype-mlp-previnp-False-ar-1_0000e+03-rv-5_0000e-02-nade-False-nt-5000-cond-False-ikeadb-no-acts-EP0-optParams.npz
-rw-r--r-- 1 user user 23599946 Jan 24 10:32 DKF_lr-8_0000e-04-vm-LR-inf-structured-dh-50-ds-50-nl-relu-bs-20-ep-2000-rs-600-ttype-simple_gated-etype-mlp-previnp-False-ar-1_0000e+03-rv-5_0000e-02-nade-False-nt-5000-cond-False-ikeadb-no-acts-EP0-params.npz

The *-config.pkl file contains information related to the structure of the network. The *-EP<N>-optParams.npz and *-EP<N>-params.npz files contain actual weights from epoch N. To get results, we'll have to take the model configuration and the -params.npz file for the most recent epoch & do the following:

# I assume you are running this from the expt-ikeadb directory
python ../common_pp/make_eval_results.py \
  ./runme-no-actions.sh \
  "chkpt-ikeadb/<MODEL>-config.pkl" \
  "chkpt-ikeadb/<MODEL>-EP<N>-params.npz" \
  results_dkf.h5

After replacing MODEL and N with the values shown above, my command was:

python ../common_pp/make_eval_results.py \
  ./runme-no-actions.sh \
  "chkpt-ikeadb/DKF_lr-8_0000e-04-vm-LR-inf-structured-dh-50-ds-50-nl-relu-bs-20-ep-2000-rs-600-ttype-simple_gated-etype-mlp-previnp-False-ar-1_0000e+03-rv-5_0000e-02-nade-False-nt-5000-cond-False-ikeadb-no-acts-config.pkl" \
  "chkpt-ikeadb/DKF_lr-8_0000e-04-vm-LR-inf-structured-dh-50-ds-50-nl-relu-bs-20-ep-2000-rs-600-ttype-simple_gated-etype-mlp-previnp-False-ar-1_0000e+03-rv-5_0000e-02-nade-False-nt-5000-cond-False-ikeadb-no-acts-EP0-params.npz" \
  results_dkf.h5

That will give you a results_dkf.h5 file (in the current directory) that you can calculate statistics for in the same way as the results_${BASELINE}.h5 files described above. Concretely, from the expt-ikeadb directory, you can do this:

python ../../stats_calculator.py \
  --output_dir ../../ikea-fa-results/csv/ \
  results_dkf.h5

I usually put all my results_*.h5 files in the same directory and run stats_calculator.py on all of them in one go.






























Other notes from my personal wiki

(these notes from my personal wiki are included from posterity; they may help with producing new plots)

How to make videos from DKF predictions: at the moment, my code for making videos from DKF predictions is in the structuredinference directory. You can run it with the following sequence of commands:

cd ~/repos/structuredinference/expt-ikeadb
source activate pose-prediction  # or `workon` or whatever
./make_eval_videos.py results_dkf.h5 --vid-dir some-dest-dir

It will select a random sequence each time, so re-run a few times to plot a good range.

Making statistics and plotting PCK: it works something like this:

cd /path/to/pose-prediction/

# repeat this call as necessary for all your baselines
# you may need --max-thresh to pick a threshold for (normalised) comparisons
# e.g. ikea works well with --max-thresh 0.1, NTU works well with --max-thresh 1
./stats_calculator.py --output_dir ikea_baselines/ ikea_baselines/_zero_velocity.h5

# the next two make actual plots that appear inthe paper
./plot_pck.py --stats-dir ~/etc/pp-baselines/2017-05-02/stats/ --methods dkf srnn erd lstm lstm3lr zero_velocity --method-names DKF SRNN ERD LSTM LSTM3LR "Zero-velocity" --parts elbows shoulders wrists --save plot.pdf --fps 16 --times 1 10 25 50 --no-thresh-px --dims 6 8 && mv plot.pdf plot-xtype-thresh.pdf
# maybe this one has time on the x-axis, and the above has a threshold?
./plot_pck.py --stats-dir ~/etc/pp-baselines/2017-05-02/stats/ --methods dkf srnn erd lstm lstm3lr zero_velocity --method-names DKF SRNN ERD LSTM LSTM3LR "Zero-velocity" --parts elbows shoulders wrists --save plot.pdf --legend-below --xtype time --fps 16 && mv plot.pdf plot-xtype-time.pdf