No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Latest commit 1c4f25c Oct 24, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
evaluate Delete a Sep 15, 2018
keras_convGRU add Jul 30, 2018
ops add Jul 30, 2018
scripts/vip add Jul 30, 2018
Dockerfile Add support for docker! Aug 25, 2018
LICENSE add Jul 30, 2018
README.md Update README.md Oct 24, 2018
aten_model.py add Jul 30, 2018
config.py add Jul 30, 2018
demo.gif add video demo Oct 24, 2018
flow_warp.py add Jul 30, 2018
flowlib.py add Jul 30, 2018
parallel_model.py add Jul 30, 2018
parsing_rcnn_model.py add Jul 30, 2018
utils.py add Jul 30, 2018
vip.py add Jul 30, 2018
visualize.py add Jul 30, 2018

README.md

Adaptive Temporal Encoding Network for Video Instance-level Human Parsing

By Qixian Zhou, Xiaodan Liang, Ke Gong, Liang Lin (ACM MM18) Video demo complete video demo

Requirements

Python3, TensorFlow 1.3+, Keras 2.0.8+

Dataset

The model is trained and evaluated on our proposed VIP dataset for video instance-level human parsing. Please check it for more dataset details.

Models

Models are released on OneDrive and baidu drive:

  • Parsing-RCNN(frame-level) weights(parsing_rcnn.h5).

  • ATEN(p=2,l=3) weights(aten_p2l3.h5).

Installation

  1. Clone this repository
  2. Keras with convGRU2D installation.
cd keras_convGRU
python setup.py install
  1. Compile flow_warp ops(optional). The flow_warp.so have been generated(Ubuntu14.04, gcc4.8.4, python3.6, tf1.4). To compile flow_warp ops, you can excute the code as follows:
cd ops
make
  1. Dataset setup. Download the VIP dataset(both VIP_Fine and VIP_Sequence) and decompress them. The directory structure of VIP should be as follows:

VIP
----Images
--------videos1
--------...
--------videos404
----adjacent_frames
--------videos1
--------...
--------videos404
----front_frame_list
----Category_ids
----Human_ids
----Instance_ids
----lists
........

  1. Model setup. Download released weights and place in models floder.

Training

# ATEN training on VIP
python scripts/vip/train_aten.py

# Parsing-RCNN(frame-level) training on VIP
python scripts/vip/train_parsingrcnn.py

Inference

# ATEN inference on VIP
python scripts/vip/test_aten.py

# Parsing-RCNN(frame-level) inference on VIP
python scripts/vip/test_parsingrcnn.py

the results are stored in ./vis

Evaluate

  1. modify the path in evaluate/*.py
  2. run the code to evaluate your results generated by visualize.py
# for human parsing
python evaluate/test_parsing.py

# for instance segmentation
python evaluate/test_ap.py

# for instance-level human parsing
python evaluate/test_inst_part_ap.py

Reference

@inproceedings{zhou2018,
    Author = {Qixian Zhou, Xiaodan Liang, Ke Gong, Liang Lin},
    Title = {Adaptive Temporal Encoding Network for Video Instance-level Human Parsing},
    Booktitle = {Proc. of ACM International Conference on Multimedia (ACM MM)},
    Year = {2018}
} 

Acknowledgements

This code is based on other source code on github:

  1. matterport/Mask_RCNN(https://github.com/matterport/Mask_RCNN), an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow.
  2. KingMV/ConvGRU(https://github.com/KingMV/ConvGRU), an implementation of ConvGRU2D on Keras.