By Qixian Zhou, Xiaodan Liang, Ke Gong, Liang Lin (ACM MM18)
Python3, TensorFlow 1.3+, Keras 2.0.8+
The model is trained and evaluated on our proposed VIP dataset for video instance-level human parsing. Please check it for more dataset details.
Models are released on google drive and baidu drive:
-
Parsing-RCNN(frame-level) weights(aten_p2l3.h5).
-
ATEN(p=2,l=3) weights(parsing_rcnn.h5).
- Clone this repository
- Keras with convGRU2D installation.
cd keras_convGRU
python setup.py install
- Compile flow_warp ops(optional). The flow_warp.so have been generated(Ubuntu14.04, gcc4.8.4, python3.6, tf1.4). To compile flow_warp ops, you can excute the code as follows:
cd ops
make
- Dataset setup. Download the VIP dataset(both VIP_Fine and VIP_Sequence) and decompress them. The directory structure of VIP should be as follows:
VIP
----Images
--------videos1
--------...
--------videos404
----adjacent_frames
--------videos1
--------...
--------videos404
----front_frame_list
----Category_ids
----Human_ids
----Instance_ids
----lists
........
- Model setup. Download released weights and place in models floder.
# ATEN training on VIP
python scripts/vip/train_aten.py
# Parsing-RCNN(frame-level) training on VIP
python scripts/vip/train_parsingrcnn.py
# ATEN inference on VIP
python scripts/vip/test_aten.py
# Parsing-RCNN(frame-level) inference on VIP
python scripts/vip/test_parsingrcnn.py
the results are stored in ./vis
This code is based on other source code on github:
- matterport/Mask_RCNN(https://github.com/matterport/Mask_RCNN), an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow.
- KingMV/ConvGRU(https://github.com/KingMV/ConvGRU), an implementation of ConvGRU2D on Keras.