This repo holds the codes of STAN+PGCN in paper: "Exploiting Informative Video Segments for Temporal Action Localization".
30/11/2020 We updated the Initial Version of code.
05/01/2021 We uploaded the trained models
#Configure Environments
pip install -r requirements.txt
#Data Preparation
Download I3D features and record their "p_train" and "p_test"
Download proposal lists, and put them in ./data/
(Optional) Download pre-trained models (./save_model/YOUR_RECORD_MODEL) for testing: Baidu Cloud (nfor)
(Optional) Download localization results (./results) for testing: Baidu Cloud (uz08)
#Training Use "p_train" and "p_test" to set "train_ft_path" and "test_ft_path" in ./data/dataset_cfg.yaml
python pgcn_train.py thumos14 --snapshot_pre ./save_model/
#Testing
sh test.sh ./save_model/YOUR_RECORD_MODEL
After generating two-stream results in ./results/
sh test_two_stream.sh
mAP@0.5IoU (%) | RGB | Flow | RGB+Flow |
---|---|---|---|
STAN+PGCN (I3D) | 40.89 | 49.87 | 52.65 |
My implementations borrow ideas from:
BSN: Boundary Sensitive Network for Temporal Action Proposal Generation,
Graph Convolutional Networks for Temporal Action Localization.