Skip to content

Latest commit

 

History

History
139 lines (101 loc) · 5.42 KB

File metadata and controls

139 lines (101 loc) · 5.42 KB

简体中文 | English

ASRF : Video Action Segmentation Model


Contents

Introduction

ASRF model is an improvement on the video motion segmentation model ms-tcn, which was published on WACV in 2021. We reproduce the officially implemented pytorch code and obtain approximate results in paddlevideo.


MS-TCN Overview

Data

ASRF can choose 50salads, breakfast, gtea as trianing set. Please refer to Video Action Segmentation dataset download and preparation doc Video Action Segmentation dataset

Unlike MS-TCN, ASRF model requires additional data construction. The script process is as follows

python data/50salads/prepare_asrf_data.py --dataset_dir data/

Train

After prepare dataset, we can run sprits.

# gtea dataset
export CUDA_VISIBLE_DEVICES=3
python3.7 main.py  --validate -c configs/segmentation/asrf/asrf_gtea.yaml
  • Start the training by using the above command line or script program. There is no need to use the pre training model. The video action segmentation model is usually a full convolution network. Due to the different lengths of videos, the DATASET.batch_size of the video action segmentation model is usually set to 1, that is, batch training is not required. At present, only single sample training is supported.

Test

Test MS-TCN on dataset scripts:

python main.py  --test -c configs/segmentation/asrf/asrf_gtea.yaml --weights=./output/ASRF/ASRF_split_1.pdparams
  • The specific implementation of the index is to calculate ACC, edit and F1 scores by referring to the test scriptevel.py provided by the author of ms-tcn.

The reproduction of pytorch comes from the official code base

  • The evaluation method of data set adopts the folding verification method in ms-tcn paper, and the division method of folding is the same as that in ms-tcn paper.

Accuracy on Breakfast dataset(4 folding verification):

Model Acc Edit F1@0.1 F1@0.25 F1@0.5
paper 67.6% 72.4% 74.3% 68.9% 56.1%
pytorch 65.8% 71.0% 72.3% 66.5% 54.9%
paddle 66.1% 71.9% 73.3% 67.9% 55.7%

Accuracy on 50salads dataset(5 folding verification):

Model Acc Edit F1@0.1 F1@0.25 F1@0.5
paper 84.5% 79.3% 82.9% 83.5% 77.3%
pytorch 81.4% 75.6% 82.7% 81.2% 77.2%
paddle 81.6% 75.8% 83.0% 81.5% 74.8%

Accuracy on gtea dataset(4 folding verification):

Model Acc Edit F1@0.1 F1@0.25 F1@0.5
paper 77.3% 83.7% 89.4% 87.8% 79.8%
pytorch 76.3% 79.6% 87.3% 85.8% 74.9%
paddle 77.1% 83.3% 88.9% 87.5% 79.1%

Model weight for gtea

Test_Data F1@0.5 checkpoints
gtea_split1 72.4409 ASRF_gtea_split_1.pdparams
gtea_split2 76.6666 ASRF_gtea_split_2.pdparams
gtea_split3 84.5528 ASRF_gtea_split_3.pdparams
gtea_split4 82.6771 ASRF_gtea_split_4.pdparams

Infer

export inference model

python3.7 tools/export_model.py -c configs/segmentation/asrf/asrf_gtea.yaml \
                                -p data/ASRF_gtea_split_1.pdparams \
                                -o inference/ASRF

To get model architecture file ASRF.pdmodel and parameters file ASRF.pdiparams, use:

infer

Input file are the file list for infering, for example:

S1_Cheese_C1.npy
S1_CofHoney_C1.npy
S1_Coffee_C1.npy
S1_Hotdog_C1.npy
...
python3.7 tools/predict.py --input_file data/gtea/splits/test.split1.bundle \
                           --config configs/segmentation/asrf/asrf_gtea.yaml \
                           --model_file inference/ASRF/ASRF.pdmodel \
                           --params_file inference/ASRF/ASRF.pdiparams \
                           --use_gpu=True \
                           --use_tensorrt=False

example of logs:

result write in : ./inference/infer_results/S1_Cheese_C1.txt
result write in : ./inference/infer_results/S1_CofHoney_C1.txt
result write in : ./inference/infer_results/S1_Coffee_C1.txt
result write in : ./inference/infer_results/S1_Hotdog_C1.txt
result write in : ./inference/infer_results/S1_Pealate_C1.txt
result write in : ./inference/infer_results/S1_Peanut_C1.txt
result write in : ./inference/infer_results/S1_Tea_C1.txt

Reference