Skip to content
/ TAP Public

TAP: Temporally-Aggregative Pretraining with Transformers for Temporal Action Detection

Notifications You must be signed in to change notification settings

weike382/TAP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

TAP

TAP: Temporally-Aggregative Pretraining with Transformers for Temporal Action Detection Given the long duration of untrimmed videos and the difficulties with end-to-end training, contemporary temporal action detection (TAD) methods heavily depend on pre-computed video feature sequences for subsequent analysis. However, video clip features extracted directly from video encoders trained for trimmed action classification commonly lack temporal sensitivity. To overcome this limitation, we propose an innovative framework named temporally-aggregative pretraining (TAP). TAP is rooted in the design principle of extracting TAD features of temporal sensitivity to improve discrimination between them and those from trimmed action classification. The proposed TAP consists of two fundamental modules at its core: a feature encoding module and a temporal aggregation module, which use both local and global features for pretraining. The feature encoding module employs a novel video encoder of a multiscale vision transformer, which ingeniously combines the essential concept of multiscale feature hierarchy with transformers to achieve effective feature extraction from video clips. The temporal aggregation module introduces a temporal pyramid pooling layer that effectively captures temporal-contextual semantic information from video feature sequences, enhancing more discriminative global video representations. Extensive experiments validate the significantly improved discriminative potency of our pretrained features for two commonly used datasets.

About

TAP: Temporally-Aggregative Pretraining with Transformers for Temporal Action Detection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published