CLEAR

Cross-Modal Semantic Alignment for Moment Localization via Nature Language

We propose an end-to-end Coarse-to-fine cross-modaL sEmantic Alignment netwoRk, dubbed CLEAR, to efﬁciently localization target moments within the given video via diverse natural language queries.
Concretely, we first design a dual-path neural network, comprising two independent modules: the video encoding network (VEN) and the query encoding network (QEN).
Thereinto, the VEN applies our proposed hierarchical semantic strategy to the input video for generating the corresponding moment candidates and modeling their semantic relevance. The QEN adopts the word embedding based bi-directional LSTM network (Bi-LSTM) to understand the corresponding semantics of the diverse given queries.
Afterwards, we develop a multi-granularity interaction network (MIN) to achieve high-quality moment localization in an effective coarse-to-fine manner. To be more specific, it first utilizes efficient coarse-grained semantic pruning to filter out corresponding semantic ranges and ignore irrelevant parts, and then performs fine-grained semantic fusing for accurate moments localization.
We conduct extensive experiments on two benchmark datasets ActivityNet Captions and TACoS. The experimental results show that our proposed model is more effective, efﬁcient than the state-of-the-art models.
The introduction of CLEAR in details will be given in the form of an authorized patent and a published paper later.
An illustration of the framework of CLEAR is shown in the following figure.

Dateset

TACoS: http://www.coli.uni-saarland.de/projects/smile/page.php?id=tacos
ActivityNet: http://activity-net.org/challenges/2016/download.html#imshuffle
ActivityNet Captions: https://cs.stanford.edu/people/ranjaykrishna/densevid/

How to run

Please place the data files to the appropriate path and set it in tacos.py and activitynet_captions.py.

python tacos.py

or

python activitynet_captions.py

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
CSUN_A		CSUN_A
CSUN_T		CSUN_T
README.md		README.md
initmodel.png		initmodel.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CSUN_A

CSUN_A

CSUN_T

CSUN_T

README.md

README.md

initmodel.png

initmodel.png

Repository files navigation

CLEAR

Cross-Modal Semantic Alignment for Moment Localization via Nature Language

Dateset

How to run

About

Releases

Packages

Languages

Huyp777/CSUN

Folders and files

Latest commit

History

Repository files navigation

CLEAR

Cross-Modal Semantic Alignment for Moment Localization via Nature Language

Dateset

How to run

About

Resources

Stars

Watchers

Forks

Languages