Skip to content

NeverMoreLCH/Awesome-Video-Grounding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 

Repository files navigation

Awesome-Video-Grounding

A reading list of papers about Video Grounding.



Table of Contents



Temporal Video Grounding Papers

Datasets

  1. Charades-STA [2017][ICCV] TALL: Temporal Activity Localization via Language Query.[paper][dataset][Charades]
  2. ActivityNet Captions [2017][ICCV] Dense-Captioning Events in Videos.[paper][dataset]
  3. DiDeMo [2017][ICCV] Localizing Moments in Video with Natural Language.[paper][dataset]
  4. TACoS [2013][ACL] Grounding Action Descriptions in Videos.[paper][dataset]
  5. CD [2021][arXiv] A Closer Look at Temporal Sentence Grounding in Videos: Datasets and Metrics.[paper][dataset]
  6. CG [2022][CVPR] Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning.[paper][dataset]
  7. MAD [2022][CVPR] MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions.[paper][dataset]

2022 Papers

  1. [2022][AAAI] Explore Inter-Contrast Between Videos via Composition forWeakly Supervised Temporal Sentence Grounding.[paper]
  2. [2022][AAAI] Exploring Motion and Appearance Information for Temporal Sentence Grounding.[paper]
  3. [2022][AAAI] Memory-Guided Semantic Learning Network for Temporal Sentence Grounding.[paper]
  4. [2022][AAAI] Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding.[paper]
  5. [2022][AAAI] Unsupervised Temporal Video Grounding with Deep Semantic Clustering.[paper]
  6. [2022][CVPR] Compositional Temporal Grounding with Structured Variational Cross-Graph Correspondence Learning.[paper][code]
  7. [2022][CVPR] MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions.[paper][code]
  8. [2022][IJCV] Weakly Supervised Moment Localization with Decoupled Consistent Concept Prediction.[paper]
  9. [2022][TIP] Video Moment Retrieval with Cross-Modal Neural Architecture Search.[paper]
  10. [2022][TIP] Exploring Language Hierarchy for Video Grounding.[paper]
  11. [2022][TMM] Cross-modal Dynamic Networks for Video Moment Retrieval with Text Query.[paper]

2021 Papers

  1. [2021][ACL] Parallel Attention Network with Sequence Matching for Video Grounding.[paper]
  2. [2021][ACMMM] AsyNCE: Disentangling False-Positives forWeakly-Supervised Video Grounding.[paper]
  3. [2021][CVPR] Cascaded Prediction Network via Segment Tree for Temporal Video Grounding.[paper]
  4. [2021][CVPR] Context-aware Biaffine Localizing Network for Temporal Sentence Grounding.[paper][code]
  5. [2021][CVPR] Embracing Uncertainty: Decoupling and De-bias for Robust Temporal Grounding.[paper]
  6. [2021][CVPR] Interventional Video Grounding with Dual Contrastive Learning.[paper][code]
  7. [2021][CVPR] Multi-stage Aggregated Transformer Network for Temporal Language Localization in Videos.[paper]
  8. [2021][CVPR] Structured Multi-Level Interaction Network for Video Moment Localization via Language Query.[paper]
  9. [2021][ICCV] Zero-shot Natural Language Video Localization.[paper]
  10. [2021][ICCV] Boundary-sensitive Pre-training for Temporal Localization in Videos.[paper]
  11. [2021][ICCV] Support-Set Based Cross-Supervision for Video Grounding.[paper]
  12. [2021][ICCV] VLG-Net: Video-Language Graph Matching Network for Video Grounding.[paper][code]
  13. [2021][TMM] Weakly Supervised Temporal Adjacent Network for Language Grounding.[paper]
  14. [2021][arXiv] A Closer Look at Temporal Sentence Grounding in Videos: Datasets and Metrics.[paper][code]
  15. [2021][CVPR] .[paper]

2020 Papers

  1. [2020][AAAI] Weakly-Supervised Video Moment Retrieval via Semantic Completion Network.[paper]
  2. [2020][AAAI] Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in Video.[paper][code]
  3. [2020][AAAI] Temporally Grounding Language Queries in Videos by Contextual Boundary-Aware Prediction.[paper]
  4. [2020][AAAI] Learning 2D Temporal Adjacent Networks for Moment Localization with Natural Language.[paper]
  5. [2020][ACMMM] Fine-grained Iterative Attention Network for Temporal Language Localization in Videos.[paper]
  6. [2020][ECCV] Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos.[paper]
  7. [2020][CVPR] Local-Global Video-Text Interactions for Temporal Grounding.[paper][code]
  8. [2020][CVPR] Dense Regression Network for Video Grounding.[paper]

2019 Papers

  1. [2019][AAAI] Localizing Natural Language in Videos.[paper]
  2. [2019][AAAI] Multilevel Language and Vision Integration for Text-to-Clip Retrieval.[paper]
  3. [2019][AAAI] Read,Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos.[paper]
  4. [2019][AAAI] Semantic Proposal for Activity Localization in Videos via Sentence Query.[paper]
  5. [2019][AAAI] To Find Where You Talk: Temporal Sentence Localization in Video with Attention Based Location Regression.[paper]
  6. [2019][CVPR] Language-driven Temporal Activity Localization: A Semantic Matching Reinforcement Learning Model.[paper]
  7. [2019][CVPR] MAN: Moment Alignment Network for Natural Language Moment Retrieval via Iterative Graph Adjustment.[paper]
  8. [2019][CVPR] Weakly Supervised Video Moment Retrieval From Text Queries .[paper]
  9. [2019][EMNLP] WSLLN:Weakly Supervised Natural Language Localization Networks.[paper]
  10. [2019][NeurIPS] Semantic Conditioned Dynamic Modulation for Temporal Sentence Grounding in Videos.[paper]
  11. [2019][WACV] MAC: Mining Activity Concepts for Language-based Temporal Localization.[paper]

2018 Papers

  1. [2018][EMNLP] Localizing Moments in Video with Temporal Language.[paper]
  2. [2018][EMNLP] Temporally Grounding Natural Sentence in Video.[paper]
  3. [2018][SIGIR] Attentive Moment Retrieval in Videos.[paper]

2017 Papers

  1. [2017][ICCV] TALL: Temporal Activity Localization via Language Query.[paper]
  2. [2017][ICCV] Dense-Captioning Events in Videos.[paper]
  3. [2017][ICCV] Localizing Moments in Video with Natural Language.[paper]



Spatial-Temporal Video Grounding Papers

Datasets

TODO

2022 Papers

  1. [2022][AAAI] End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding.[paper]

2021 Papers

  1. [2021][CVPR] Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos.[paper]
  2. [2021][ICCV] STVGBert: A Visual-linguistic Transformer based Framework for Spatio-temporal Video Grounding.[paper]
  3. [2021][CVPR] .[paper]

2020 Papers

TODO

2019 Papers

TODO

2018 Papers

TODO

2017 Papers

TODO

About

A reading list of papers about Visual Grounding.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published