Skip to content

soumyasj/NewsVideoQA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 

Repository files navigation

NewsVideoQA

This repository provides code implementations for baselines and links to the proposed dataset mentioned in the paper "Watching the News: Towards VideoQA Models that can Read" (WACV-2023).

arXiv: 2211.05588 webpage: CVIT video: YouTube Dataset: RRC

Task

Video Question Answering methods focus on commonsense reasoning and visual cognition of objects or persons and their interactions over time. Current VideoQA approaches ignore the textual information present in the video. We introduce the ``NewsVideoQA'' dataset that comprises more than 8,600+ QA pairs on 3,000+ news videos obtained from diverse news channels from around the world.

Task

Samples from the Dataset

Task

Baselines

  1. BERT: baselines/BERT
  2. M4C: baselines/M4C
  3. SINGULARITY: baselines/SINGULARITY

Citation

If you find our dataset/code useful, feel free to leave a star and please cite our paper as follows:

@inproceedings{DBLP:conf/wacv/JahagirdarMKJ23,
  author       = {Soumya Jahagirdar and
                  Minesh Mathew and
                  Dimosthenis Karatzas and
                  C. V. Jawahar},
  title        = {Watching the News: Towards VideoQA Models that can Read},
  booktitle    = {{IEEE/CVF} Winter Conference on Applications of Computer Vision, {WACV}
                  2023, Waikoloa, HI, USA, January 2-7, 2023},
  pages        = {4430--4439},
  publisher    = {{IEEE}},
  year         = {2023},
}

Contact

For any clarifications, comments, or suggestions, please create an issue or contact Soumya Shamarao Jahagirdar.

About

This repository is a code base baseline implementations in "Watching the News: Towards VideoQA Models that can Read", WACV, 2023,

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages