Video Question Answering with HCRN

Final project on video question answering for the Master MVA Recvis 2020/2021 class. The idea is that given a video, we are able to automaticaly answer questions about what is happening in the video.

This project is based on the paper "Conditional Relation Networks for Video Question Answering". Le et al 2020

We build upon it to replace the Glove Text encoding by a BERT text encoder and adapt the multimodal setup to include subtitles modalities in the TVQA dataset. The following visuals where taken from Le et al paper. They explain how the HCRN model work

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
model		model
preprocess		preprocess
.gitignore		.gitignore
CRNUnit.png		CRNUnit.png
DataLoader.py		DataLoader.py
FPP_DUFOUR.pdf		FPP_DUFOUR.pdf
HCRN.png		HCRN.png
README.md		README.md
Untitled.ipynb		Untitled.ipynb
process_question.ipynb		process_question.ipynb
project_proposal_recvis.doc		project_proposal_recvis.doc
report.pdf		report.pdf
train.csv		train.csv
videoqa.ipynb		videoqa.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video Question Answering with HCRN

About

Languages

nicolas-dufour/video-question-answering

Folders and files

Latest commit

History

Repository files navigation

Video Question Answering with HCRN

About

Topics

Resources

Stars

Watchers

Forks

Languages