==主要介绍使用VQA2.0数据集作为评价指标的论文,使用CLEVR数据集的因为指标已经很高,所以不是关注的重点
主流数据集VQA2.0介绍
Clevr数据集
Bibtex
CLevr 处理代码n2nmn VQA: VQA2 coco tdiuc
许多网络的基础
ottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Bibtex
Relation-aware Graph Attention Network for Visual Question Answering 所看结果最高,代码公开,还是图神经网络结果好 emmm..
Bibtex
Generating Natural Language Explanations for Visual Question Answering Using Scene Graphs and Visual Attention 没有评价指标,但是是在场景图基础上做的VQA
Bibtex
LEARNING TO COUNT OBJECTS IN NATURAL IMAGES FOR VISUAL QUESTION ANSWERING
Bibtex
Learning Conditioned Graph Structures for Interpretable Visual Question Answering
Bibtex
Multi-modal Learning with Prior Visual Relation Reasoning
Bibtex
[详解](./detail/Multi-modal Learning with Prior Visual Relation Reasoning.md)
Bilinear Attention Networks
Bibtex
Visual Entailment: A Novel Task for Fine-Grained Image Understanding 使用self-attention
Bibtex
MUREL: Multimodal Relational Reasoning for Visual Question Answering
Bibtex
Deep Compositional Question Answering with Neural Module Networks 这个方向的鼻祖,利用问题产生不同的网络,还没找见代码
[Bibtex](https://dblp.uni-trier.de/rec/bibtex/journals/corr/AndreasRDK15)
Learning to Reason: End-to-End Module Networksfor Visual Question Answering
[Bibtex](https://dblp.uni-trier.de/rec/bibtex/journals/corr/HuARDS17)
Inferring and Executing Programs for Visual Reasoning
Bibtex
FiLM: Visual Reasoning with a General Conditioning Layer
Bibtex
Learning to Reason: End-to-End Module Networks for Visual Question Answering
Bibtex
Question Guided Modular Routing Networks for Visual Question Answering创建了一个路由网络
Bibtex
Answer Them All! Toward Universal Visual Question Answering Models
Bibtex
Probabilistic Neural-symbolic Models for Interpretable Visual Question Answering
Bibtex
BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection
Bibtex
Compositional Attention Networks for Machine Reasoning
Bibtex