Skip to content

CCYChongyanChen/VQA_AlgorithmDatasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Table Category

📄Collection of Papers

📗 Tutorials

📈 Leaderboard

Algorithm Accuracy
Renaissance 79.34
UNIMO Ensemble
VinVL (MSR+MS Cog Svcs., X10 models ) (paper,code) 76.60
GridFeat+MoVie 76.36
DL-61 (BGN) 76.08
VILLA (adversarial training) based on UNITER, (paper, code) 75.9
Ensemble LXMERT, VILBERT, VisualBERT 75.15
Pixel-BERT x152 74.45
Oscar(paper, code) 73.82
UNITER (+grid feature)(paper, code1,code2) 73.82
SOHO 73.47
LXMERT (paper,code) 72.54
VLBERT 72.22
Pixel-BERT r50 71.35
ViLT 71.32
MCAN 70.93
VisualBERT 71.00
ViLBERT 70.92
BUTD 65.67
MUTAN 60.17
Algorithm Accuracy
GIT 67.53
HSSLab 66.72
Alibaba 61.81
LXMBERT 55.4
Pythia 54.72
Gridfeature+MCAN 54.17
VilBERT 52
SAN 47.3
Algorithm Accuracy
Mia 73.67
SunLan 65.86
Summer 59.16
Microsoft 54.71
TAG 53.69
ST-VQA 45.66
M4C 39.01
RUArt-M4C 33.54
LoRRA 27.63

💾 Dataset

  • VQA Dataset
    • General VQA

      • COCO
      • VQAv1, VQAv2
      • VQA Dialog
    • Text-VQA

      • TextVQA
      • Scene Text VQA
      • OCR-VQA (toy-sized dataset, containing book/poster cover)
    • Doc-VQA

    • Rehrase VQA question

      • Inverse Visual QA (iVQA)
      • VQA-Rehrasings
      • VQA-LOL
      • VQA- introspect
      • rehrase ambiguous questions| 2022 paper
    • Replace VQA images

      • VQAv2
      • VQA-CP
    • VQA reasoning

      • VCR (11/2018)
      • Visual Entailment(2019)
      • GQA
      • CLEVER
      • Referring Expression
      • NLVR2 (2018)
    • VQA with External Knowledge

      • OK-VQA
      • FVQA
      • KBVQA
      • KVQA (2019)
    • Explainable/Grounding Image Captioning/VQA

      • Grounding for image captioning (referring expression)
        • Flickr30K entities
        • Visual Genome
        • RefClef
        • RefCOCO
        • CLEVER-Ref+
        • Google Referring expression
        • PhraseCut
      • grounding for VQA
        • Visual7W (2016)
        • Visual Genome (2016) | paper | website
        • VQA-HAT(2016)
        • VQS (2017) | paper
        • VQA-X(2018)
        • VQA-E(2018)
        • TextVQA-X
        • GQA
        • CLEVR-Ans
        • VizWiz-VQA-Grounding (2022) | paper
    • Multilingual

      • Multilingual VQA
      • Image captioning
        • crossmodal3600

✏️ Algorithm

  • Image Feature preparation

    • Show, Attend and Tell (2015/5)
    • SAN (2015/11)
    • BUTD (2017/7) | paper
    • Grid Feature (2020/1)
    • Pixel-BERT (2020/4)
    • SOHO(2021/4)
    • VinVL(2021/4)
  • Enhanced multimodal fusion

    • Bilinear pooling: how to fuse two vectors into one

      • MCB (2016/6)
      • MLB (2016/10)
      • MUTAN (2017/5)
      • MFB&MFH (2017/8)
      • BLOCK (2019/1)
    • FiLM: Feature-wise Linear Modulation

      • FiLM
    • cross-modal attention

      • SAN (2015/11)
      • HierCoAttn (2016/5)
      • DAN (2016/11)
      • DCN (2018/4)
      • BAN (2018/5)
    • pretraining:

      • UNITER
      • ViLBERT
      • LXMERT
      • B2T2
      • VisualBERT
      • Unicoder-VL
      • VL-BERT
      • ERINE-ViL (AAAI, 2021): Scene Graph Prediction
      • Oscar
      • UNIMO (ACL, 2021)
    • End-to-End pretraining:

      • SOHO (CVPR, 2020/4)
      • ViLT (2021, ICML)
    • graph attention/graph Convolutional Network

      • Graph-Structured, (2016/9)
      • Relation Network, (2017/6)
      • Graph Learner,(2018/6)
      • MuRel, (2019/2)
      • ReGAT, (2019/3)
      • LCGN (2019/5)
    • Cross-modal+intra-modal

      • MCAN, 2019: Deep Modular Co-Attention Network
    • Multi-step reasoning

      • MAC: Memory, Attention and Composition
    • Neural module networks

      • NMN, (2015/11)
      • N2NMN,(2017/4)
      • PG+EE,(2017/5)
      • TbD,(2018/3)
      • stackNMN,(2018/7)
      • NS-VQA,(2018/10)
      • Prob-NMN, (2019/2)
      • MMN (2019/10)
  • External Knowledge Algorithm

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published