Skip to content

Gary-code/KICNLE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KICNLE

This repository contains the official PyTorch implementation of paper "Knowledge-Augmented Visual Question Answering with Natural Language Explanation" for Transaction on Image Processing (TIP) 2024.

Overview

The KICNLE model enhances visual question answering by using an iterative method where each answer is refined based on the previous explanation. It includes a knowledge retrieval module to ensure relevant and accurate information. This results in high-quality, consistent answers and explanations closely tied to the visual content.

model

Installation

  • Install Anaconda or Miniconda distribution based on Python3.8
  • Main packages: PyTorch = 1.12, transformers = 4.30

Pre-trained Model

  • CLIP ViT-based model
pip install git+https://github.com/openai/CLIP.git

Training & Evaluation

  • For VQA-X dataset
python vqaX.py
  • For A-OKVQA dataset
python a_okvqa.py

About

[TIP 2024] The official code of paper "Knowledge-Augmented Visual Question Answering with Natural Language Explanation"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages