Skip to content

dino-chiio/blip-vqa-finetune

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Visual Question Answering using BLIP pre-trained model!

This implementation applies the BLIP pre-trained model to solve the icon domain task. The BLIP model for VQA task

enter image description here
How many dots are there? 36

Description

**Note: The test dataset does not have labels. I evaluated the model via Kaggle competition and got 96% in accuracy manner. Obviously, you can use a partition of the training set as a testing set.

Create data folder

Copy all data following the example form You can download data here

Install requirements.txt

pip install -r requirements.txt

Run finetuning code

python finetuning.py

Run prediction

python predicting.py

References:

Nguyen Van Tuan (2023). JAIST_Advanced Machine Learning_Visual_Question_Answering

About

This is implementation of finetuning BLIP model for Visual Question Answering

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages