GitHub - yangdsh/VQA-BUTD-demo: A easy-to-use tool for real time Visual Question Answering

Demo for Visual Question Answering with BUTD

This an user friendly demo for visual question answering. It is essentially a pipeline that combines an image feature extractaion tool and a fast attention implementation. These two repos implement the BUTD system described in "Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering" (https://arxiv.org/abs/1707.07998) and "Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge" (https://arxiv.org/abs/1708.02711).

We modify the BUTD code in both above repos to make it applicable to any new image on the website. In addition, we re-draw the image to show the attention on the image. We further improved the above models in the following ways:

Make use of position information in the attention model
Add a layer to the attention model, which improves the accuracy

We include the pre-trained attention model as a tar.gz in this repo, which is last missing piece of pre-trained models needed in those two repos. The users will need to decompress it. Also, the users need to follow the installation instruments in the two sub-folders, and download the pre-trained models and dictionaries..

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
bottom-up-attention-vqa		bottom-up-attention-vqa
bottom-up-attention		bottom-up-attention
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Demo for Visual Question Answering with BUTD

About

Releases

Packages

Languages

License

yangdsh/VQA-BUTD-demo

Folders and files

Latest commit

History

Repository files navigation

Demo for Visual Question Answering with BUTD

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages