Visual Question Generation in Tensorflow
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
assets Add files via upload Sep 27, 2016 Update Jul 23, 2018 add codes Sep 26, 2016 Update Sep 27, 2016 Update Jan 18, 2017 add codes Sep 26, 2016

Visual Question Generation in Tensorflow

It's simple question generator based on visual content written in Tensorflow. The model is quite similar to GRNN in Generating Natural Questions About an Image but I use LSTM instead of GRU. It's quite similar to Google's new AI assistant Allo which can ask question based on image content. Since Mostafazadeh et al. does not released VQG dataset yet, we will use VQA dataset temporarily.



We will use VQA dataset which contains over 760K questions. We simply follow the steps in original repo to download the data and do some preprocessing. After running their code you should acquire three files: data_prepro.h5, data_prepro.json and data_img.h5, put them in the root directory.


Train the VQG model:

python --model_path=[where_to_save]

Demo VQG with single image: (you need to download pre-trained VGG19 here)

python --is_train=False --test_image_path=[path_to_image] --test_model_path=[path_to_model]

Experiment Result

Model: How many zebras are in the picture ?

Model: Where is the chair ?

Problem: Since we use VQA dataset which is designed for challenge so its question must be relevant to image content. No wonder the model train from VQA can not ask natural questions like human. We will adapt VQG dataset once it release to ask more meaningful question.

Allo: Google AI Assistant

We also let Allo reply to these images. Here's the result.


Apply VQG dataset instead of VQA to ask more useful question.