Answer selection task based on WikiQA datatest.
You can find the dataset at WikiQA

I remove the question-answer groups without any correct answers in test data.

The raw data I use is published at data/raw/ folder.

For CNN model, I use pre-trained Glove word embeddings. You should download it and put it in data/embeddings/ folder.


Given a question and a group of answer candidates, you should rank the candidates by the likelihood that it is the correct answer.

The quality of the rank is evaluated by MRR and MAP, as implemented in


  1. Use to lemmatize the corpus and generate transformed train data.

  2. Test two baseline methods with

    • Word matching:

      python3 --word_matching && python3 data/output/WikiQA-dev.rank data/raw/WikiQA-dev.tsv

    • Do nothing:

      python3 --nothing && python3 data/output/WikiQA-dev.rank data/raw/WikiQA-dev.tsv

  3. Try CNN models using

    • Prepare data_helper:

      python3 --prepare

    • Train cnn model:

      python3 --train

    • Generate the final rank for test data:

      python3 --test

    • Visualize the train loss and graph:

      python3 -m tensorflow.tensorboard --logdir data/model/summary/