CNN-QA: CNN for Multiple Choice Question Answering

This repository contains code for the paper:

CNN for Text-Based Multiple Choice Question Answering. Akshay Chaturvedi, Onkar Pandit and Utpal Garain. ACL 2018

Dependencies

Keras v2.0.8 with Theano (v 0.9.0) backend
PyLucene 6.5.0 http://lucene.apache.org/pylucene/
Pickle, NLTK, numpy, json, gensim

Pylucene is needed for query expansion based paragraph selection.

Before training

TQA dataset can be downloaded from http://data.allenai.org/tqa/. SciQ dataset can be downloaded from http://data.allenai.org/sciq/. Once downloaded, extract the folders in the data subdirectory of TQA and SciQ.

Word Vectors can be downloaded from https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit. Place the tgz file in word2vec folder.

Training

Run the file tqa_system.py, sciq_system.py to start training.

Evaluation

For TQA dataset, once the model is trained, modify tqa_system.py depending on which split you want to evaluate and then run result.py. First evaluate the trained model on train set using different thresholds. Once the threshold is fixed, evaluate the model on other splits.

For SciQ dataset, the model can be evaluated by simply modifying and running sciq_system.py since the dataset has no forbidden questions.

Name		Name	Last commit message	Last commit date
Latest commit History 55 Commits
SciQ		SciQ
TQA		TQA
word2vec		word2vec
README.md		README.md
model.py		model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SciQ

SciQ

TQA

TQA

word2vec

word2vec

README.md

README.md

model.py

model.py

Repository files navigation

CNN-QA: CNN for Multiple Choice Question Answering

Dependencies

Before training

Training

Evaluation

About

Releases

Packages

Languages

akshay107/CNN-QA

Folders and files

Latest commit

History

Repository files navigation

CNN-QA: CNN for Multiple Choice Question Answering

Dependencies

Before training

Training

Evaluation

About

Resources

Stars

Watchers

Forks

Languages