reading_comprehension

CS 224n project on Reading Comprehension / Question-Answering using Microsoft MARCO dataset

Required folder structure

Let's call the main project folder "rc_project" (where the files in this repo goes to)

Download Glove "http://nlp.stanford.edu/data/glove.6B.zip" and put it in "rc_project/download/dwr" (create this folder)
Download training and dev datasets for Marco and put them in "rc_project/download/marco" (create this folder)

#Setup pip install tqdm Go to Keras and intall it separately Go to Recurrentshop and install it separately Go to Seq2seq and install it pip install h5py (for save model) (make sure file structure includes data and download as described in github) For marco script: sudo python -m spacy.en.download --force all

Preprocesing steps:

Run marco_preprocessing.py first, then marco_preprocessing_second.py

Simple Model:

This simple model is the baseline module for the project. It is a basic seq2seq model where the question and passages are all combined with simple concatination. Each row is then padded to a length of 1000 word embeddings, and then trained.

Data And Models:

Data and Models are too large to save on github. They can be found on the dropbox we have set up: https://www.dropbox.com/sh/g9bb5ralrmlj0lz/AACdz6TGCxqW_4acMD1cnmAza?dl=0

Name		Name	Last commit message	Last commit date
Latest commit History 471 Commits
legacy_code		legacy_code
ms_marco_eval		ms_marco_eval
.gitignore		.gitignore
README.md		README.md
Vocabulary distribution.txt		Vocabulary distribution.txt
ce_2attn.py		ce_2attn.py
ce_2attn_debug.py		ce_2attn_debug.py
ce_attn.py		ce_attn.py
ce_double_2attn.py		ce_double_2attn.py
ce_match_2attn.py		ce_match_2attn.py
continue_l2_double_2attn.py		continue_l2_double_2attn.py
data_handler.py		data_handler.py
embeddings_handler.py		embeddings_handler.py
l2_2attn.py		l2_2attn.py
l2_attn.py		l2_attn.py
l2_double_2attn.py		l2_double_2attn.py
l2_match_2attn.py		l2_match_2attn.py
load_predictor.py		load_predictor.py
loss_numerical_eval.py		loss_numerical_eval.py
marco_preprocess.py		marco_preprocess.py
marco_preprocess_second.py		marco_preprocess_second.py
model.py		model.py
passage_classifier.py		passage_classifier.py
passage_classifier_eval.py		passage_classifier_eval.py
pre_process_dev.py		pre_process_dev.py
prediction_handler.py		prediction_handler.py
progbar.py		progbar.py
simple_configs.py		simple_configs.py
tensorflow_l2_model_attn.py		tensorflow_l2_model_attn.py
tensorflow_model_attn.py		tensorflow_model_attn.py
tensorflow_model_attn_squared_hhtilda_WIP.py		tensorflow_model_attn_squared_hhtilda_WIP.py
tf_lstm_attention_cell.py		tf_lstm_attention_cell.py
tf_lstm_attention_cell_WIP.py		tf_lstm_attention_cell_WIP.py
tf_lstm_attention_cell_match.py		tf_lstm_attention_cell_match.py
type_separater.py		type_separater.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

reading_comprehension

Required folder structure

Preprocesing steps:

Simple Model:

Data And Models:

About

Releases

Packages

Contributors 2

Languages

eugenenho/reading_comprehension

Folders and files

Latest commit

History

Repository files navigation

reading_comprehension

Required folder structure

Preprocesing steps:

Simple Model:

Data And Models:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages