Skip to content

ranqiu92/OCN

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 

Option Comparison Network for Multi-choice Reading Comprehension

This is the implementation of Option Comparison Network for Multi-choice Reading Comprehension, and the code is mainly based on the PyTorch implementation of BERT.

Introduction

Multiple-choice reading comprehension (MCRC) is the task of selecting the correct answer from multiple options given a question and an article. Existing MCRC models typically either read each option independently or compute a fixed-length representation for each option before comparing them. However, humans typically compare the options at multiple-granularity level before reading the article in detail to make reasoning more efficient.

Mimicking humans, we propose an option comparison network (OCN) for MCRC which compares options at word-level to better identify their correlations to help reasoning. Specially, each option is encoded into a vector sequence using a skimmer to retain fine-grained information as much as possible. An attention mechanism is leveraged to compare these sequences vector-by-vector to identify more subtle correlations between options, which is potentially valuable for reasoning.

Prerequisite

To use this source code, you need Python3.6+, a few python3 packages, RACE data and pretrained BERT models. The python dependencies can be installed as follows:

pip install -r requirements.txt

Usage

Training

To train the model, you can use the following command. Here, [RACE_DIR] is the directory of the original RACE data you download, whose structure is as follows:

RACE
|--train
   |--high
      |--file1.txt
      |  ...
   |--middle
      |--file2.txt
      |  ...
|--dev
   |--high
      |--file3.txt
      |  ...
   |--middle
      |--file4.txt
      |  ...
|--test
   |--high
      |--file5.txt
      |  ...
   |--middle
      |--file6.txt
      |  ...

MODEL_DIR is the directory of model files including the BERT parameters, the vocabulary of the tokenizer and the model configuration file, and [OUTPUT_DIR] is the directory where you want to save the checkpoint. [GRAD_ACCUM_NUM] is an integer which the batch at each step will be divided by, and the gradient will be accumulated over [GRAD_ACCUM_NUM] steps. When determining the value of [GRAD_ACCUM_NUM], the batch size, GPU number and the memory each GPU has should be considered. For 4 NVIDIA Tesla P40 GPUs each of which has 24GB memory, we set [GRAD_ACCUM_NUM] to 1 and 6 when using BERT-Base and BERT-Large respectively.

# when using BERT-Base
python run.py \
  --do_train \
  --do_eval \
  --do_lower_case \
  --race_dir [RACE_DIR] \
  --model_dir [MODEL_DIR] \
  --max_doc_len 400 \
  --max_query_len 30 \
  --max_option_len 16 \
  --train_batch_size 12 \
  --eval_batch_size 24 \
  --learning_rate 3e-5 \
  --num_train_epochs 3 \
  --gradient_accumulation_steps [GRAD_ACCUM_NUM] \
  --output_dir [OUTPUT_DIR]

# when using BERT-Large
python run.py \
  --do_train \
  --do_eval \
  --do_lower_case \
  --race_dir [RACE_DIR] \
  --model_dir [MODEL_DIR] \
  --max_doc_len 400 \
  --max_query_len 30 \
  --max_option_len 16 \
  --train_batch_size 24 \
  --eval_batch_size 24 \
  --learning_rate 1.5e-5 \
  --num_train_epochs 5 \
  --gradient_accumulation_steps [GRAD_ACCUM_NUM] \
  --output_dir [OUTPUT_DIR]

Evaluation

To evaluate the model, the following command can be used. Here, [MODEL_DIR] is the directory of the checkpoint you saved when training.

python run.py \
  --do_eval \
  --do_lower_case \
  --race_dir [RACE_DIR] \
  --model_dir [MODEL_DIR] \
  --max_doc_len 400 \
  --max_query_len 30 \
  --max_option_len 16 \
  --eval_batch_size 24

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages