Skip to content

Latest commit

 

History

History
70 lines (59 loc) · 2.06 KB

openNMT.0.md

File metadata and controls

70 lines (59 loc) · 2.06 KB

Motivations

  • Replicate results for Text Summarization task on Gigaword (see 'About')
  • Getting started with Text Summarization using OpenNMT (src)
  • Getting started with ROUGE scoring using files2rouge (src)

About

Setup

git clone https://github.com/OpenNMT/OpenNMT.git opennmt
git clone --recursive https://github.com/pltrdy/files2rouge.git files2rouge

Download data from here and extract (tar -xzf summary.tar.gz) to ./data.

We assume that your file system is like:

./   
  opennmt/   
  data/   
  file2rouge/   

Building model

Following the guide

# First, move to OpenNMT dir
cd opennmt

1) Preprocess

th preprocess.lua -train_src ../data/train/train.article.txt -train_tgt ../data/train/train.title.txt -valid_src ../data/train/valid.article.filter.txt -valid_tgt ../data/train/valid.title.filter.txt -save_data ../data/train/textsum

2) Train

th train.lua -data ./textsum_train/textsum-train.t7  -save_model textsum

or using GPU:

th train.lua -data ./textsum_train/textsum_model-train.t7  -save_model textsum -gpuid 1

3) Generate summary

th translate.lua -model textsum_final.t7 -src ../data/Giga/inputs.txt

(add -gpuid 1 if you trained the model using GPU)
The output will be in pred.txt

ROUGE Scoring using files2rouge

cd ../files2rouge
./files2rouge --ref ../data/Giga/task1_ref0.txt --summ ../opennmt/pred.txt

Results

ROUGE-1 ROUGE-2 ROUGE-L
34.2 16.2 31.9