Skip to content

Code base for "Contextualized Rewriting for Text Summarization"

License

Notifications You must be signed in to change notification settings

baoguangsheng/ctx-rewriter-for-summ

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ContextRewriter

This code is for AAAI 2021 paper Contextualized Rewriting for Text Summarization

Python Version: Python3.6

Package Requirements: torch==1.1.0 pytorch_transformers tensorboardX multiprocess pyrouge

Some codes are borrowed from ONMT and PreSumm.

Results

Contextualized rewriter applied to various extractive summarizers on CNN/DailyMail (30/9/2020):

Models ROUGE-1 ROUGE-2 ROUGE-L Words
Oracle of BERT-Ext 46.77 26.78 43.32 112
+ ContextRewriter 52.57 (+5.80) 29.71 (+2.93) 49.69 (+6.37) 63
LEAD-3 40.34 17.70 36.57 85
+ ContextRewriter 41.09 (+0.75) 18.19 (+0.49) 38.06 (+1.49) 55
BERTSUMEXT w/o Tri-Bloc 42.50 19.88 38.91 80
+ ContextRewriter 43.31 (+0.81) 20.44 (+0.56) 40.33 (+1.42) 54
BERT-Ext (ours) 41.04 19.56 37.66 105
+ ContextRewriter 43.52 (+2.48) 20.57 (+1.01) 40.56 (+2.90) 66

Model Evaluation

Contextualized rewriter can be evaluated through this experimental scripts. The Lead3, BERTSUMEXT, and BERT-Ext extractive summarizers are included. All the parameters and settings are hard-coded in the py file.

    python src/exp_varext_guidabs.py 

The rewriter can also be easily applied to other extractive summarizer using following code. The full example can be found in context_rewriter.py.

    rewriter = ContextRewriter(args.model_file)
    
    doc_lines = ["georgia high school ...", "less than 24 hours ...", ...]
    ext_lines = ["georgia high school ...", "less than 24 hours ..."]
    res_lines = rewriter.rewrite(doc_lines, ext_lines)

Model Training

Contextualized rewriter can be trained with following script. All the settings are packed into the .py file.

    python src/exp_guidabs.py

By default, the input data path is ./bert_data, and the output model path is ./exp_guidabs

About

Code base for "Contextualized Rewriting for Text Summarization"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages