author:   
Zhaojie Chen (zc153)  
TJ Tang (tt238)   
Elena Wang (xnw3)   
Chihui Shao (cs662)    
Qin He (qh58)   
Mingxuan Wang (mw446) 

In [1]:
import method
import evaluation 
import pandas as pd
import warnings
warnings.filterwarnings('ignore')

[nltk_data] Downloading package punkt to /Users/student/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [2]:
# methods we are considering
methods_list = ["Luhn","KLSum","LSA","textRank"]

### Evaluation Metrics (ROUGE metric)  
source: https://github.com/Diego999/py-rouge#readme  
https://stackoverflow.com/questions/9879276/how-do-i-evaluate-a-text-summarization-tool    
https://en.wikipedia.org/wiki/ROUGE_(metric)  
https://www.ccs.neu.edu/home/vip/teach/DMcourse/5_topicmodel_summ/notes_slides/What-is-ROUGE.pdf

ROUGE, or Recall-Oriented Understudy for Gisting Evaluation, is a set of metrics that compare an automatically produced summary against human-produced summary (references). Here are some metrics in ROUGE.
- ROUGE-N (N=1,2,...): overlap of N-grams between the system and reference summaries. (For example, ROUGE-1 refers to the overlap of unigram (i.e. each word).)
- ROUGE-L: Longest Common Subsequence (LCS) based statistics. Longest common subsequence problem considers sentence level structure similarity naturally and identifies longest co-occurring in sequence n-grams automatically.

ROUGE will measure precision (P), recall (R), and F-score (F). In ROUGE, precision measures how much of the automatically produced summary in relevant or needed, which is computed as the percentage of overlapping words in the machine generated summary.  On the other hand, recall means how much of the reference summary is the automatically produced summary recovering, which can be computed as the percentage of overlapping words in the reference summary. F-score combines P and R together, and it calculated as $\frac{2*P*R}{P+R}$.

In [3]:
# compare by ROUGE
def compare_methods_rouge(origin_file,ref_file,sentence_count=5):
    # machine summary
    me = method.Method(sentence_count, file = origin_file)
    hypothesis_1 = me.luhn()
    hypothesis_2 = me.klsum()
    hypothesis_3 = me.lsa()
    hypothesis_4 = me.textRank()
    # read reference text
    with open(ref_file, 'r') as file: 
        reference_1 = file.read().replace('\n','') # human summary
    luhn_auto = pd.DataFrame(evaluation.rouge_eval(hypothesis_1,reference_1)).style.hide_index().data
    klsum_auto = pd.DataFrame(evaluation.rouge_eval(hypothesis_2,reference_1)).style.hide_index().data
    lsa_auto = pd.DataFrame(evaluation.rouge_eval(hypothesis_3,reference_1)).style.hide_index().data
    textrank_auto = pd.DataFrame(evaluation.rouge_eval(hypothesis_4,reference_1)).style.hide_index().data
    compare = pd.concat(
        [luhn_auto,klsum_auto,lsa_auto,textrank_auto],
        keys=methods_list,
        axis=1
    )
    compare.insert(0,column="Metric",value=["ROUGE-1","ROUGE-L"])
    return(compare)

In [4]:
# compare by length
def compare_methods_charcount(file,sentence_count=5):
    # machine summary
    me = method.Method(sentence_count, file = file)
    return({"luhn":len(me.luhn()),"klsum":len(me.klsum()),"lsa":len(me.lsa()),"textRank":len(me.textRank())})

### Test Cases

#### News 
source: https://www.kaggle.com/datasets/sunnysai12345/news-summary?resource=download

In [5]:
# https://stackoverflow.com/questions/48067514/utf-8-codec-cant-decode-byte-0xa0-in-position-4276-invalid-start-byte
news = pd.read_csv("data/news_summary.csv",encoding='windows-1252')
news.head(1)

Unnamed: 0,author,date,headlines,read_more,text,ctext,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,...,Unnamed: 272,Unnamed: 273,Unnamed: 274,Unnamed: 275,Unnamed: 276,Unnamed: 277,Unnamed: 278,Unnamed: 279,Unnamed: 280,Unnamed: 281
0,Chhavi Tyagi,"03 Aug 2017,Thursday",Daman & Diu revokes mandatory Rakshabandhan in...,http://www.hindustantimes.com/india-news/raksh...,The Administration of Union Territory Daman an...,The Daman and Diu administration on Wednesday ...,,,,,...,,,,,,,,,,


In [6]:
for i in range(5):
    news_text = news["ctext"][i]
    news_ref = news["text"][i]
    me = method.Method(5, text = news_text)
    luhn_news = pd.DataFrame(evaluation.rouge_eval(me.luhn(),news_ref)).style.hide_index().data
    klsum_news = pd.DataFrame(evaluation.rouge_eval(me.klsum(),news_ref)).style.hide_index().data
    lsa_news = pd.DataFrame(evaluation.rouge_eval(me.lsa(),news_ref)).style.hide_index().data
    textrank_news = pd.DataFrame(evaluation.rouge_eval(me.textRank(),news_ref)).style.hide_index().data
    news_compare = pd.concat(
        [luhn_news,klsum_news,lsa_news,textrank_news],
        keys=methods_list,
        axis=1
    )
    news_compare.insert(0,column="Metric",value=["ROUGE-1","ROUGE-L"])
    print(news_compare)

    Metric   Luhn                KLSum                  LSA                \
                P      R      F      P      R      F      P      R      F   
0  ROUGE-1  28.57  50.00  36.36  50.96  88.33  64.63  50.96  88.33  64.63   
1  ROUGE-L  17.54  27.96  21.55  32.50  51.41  39.83  32.50  51.41  39.83   

  textRank                
         P      R      F  
0    46.67  81.67  59.39  
1    32.25  51.41  39.63  
    Metric   Luhn                KLSum                  LSA                \
                P      R      F      P      R      F      P      R      F   
0  ROUGE-1  17.65  28.57  21.82  14.02  23.81  17.65  19.23  31.75  23.95   
1  ROUGE-L  14.44  21.57  17.30  11.52  17.91  14.02  13.01  19.76  15.69   

  textRank                
         P      R      F  
0    18.45  30.16  22.89  
1    16.67  25.11  20.04  
    Metric   Luhn                KLSum                  LSA                \
                P      R      F      P      R      F      P      R      F   
0  ROUGE-1  

#### Essays/Blog Articles

source: https://blog.hypeinnovation.com/innovation-management-10-most-popular-articles-2016  
https://blog.hypeinnovation.com/the-single-most-important-kpi-for-building-innovation-muscle?hsCtaTracking=c5fc5fb8-3611-4e97-ac7b-5f347e517f48%7Cbb4a45cd-6c6a-4478-ade4-0f7fd322dbe0

In [7]:
compare_methods_rouge("data/innovation.txt","data/innovation_summary.txt")

Unnamed: 0_level_0,Metric,Luhn,Luhn,Luhn,KLSum,KLSum,KLSum,LSA,LSA,LSA,textRank,textRank,textRank
Unnamed: 0_level_1,Unnamed: 1_level_1,P,R,F,P,R,F,P,R,F,P,R,F
0,ROUGE-1,14.29,27.78,18.87,24.04,46.3,31.65,18.75,33.33,24.0,19.1,31.48,23.78
1,ROUGE-L,12.91,22.47,16.4,26.36,45.52,33.39,15.19,24.53,18.76,14.82,22.47,17.86


#### Novel (A Tale of Two Cities)

source: https://www.online-literature.com/dickens/twocities/1/  

SparkNotes editors. "A Tale of Two Cities Summary: Chapter 1: The Period" SparkNotes.com, SparkNotes LLC, 2005,  
https://www.sparknotes.com/lit/a-tale-of-two-cities/section2/.

In [8]:
# A Tale of Two Cities Chapter 1
compare_methods_rouge("data/a-tale-of-two-cities_c1.txt","data/summary_c1.txt")

Unnamed: 0_level_0,Metric,Luhn,Luhn,Luhn,KLSum,KLSum,KLSum,LSA,LSA,LSA,textRank,textRank,textRank
Unnamed: 0_level_1,Unnamed: 1_level_1,P,R,F,P,R,F,P,R,F,P,R,F
0,ROUGE-1,29.0,28.16,28.57,25.96,26.21,26.09,25.96,26.21,26.09,31.68,31.07,31.37
1,ROUGE-L,22.84,22.28,22.56,23.18,23.37,23.28,19.92,20.08,20.0,29.14,28.67,28.9


In [9]:
compare_methods_charcount("data/a-tale-of-two-cities_c1.txt")

{'luhn': 3588, 'klsum': 2200, 'lsa': 1047, 'textRank': 1241}

#### Novel (The Five Orange Pips)

source: https://sherlock-holm.es/ascii/  
https://en.wikipedia.org/wiki/The_Five_Orange_Pips

In [10]:
# The Five Orange Pips
compare_methods_rouge("data/orange.txt","data/summary_orange.txt")

Unnamed: 0_level_0,Metric,Luhn,Luhn,Luhn,KLSum,KLSum,KLSum,LSA,LSA,LSA,textRank,textRank,textRank
Unnamed: 0_level_1,Unnamed: 1_level_1,P,R,F,P,R,F,P,R,F,P,R,F
0,ROUGE-1,29.7,29.41,29.56,15.69,15.69,15.69,22.11,20.59,21.32,28.71,28.43,28.57
1,ROUGE-L,19.27,19.11,19.19,15.63,15.63,15.63,17.83,16.81,17.3,19.27,19.11,19.19


In [11]:
compare_methods_charcount("data/orange.txt")

{'luhn': 1584, 'klsum': 1615, 'lsa': 503, 'textRank': 1121}

#### Novel (1984)

source:  
https://www.george-orwell.org/1984/0.html  
SparkNotes editors. “1984 Summary: Chapter 1.” SparkNotes.com, SparkNotes LLC, 2005, https://www.sparknotes.com/lit/1984/section1/.

In [12]:
# 1984
compare_methods_rouge("data/1984.txt","data/1984_summary.txt")

Unnamed: 0_level_0,Metric,Luhn,Luhn,Luhn,KLSum,KLSum,KLSum,LSA,LSA,LSA,textRank,textRank,textRank
Unnamed: 0_level_1,Unnamed: 1_level_1,P,R,F,P,R,F,P,R,F,P,R,F
0,ROUGE-1,22.77,22.55,22.66,27.72,27.45,27.59,31.43,32.35,31.88,26.92,27.45,27.18
1,ROUGE-L,19.27,19.11,19.19,21.54,21.36,21.45,19.76,20.24,20.0,19.92,20.24,20.08


In [13]:
compare_methods_charcount("data/1984.txt")

{'luhn': 2503, 'klsum': 2118, 'lsa': 802, 'textRank': 899}