In [1]:
import sys
sys.path.append('../src')

import numpy as np
import pandas as pd
import data
import utils
from main import extract_summary, report_rouge_scores

%load_ext autoreload
%autoreload 2

In [2]:
# Get list of titles, reference summaries, and body text
outlook_titles, outlook_refs, outlook_text = data.get_outlook_data()
total = len(outlook_text)
print(total)

38


### Summarization

- Summarization algorithms includes: 
    - SMRS (TF-IDF matrix)
    - Franke-Wolfe (TF-IDF matrix)
    - Franke-Wolfe (Sentence embeddings matrix)
- *Matlab* and *Python for matlab engine* is required to run the SMRS method. Remove `'SMRS'` from the `methods` list below if matlab is not installed.

- Main function: `extract_summary()`

```python
# Arguments:
#     - doc: string; article body text
#     - ref: string; reference summary
#     - title: string; title of the article
#     - k: number of extracted examplars
#     - print_summary: print summary text for each algorithm
#     - report_rouge: report rouge score (need to pass in ref argument)
#     - rouge_embed: use word embedding to calculate rouge score
#     - vectorize_scores: return scores in np.ndarray instead of in a dictionary
#     - methods: summarization algorithms to be used
# Return:
#     - summary: dictionary; extracted summary sentences using each algorithm
#     - word_count: dictionary; number of words in the extracted summary
#     - runtime: computation time of each algorithm
#     - scores: rouge score of each algorithm
        
summary, word_count, runtime, scores = extract_summary(doc, ref=None, title=None, k=5, print_summary=False, 
                                                       report_rouge=False, print_rouge=True, rouge_embed=False, 
                                                       vectorize_scores=False, methods=['random', 'SMRS', 'tfidf', 'embed']);

```

In [6]:
# 9
doc_idx = 0
doc = outlook_text[doc_idx]
ref = outlook_refs[doc_idx]
title = outlook_titles[doc_idx]
print(ref)

from technology and usd stability to emerging markets rebalancing, we review six key market drivers and risks in 2019. keeping inflation under control. us dollar stability. china’s resilience. calmer european politics. emerging markets rebalancing. tech and healthcare innovations. the best of all worlds is a fairly stable usd.


In [7]:
# k=5
# ratio=0.2
methods = ['first-k', 'SMRS', 'TextRank', 'tfidf', 'embed']
extract_summary(doc, ref, title, report_rouge=False, methods=methods, print_summary=True);

Soruce Text: 30 sentences, 255 distinct vocab
# of selected sentences: 7

Title: drivers likely to extend the cycle 

from technology and usd stability to emerging markets rebalancing, we review six key market drivers and risks in 2019. keeping inflation under control. us dollar stability. china’s resilience. calmer european politics. emerging markets rebalancing. tech and healthcare innovations. the best of all worlds is a fairly stable usd.
-----
Word count:48
[3.6249864 4.4029675 5.358628  4.278138  5.0041075 5.015027  3.5632715]
Similarity score: 143.55397

Growth momentum in advanced economies seems strong enough to extend the cycle into 2019 and beyond.
The more important question for markets is whether inflation will remain as benign as it has been.
If inflation rises significantly more than markets (and we) currently expect, the us federal reserve (fed) will be seen as being behind the curve.
Bond yields would further increase significantly, while equities and other risk assets

### ROUGE Score

In [8]:
%%time
# k=5
# ratio=0.3
methods = ['first-k', 'SMRS', 'TextRank', 'tfidf', 'embed']
extract_summary(doc, ref, title, report_rouge=True, rouge_embed=False, 
                methods=methods, print_summary=False, print_rouge=True);



first-k
Overlap 1-gram 			F1: 0.191
Overlap 1-gram 			Precision: 0.138
Overlap 1-gram 			Recall: 0.310
Overlap bi-gram 		F1: 0.000
Overlap bi-gram 		Precision: 0.000
Overlap bi-gram 		Recall: 0.000
Longest Common Subsequence 	F1: 0.141
Longest Common Subsequence 	Precision: 0.128
Longest Common Subsequence 	Recall: 0.286

SMRS
Overlap 1-gram 			F1: 0.132
Overlap 1-gram 			Precision: 0.091
Overlap 1-gram 			Recall: 0.238
Overlap bi-gram 		F1: 0.000
Overlap bi-gram 		Precision: 0.000
Overlap bi-gram 		Recall: 0.000
Longest Common Subsequence 	F1: 0.089
Longest Common Subsequence 	Precision: 0.082
Longest Common Subsequence 	Recall: 0.214

TextRank
Overlap 1-gram 			F1: 0.181
Overlap 1-gram 			Precision: 0.124
Overlap 1-gram 			Recall: 0.333
Overlap bi-gram 		F1: 0.000
Overlap bi-gram 		Precision: 0.000
Overlap bi-gram 		Recall: 0.000
Longest Common Subsequence 	F1: 0.115
Longest Common Subsequence 	Precision: 0.106
Longest Common Subsequence 	Recall: 0.286

tfidf
Overlap 1-gram 			F1: 

### Word Embedding ROUGE Score

In [9]:
%%time
_ = extract_summary(doc, ref, title, report_rouge=True, rouge_embed=True, 
                    methods=methods, print_summary=False, print_rouge=True);



first-k
Overlap 1-gram 			F1: 0.453
Overlap 1-gram 			Precision: 0.426
Overlap 1-gram 			Recall: 0.484
Overlap bi-gram 		F1: 0.606
Overlap bi-gram 		Precision: 0.569
Overlap bi-gram 		Recall: 0.649
Longest Common Subsequence 	F1: 0.238
Longest Common Subsequence 	Precision: 0.216
Longest Common Subsequence 	Recall: 0.484

SMRS
Overlap 1-gram 			F1: 0.407
Overlap 1-gram 			Precision: 0.373
Overlap 1-gram 			Recall: 0.448
Overlap bi-gram 		F1: 0.533
Overlap bi-gram 		Precision: 0.485
Overlap bi-gram 		Recall: 0.593
Longest Common Subsequence 	F1: 0.186
Longest Common Subsequence 	Precision: 0.171
Longest Common Subsequence 	Recall: 0.448

TextRank
Overlap 1-gram 			F1: 0.451
Overlap 1-gram 			Precision: 0.419
Overlap 1-gram 			Recall: 0.488
Overlap bi-gram 		F1: 0.585
Overlap bi-gram 		Precision: 0.539
Overlap bi-gram 		Recall: 0.640
Longest Common Subsequence 	F1: 0.197
Longest Common Subsequence 	Precision: 0.182
Longest Common Subsequence 	Recall: 0.488

tfidf
Overlap 1-gram 			F1: 

### ROUGE Score Across Documents

In [3]:
start = 0
num_articles = total
articles = outlook_text[start : start + num_articles]
references = outlook_refs[start : start + num_articles]
titles = outlook_titles[start : start + num_articles]

In [4]:
%%time
methods = ['first-k', 'SMRS', 'TextRank', 'tfidf', 'embed']
rouge_mean, rouge_median, rouge_std = report_rouge_scores(articles, references, titles, methods=methods)

index =  ['1-gram F1', '1-gram Precision', '1-gram Recall', 'bi-gram F1', 'bi-gram Precision', 'bi-gram Recall', 
          'longest common F1', 'longest common Precision', 'longest common Recall', 'runtime', 'word count']

print('=' * 22 + ' Mean ' + '=' * 22)
rouge_mean.index = index
display(rouge_mean)

print('=' * 21 + ' Median ' + '=' * 21)
rouge_median.index = index
display(rouge_median)

print('=' * 15 + ' Standard Deviation ' + '=' * 15)
rouge_std.index = index
display(rouge_std)



Unnamed: 0,first-k,SMRS,TextRank,tfidf,embed
1-gram F1,0.232502,0.149931,0.199017,0.184991,0.172717
1-gram Precision,0.19577,0.126404,0.148699,0.124568,0.155612
1-gram Recall,0.322923,0.210277,0.327309,0.397492,0.211179
bi-gram F1,0.089026,0.029302,0.042815,0.052754,0.037985
bi-gram Precision,0.076471,0.028004,0.030624,0.034422,0.035617
bi-gram Recall,0.122522,0.037495,0.077674,0.12882,0.045623
longest common F1,0.191703,0.120191,0.143767,0.126039,0.15024
longest common Precision,0.182406,0.118251,0.131606,0.117464,0.147717
longest common Recall,0.302875,0.196088,0.294994,0.377048,0.199876
runtime,3e-06,0.651033,0.016098,1.523646,1.657964




Unnamed: 0,first-k,SMRS,TextRank,tfidf,embed
1-gram F1,0.186649,0.1467,0.186349,0.160031,0.159003
1-gram Precision,0.147214,0.108686,0.13077,0.106428,0.143182
1-gram Recall,0.273864,0.218085,0.321157,0.370241,0.198309
bi-gram F1,0.025995,0.0,0.015331,0.025989,0.007399
bi-gram Precision,0.020762,0.0,0.010652,0.015995,0.005723
bi-gram Recall,0.03576,0.0,0.027047,0.075499,0.008893
longest common F1,0.148432,0.103088,0.124335,0.113506,0.127341
longest common Precision,0.13438,0.099177,0.117127,0.104356,0.134322
longest common Recall,0.251506,0.195745,0.269697,0.355575,0.190064
runtime,2e-06,0.500333,0.011558,1.055535,1.131383




Unnamed: 0,first-k,SMRS,TextRank,tfidf,embed
1-gram F1,0.166804,0.081618,0.09097,0.077202,0.092767
1-gram Precision,0.168905,0.094093,0.078649,0.060811,0.099622
1-gram Recall,0.17987,0.09774,0.12163,0.121196,0.105304
bi-gram F1,0.177629,0.071052,0.066968,0.063413,0.075105
bi-gram Precision,0.17148,0.082679,0.048787,0.043653,0.077862
bi-gram Recall,0.198261,0.073696,0.117147,0.141643,0.083254
longest common F1,0.16989,0.075666,0.081325,0.064593,0.086909
longest common Precision,0.167599,0.09357,0.073005,0.05775,0.096616
longest common Recall,0.183587,0.093853,0.124269,0.12358,0.10098
runtime,2e-06,0.398444,0.010995,1.030224,1.154715


CPU times: user 7min 29s, sys: 24.1 s, total: 7min 54s
Wall time: 10min 15s


### Word Embedding ROUGE Score Across Documents

In [5]:
%%time
methods = ['first-k', 'SMRS', 'TextRank', 'tfidf', 'embed']
rouge_mean_embed, rouge_median_embed, rouge_std_embed = report_rouge_scores(articles, references, titles, 
                                                                            rouge_embed=True, methods=methods)

index =  ['1-gram F1', '1-gram Precision', '1-gram Recall', 'bi-gram F1', 'bi-gram Precision', 'bi-gram Recall', 
          'longest common F1', 'longest common Precision', 'longest common Recall', 'runtime', 'word count']

print('=' * 22 + ' Mean ' + '=' * 22)
rouge_mean_embed.index = index
display(rouge_mean_embed)

print('=' * 21 + ' Median ' + '=' * 21)
rouge_median_embed.index = index
display(rouge_median_embed)

print('=' * 15 + ' Standard Deviation ' + '=' * 15)
rouge_std_embed.index = index
display(rouge_std_embed)



Unnamed: 0,first-k,SMRS,TextRank,tfidf,embed
1-gram F1,0.481999,0.433341,0.479281,0.499118,0.437908
1-gram Precision,0.457178,0.415907,0.443101,0.450556,0.418294
1-gram Recall,0.513401,0.460778,0.526352,0.564357,0.466481
bi-gram F1,0.621983,0.558296,0.613397,0.617496,0.561111
bi-gram Precision,0.589588,0.53027,0.568917,0.552695,0.533321
bi-gram Recall,0.662353,0.595274,0.668938,0.702454,0.598023
longest common F1,0.31672,0.286637,0.257364,0.186777,0.351993
longest common Precision,0.301086,0.337702,0.235658,0.174202,0.402951
longest common Recall,0.513401,0.460778,0.526352,0.564357,0.466481
runtime,3e-06,0.651631,0.014941,1.510742,1.661917




Unnamed: 0,first-k,SMRS,TextRank,tfidf,embed
1-gram F1,0.484466,0.437796,0.478227,0.500027,0.435741
1-gram Precision,0.462095,0.415514,0.426479,0.450967,0.420579
1-gram Recall,0.528486,0.464912,0.539623,0.580349,0.471025
bi-gram F1,0.615191,0.563448,0.601763,0.616251,0.542669
bi-gram Precision,0.573377,0.522826,0.55628,0.554532,0.516982
bi-gram Recall,0.655332,0.613431,0.67496,0.703219,0.605674
longest common F1,0.309494,0.272578,0.236309,0.179876,0.367091
longest common Precision,0.278465,0.252493,0.219183,0.169473,0.335678
longest common Recall,0.528486,0.464912,0.539624,0.580349,0.471025
runtime,3e-06,0.528548,0.010568,1.064373,1.139114




Unnamed: 0,first-k,SMRS,TextRank,tfidf,embed
1-gram F1,0.07187467,0.059663,0.064598,0.063715,0.071156
1-gram Precision,0.06985617,0.062838,0.057793,0.05499,0.070948
1-gram Recall,0.08387914,0.073862,0.086373,0.089786,0.083054
bi-gram F1,0.08533493,0.053955,0.053213,0.051586,0.068628
bi-gram Precision,0.08936941,0.055664,0.048164,0.04875,0.073327
bi-gram Recall,0.09372008,0.073056,0.075157,0.069321,0.081407
longest common F1,0.1229452,0.08307,0.090519,0.067733,0.105399
longest common Precision,0.1284381,0.3614,0.080629,0.059175,0.356245
longest common Recall,0.08387914,0.073862,0.086373,0.089786,0.083054
runtime,9.636934e-07,0.344737,0.010218,0.978585,1.161512


CPU times: user 1h 31min 17s, sys: 4min 37s, total: 1h 35min 55s
Wall time: 1h 38min 29s
