## ONLY if running on Colaboratory, run this cell first (once)

In [None]:
!git clone https://github.com/pie3636/newsjam.git
!mv newsjam/* .

## Install missing modules if needed (only run once)

In [22]:
!python -m pip install -r requirements.txt
!python -m spacy download fr_core_news_sm
# Note: You'll have to restart the kernel/runtime after running this cell

Defaulting to user installation because normal site-packages is not writeable


Defaulting to user installation because normal site-packages is not writeable
Collecting fr-core-news-sm==3.1.0
  Using cached https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-3.1.0/fr_core_news_sm-3.1.0-py3-none-any.whl (17.1 MB)
[+] Download and installation successful
You can now load the package via spacy.load('fr_core_news_sm')
^C


## Imports (only run once)

In [1]:
# MLSUM Corpus
from datasets import load_dataset

# Loading article data
import json

# Our packages
from eval.rouge_l import RougeLEval
from eval.bert_eval import BERT_Eval
from summ.lsa import LSASummarizer
from summ.bert_embed import BertEmbeddingsSummarizer

from tqdm import tqdm

dataset = load_dataset('mlsum', 'fr')

rouge_l = RougeLEval()
bert = BERT_Eval()
lsa_summ = LSASummarizer()
flaubert_summ = BertEmbeddingsSummarizer('flaubert/flaubert_large_cased')
camembert_summ = BertEmbeddingsSummarizer('camembert/camembert-large')

Reusing dataset mlsum (C:\Users\maxim\.cache\huggingface\datasets\mlsum\fr\1.0.0\77f23eb185781f439927ac2569ab1da1083195d8b2dab2b2f6bbe52feb600688)


  0%|          | 0/3 [00:00<?, ?it/s]

Some weights of the model checkpoint at flaubert/flaubert_large_cased were not used when initializing FlaubertModel: ['pred_layer.proj.bias', 'pred_layer.proj.weight']
- This IS expected if you are initializing FlaubertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing FlaubertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of the model checkpoint at camembert/camembert-large were not used when initializing CamembertModel: ['lm_head.layer_norm.bias', 'lm_head.layer_norm.weight', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.bias']
- This IS expected if you are initializing CamembertModel from the checkpoint of a model trained on another task

## Summarize a single article

In [None]:
# Pick an article and its reference summary
article = dataset['test']['text'][54]
ref_summ = dataset['test']['summary'][54]

# Computes the summary and evaluation
gen_summ = flaubert_summ.get_summary(article)
scores1, scores2 = rouge_l.evaluate_one(ref_summ, gen_summ)
print(gen_summ[0])
print()
print(gen_summ[1])
print()
print(ref_summ)
print()
print(scores1)
print(scores2)

## Summarize a series of articles

In [19]:
texts = dataset['test']['text']
ref_summs = dataset['test']['summary']

# Here we pick 5 articles
gen_summs = []
for text in tqdm(texts[:5]):
    gen_summs.append(flaubert_summ.get_summary(text))

scores1, scores2 = rouge_l.evaluate_many(ref_summs, gen_summs, 5)
results = rouge_l.get_results(scores1, scores2)

for k, v in results.items():
    print(k.ljust(25), round(v*100, 3), '%')

100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [06:12<00:00, 74.59s/it]
100%|████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 10.02it/s]

Long precision avg        10.401 %
Long recall avg           9.753 %
Long F1-score avg         10.041 %
Keyword precision avg     5.288 %
Keyword recall avg        4.038 %
Keyword F1-score avg      4.538 %





#### Optional: Save generated summaries to file

In [None]:
with open('generated.txt', 'w') as f:
    for summ1, summ2 in tqdm(gen_summs):
        f.write(summ1)
        f.write('\n\n')
        f.write(summ2)
        f.write('\n\n')

## Summarize a series of scraped articles

In [14]:
with open('data/actu_preliminary.json', 'r', encoding='utf-8') as jsonfile:
    data = json.load(jsonfile)

texts = [article['text'] for article in data]
ref_summs = [article['summary'] for article in data]

gen_summs = []
for text in tqdm(texts):
    gen_summs.append(flaubert_summ.get_summary(text))

scores1, scores2 = rouge_l.evaluate_many(ref_summs, gen_summs)
results = rouge_l.get_results(scores1, scores2)

for k, v in results.items():
    print(k.ljust(25), round(v*100, 3), '%')

100%|██████████████████████████████████████████████████████████████████████████████████| 47/47 [28:02<00:00, 35.80s/it]
100%|██████████████████████████████████████████████████████████████████████████████████| 47/47 [00:00<00:00, 74.57it/s]

Long precision avg        23.711 %
Long recall avg           24.248 %
Long F1-score avg         23.631 %
Keyword precision avg     21.481 %
Keyword recall avg        23.798 %
Keyword F1-score avg      22.118 %





Implementation of BERTScore

In [15]:
long_summs, short_summs, ref_summs, key_ref_sums =  bert.split_summs(gen_summs, ref_summs)

In [16]:
bert.bert_score(long_summs, short_summs, ref_summs, key_ref_sums)

Some weights of the model checkpoint at bert-base-multilingual-cased were not used when initializing BertModel: ['cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.decoder.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


calculating scores...
computing bert embedding.


  0%|          | 0/2 [00:00<?, ?it/s]

computing greedy matching.


  0%|          | 0/1 [00:00<?, ?it/s]

done in 11.67 seconds, 4.03 sentences/sec


{'Long precision avg': tensor(0.1715),
 'Long recall avg': tensor(0.2140),
 'Long F1-score avg': tensor(0.1925)}

In [None]:
bert.get_matrix(long_summs, ref_summs, 4)