
# **Validation of summary results**

**Create by** Cinthia M. Souza

**Created on** Tue Nov 26 12:29:16 2020

This notebook was created with the intention of being a post prediction validation tool. Considering that you already have an .xml, in the appropriate format, it reads the .xml file loads the information from each summary and performs the validation with the selected metrics. At the end, a new .xml is generated that has the same format as the input .xml, however, with the results of the new metrics.

If you want to calculate all metrics again, simply identify that you do not have any pre-calculated metrics. Thus, only the information regarding the entry summary to the reference and candidate summary will be loaded.

So far, the metrics that can be used are: ROUGES, NUBIA and BLEURT. The articles that each metrics proposes are presented in the last cell of the notebook.

.xml format:

```
<?xml version="1.0" ?>
<ZakSum BLEURT="0.0" NUBIA="0.0" rouge_1="0.0" rouge_2="0.0" rouge_L="0.">
  <!--Generated by Amr Zaki-->
  <example>
    <article>Here is the input text </article>
    <reference>Here is the reference summary</reference>
    <summary>Here is the candidate summary</summary>
    <eval>
      <ROUGE_1 score="0.0"/>
      <ROUGE_2 score="0.0"/>
      <ROUGE_l score="0.0"/>
      <NUBIA score="0.0"/>
      <BLEURT score="-0.0"/>
    </eval>
  </example>
</ZakSum>
```

In [1]:
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly&response_type=code

Enter your authorization code:
··········
Mounted at /content/drive


In [2]:
cd '/content/drive/My Drive/Colab Notebooks'

/content/drive/My Drive/Colab Notebooks


In [3]:
import os
os.chdir('nubia')
!pip install -r requirements.txt
from nubia import Nubia
nubia = Nubia()

Collecting fairseq
[?25l  Downloading https://files.pythonhosted.org/packages/67/bf/de299e082e7af010d35162cb9a185dc6c17db71624590f2f379aeb2519ff/fairseq-0.9.0.tar.gz (306kB)
[K     |████████████████████████████████| 307kB 8.7MB/s 
[?25hCollecting pytorch-transformers
[?25l  Downloading https://files.pythonhosted.org/packages/a3/b7/d3d18008a67e0b968d1ab93ad444fc05699403fa662f634b2f2c318a508b/pytorch_transformers-1.2.0-py3-none-any.whl (176kB)
[K     |████████████████████████████████| 184kB 20.0MB/s 
Collecting wget
  Downloading https://files.pythonhosted.org/packages/47/6a/62e288da7bcda82b935ff0c6cfe542970f04e29c756b0e147251b2fb251f/wget-3.2.zip
Collecting sacrebleu
[?25l  Downloading https://files.pythonhosted.org/packages/23/d3/be980ad7cda7c4bbfa97ee3de062fb3014fc1a34d6dd5b82d7b92f8d6522/sacrebleu-1.4.13-py3-none-any.whl (43kB)
[K     |████████████████████████████████| 51kB 8.2MB/s 
Collecting sacremoses
[?25l  Downloading https://files.pythonhosted.org/packages/7d/34/09d19af

1042301B [00:00, 1093367.25B/s]
456318B [00:00, 612016.45B/s]


loading archive file pretrained/roBERTa_MNLI
| dictionary: 50264 types


100%|██████████| 1042301/1042301 [00:00<00:00, 2022600.34B/s]
100%|██████████| 456318/456318 [00:00<00:00, 1322369.24B/s]
100%|██████████| 665/665 [00:00<00:00, 435542.19B/s]
100%|██████████| 548118077/548118077 [00:18<00:00, 29297558.80B/s]


In [4]:
cd '/content/drive/My Drive/Colab Notebooks/SCIM/30_resultados/'

/content/drive/My Drive/Colab Notebooks/SCIM/30_resultados


In [5]:
from bs4 import BeautifulSoup
from xml.etree import ElementTree
from xml.dom import minidom
from functools import reduce
from xml.etree.ElementTree import Element, SubElement, Comment
import numpy as np

path = '/content/drive/My Drive/Colab Notebooks/SCIM/30_resultados/'
number_files = 30


rouge_1_arr  = []
rouge_2_arr  = []
rouge_L_arr  = []
bleurt_arr = []
NUBIA_arr = []

def prettify(elem):
      """Return a pretty-printed XML string for the Element.
      """
      rough_string = ElementTree.tostring(elem, 'utf-8')
      reparsed = minidom.parseString(rough_string)
      return reparsed.toprettyxml(indent="  ")
  

top = Element('ZakSum')

comment = Comment('Generated by Amr Zaki')
top.append(comment)
 
i = 28
infile = open( path + "result_model_"+ str(i) +".xml" ,"r")
contents = infile.read()
soup = BeautifulSoup(contents, 'xml')

print("Model: {}".format(i))


for r in soup.find_all('example'):

    article = r.find('article').get_text()
    reference = r.find('reference').get_text()
    candidate = r.find('summary').get_text()

    example = SubElement(top, 'example')
    article_element   = SubElement(example, 'article')
    article_element.text = article
  
    reference_element = SubElement(example, 'reference')
    reference_element.text = reference
  
    summary_element   = SubElement(example, 'candidate')
    summary_element.text = candidate

    rouge_1 = float((str(r.find('ROUGE_1')).replace("<ROUGE_1 score=\"","").replace("\"/>","")))
    rouge_2 = float((str(r.find('ROUGE_2')).replace("<ROUGE_2 score=\"","").replace("\"/>","")))
    rouge_l = float((str(r.find('ROUGE_l')).replace("<ROUGE_l score=\"","").replace("\"/>","")))

    if(candidate != ""):
      nubia_score =  nubia.score(reference, candidate)
    else:
      nubia_score = 0

    eval_element = SubElement(example, 'eval')
    ROUGE_1_element  = SubElement(eval_element, 'ROUGE_1' , {'score':str(rouge_1)})
    ROUGE_2_element  = SubElement(eval_element, 'ROUGE_2' , {'score':str(rouge_2)})
    ROUGE_L_element  = SubElement(eval_element, 'ROUGE_l' , {'score':str(rouge_l)})
    NUBIA_element =  SubElement(eval_element,'NUBIA', {'score':str(nubia_score)})

    rouge_1_arr.append(rouge_1) 
    rouge_2_arr.append(rouge_2) 
    rouge_L_arr.append(rouge_l)
    NUBIA_arr.append(nubia_score)

top.set('rouge_1', str(np.mean(rouge_1_arr)))
top.set('rouge_2', str(np.mean(rouge_2_arr)))
top.set('rouge_L', str(np.mean(rouge_L_arr)))
top.set('NUBIA', str(np.mean(NUBIA_arr)))

with open("/content/drive/My Drive/Colab Notebooks/SCIM/NUBIA/model_" + str (i) +".xml", "w+") as f:
  print(prettify(top), file=f)

Model: 28


In [6]:
#Sending mensage to slack
import re
import requests
import json

web_hook_url = 'https://hooks.slack.com/services/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'
slack_msg = {'text': 'Validation was finished'}
requests.post(web_hook_url,data = json.dumps(slack_msg))

<Response [200]>

# **REFERENCES**


KANE, Hassan et al. NUBIA: NeUral Based Interchangeability Assessor for Text Generation. arXiv preprint arXiv:2004.14667, 2020.

LIN, Chin-Yew. Rouge: A package for automatic evaluation of summaries. In: Text summarization branches out. 2004. p. 74-81.

SELLAM, Thibault; DAS, Dipanjan; PARIKH, Ankur P. BLEURT: Learning Robust Metrics for Text Generation. arXiv preprint arXiv:2004.04696, 2020.