# **WEBNLG 16-domains Concept-based**

Authors: *Dario Della Mura - David Doci*

*INSID&S Lab*

*Department of Computer Science, Systems and Communication - 
University of Milano-Bicocca*


# Dataset Presentation

*The WebNLG dataset consists of 35426 (data,text) pairs and 13211 distinct data units. The data units are sets of RDF triples extracted from DBPedia and the texts are sequences of one or more sentences verbalising these data units. This notebook uses as a dataset the pre-processed version with the WebNLG challange 2017 baseline of the original WebNLG version 3.0 dataset.*

## Setting Environment

In [None]:
from google.colab import drive

drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# change with your paths

# train set path for model seen
train_path = '/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/SEEN/Concept-based/webnlg-train.csv'

# validation set path for model seen
val_path = '/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/SEEN/Concept-based/webnlg-val.csv'

# train set path for model unseen
train_path_u = '/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/UNSEEN/Concept-based/webnlg-train.csv'

#input-splitted df for metric
input_splitted_path = '/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/SEEN/Concept-based/input-splitted.csv'

# words-3 df for metric 
words_3_path = '/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/SEEN/Concept-based/words_3.csv'

# words_45 df for metric 
words_45_path = '/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/SEEN/Concept-based/words_45.csv'

# words_345 final df for metric 
words_345_path = '/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/SEEN/Concept-based/words_345.csv'

# validation set path for model unseen
val_path_u = '/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/UNSEEN/Concept-based/webnlg-val.csv'

# meteor metric path
meteor_path = '/content/drive/MyDrive/rdf-to-text/meteor-1.5'

# ter metric path
ter_path = '/content/drive/MyDrive/rdf-to-text/tercom-0.7.25'

# model seen
model_lstm = '/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/SEEN/Concept-based/lstm_model.pt'

#model seen
model_transformer = '/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/SEEN/Concept-based/transformer_model.pt'

# model unseen
model_lstm_u = '/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/UNSEEN/Concept-based/lstm_model.pt'

# model unseen
model_transformer_u = '/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/UNSEEN/Concept-based/transformer_model.pt'

In [None]:
# import e set openNMT-py env
%%capture
!git clone https://github.com/OpenNMT/OpenNMT-py.git
%cd OpenNMT-py
!pip install -e .

# install openNMT requirements
!pip install -r requirements.opt.txt

%cd /content/

In [None]:
from gensim.parsing.preprocessing import remove_stopwords
from string import punctuation
from nltk.corpus import stopwords
from google.colab import data_table

import pandas as pd
import numpy as np
import nltk
import string
import shutil
import re
import os

# improve visualisation of data
data_table.enable_dataframe_formatter()

nltk.download('stopwords')
stop = stopwords.words('english')

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.


# SEEN

## Dataset Creation 

In [None]:
# import webnlg baseline from offical gitlab repository

!git clone 'https://gitlab.com/webnlg/webnlg-baseline.git'

Cloning into 'webnlg-baseline'...
remote: Enumerating objects: 23, done.[K
remote: Counting objects: 100% (23/23), done.[K
remote: Compressing objects: 100% (23/23), done.[K
remote: Total 23 (delta 7), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (23/23), done.


In [None]:
# import webnlg dataset from offical gitlab repository

!git clone https://gitlab.com/shimorina/webnlg-dataset.git

Cloning into 'webnlg-dataset'...
remote: Enumerating objects: 5112, done.[K
remote: Counting objects: 100% (6/6), done.[K
remote: Compressing objects: 100% (6/6), done.[K
remote: Total 5112 (delta 2), reused 0 (delta 0), pack-reused 5106[K
Receiving objects: 100% (5112/5112), 26.09 MiB | 9.13 MiB/s, done.
Resolving deltas: 100% (4010/4010), done.
Checking out files: 100% (1425/1425), done.


In [None]:
# create data-directory folder and move needed files

shutil.copytree('/content/webnlg-dataset/release_v3.0/en/dev','/content/data-directory/dev')
shutil.copytree('/content/webnlg-dataset/release_v3.0/en/train', '/content/data-directory/train')
shutil.copytree('/content/webnlg-dataset/release_v3.0/en/test' , '/content/data-directory/test')

'/content/data-directory/test'

In [None]:
# add new field in /content/webnlg-baseline/delex_dict.json for company category

'''
"Company": [
    "AmeriGas",
    "Chinabank",
    "GMA_New_Media",
    "Hypermarcas",
    "Trane"
  ]
'''

In [None]:
# move to webnlg-baseline folder and pre-processing data with webnlg_baseline_input.py

%cd '/content/webnlg-baseline'
!python webnlg_baseline_input.py -i '/content/data-directory/'

/content/webnlg-baseline
Input directory is  /content/data-directory/
Total of 88 files processed in train with all-delex mode
Total of 88 files processed in train with all-notdelex mode
Total of 88 files processed in dev with all-delex mode
Total of 88 files processed in dev with all-notdelex mode
Files necessary for training/evaluating are written on disc.


In [None]:
# move pre-processing output file in data folder

%cd '/content/'
%ls
!mkdir 'data'
lista_files = ['train-webnlg-all-delex.triple', 'train-webnlg-all-delex.lex', 'dev-webnlg-all-delex.triple', 'dev-webnlg-all-delex.lex']

os.chdir('/content/webnlg-baseline')
dst_dir = "/content/data/"
for f in lista_files:
  shutil.copy(f, dst_dir)

/content
[0m[01;34mdata-directory[0m/  [01;34mOpenNMT-py[0m/   [01;34mwebnlg-baseline[0m/
[01;34mdrive[0m/           [01;34msample_data[0m/  [01;34mwebnlg-dataset[0m/


In [None]:
# create a df for train refences text

file1 = open('/content/data/train-webnlg-all-delex.lex', 'r')
Lines = file1.readlines()
lex = pd.DataFrame(Lines, columns=['ref_text'])

In [None]:
# create a df for train rdf triple

file1 = open('/content/data/train-webnlg-all-delex.triple', 'r')
Lines = file1.readlines()
triple = pd.DataFrame(Lines, columns=['rdf_triple'])

In [None]:
# creation of training set

train = pd.concat([triple, lex], axis=1)
train['ref_text'].replace('\n', '', regex=True, inplace=True)
train['ref_text'].replace('\r', '', regex=True, inplace=True)

len(train )

35426

In [None]:
#make csv file
train.to_csv('/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/SEEN/Concept-based/webnlg-train.csv')

In [None]:
# create a df for val  refences text

file1 = open('/content/data/dev-webnlg-all-delex.lex', 'r')
Lines = file1.readlines()
lex = pd.DataFrame(Lines, columns=['ref_text'])

In [None]:
# create a df for val rdf triple

file1 = open('/content/data/dev-webnlg-all-delex.triple', 'r')
Lines = file1.readlines()
triple = pd.DataFrame(Lines, columns=['rdf_triple'])

In [None]:
# creation of val set

val = pd.concat([triple, lex], axis=1)
val['ref_text'].replace('\n', '', regex=True, inplace=True)
val['ref_text'].replace('\r', '', regex=True, inplace=True)
len(val)

4464

In [None]:
#make csv file
val.to_csv('/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/SEEN/Concept-based/webnlg-val.csv')

## Import Dataset

In [None]:
# import train e val set

train_raw = pd.read_csv(train_path)
val_raw = pd.read_csv(val_path)

train_raw.drop(columns=['Unnamed: 0'], inplace=True)
val_raw.drop(columns=['Unnamed: 0'], inplace=True)
train_raw.head(10)

Unnamed: 0,rdf_triple,ref_text
0,AIRPORT location LOCATION\n,AIRPORT is located in the LOCATION .
1,United States capital CAPITAL\n,CAPITAL is the capital of the U . S .
2,AIRPORT runwayLength RUNWAYLENGTH AIRPORT loca...,AIRPORT can be found in LOCATION and is operat...
3,FOOD country COUNTRY COUNTRY demonym DEMONYM C...,"FOOD is a dish from COUNTRY , where the leader..."
4,CITY isPartOf ISPARTOF ISPARTOF countySeat COU...,Arlington in the COUNTRY is part of Tarrant Co...
5,LOCATION capital CAPITAL LOCATION leaderTitle ...,LEADER and Joe Biden are leaders in the LOCATI...
6,Clyde F . C . ground GROUND\n,Clyde F . C . ' s football ground is called GR...
7,COUNTRY leader LEADER UNIVERSITY country COUNT...,The leader of COUNTRY is Lars Lokke Rasmussen ...
8,SPORTSTEAM league LEAGUE GROUND leader LEADER ...,CHAMPIONS are champions of LEAGUE . in which A...
9,SITE location LOCATION SITE headquarter HEADQU...,"The SITE , where the LAUNCHSITE is located and..."


## Input Metrics

### Descriptive Statistics

In [None]:
# mancano le altre metriche descrittive, perchè ci stiamo ancora ragionando 

In [None]:
df = train_raw.copy()

In [None]:
'''
# To isolate triples for all input lines run this code. However, it doesn't capture all triples
j=0
for j in tqdm(range(len(df))):
  time.sleep(.1)
  sent =df['input_text'][j]
  sent_splitted = sent.split()
  i=0
  for i in range (len(sent_splitted)): 
    if sent_splitted[i]== sent_splitted[i].upper() and sent_splitted[i-1].lower()== sent_splitted[i].lower() and i !=0 and i!=len(sent_splitted)-1:
      sent_splitted[i] = sent_splitted[i] + " &&"
      i=i+1
    else:
      i=i+1
  df['input_text'][j] =TreebankWordDetokenizer().detokenize(sent_splitted)
  input = df[['input_text']]
  input = input.assign(input_text=input['input_text'].str.split('&&')).explode('input_text')
  input.input_text = input.input_text.str.lstrip()
  input.input_text = input.input_text.str.rstrip()
  j=j+1

input = input.reset_index()
input.to_csv('input-splitted.csv')

splitted_df = pd.read_csv('/content/input-splitted.csv')
splitted_df.drop(columns=['Unnamed: 0'], inplace=True)
splitted_df.drop(columns=['index'], inplace=True)
splitted_df

# count tokens in input_text for each row
splitted_df['count_words'] = splitted_df['input_text'].apply(lambda x: len(str(x).split(' ')))

# select only rows that count_words = 3
words_3 = splitted_df[splitted_df['count_words']==3]
words_3.reset_index(inplace=True)

# compute sub-pred-obj for each row that have 3 tokens in input_text. It is not accurate 100%
from tqdm import tqdm
import time
words_3['sub']=''
words_3['pred']=''
words_3['obj']=''

j=0
for j in tqdm(range(len(words_3))):
  time.sleep(.1)
  sent =words_3['input_text'][j]
  sent_splitted = sent.split()
  i=0
  for i in range (len(sent_splitted)): 
    words_3['sub'][j] = sent_splitted[len(sent_splitted)-3]
    words_3['pred'][j] = sent_splitted[len(sent_splitted)-2]
    words_3['obj'][j] = sent_splitted[len(sent_splitted)-1]
  j = j+1

words_3.to_csv('words_3.csv')

# select only rows that count_words = 4 or 5
words_4_5  = splitted_df.loc[(splitted_df['count_words']==4) | (splitted_df['count_words']==5)]
words_4_5.reset_index(inplace=True)

# compute sub-pred-obj for each row that have 4 or 5 tokens in input_text. It is not accurate 100%
from tqdm import tqdm
words_4_5['sub'] = ''
words_4_5['pred'] = ''
words_4_5['obj'] = ''

j=0
for j in tqdm(range(len(words_4_5))):
  time.sleep(.1)
  sent =words_4_5['input_text'][j]
  sent_splitted = sent.split()
  i=0
  for i in range (len(sent_splitted)): 
    if sent_splitted[len(sent_splitted)-1]== sent_splitted[len(sent_splitted)-1].upper() and sent_splitted[len(sent_splitted)-1].lower()== sent_splitted[len(sent_splitted)-2].lower() :
      words_4_5['obj'][j] = sent_splitted[len(sent_splitted)-1]
      words_4_5['pred'][j] = sent_splitted[len(sent_splitted)-2]
      words_4_5['sub'][j] = sent_splitted[0:len(sent_splitted)-2]
    else:
       words_4_5['obj'][j]=sent_splitted[len(sent_splitted)-1]
       words_4_5['pred'][j]= np.NaN
       words_4_5['sub'][j]=np.NaN

  words_4_5['input_text'][j] =TreebankWordDetokenizer().detokenize(sent_splitted)
  j=j+1


words_45 = words_4_5.copy()
words_45.to_csv('sop_45-words.csv')

# join words_3 df with words_45 df
words_45['sub'] =  words_45['sub'].astype(str)
words_345 = pd.concat([words_3, words_45], ignore_index=True)
words_345.drop(columns=['index', 'count_words'], inplace=True)
words_345.to_csv('words_345.csv')
'''

In [None]:
words_345 = pd.read_csv(words_345_path)

In [None]:
def compute_metrics_web_base(df_):
  data_text_pairs = len(df)
  distinct_inputs = len(df.rdf_triple.unique())
  dupl = df.rdf_triple.duplicated().sum()
  perc_duplicated = dupl / len(df) *100
  input = df_.copy()
  input['sub'] = input['sub'].astype(str)
  input.input_text = input.input_text.str.lstrip()
  input.input_text = input.input_text.str.rstrip()
  n_triples = len(input)
  dupl = input.input_text.duplicated().sum()
  perc_duplicated = dupl / len(input) *100
  dist_sub = input['sub'].nunique()
  dist_obj = input.obj.nunique()
  dist_pred = input.pred.nunique()
  #dist_spo = input.nunique()
  dist_sub_pred = len(input[~input.duplicated(subset=['sub','pred'])])
  dist_sub_obj = len(input[~input.duplicated(subset=['sub','obj'])])
  dist_obj_pred = len(input[~input.duplicated(subset=['obj','pred'])])
  avg_triple_for_sentence = len(input) / data_text_pairs
  avg_text_for_triple = len(df.ref_text.unique()) / len(df.rdf_triple.unique())
  print("Number of data-text-pairs: ", data_text_pairs), print("Number of distinct inputs: ", distinct_inputs),\
  print("Number of triples: ", n_triples), print("Number of duplicated triples: ", dupl),\
  print("Perc of duplicated triples: ",perc_duplicated), print("Number of distinct properties: ", dist_pred),\
  print("Number of distinct subjects: ", dist_sub ),print("Number of distinct objects: ", dist_obj ),\
  print("Number of distinct subject and predicate: ", dist_sub_pred ),print("Number of distinct object and predicate: ", dist_obj_pred),\
  print("Number of distinct subject and object: ", dist_sub_obj), print("Average triple for one sentence: ",avg_triple_for_sentence),\
  print("Average sentence for one triple: ", avg_text_for_triple)

compute_metrics_web_base(words_345)

Number of data-text-pairs:  35426
Number of distinct inputs:  7792
Number of triples:  85545
Number of duplicated triples:  84028
Perc of duplicated triples:  98.22666432871588
Number of distinct properties:  373
Number of distinct subjects:  509
Number of distinct objects:  390
Number of distinct subject and predicate:  1438
Number of distinct object and predicate:  412
Number of distinct subject and object:  1473
Average triple for one sentence:  2.414751877152374
Average sentence for one triple:  4.330852156057495


### Lessical Richness

Compute the Lexical richness. 
This metric describes the lexical richness of the dataset, i.e. the percentage of unique words within the dataset.

In [None]:
# lexical richness 

text_to_clean = df['ref_text'] # column of your dataset
text_to_clean.replace('\n', '', regex=True, inplace=True)
text_to_clean.replace('\r', '', regex=True, inplace=True)

text = text_to_clean.str.cat(sep =' ')

def text_clean(text):
  filtered_sentence = remove_stopwords(text)
  filtered_sentence1 = filtered_sentence.translate(str.maketrans('', '', string.punctuation))
  len_filtered_sentence1 = len(filtered_sentence1.split())
  return filtered_sentence1

def unique_words(text):
    words = text_clean(text).replace('"','').replace(',', '').split()
    unique = list(set(words))
    return len(unique)

def richness_score(df):
  score = unique_words(text) / len(text_clean(text).split())
  score = score*100
  return score

richness_score(text)

1.8627724986025713

### Occurence Metrics

Compute Occurence metric. This metric describes the percentage of words contained within the rdf triples within the reference texts.

In [None]:
def Occurence_Metric(df):
  df['rdf_triple'] = df['rdf_triple'].str.lower()
  df['ref_text'] = df['ref_text'].str.lower()
  df['rdf_triple'] = df['rdf_triple'].str.replace('[^\w]|_',' ')
  df['rdf_triple'] = df['rdf_triple'].apply(lambda x: ' '.join([word for word in x.split() if word not in (stop)])) 
  df['occurance_metric']=''
  for j in range(0,len(df)):
    #print("j:" + str(j))
    if j < len(df):
      c = 0
      for i in range(0, len(df["rdf_triple"].iloc[j].split())):
          #print("i:" + str(i))
          if df["rdf_triple"].iloc[j].split()[i] in df["ref_text"].iloc[j]:
              c = c + 1
              i=i+1
          else:
            c = c
            i=i+1    
      
      df['occurance_metric'].iloc[j] = c/len(df["rdf_triple"].iloc[j].split())
      j=j+1
  return(df['occurance_metric'].mean())



Occurence_Metric(df)

  after removing the cwd from sys.path.


0.8050118807550063

### Bleu - Meteor - Rouge

In [None]:
#import bleu metric

%%capture 
%cd /content/
!wget https://raw.githubusercontent.com/moses-smt/mosesdecoder/master/scripts/generic/multi-bleu.perl

In [None]:
# import meteor metric

%%capture 
import shutil

source_dir = meteor_path
destination_dir = r"/content/meteor-1.5"
shutil.copytree(source_dir, destination_dir)

In [None]:
# import and install rouge metric

%%capture 
%cd /content/
!git clone https://github.com/pltrdy/rouge.git
%cd rouge
!python setup.py install

In [None]:
# function to clean df

def clean_df(df):
  %cd /content/
  df['rdf_triple'] = df['rdf_triple'].str.lower()
  df['ref_text'] = df['ref_text'].str.lower()
  df['rdf_triple'] = df['rdf_triple'].str.replace('[^\w]|_',' ')
  df['rdf_triple'] = df['rdf_triple'].apply(lambda x: ' '.join([word for word in x.split() if word not in (stop)]))
  df['ref_text'].replace('\n', '', regex=True, inplace=True)
  df['ref_text'].replace('\r', '', regex=True, inplace=True)
  np.savetxt(r'ref.txt', df['rdf_triple'].values, fmt='%s', delimiter='\t')
  np.savetxt(r'hyp.txt', df['ref_text'].values, fmt='%s', delimiter='\t')

In [None]:
# function to compute bleu metric
def bleu():
  %cd /content/
  bleu = !perl multi-bleu.perl /content/ref.txt < /content/hyp.txt
  print(bleu[0])

# function to compute meteor metric
def meteor():
  %cd /content/meteor-1.5/
  meteor= !java -Xmx2G -jar meteor-1.5.jar /content/hyp.txt /content/ref.txt -l en -norm 
  print("Meteor", meteor[-1])

# function to compute rouge metric
def rouge():
  %cd /content/rouge/
  from rouge import FilesRouge
  hyp_path = '/content/hyp.txt'
  ref_path= '/content/ref.txt'
  files_rouge = FilesRouge()
  rouge = files_rouge.get_scores(hyp_path, ref_path, avg=True)
  return rouge

clean_df(df)

/content


  import sys


In [None]:
bleu()

/content
BLEU = 1.04, 19.9/1.9/0.4/0.1 (BP=1.000, ratio=2.113, hyp_len=715189, ref_len=338525)


In [None]:
meteor()

/content/meteor-1.5
Meteor Final score:            0.17048651688959673


In [None]:
rouge()

/content/rouge


{'rouge-1': {'f': 0.36971125309488984,
  'p': 0.24765951524965132,
  'r': 0.7789398032312577},
 'rouge-2': {'f': 0.03321186023177212,
  'p': 0.0255197839909145,
  'r': 0.04968113664603028},
 'rouge-l': {'f': 0.3206605979346582,
  'p': 0.21489950301116909,
  'r': 0.6757260302389392}}

### Bert Score

In [None]:
# install bert score metric
%%capture
%cd /content/
!pip install bert-score

In [None]:
def clean_df(df):
  df['rdf_triple'] = df['rdf_triple'].str.lower()
  df['ref_text'] = df['ref_text'].str.lower()
  df['rdf_triple'] = df['rdf_triple'].str.replace('[^\w]|_',' ')
  df['rdf_triple'] = df['rdf_triple'].apply(lambda x: ' '.join([word for word in x.split() if word not in (stop)]))
  lista_triple = df['rdf_triple'].tolist()
  lista_text = df['ref_text'].tolist()
  return lista_text, lista_triple

#function to compute bert score metric
def bert_score_(references, hypothesis, lng='en'):

    from bert_score import score
    for i, refs in enumerate(references):
        references[i] = [ref for ref in refs if ref.strip() != '']
    try:
        P, R, F1 = score(hypothesis, references, lang=lng)
    #     print('FINISHING TO COMPUTE BERT SCORE...')
        P, R, F1 = list(P), list(R), list(F1)
        F1 = float(sum(F1) / len(F1))
        P = float(sum(P) / len(P))
        R = float(sum(R) / len(R))
    except:
        P, R, F1 = 0, 0, 0
    return P, R, F1

In [None]:
text, triple = clean_df(df)
bert_score_(references=text, hypothesis=triple, lng='en')

  after removing the cwd from sys.path.


Downloading:   0%|          | 0.00/482 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.33G [00:00<?, ?B/s]

Some weights of the model checkpoint at roberta-large were not used when initializing RobertaModel: ['lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.bias', 'lm_head.decoder.weight', 'lm_head.layer_norm.bias', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


(0.7735552191734314, 0.999890148639679, 0.8715945482254028)

In [None]:
%cd /content/

/content


## Setting parameters and training LSTM Model

In [None]:
!mkdir data_lstm
!mkdir data_lstm/model
!mkdir data_lstm/loaded_model

In [None]:
import yaml
data = {
    ## Where the samples will be written
'save_data': '/content/data_lstm/model/',
## Where the vocab(s) will be written
'src_vocab': '/content/data_lstm/example.vocab.src',
'tgt_vocab': '/content/data_lstm/example.vocab.tgt',
# Prevent overwriting existing files in the folder
'overwrite': False,
# Corpus opts:
'data': ({
    'corpus_1':({
            'path_src': '/content/data/train-webnlg-all-delex.triple',
            'path_tgt': '/content/data/train-webnlg-all-delex.lex',
        }),

    'valid':({
            'path_src': '/content/data/dev-webnlg-all-delex.triple',
            'path_tgt': '/content/data/dev-webnlg-all-delex.lex',
        }),

}) ,

# Vocabulary files that were just created
'src_vocab': '/content/data_lstm/example.vocab.src',
'tgt_vocab': '/content/data_lstm/example.vocab.tgt',

# Train on a single GPU
'world_size': 1,
'gpu_ranks': [0],

# Where to save the checkpoints
'save_model': '/content/data_lstm/model/',
'save_checkpoint_steps': 5000,
'train_steps': 35000,
'valid_steps': 5000,
'seed':1234

}

file = open("/content/data_lstm/data.yaml", "w")
yaml.dump(data, file, default_flow_style=None)
file.close()


In [None]:
# build vocab
!onmt_build_vocab -config /content/data_lstm/data.yaml -n_sample 10000

Corpus corpus_1's weight should be given. We default it to 1 for you.
[2022-05-30 09:53:29,359 INFO] Counter vocab from 10000 samples.
[2022-05-30 09:53:29,359 INFO] Build vocab on 10000 transformed examples/corpus.
[2022-05-30 09:53:29,369 INFO] corpus_1's transforms: TransformPipe()
[2022-05-30 09:53:29,570 INFO] Counters src:1928
[2022-05-30 09:53:29,570 INFO] Counters tgt:5199


In [None]:
# train default openNMT model: LSTM with 2 layer (500 units for layer). Execution time ~ 1 hour.
!onmt_train -config /content/data_lstm/data.yaml

[2022-02-21 14:41:47,065 INFO] Missing transforms field for corpus_1 data, set to default: [].
[2022-02-21 14:41:47,065 INFO] Missing transforms field for valid data, set to default: [].
[2022-02-21 14:41:47,065 INFO] Parsed 2 corpora from -data.
[2022-02-21 14:41:47,065 INFO] Get special vocabs from Transforms: {'src': set(), 'tgt': set()}.
[2022-02-21 14:41:47,065 INFO] Loading vocab from text file...
[2022-02-21 14:41:47,065 INFO] Loading src vocabulary from /content/data/example.vocab.src
[2022-02-21 14:41:47,071 INFO] Loaded src vocab has 1899 tokens.
[2022-02-21 14:41:47,072 INFO] Loading tgt vocabulary from /content/data/example.vocab.tgt
[2022-02-21 14:41:47,106 INFO] Loaded tgt vocab has 5081 tokens.
[2022-02-21 14:41:47,108 INFO] Building fields with vocab in counters...
[2022-02-21 14:41:47,115 INFO]  * tgt vocab size: 5085.
[2022-02-21 14:41:47,117 INFO]  * src vocab size: 1901.
[2022-02-21 14:41:47,117 INFO]  * src vocab size = 1901
[2022-02-21 14:41:47,118 INFO]  * tgt vo

In [None]:
# import saved model from /content/data_lstm or where you saved your model

shutil.copyfile(src = model_lstm, dst = '/content/data_lstm/loaded_model/lstm_model.pt' )

'/content/data_lstm/loaded_model/lstm_model.pt'

In [None]:
# make prediction file
!onmt_translate -model /content/data_lstm/loaded_model/lstm_model.pt -src /content/data/dev-webnlg-all-delex.triple -output /content/data_lstm/pred.txt -gpu 0 -verbose -replace_unk

[1;30;43mOutput streaming troncato alle ultime 5000 righe.[0m
SENT 3465: ['MEANOFTRANSPORTATION', 'builder', 'BUILDER', 'BUILDER', 'leader', 'LEADER']
PRED 3465: BUILDER is led by LEADER and built the MEANOFTRANSPORTATION .
PRED SCORE: -3.6591

[2022-06-07 09:00:15,610 INFO] 
SENT 3466: ['Duncan', 'Rouleau', 'nationality', 'NATIONALITY', 'COMICSCHARACTER', 'creator', 'Duncan', 'Rouleau', 'COMICSCHARACTER', 'creator', 'CREATOR', 'COMICSCHARACTER', 'series', 'SERIES', 'SERIES', 'starring', 'STARRING']
PRED 3466: COMICSCHARACTER is a character in Big Hero 6 which stars STARRING and was created by Duncan Rouleau and CREATOR .
PRED SCORE: -2.9714

[2022-06-07 09:00:15,610 INFO] 
SENT 3467: ['MEANOFTRANSPORTATION', 'builder', 'BUILDER', 'MEANOFTRANSPORTATION', 'length', 'LENGTH', 'MEANOFTRANSPORTATION', 'engine', 'ENGINE', 'MEANOFTRANSPORTATION', 'buildDate', 'BUILDDATE']
PRED 3467: The MEANOFTRANSPORTATION was produced by the BUILDER between May 1950 and August 1956 . It is 17068 . 8 mill

In [None]:
%cd /content/webnlg-baseline
%ls

/content/webnlg-baseline
all-notdelex-reference0.lex  dev-webnlg-all-notdelex.lex
all-notdelex-reference1.lex  dev-webnlg-all-notdelex.triple
all-notdelex-reference2.lex  LICENSE
all-notdelex-reference3.lex  metrics.py
all-notdelex-reference4.lex  [0m[01;32mmulti-bleu.perl[0m*
all-notdelex-reference5.lex  [01;34m__pycache__[0m/
all-notdelex-reference6.lex  README.md
all-notdelex-reference7.lex  train-webnlg-all-delex.lex
all-notdelex-source.triple   train-webnlg-all-delex.triple
benchmark_reader.py          train-webnlg-all-notdelex.lex
[01;32mcalculate_bleu_dev.sh[0m*       train-webnlg-all-notdelex.triple
delex_dict.json              webnlg_baseline_input.py
dev-webnlg-all-delex.lex     webnlg_relexicalise.py
dev-webnlg-all-delex.triple


In [None]:
# relexication

!python webnlg_relexicalise.py -i /content/data-directory/ -f /content/data_lstm/pred.txt


Input directory is /content/data-directory/
Path to the file is /content/data_lstm/pred.txt
Total of 88 files processed in train with all-delex mode
Total of 88 files processed in train with all-notdelex mode
Total of 88 files processed in dev with all-delex mode
Total of 88 files processed in dev with all-notdelex mode
Files necessary for training/evaluating are written on disc.


#### Evaluation Metrics: LSTM

##### Bleu

In [None]:
!chmod 755 /content/webnlg-baseline/calculate_bleu_dev.sh
!chmod 755 /content/webnlg-baseline/multi-bleu.perl

In [None]:
# bleu score
!./calculate_bleu_dev.sh

BLEU = 54.63, 82.4/62.3/48.1/37.1 (BP=0.993, ratio=0.993, hyp_len=37383, ref_len=37653)


In [None]:
# create file for meteor and ter

!python metrics.py

Input files for METEOR and TER generated successfully.


In [None]:
cd /content/

/content


##### Meteor

In [None]:
# if you didn't import meteor metric before, please run this code
'''
%%capture 
import shutil

source_dir = meteor_path
destination_dir = r"/content/meteor-1.5"
shutil.copytree(source_dir, destination_dir)
'''

'/content/meteor-1.5'

In [None]:
%cd /content/meteor-1.5/

/content/meteor-1.5


In [None]:
!java -Xmx2G -jar meteor-1.5.jar /content/webnlg-baseline/relexicalised_predictions.txt /content/webnlg-baseline/all-notdelex-refs-meteor.txt -l en -norm -r 8

Meteor version: 1.5

Eval ID:        meteor-1.5-wo-en-norm-0.85_0.2_0.6_0.75-ex_st_sy_pa-1.0_0.6_0.8_0.6

Language:       English
Format:         plaintext
Task:           Ranking
Modules:        exact stem synonym paraphrase
Weights:        1.0 0.6 0.8 0.6
Parameters:     0.85 0.2 0.6 0.75

Segment 1 score:	0.3894510279539969
Segment 2 score:	0.32276902106095
Segment 3 score:	0.5770968732790582
Segment 4 score:	0.46243253431537573
Segment 5 score:	0.301052307997264
Segment 6 score:	0.5664056015665114
Segment 7 score:	1.0
Segment 8 score:	0.37981723532045264
Segment 9 score:	0.3649028741157757
Segment 10 score:	0.33906921119213385
Segment 11 score:	0.3363167146327523
Segment 12 score:	0.4633628440271209
Segment 13 score:	0.3476303609487016
Segment 14 score:	0.35102246517159824
Segment 15 score:	0.30340889838505847
Segment 16 score:	0.4462184468417052
Segment 17 score:	1.0
Segment 18 score:	0.38653456526462887
Segment 19 score:	0.28842471747769144
Segment 20 score:	0.3167250708072301
Se

##### Ter

In [None]:
# import Ter metric

import shutil
source_dir = ter_path
destination_dir = r"/content/tercom-0.7.25"
shutil.copytree(source_dir, destination_dir)

'/content/tercom-0.7.25'

In [None]:
%cd /content/tercom-0.7.25/

/content/tercom-0.7.25


In [None]:
!java -jar tercom.7.25.jar -h /content/webnlg-baseline/relexicalised_predictions-ter.txt -r /content/webnlg-baseline/all-notdelex-refs-ter.txt

"/content/webnlg-baseline/relexicalised_predictions-ter.txt" was successfully parsed as Trans text
"/content/webnlg-baseline/all-notdelex-refs-ter.txt" was successfully parsed as Trans text
Processing id0:1
Processing id1:1
Processing id2:1
Processing id3:1
Processing id4:1
Processing id5:1
Processing id6:1
Processing id7:1
Processing id8:1
Processing id9:1
Processing id10:1
Processing id11:1
Processing id12:1
Processing id13:1
Processing id14:1
Processing id15:1
Processing id16:1
Processing id17:1
Processing id18:1
Processing id19:1
Processing id20:1
Processing id21:1
Processing id22:1
Processing id23:1
Processing id24:1
Processing id25:1
Processing id26:1
Processing id27:1
Processing id28:1
Processing id29:1
Processing id30:1
Processing id31:1
Processing id32:1
Processing id33:1
Processing id34:1
Processing id35:1
Processing id36:1
Processing id37:1
Processing id38:1
Processing id39:1
Processing id40:1
Processing id41:1
Processing id42:1
Processing id43:1
Processing id44:1
Processing

##### Rouge

In [None]:
%cd ..

/content


In [None]:
# if you didn't import and install rouge metric before, please run this code
'''
# import and install rouge metric

%cd /content/
!git clone https://github.com/pltrdy/rouge.git
%cd rouge
!python setup.py install
'''

In [None]:
%cd /content/rouge
from rouge import FilesRouge

hyp_path = r'/content/webnlg-baseline/relexicalised_predictions-ter.txt'

ref_path= r'/content/webnlg-baseline/all-notdelex-oneref-ter.txt'


files_rouge = FilesRouge()
scores = files_rouge.get_scores(hyp_path, ref_path, avg=True)
scores


/content/rouge


{'rouge-1': {'f': 0.7207483150810837,
  'p': 0.743598754153185,
  'r': 0.709357107753434},
 'rouge-2': {'f': 0.44443559248944997,
  'p': 0.457958326715399,
  'r': 0.43856438941383846},
 'rouge-l': {'f': 0.6424542947898118,
  'p': 0.6628374372301197,
  'r': 0.6322149328218885}}

##### Bert Score

In [None]:
# if you didn't install bert score before, please run this code

#!pip install bert-score

In [None]:
a_file = open("/content/webnlg-baseline/all-notdelex-oneref-ter.txt", "r")

ref = []
for line in a_file:
  stripped_line = line.strip()
  #line_list = stripped_line.split()
  ref.append(stripped_line)

a_file.close()

In [None]:
a_file = open("/content/webnlg-baseline/relexicalised_predictions-ter.txt", "r")

hyp = []
for line in a_file:
  stripped_line = line.strip()
  #line_list = stripped_line.split()
  hyp.append(stripped_line)

a_file.close()

In [None]:
from bert_score import score
def bert_score_(references, hypothesis, lng='en'):
    from bert_score import score
    for i, refs in enumerate(references):
        references[i] = [ref for ref in refs if ref.strip() != '']
    try:
        P, R, F1 = score(hypothesis, references, lang=lng)
    #     print('FINISHING TO COMPUTE BERT SCORE...')
        P, R, F1 = list(P), list(R), list(F1)
        F1 = float(sum(F1) / len(F1))
        P = float(sum(P) / len(P))
        R = float(sum(R) / len(R))
    except:
        P, R, F1 = 0, 0, 0
    return P, R, F1
 
bert_score_(references=ref,hypothesis=hyp, lng='en' )

Some weights of the model checkpoint at roberta-large were not used when initializing RobertaModel: ['lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.bias', 'lm_head.decoder.weight', 'lm_head.layer_norm.bias', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


(0.7840254902839661, 0.9997782111167908, 0.8786812424659729)

#### Models results

In [None]:
df_pred_lstm = pd.read_fwf('/content/data_lstm/pred.txt', header=None)
df_pred_lstm= df_pred_lstm.rename(columns={0:'text'})
df_pred_lstm = df_pred_lstm[['text']]
df_pred_lstm = df_pred_lstm.head(10)
df_pred_lstm

Unnamed: 0,text
0,The runway name of AIRPORT is RUNWAYNAME .
1,MANAGER has been the manager of 1 FC Magdeburg .
2,BIRTHPLACE born POLITICIAN studied at ALMAMATE...
3,FOOD comes from the REGION region of the COUNT...
4,"ASTRONAUT graduated from the ALMAMATER , which..."
5,The OPERATINGORGANISATION who are the operatin...
6,The College of William and Mary is the owner o...
7,"FOOD , a dessert , can utilize cottage cheese ..."
8,AIRPORT serves the city of CITYSERVED in COUNT...
9,Philippe of COUNTRY and Charles Michel are the...


In [None]:
val_sample = val_raw.copy()
val_sample = val_sample.head(10)
val_sample

Unnamed: 0,rdf_triple,ref_text
0,AIRPORT runwayName RUNWAYNAME\n,The runway name of AIRPORT is RUNWAYNAME .
1,SPORTSTEAM manager MANAGER\n,The manager of SPORTSTEAM is MANAGER .
2,POLITICIAN almaMater ALMAMATER POLITICIAN part...,"POLITICIAN was born in the BIRTHPLACE , belong..."
3,FOOD ingredient INGREDIENT FOOD mainIngredient...,"FOOD is a traditional dish from REGION , COUNT..."
4,ASTRONAUT almaMater ALMAMATER ALMAMATER affili...,ASTRONAUT attended the University of Texas in ...
5,AIRPORT operatingOrganisation OPERATINGORGANIS...,"The US Air Force , veteran of the Korean war a..."
6,BUILDING owner OWNER BUILDING location LOCATIO...,"The BUILDING , owned by The College of William..."
7,FOOD country COUNTRY COUNTRY leader LEADER FOO...,The dessert FOOD is found in REGION and LEADER...
8,COUNTRY leader Philippe of COUNTRY AIRPORT cit...,AIRPORT serves the city of CITYSERVED in COUNT...
9,COUNTRY leader Charles Michel COUNTRY leader P...,AIRPORT serves the city of CITYSERVED which is...


In [None]:
prediction_df_lstm = pd.DataFrame(columns=['rdf_triple', 'prediction_text'] )
prediction_df_lstm.rdf_triple = val_sample.rdf_triple.values
prediction_df_lstm.prediction_text = df_pred_lstm.text.values
prediction_df_lstm

Unnamed: 0,rdf_triple,prediction_text
0,AIRPORT runwayName RUNWAYNAME\n,The runway name of AIRPORT is RUNWAYNAME .
1,SPORTSTEAM manager MANAGER\n,MANAGER has been the manager of 1 FC Magdeburg .
2,POLITICIAN almaMater ALMAMATER POLITICIAN part...,BIRTHPLACE born POLITICIAN studied at ALMAMATE...
3,FOOD ingredient INGREDIENT FOOD mainIngredient...,FOOD comes from the REGION region of the COUNT...
4,ASTRONAUT almaMater ALMAMATER ALMAMATER affili...,"ASTRONAUT graduated from the ALMAMATER , which..."
5,AIRPORT operatingOrganisation OPERATINGORGANIS...,The OPERATINGORGANISATION who are the operatin...
6,BUILDING owner OWNER BUILDING location LOCATIO...,The College of William and Mary is the owner o...
7,FOOD country COUNTRY COUNTRY leader LEADER FOO...,"FOOD , a dessert , can utilize cottage cheese ..."
8,COUNTRY leader Philippe of COUNTRY AIRPORT cit...,AIRPORT serves the city of CITYSERVED in COUNT...
9,COUNTRY leader Charles Michel COUNTRY leader P...,Philippe of COUNTRY and Charles Michel are the...


In [None]:
text_comparation_lstm = pd.DataFrame(columns=['ref_text', 'prediction_text'] )
text_comparation_lstm.ref_text = val_sample.ref_text.values
text_comparation_lstm.prediction_text = df_pred_lstm.text.values
text_comparation_lstm

Unnamed: 0,ref_text,prediction_text
0,The runway name of AIRPORT is RUNWAYNAME .,The runway name of AIRPORT is RUNWAYNAME .
1,The manager of SPORTSTEAM is MANAGER .,MANAGER has been the manager of 1 FC Magdeburg .
2,"POLITICIAN was born in the BIRTHPLACE , belong...",BIRTHPLACE born POLITICIAN studied at ALMAMATE...
3,"FOOD is a traditional dish from REGION , COUNT...",FOOD comes from the REGION region of the COUNT...
4,ASTRONAUT attended the University of Texas in ...,"ASTRONAUT graduated from the ALMAMATER , which..."
5,"The US Air Force , veteran of the Korean war a...",The OPERATINGORGANISATION who are the operatin...
6,"The BUILDING , owned by The College of William...",The College of William and Mary is the owner o...
7,The dessert FOOD is found in REGION and LEADER...,"FOOD , a dessert , can utilize cottage cheese ..."
8,AIRPORT serves the city of CITYSERVED in COUNT...,AIRPORT serves the city of CITYSERVED in COUNT...
9,AIRPORT serves the city of CITYSERVED which is...,Philippe of COUNTRY and Charles Michel are the...


## Setting parameters and training Transformer Model

In [None]:
%cd /content/
!mkdir data_transf
!mkdir data_transf/model
!mkdir data_transf/loaded_model

/content


In [None]:

import yaml
data = {
    ## Where the samples will be written
'save_data': '/content/data_transf/model/',
## Where the vocab(s) will be written
'src_vocab': '/content/data_transf/example.vocab.src',
'tgt_vocab': '/content/data_transf/example.vocab.tgt',
# Prevent overwriting existing files in the folder
'overwrite': False,
# Corpus opts:
'data': ({
    'corpus_1':({
            'path_src': '/content/data/train-webnlg-all-delex.triple',
            'path_tgt': '/content/data/train-webnlg-all-delex.lex',
        }),

    'valid':({
            'path_src': '/content/data/dev-webnlg-all-delex.triple',
            'path_tgt': '/content/data/dev-webnlg-all-delex.lex',
        }),

}) ,

# Vocabulary files that were just created
'src_vocab': '/content/data_transf/example.vocab.src',
'tgt_vocab': '/content/data_transf/example.vocab.tgt',

# Train on a single GPU
'world_size': 1,
'gpu_ranks': [0],

# Where to save the checkpoints
'save_model': '/content/data_transf/model/',
'save_checkpoint_steps': 5000,
'train_steps': 35000,
'valid_steps': 5000,
'decoder_type': 'transformer',
'encoder_type': 'transformer',
'word_vec_size': 512,
'rnn_size': 512,
'layers': 2,
'transformer_ff': 2048,
'heads': 4,
'batch_size': 64,
'batch_type': 'sents',
'normalization': 'tokens',
'dropout': 0.3,
'label_smoothing': 0.1,
'seed':1234
}

file = open("/content/data_transf/data.yaml", "w")
yaml.dump(data, file, default_flow_style=None)
file.close()


In [None]:
# build vocab
!onmt_build_vocab -config /content/data_transf/data.yaml -n_sample 10000

Corpus corpus_1's weight should be given. We default it to 1 for you.
[2022-06-07 09:04:54,855 INFO] Counter vocab from 10000 samples.
[2022-06-07 09:04:54,855 INFO] Build vocab on 10000 transformed examples/corpus.
[2022-06-07 09:04:54,868 INFO] corpus_1's transforms: TransformPipe()
[2022-06-07 09:04:55,070 INFO] Counters src:1920
[2022-06-07 09:04:55,070 INFO] Counters tgt:5138


In [None]:
# training with transformer openNMT model
!onmt_train -config /content/data_transf/data.yaml

[2022-04-15 19:19:28,677 INFO] Missing transforms field for corpus_1 data, set to default: [].
[2022-04-15 19:19:28,677 INFO] Missing transforms field for valid data, set to default: [].
[2022-04-15 19:19:28,678 INFO] Parsed 2 corpora from -data.
[2022-04-15 19:19:28,678 INFO] Get special vocabs from Transforms: {'src': set(), 'tgt': set()}.
[2022-04-15 19:19:28,678 INFO] Loading vocab from text file...
[2022-04-15 19:19:28,678 INFO] Loading src vocabulary from /content/data/example.vocab.src
[2022-04-15 19:19:28,682 INFO] Loaded src vocab has 1896 tokens.
[2022-04-15 19:19:28,683 INFO] Loading tgt vocabulary from /content/data/example.vocab.tgt
[2022-04-15 19:19:28,727 INFO] Loaded tgt vocab has 5113 tokens.
[2022-04-15 19:19:28,730 INFO] Building fields with vocab in counters...
[2022-04-15 19:19:28,737 INFO]  * tgt vocab size: 5117.
[2022-04-15 19:19:28,739 INFO]  * src vocab size: 1898.
[2022-04-15 19:19:28,739 INFO]  * src vocab size = 1898
[2022-04-15 19:19:28,739 INFO]  * tgt vo

In [None]:
# import saved model from /content/data_transf or where you saved your model

shutil.copyfile(src = model_transformer, dst = '/content/data_transf/loaded_model/transformer_model.pt' )

'/content/data_transf/loaded_model/transformer_model.pt'

In [None]:
# make prediction file
!onmt_translate -model /content/data_transf/loaded_model/transformer_model.pt -src /content/data/dev-webnlg-all-delex.triple -output /content/data_transf/pred.txt -gpu 0 -verbose -replace_unk

[1;30;43mOutput streaming troncato alle ultime 5000 righe.[0m
SENT 3465: ['MEANOFTRANSPORTATION', 'builder', 'BUILDER', 'BUILDER', 'leader', 'LEADER']
PRED 3465: BUILDER built the MEANOFTRANSPORTATION which is led by LEADER .
PRED SCORE: -4.1146

[2022-06-07 09:05:42,723 INFO] 
SENT 3466: ['Duncan', 'Rouleau', 'nationality', 'NATIONALITY', 'COMICSCHARACTER', 'creator', 'Duncan', 'Rouleau', 'COMICSCHARACTER', 'creator', 'CREATOR', 'COMICSCHARACTER', 'series', 'SERIES', 'SERIES', 'starring', 'STARRING']
PRED 3466: COMICSCHARACTER , a character in Big Hero 6 , was created by Steven T Seagle and the American , Duncan Rouleau .
PRED SCORE: -7.4460

[2022-06-07 09:05:42,723 INFO] 
SENT 3467: ['MEANOFTRANSPORTATION', 'builder', 'BUILDER', 'MEANOFTRANSPORTATION', 'length', 'LENGTH', 'MEANOFTRANSPORTATION', 'engine', 'ENGINE', 'MEANOFTRANSPORTATION', 'buildDate', 'BUILDDATE']
PRED 3467: The MEANOFTRANSPORTATION was built by BUILDER between May 1950 and August 1956 . It has a four - stroke eng

In [None]:
%cd /content/webnlg-baseline
%ls

/content/webnlg-baseline
all-notdelex-oneref-ter.txt   dev-webnlg-all-delex.triple
all-notdelex-reference0.lex   dev-webnlg-all-notdelex.lex
all-notdelex-reference1.lex   dev-webnlg-all-notdelex.triple
all-notdelex-reference2.lex   LICENSE
all-notdelex-reference3.lex   metrics.py
all-notdelex-reference4.lex   [0m[01;32mmulti-bleu.perl[0m*
all-notdelex-reference5.lex   [01;34m__pycache__[0m/
all-notdelex-reference6.lex   README.md
all-notdelex-reference7.lex   relexicalised_predictions-ter.txt
all-notdelex-refs-meteor.txt  relexicalised_predictions.txt
all-notdelex-refs-ter.txt     train-webnlg-all-delex.lex
all-notdelex-source.triple    train-webnlg-all-delex.triple
benchmark_reader.py           train-webnlg-all-notdelex.lex
[01;32mcalculate_bleu_dev.sh[0m*        train-webnlg-all-notdelex.triple
delex_dict.json               webnlg_baseline_input.py
dev-webnlg-all-delex.lex      webnlg_relexicalise.py


In [None]:
# relexication

!python webnlg_relexicalise.py -i /content/data-directory/ -f /content/data_transf/pred.txt


Input directory is /content/data-directory/
Path to the file is /content/data_transf/pred.txt
Total of 88 files processed in train with all-delex mode
Total of 88 files processed in train with all-notdelex mode
Total of 88 files processed in dev with all-delex mode
Total of 88 files processed in dev with all-notdelex mode
Files necessary for training/evaluating are written on disc.


#### Evaluation Metrics: TRANSFORMER

##### Bleu

In [None]:
!chmod 755 /content/webnlg-baseline/calculate_bleu_dev.sh
!chmod 755 /content/webnlg-baseline/multi-bleu.perl

In [None]:
# bleu score
!./calculate_bleu_dev.sh

BLEU = 54.59, 81.6/62.2/47.9/36.8 (BP=0.998, ratio=0.998, hyp_len=37470, ref_len=37545)


In [None]:
# create file for meteor and ter

!python metrics.py

Input files for METEOR and TER generated successfully.


In [None]:
cd /content/

/content


##### Meteor

In [None]:
# if you didn't import meteor metric before, please run this code
'''
%%capture 
import shutil

source_dir = meteor_path
destination_dir = r"/content/meteor-1.5"
shutil.copytree(source_dir, destination_dir)
'''

'/content/meteor-1.5'

In [None]:
%cd /content/meteor-1.5/

/content/meteor-1.5


In [None]:
!java -Xmx2G -jar meteor-1.5.jar /content/webnlg-baseline/relexicalised_predictions.txt /content/webnlg-baseline/all-notdelex-refs-meteor.txt -l en -norm -r 8

Meteor version: 1.5

Eval ID:        meteor-1.5-wo-en-norm-0.85_0.2_0.6_0.75-ex_st_sy_pa-1.0_0.6_0.8_0.6

Language:       English
Format:         plaintext
Task:           Ranking
Modules:        exact stem synonym paraphrase
Weights:        1.0 0.6 0.8 0.6
Parameters:     0.85 0.2 0.6 0.75

Segment 1 score:	0.354165985763581
Segment 2 score:	0.34736540498115
Segment 3 score:	0.5770968732790582
Segment 4 score:	0.5531633058316747
Segment 5 score:	0.42204388352950833
Segment 6 score:	0.5773220550092653
Segment 7 score:	0.392189672506831
Segment 8 score:	0.37981723532045264
Segment 9 score:	0.3663881072868922
Segment 10 score:	0.3461782946877047
Segment 11 score:	0.37213866253589345
Segment 12 score:	0.4691185362344342
Segment 13 score:	0.2952455329132055
Segment 14 score:	0.26841634986012447
Segment 15 score:	0.41147691349624244
Segment 16 score:	0.4462184468417052
Segment 17 score:	1.0
Segment 18 score:	0.41754011410958797
Segment 19 score:	0.43221975151635805
Segment 20 score:	0.17363

##### Ter

In [None]:
# if you didn't import ter metric before, please run this code
'''
import shutil
source_dir = ter_path
destination_dir = r"/content/tercom-0.7.25"
shutil.copytree(source_dir, destination_dir)
'''

'/content/tercom-0.7.25'

In [None]:
%cd /content/tercom-0.7.25/

/content/tercom-0.7.25


In [None]:
!java -jar tercom.7.25.jar -h /content/webnlg-baseline/relexicalised_predictions-ter.txt -r /content/webnlg-baseline/all-notdelex-refs-ter.txt

"/content/webnlg-baseline/relexicalised_predictions-ter.txt" was successfully parsed as Trans text
"/content/webnlg-baseline/all-notdelex-refs-ter.txt" was successfully parsed as Trans text
Processing id0:1
Processing id1:1
Processing id2:1
Processing id3:1
Processing id4:1
Processing id5:1
Processing id6:1
Processing id7:1
Processing id8:1
Processing id9:1
Processing id10:1
Processing id11:1
Processing id12:1
Processing id13:1
Processing id14:1
Processing id15:1
Processing id16:1
Processing id17:1
Processing id18:1
Processing id19:1
Processing id20:1
Processing id21:1
Processing id22:1
Processing id23:1
Processing id24:1
Processing id25:1
Processing id26:1
Processing id27:1
Processing id28:1
Processing id29:1
Processing id30:1
Processing id31:1
Processing id32:1
Processing id33:1
Processing id34:1
Processing id35:1
Processing id36:1
Processing id37:1
Processing id38:1
Processing id39:1
Processing id40:1
Processing id41:1
Processing id42:1
Processing id43:1
Processing id44:1
Processing

##### Rouge

In [None]:
%cd ..

/content


In [None]:
# if you didn't import and install rouge metric before, please run this code
'''
# import and install rouge metric

%cd /content/
!git clone https://github.com/pltrdy/rouge.git
%cd rouge
!python setup.py install
'''

In [None]:
%cd rouge
from rouge import FilesRouge

hyp_path = r'/content/webnlg-baseline/relexicalised_predictions-ter.txt'

ref_path= r'/content/webnlg-baseline/all-notdelex-oneref-ter.txt'


files_rouge = FilesRouge()
scores = files_rouge.get_scores(hyp_path, ref_path, avg=True)
scores


/content/rouge


{'rouge-1': {'f': 0.7264081470695984,
  'p': 0.7603568402088295,
  'r': 0.7069531420337213},
 'rouge-2': {'f': 0.4517492138093311,
  'p': 0.47413805639644835,
  'r': 0.44026630026021385},
 'rouge-l': {'f': 0.6543201281458986,
  'p': 0.6848798231370204,
  'r': 0.6368138468706959}}

##### Bert Score

In [None]:
# if you didn't install bert score before, please run this code

#!pip install bert-score

In [None]:
a_file = open("/content/webnlg-baseline/all-notdelex-oneref-ter.txt", "r")

ref = []
for line in a_file:
  stripped_line = line.strip()
  #line_list = stripped_line.split()
  ref.append(stripped_line)

a_file.close()

In [None]:
a_file = open("/content/webnlg-baseline/relexicalised_predictions-ter.txt", "r")

hyp = []
for line in a_file:
  stripped_line = line.strip()
  #line_list = stripped_line.split()
  hyp.append(stripped_line)

a_file.close()

In [None]:
from bert_score import score
def bert_score_(references, hypothesis, lng='en'):
    from bert_score import score
    for i, refs in enumerate(references):
        references[i] = [ref for ref in refs if ref.strip() != '']
    try:
        P, R, F1 = score(hypothesis, references, lang=lng)
    #     print('FINISHING TO COMPUTE BERT SCORE...')
        P, R, F1 = list(P), list(R), list(F1)
        F1 = float(sum(F1) / len(F1))
        P = float(sum(P) / len(P))
        R = float(sum(R) / len(R))
    except:
        P, R, F1 = 0, 0, 0
    return P, R, F1
 
bert_score_(references=ref,hypothesis=hyp, lng='en' )

Some weights of the model checkpoint at roberta-large were not used when initializing RobertaModel: ['lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.bias', 'lm_head.decoder.weight', 'lm_head.layer_norm.bias', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


(0.7811049222946167, 0.999767005443573, 0.8768384456634521)

#### Models results

In [None]:
df_pred_tr = pd.read_fwf('/content/data_transf/pred.txt', header=None)
df_pred_tr = df_pred_tr.rename(columns={0:'text'})
df_pred_tr = df_pred_tr[['text']]
df_pred_tr = df_pred_tr.head(10)
df_pred_tr 

Unnamed: 0,text
0,RUNWAYNAME is the runway name of AIRPORT .
1,MANAGER manages SPORTSTEAM .
2,BIRTHPLACE born POLITICIAN studied at ALMAMATE...
3,"FOOD is from REGION , COUNTRY . The main ingre..."
4,ASTRONAUT graduated from NWC with an M . A . i...
5,The OPERATINGORGANISATION is the operating org...
6,Alan B Miller Hall ( located in LOCATION ) is ...
7,"FOOD is a dessert from REGION , COUNTRY . The ..."
8,"AIRPORT serves the city of CITYSERVED , a popu..."
9,AIRPORT serves the city of CITYSERVED in COUNT...


In [None]:
val_sample

Unnamed: 0,rdf_triple,ref_text
0,AIRPORT runwayName RUNWAYNAME\n,The runway name of AIRPORT is RUNWAYNAME .
1,SPORTSTEAM manager MANAGER\n,The manager of SPORTSTEAM is MANAGER .
2,POLITICIAN almaMater ALMAMATER POLITICIAN part...,"POLITICIAN was born in the BIRTHPLACE , belong..."
3,FOOD ingredient INGREDIENT FOOD mainIngredient...,"FOOD is a traditional dish from REGION , COUNT..."
4,ASTRONAUT almaMater ALMAMATER ALMAMATER affili...,ASTRONAUT attended the University of Texas in ...
5,AIRPORT operatingOrganisation OPERATINGORGANIS...,"The US Air Force , veteran of the Korean war a..."
6,BUILDING owner OWNER BUILDING location LOCATIO...,"The BUILDING , owned by The College of William..."
7,FOOD country COUNTRY COUNTRY leader LEADER FOO...,The dessert FOOD is found in REGION and LEADER...
8,COUNTRY leader Philippe of COUNTRY AIRPORT cit...,AIRPORT serves the city of CITYSERVED in COUNT...
9,COUNTRY leader Charles Michel COUNTRY leader P...,AIRPORT serves the city of CITYSERVED which is...


In [None]:
prediction_df_tr  = pd.DataFrame(columns=['rdf_triple', 'prediction_text'] )
prediction_df_tr.rdf_triple = val_sample.rdf_triple.values
prediction_df_tr.prediction_text = df_pred_tr.text.values
prediction_df_tr

Unnamed: 0,rdf_triple,prediction_text
0,AIRPORT runwayName RUNWAYNAME\n,RUNWAYNAME is the runway name of AIRPORT .
1,SPORTSTEAM manager MANAGER\n,MANAGER manages SPORTSTEAM .
2,POLITICIAN almaMater ALMAMATER POLITICIAN part...,BIRTHPLACE born POLITICIAN studied at ALMAMATE...
3,FOOD ingredient INGREDIENT FOOD mainIngredient...,"FOOD is from REGION , COUNTRY . The main ingre..."
4,ASTRONAUT almaMater ALMAMATER ALMAMATER affili...,ASTRONAUT graduated from NWC with an M . A . i...
5,AIRPORT operatingOrganisation OPERATINGORGANIS...,The OPERATINGORGANISATION is the operating org...
6,BUILDING owner OWNER BUILDING location LOCATIO...,Alan B Miller Hall ( located in LOCATION ) is ...
7,FOOD country COUNTRY COUNTRY leader LEADER FOO...,"FOOD is a dessert from REGION , COUNTRY . The ..."
8,COUNTRY leader Philippe of COUNTRY AIRPORT cit...,"AIRPORT serves the city of CITYSERVED , a popu..."
9,COUNTRY leader Charles Michel COUNTRY leader P...,AIRPORT serves the city of CITYSERVED in COUNT...


In [None]:
text_comparation_tr = pd.DataFrame(columns=['ref_text', 'prediction_text'] )
text_comparation_tr.ref_text = val_sample.ref_text.values
text_comparation_tr.prediction_text = df_pred_tr.text.values
text_comparation_tr

Unnamed: 0,ref_text,prediction_text
0,The runway name of AIRPORT is RUNWAYNAME .,RUNWAYNAME is the runway name of AIRPORT .
1,The manager of SPORTSTEAM is MANAGER .,MANAGER manages SPORTSTEAM .
2,"POLITICIAN was born in the BIRTHPLACE , belong...",BIRTHPLACE born POLITICIAN studied at ALMAMATE...
3,"FOOD is a traditional dish from REGION , COUNT...","FOOD is from REGION , COUNTRY . The main ingre..."
4,ASTRONAUT attended the University of Texas in ...,ASTRONAUT graduated from NWC with an M . A . i...
5,"The US Air Force , veteran of the Korean war a...",The OPERATINGORGANISATION is the operating org...
6,"The BUILDING , owned by The College of William...",Alan B Miller Hall ( located in LOCATION ) is ...
7,The dessert FOOD is found in REGION and LEADER...,"FOOD is a dessert from REGION , COUNTRY . The ..."
8,AIRPORT serves the city of CITYSERVED in COUNT...,"AIRPORT serves the city of CITYSERVED , a popu..."
9,AIRPORT serves the city of CITYSERVED which is...,AIRPORT serves the city of CITYSERVED in COUNT...


# UNSEEN

## Dataset Creation 

In [None]:
'''shutil.rmtree('/content/data-directory_u')
shutil.rmtree('/content/webnlg-baseline_u')
shutil.rmtree('/content/webnlg-dataset_u')'''

In [None]:
# import webnlg pipeline repositorty
!git clone 'https://gitlab.com/webnlg/webnlg-baseline.git' '/content/webnlg-baseline_u'

Cloning into '/content/webnlg-baseline_u'...
remote: Enumerating objects: 23, done.[K
remote: Counting objects: 100% (23/23), done.[K
remote: Compressing objects: 100% (23/23), done.[K
remote: Total 23 (delta 7), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (23/23), done.


In [None]:
# import data repository, move all folders in /content/webnlg-dataset/webnlg_challenge_2017 to data-directory
!git clone 'https://gitlab.com/shimorina/webnlg-dataset.git' '/content/webnlg-dataset_u'

Cloning into '/content/webnlg-dataset_u'...
remote: Enumerating objects: 5112, done.[K
remote: Counting objects: 100% (6/6), done.[K
remote: Compressing objects: 100% (6/6), done.[K
remote: Total 5112 (delta 2), reused 0 (delta 0), pack-reused 5106[K
Receiving objects: 100% (5112/5112), 26.09 MiB | 18.55 MiB/s, done.
Resolving deltas: 100% (4010/4010), done.
Checking out files: 100% (1425/1425), done.


In [None]:
# add new field in /content/webnlg-baseline_u/delex_dict.json for company category

'''
"Company": [
    "AmeriGas",
    "Chinabank",
    "GMA_New_Media",
    "Hypermarcas",
    "Trane"
  ]
'''

In [None]:
shutil.copytree('/content/webnlg-dataset_u/release_v3.0/en/train', '/content/data-directory_u/train')

'/content/data-directory_u/train'

In [None]:
!mkdir /content/data-directory_u/dev/
!mkdir /content/data-directory_u/dev/1triples/
!mkdir /content/data-directory_u/dev/2triples/
!mkdir /content/data-directory_u/dev/3triples/
!mkdir /content/data-directory_u/dev/4triples/
!mkdir /content/data-directory_u/dev/5triples/
!mkdir /content/data-directory_u/dev/6triples/
!mkdir /content/data-directory_u/dev/7triples/

!mv  '/content/data-directory_u/train/1triples/Food_allSolutions.xml' '/content/data-directory_u/dev/1triples/'
!mv  '/content/data-directory_u/train/2triples/Food.xml' '/content/data-directory_u/dev/2triples/'
!mv  '/content/data-directory_u/train/3triples/Food.xml' '/content/data-directory_u/dev/3triples/'
!mv '/content/data-directory_u/train/4triples/Food.xml' '/content/data-directory_u/dev/4triples/'
!mv '/content/data-directory_u/train/5triples/Food.xml' '/content/data-directory_u/dev/5triples/'
%cp -av  '/content/webnlg-dataset_u/release_v3.0/en/dev/5triples/Food.xml' '/content/data-directory_u/dev/6triples/'
%cp -av  '/content/webnlg-dataset_u/release_v3.0/en/dev/4triples/Food.xml' '/content/data-directory_u/dev/7triples/'

'/content/webnlg-dataset_u/release_v3.0/en/dev/5triples/Food.xml' -> '/content/data-directory_u/dev/6triples/Food.xml'
'/content/webnlg-dataset_u/release_v3.0/en/dev/4triples/Food.xml' -> '/content/data-directory_u/dev/7triples/Food.xml'


In [None]:
%cd '/content/webnlg-baseline_u'
%ls

/content/webnlg-baseline_u
benchmark_reader.py     LICENSE           README.md
[0m[01;32mcalculate_bleu_dev.sh[0m*  metrics.py        webnlg_baseline_input.py
delex_dict.json         [01;32mmulti-bleu.perl[0m*  webnlg_relexicalise.py


In [None]:
# pre processing data with webnlg baseline 
!python webnlg_baseline_input.py -i '/content/data-directory_u/'

Input directory is  /content/data-directory_u/
Total of 83 files processed in train with all-delex mode
Total of 83 files processed in train with all-notdelex mode
Total of 7 files processed in dev with all-delex mode
Total of 7 files processed in dev with all-notdelex mode
Files necessary for training/evaluating are written on disc.


In [None]:
%cd '/content/'
%ls

/content
[0m[01;34mdata[0m/              [01;34mdrive[0m/           ref.txt           [01;34mwebnlg-baseline_u[0m/
[01;34mdata-directory[0m/    hyp.txt          [01;34mrouge[0m/            [01;34mwebnlg-dataset[0m/
[01;34mdata-directory_u[0m/  [01;34mmeteor-1.5[0m/      [01;34msample_data[0m/      [01;34mwebnlg-dataset_u[0m/
[01;34mdata_lstm[0m/         multi-bleu.perl  [01;34mtercom-0.7.25[0m/
[01;34mdata_transf[0m/       [01;34mOpenNMT-py[0m/      [01;34mwebnlg-baseline[0m/


In [None]:
# move pre processing output file in data folder
!mkdir 'data_u'
lista_files = ['train-webnlg-all-delex.triple', 'train-webnlg-all-delex.lex', 'dev-webnlg-all-delex.triple', 'dev-webnlg-all-delex.lex']

os.chdir('/content/webnlg-baseline_u')
dst_dir = "/content/data_u/"
for f in lista_files:
  shutil.copy(f, dst_dir)

In [None]:
file1 = open('/content/data_u/train-webnlg-all-delex.lex', 'r')
Lines = file1.readlines()
lex = pd.DataFrame(Lines, columns=['ref_text'])

In [None]:
file1 = open('/content/data_u/train-webnlg-all-delex.triple', 'r')
Lines = file1.readlines()
triple = pd.DataFrame(Lines, columns=['rdf_triple'])

In [None]:
webnlg_train_u = pd.concat([triple, lex], axis=1)

In [None]:
webnlg_train_u['ref_text'].replace('\n', '', regex=True, inplace=True)
webnlg_train_u['ref_text'].replace('\r', '', regex=True, inplace=True)

In [None]:
webnlg_train_u.to_csv('/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/UNSEEN/Concept-based/webnlg-train.csv')

In [None]:
file1 = open('/content/data_u/dev-webnlg-all-delex.lex', 'r')
Lines = file1.readlines()
lex = pd.DataFrame(Lines, columns=['ref_text'])

In [None]:
file1 = open('/content/data_u/dev-webnlg-all-delex.triple', 'r')
Lines = file1.readlines()
triple = pd.DataFrame(Lines, columns=['rdf_triple'])

In [None]:
webnlg_val_u = pd.concat([triple, lex], axis=1)

In [None]:
webnlg_val_u['ref_text'].replace('\n', '', regex=True, inplace=True)
webnlg_val_u['ref_text'].replace('\r', '', regex=True, inplace=True)

In [None]:
len(webnlg_val_u)

3798

In [None]:
webnlg_val_u.to_csv('/content/drive/MyDrive/rdf-to-text/dataset/Notebook/WebNLG/16-domains/UNSEEN/Concept-based/webnlg-val.csv')

## Import Dataset

In [None]:
# import train e val set for unseen model

train_raw_u = pd.read_csv(train_path_u)
val_raw_u = pd.read_csv(val_path_u)

train_raw_u .drop(columns=['Unnamed: 0'], inplace=True)
val_raw_u .drop(columns=['Unnamed: 0'], inplace=True)
train_raw_u .head(10)

Unnamed: 0,rdf_triple,ref_text
0,England capital CAPITAL\n,The capital of England is CAPITAL .
1,ASTRONAUT deathPlace DEATHPLACE ASTRONAUT occu...,ASTRONAUT was a national of the NATIONALITY . ...
2,CITY areaCode AREACODE\n,The area code for CITY is AREACODE .
3,ATHLETE club Hull City A . F . C . ATHLETE clu...,Abel Hernandez played for the Central Espanol ...
4,ARTIST genre GENRE GENRE musicSubgenre MUSICSU...,"The musical genre of ARTIST is hip hop music ,..."
5,SPORTSTEAM numberOfMembers NUMBEROFMEMBERS SPO...,AE Dimitra Efxeinoupolis has NUMBEROFMEMBERS m...
6,CITY utcOffset UTCOFFSET\n,"Anaheim , CA has a UTC offset of minus 7 ."
7,COMPANY product PRODUCT COMPANY keyPerson KEYP...,"The KEYPERSON is the head of the drugmaker , C..."
8,POLITICIAN birthPlace BIRTHPLACE\n,BIRTHPLACE was the birthplace of POLITICIAN .
9,LOCATION leader LEADER MONUMENT designer DESIG...,CAPITAL is the capital of LOCATION where the c...


## Setting parameters and training UNSEEN LSTM Model

In [None]:

%cd /content/
!mkdir data_lstm_u
!mkdir data_lstm_u/model
!mkdir data_lstm_u/loaded_model

/content


In [None]:
import yaml
data = {
    ## Where the samples will be written
'save_data': '/content/data_lstm_u/model/',
## Where the vocab(s) will be written
'src_vocab': '/content/data_lstm_u/example.vocab.src',
'tgt_vocab': '/content/data_lstm_u/example.vocab.tgt',
# Prevent overwriting existing files in the folder
'overwrite': False,
# Corpus opts:
'data': ({
    'corpus_1':({
            'path_src': '/content/data_u/train-webnlg-all-delex.triple',
            'path_tgt': '/content/data_u/train-webnlg-all-delex.lex',
        }),

    'valid':({
            'path_src': '/content/data_u/dev-webnlg-all-delex.triple',
            'path_tgt': '/content/data_u/dev-webnlg-all-delex.lex',
        }),

}) ,

# Vocabulary files that were just created
'src_vocab': '/content/data_lstm_u/example.vocab.src',
'tgt_vocab': '/content/data_lstm_u/example.vocab.tgt',

# Train on a single GPU
'world_size': 1,
'gpu_ranks': [0],

# Where to save the checkpoints
'save_model': '/content/data_lstm_u/model/',
'save_checkpoint_steps': 5000,
'train_steps': 35000,
'valid_steps': 5000,
'seed':1234

}

file = open("/content/data_lstm_u/data.yaml", "w")
yaml.dump(data, file, default_flow_style=None)
file.close()


In [None]:
# build vocab
!onmt_build_vocab -config /content/data_lstm_u/data.yaml -n_sample 10000

Corpus corpus_1's weight should be given. We default it to 1 for you.
[2022-06-07 09:13:11,414 INFO] Counter vocab from 10000 samples.
[2022-06-07 09:13:11,415 INFO] Build vocab on 10000 transformed examples/corpus.
[2022-06-07 09:13:11,422 INFO] corpus_1's transforms: TransformPipe()
[2022-06-07 09:13:11,623 INFO] Counters src:1807
[2022-06-07 09:13:11,623 INFO] Counters tgt:4858


In [None]:
# train default openNMT model: LSTM with 2 layer (500 units for layer). Execution time ~ 1 hour.
!onmt_train -config /content/data_lstm_u/data.yaml

[2022-03-14 10:12:05,846 INFO] Missing transforms field for corpus_1 data, set to default: [].
[2022-03-14 10:12:05,846 INFO] Missing transforms field for valid data, set to default: [].
[2022-03-14 10:12:05,846 INFO] Parsed 2 corpora from -data.
[2022-03-14 10:12:05,846 INFO] Get special vocabs from Transforms: {'src': set(), 'tgt': set()}.
[2022-03-14 10:12:05,847 INFO] Loading vocab from text file...
[2022-03-14 10:12:05,847 INFO] Loading src vocabulary from /content/data/example.vocab.src
[2022-03-14 10:12:05,851 INFO] Loaded src vocab has 1789 tokens.
[2022-03-14 10:12:05,852 INFO] Loading tgt vocabulary from /content/data/example.vocab.tgt
[2022-03-14 10:12:05,899 INFO] Loaded tgt vocab has 4827 tokens.
[2022-03-14 10:12:05,901 INFO] Building fields with vocab in counters...
[2022-03-14 10:12:05,908 INFO]  * tgt vocab size: 4831.
[2022-03-14 10:12:05,910 INFO]  * src vocab size: 1791.
[2022-03-14 10:12:05,911 INFO]  * src vocab size = 1791
[2022-03-14 10:12:05,911 INFO]  * tgt vo

In [None]:
# import saved model from /content/data_lstm_u or where you saved your model

shutil.copyfile(src = model_lstm_u, dst = '/content/data_lstm_u/loaded_model/lstm_model.pt' )

'/content/data_lstm_u/loaded_model/lstm_model.pt'

In [None]:
%cd /content/

/content


In [None]:
# make prediction file
!onmt_translate -model /content/data_lstm_u/loaded_model/lstm_model.pt -src /content/data_u/dev-webnlg-all-delex.triple -output /content/data_lstm_u/pred.txt -gpu 0 -verbose -replace_unk

[1;30;43mOutput streaming troncato alle ultime 5000 righe.[0m
SENT 2799: ['FOOD', 'region', 'REGION', 'COUNTRY', 'ethnicGroup', 'ETHNICGROUP', 'FOOD', 'country', 'COUNTRY']
PRED 2799: The COUNTRY is the location of Andrews County and includes the ethnic group of ETHNICGROUP .
PRED SCORE: -4.3342

[2022-06-07 09:13:37,819 INFO] 
SENT 2800: ['FOOD', 'country', 'COUNTRY', 'FOOD', 'ingredient', 'INGREDIENT', 'FOOD', 'region', 'REGION', 'FOOD', 'course', 'COURSE', 'FOOD', 'mainIngredient', 'MAININGREDIENT']
PRED 2800: The FOOD saint of REGION FOOD is FOOD .
PRED SCORE: -2.1210

[2022-06-07 09:13:37,819 INFO] 
SENT 2801: ['FOOD', 'region', 'REGION', 'REGION', 'leader', 'LEADER']
PRED 2801: The leader of REGION is LEADER .
PRED SCORE: -1.0941

[2022-06-07 09:13:37,819 INFO] 
SENT 2802: ['FOOD', 'country', 'COUNTRY', 'FOOD', 'region', 'REGION', 'COURSE', 'dishVariation', 'DISHVARIATION', 'COUNTRY', 'leader', 'LEADER', 'FOOD', 'course', 'COURSE']
PRED 2802: LEADER is the leader of COUNTRY whe

In [None]:
%cd /content/webnlg-baseline_u
%ls

/content/webnlg-baseline_u
all-notdelex-reference0.lex  dev-webnlg-all-notdelex.triple
all-notdelex-reference1.lex  LICENSE
all-notdelex-reference2.lex  metrics.py
all-notdelex-reference3.lex  [0m[01;32mmulti-bleu.perl[0m*
all-notdelex-reference4.lex  [01;34m__pycache__[0m/
all-notdelex-reference5.lex  README.md
all-notdelex-source.triple   train-webnlg-all-delex.lex
benchmark_reader.py          train-webnlg-all-delex.triple
[01;32mcalculate_bleu_dev.sh[0m*       train-webnlg-all-notdelex.lex
delex_dict.json              train-webnlg-all-notdelex.triple
dev-webnlg-all-delex.lex     webnlg_baseline_input.py
dev-webnlg-all-delex.triple  webnlg_relexicalise.py
dev-webnlg-all-notdelex.lex


In [None]:
# relexication

!python webnlg_relexicalise.py -i /content/data-directory_u/ -f /content/data_lstm_u/pred.txt


Input directory is /content/data-directory_u/
Path to the file is /content/data_lstm_u/pred.txt
Total of 83 files processed in train with all-delex mode
Total of 83 files processed in train with all-notdelex mode
Total of 7 files processed in dev with all-delex mode
Total of 7 files processed in dev with all-notdelex mode
Files necessary for training/evaluating are written on disc.


#### Evaluation Metrics: LSTM

##### Bleu

In [None]:
!chmod 755 /content/webnlg-baseline_u/calculate_bleu_dev.sh
!chmod 755 /content/webnlg-baseline_u/multi-bleu.perl

In [None]:
# bleu score
!./calculate_bleu_dev.sh

BLEU = 21.20, 60.7/28.6/14.1/8.3 (BP=1.000, ratio=1.029, hyp_len=24735, ref_len=24046)


In [None]:
# create file for meteor and ter

!python metrics.py

Input files for METEOR and TER generated successfully.


In [None]:
cd /content/

/content


##### Meteor

In [None]:
# if you didn't import meteor metric before, please run this code
'''
%%capture 
import shutil

source_dir = meteor_path
destination_dir = r"/content/meteor-1.5"
shutil.copytree(source_dir, destination_dir)
'''

In [None]:
%cd /content/meteor-1.5/

/content/meteor-1.5


In [None]:
!java -Xmx2G -jar meteor-1.5.jar /content/webnlg-baseline_u/relexicalised_predictions.txt /content/webnlg-baseline_u/all-notdelex-refs-meteor.txt -l en -norm -r 8

Meteor version: 1.5

Eval ID:        meteor-1.5-wo-en-norm-0.85_0.2_0.6_0.75-ex_st_sy_pa-1.0_0.6_0.8_0.6

Language:       English
Format:         plaintext
Task:           Ranking
Modules:        exact stem synonym paraphrase
Weights:        1.0 0.6 0.8 0.6
Parameters:     0.85 0.2 0.6 0.75

Segment 1 score:	0.40437643345805335
Segment 2 score:	0.1653584379442454
Segment 3 score:	0.14768041131685702
Segment 4 score:	0.1835898300456014
Segment 5 score:	0.23703703703703705
Segment 6 score:	0.29854353876395345
Segment 7 score:	0.275
Segment 8 score:	0.13197842538011018
Segment 9 score:	0.1449759174960861
Segment 10 score:	0.14673018307922248
Segment 11 score:	0.14511707730028214
Segment 12 score:	0.11419842338874782
Segment 13 score:	0.0898876404494382
Segment 14 score:	0.10648029413012809
Segment 15 score:	0.10201912858661
Segment 16 score:	0.1664822373976546
Segment 17 score:	0.11872802749251371
Segment 18 score:	0.12852969088544877
Segment 19 score:	0.10282776349614396
Segment 20 score

##### Ter

In [None]:
'''
# import Ter metric

import shutil
source_dir = ter_path
destination_dir = r"/content/tercom-0.7.25"
shutil.copytree(source_dir, destination_dir)
'''

'/content/tercom-0.7.25'

In [None]:
%cd /content/tercom-0.7.25/

/content/tercom-0.7.25


In [None]:
!java -jar tercom.7.25.jar -h /content/webnlg-baseline_u/relexicalised_predictions-ter.txt -r /content/webnlg-baseline_u/all-notdelex-refs-ter.txt

"/content/webnlg-baseline_u/relexicalised_predictions-ter.txt" was successfully parsed as Trans text
"/content/webnlg-baseline_u/all-notdelex-refs-ter.txt" was successfully parsed as Trans text
Processing id0:1
Processing id1:1
Processing id2:1
Processing id3:1
Processing id4:1
Processing id5:1
Processing id6:1
Processing id7:1
Processing id8:1
Processing id9:1
Processing id10:1
Processing id11:1
Processing id12:1
Processing id13:1
Processing id14:1
Processing id15:1
Processing id16:1
Processing id17:1
Processing id18:1
Processing id19:1
Processing id20:1
Processing id21:1
Processing id22:1
Processing id23:1
Processing id24:1
Processing id25:1
Processing id26:1
Processing id27:1
Processing id28:1
Processing id29:1
Processing id30:1
Processing id31:1
Processing id32:1
Processing id33:1
Processing id34:1
Processing id35:1
Processing id36:1
Processing id37:1
Processing id38:1
Processing id39:1
Processing id40:1
Processing id41:1
Processing id42:1
Processing id43:1
Processing id44:1
Proces

##### Rouge

In [None]:
%cd ..

/content


In [None]:
# if you didn't import and install rouge metric before, please run this code
'''
# import and install rouge metric

%cd /content/
!git clone https://github.com/pltrdy/rouge.git
%cd rouge
!python setup.py install
'''

In [None]:
%cd /content/rouge 
from rouge import FilesRouge

hyp_path = r'/content/webnlg-baseline_u/relexicalised_predictions-ter.txt'

ref_path= r'/content/webnlg-baseline_u/all-notdelex-oneref-ter.txt'


files_rouge = FilesRouge()
scores = files_rouge.get_scores(hyp_path, ref_path, avg=True)
scores


/content/rouge


{'rouge-1': {'f': 0.45898170319040116,
  'p': 0.5895128507839773,
  'r': 0.3987637537788297},
 'rouge-2': {'f': 0.15525238985313913,
  'p': 0.18668975465084595,
  'r': 0.14149098024423795},
 'rouge-l': {'f': 0.38336470525165267,
  'p': 0.49765954823484687,
  'r': 0.33140889464696244}}

##### Bert Score

In [None]:
# if you didn't install bert score before, please run this code

#!pip install bert-score

In [None]:
a_file = open("/content/webnlg-baseline_u/all-notdelex-oneref-ter.txt", "r")

ref = []
for line in a_file:
  stripped_line = line.strip()
  #line_list = stripped_line.split()
  ref.append(stripped_line)

a_file.close()

In [None]:
a_file = open("/content/webnlg-baseline_u/relexicalised_predictions-ter.txt", "r")

hyp = []
for line in a_file:
  stripped_line = line.strip()
  #line_list = stripped_line.split()
  hyp.append(stripped_line)

a_file.close()

In [None]:
from bert_score import score
def bert_score_(references, hypothesis, lng='en'):
    from bert_score import score
    for i, refs in enumerate(references):
        references[i] = [ref for ref in refs if ref.strip() != '']
    try:
        P, R, F1 = score(hypothesis, references, lang=lng)
    #     print('FINISHING TO COMPUTE BERT SCORE...')
        P, R, F1 = list(P), list(R), list(F1)
        F1 = float(sum(F1) / len(F1))
        P = float(sum(P) / len(P))
        R = float(sum(R) / len(R))
    except:
        P, R, F1 = 0, 0, 0
    return P, R, F1
 
bert_score_(references=ref,hypothesis=hyp, lng='en' )

Some weights of the model checkpoint at roberta-large were not used when initializing RobertaModel: ['lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.bias', 'lm_head.decoder.weight', 'lm_head.layer_norm.bias', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


(0.7952078580856323, 0.9997720718383789, 0.8857350945472717)

#### Models results

In [None]:
df_pred_lstm_u = pd.read_fwf('/content/data_lstm_u/pred.txt', header=None)
df_pred_lstm_u= df_pred_lstm_u.rename(columns={0:'text'})
df_pred_lstm_u = df_pred_lstm_u[['text']]
df_pred_lstm_u = df_pred_lstm_u.head(10)
df_pred_lstm_u

Unnamed: 0,text
0,The predecessor of James ingredient Watson is ...
1,Dessert have won Dessert .
2,LEADER is the leader of COUNTRY where the capi...
3,LEADER is the leader of the COUNTRY where the ...
4,LEADER is the leader of COUNTRY where the capi...
5,"The AIRPORT is located in COUNTRY , the ALTERN..."
6,FOOD have won FOOD .
7,The name of the leader of COUNTRY FOOD is LEAD...
8,"COUNTRY ' s capital is CAPITAL , and the count..."
9,"region is located in COUNTRY , where the leade..."


In [None]:
val_sample_u = val_raw_u.copy()
val_sample_u = val_sample_u.head(10)
val_sample_u

Unnamed: 0,rdf_triple,ref_text
0,FOOD ingredient INGREDIENT FOOD mainIngredient...,The ingredients of binignit include taro and s...
1,Dessert dishVariation DISHVARIATION\n,DISHVARIATION is a type of dessert .
2,FOOD country COUNTRY COUNTRY leader Pietro Gra...,FOOD originates from COUNTRY whose capital cit...
3,COUNTRY leader LEADER FOOD country COUNTRY REG...,FOOD is a dish from REGION and COUNTRY . REGIO...
4,REGION capital CAPITAL REGION language LANGUAG...,FOOD is a Chinese style dish from REGION ( cap...
5,FOOD country COUNTRY FOOD alternativeName ALTE...,FOOD ( or ALTERNATIVENAME ) contains the ingre...
6,FOOD mainIngredient MAININGREDIENT\n,MAININGREDIENT is a main ingredient in FOOD .
7,FOOD country COUNTRY COUNTRY leader LEADER FOO...,FOOD is a dessert found in COUNTRY . The leade...
8,FOOD country COUNTRY COUNTRY capital CAPITAL\n,FOOD comes from the COUNTRY where the capital ...
9,COUNTRY leader Felipe VI of COUNTRY FOOD regio...,"FOOD is from the REGION region in COUNTRY , th..."


In [None]:
prediction_df_lstm_u = pd.DataFrame(columns=['rdf_triple', 'prediction_text'] )
prediction_df_lstm_u.rdf_triple = val_sample_u.rdf_triple.values
prediction_df_lstm_u.prediction_text = df_pred_lstm_u.text.values
prediction_df_lstm_u

Unnamed: 0,rdf_triple,prediction_text
0,FOOD ingredient INGREDIENT FOOD mainIngredient...,The predecessor of James ingredient Watson is ...
1,Dessert dishVariation DISHVARIATION\n,Dessert have won Dessert .
2,FOOD country COUNTRY COUNTRY leader Pietro Gra...,LEADER is the leader of COUNTRY where the capi...
3,COUNTRY leader LEADER FOOD country COUNTRY REG...,LEADER is the leader of the COUNTRY where the ...
4,REGION capital CAPITAL REGION language LANGUAG...,LEADER is the leader of COUNTRY where the capi...
5,FOOD country COUNTRY FOOD alternativeName ALTE...,"The AIRPORT is located in COUNTRY , the ALTERN..."
6,FOOD mainIngredient MAININGREDIENT\n,FOOD have won FOOD .
7,FOOD country COUNTRY COUNTRY leader LEADER FOO...,The name of the leader of COUNTRY FOOD is LEAD...
8,FOOD country COUNTRY COUNTRY capital CAPITAL\n,"COUNTRY ' s capital is CAPITAL , and the count..."
9,COUNTRY leader Felipe VI of COUNTRY FOOD regio...,"region is located in COUNTRY , where the leade..."


In [None]:
text_comparation_lstm_u = pd.DataFrame(columns=['ref_text', 'prediction_text'] )
text_comparation_lstm_u.ref_text = val_sample_u.ref_text.values
text_comparation_lstm_u.prediction_text = df_pred_lstm_u.text.values
text_comparation_lstm_u

Unnamed: 0,ref_text,prediction_text
0,The ingredients of binignit include taro and s...,The predecessor of James ingredient Watson is ...
1,DISHVARIATION is a type of dessert .,Dessert have won Dessert .
2,FOOD originates from COUNTRY whose capital cit...,LEADER is the leader of COUNTRY where the capi...
3,FOOD is a dish from REGION and COUNTRY . REGIO...,LEADER is the leader of the COUNTRY where the ...
4,FOOD is a Chinese style dish from REGION ( cap...,LEADER is the leader of COUNTRY where the capi...
5,FOOD ( or ALTERNATIVENAME ) contains the ingre...,"The AIRPORT is located in COUNTRY , the ALTERN..."
6,MAININGREDIENT is a main ingredient in FOOD .,FOOD have won FOOD .
7,FOOD is a dessert found in COUNTRY . The leade...,The name of the leader of COUNTRY FOOD is LEAD...
8,FOOD comes from the COUNTRY where the capital ...,"COUNTRY ' s capital is CAPITAL , and the count..."
9,"FOOD is from the REGION region in COUNTRY , th...","region is located in COUNTRY , where the leade..."


## Setting parameters and training UNSEEN Transformer Model

In [None]:
%cd /content/
!mkdir data_transf_u
!mkdir data_transf_u/model
!mkdir data_transf_u/loaded_model

/content


In [None]:

import yaml
data = {
    ## Where the samples will be written
'save_data': '/content/data_transf_u/model/',
## Where the vocab(s) will be written
'src_vocab': '/content/data_transf_u/example.vocab.src',
'tgt_vocab': '/content/data_transf_u/example.vocab.tgt',
# Prevent overwriting existing files in the folder
'overwrite': False,
# Corpus opts:
'data': ({
    'corpus_1':({
            'path_src': '/content/data_u/train-webnlg-all-delex.triple',
            'path_tgt': '/content/data_u/train-webnlg-all-delex.lex',
        }),

    'valid':({
            'path_src': '/content/data_u/dev-webnlg-all-delex.triple',
            'path_tgt': '/content/data_u/dev-webnlg-all-delex.lex',
        }),

}) ,

# Vocabulary files that were just created
'src_vocab': '/content/data_transf_u/example.vocab.src',
'tgt_vocab': '/content/data_transf_u/example.vocab.tgt',

# Train on a single GPU
'world_size': 1,
'gpu_ranks': [0],

# Where to save the checkpoints
'save_model': '/content/data_transf_u/model/',
'save_checkpoint_steps': 5000,
'train_steps': 35000,
'valid_steps': 5000,
'decoder_type': 'transformer',
'encoder_type': 'transformer',
'word_vec_size': 512,
'rnn_size': 512,
'layers': 2,
'transformer_ff': 2048,
'heads': 4,
'batch_size': 64,
'batch_type': 'sents',
'normalization': 'tokens',
'dropout': 0.3,
'label_smoothing': 0.1,
'seed':1234
}

file = open("/content/data_transf_u/data.yaml", "w")
yaml.dump(data, file, default_flow_style=None)
file.close()


In [None]:
# build vocab
!onmt_build_vocab -config /content/data_transf_u/data.yaml -n_sample 10000

Corpus corpus_1's weight should be given. We default it to 1 for you.
[2022-06-07 09:17:11,515 INFO] Counter vocab from 10000 samples.
[2022-06-07 09:17:11,515 INFO] Build vocab on 10000 transformed examples/corpus.
[2022-06-07 09:17:11,528 INFO] corpus_1's transforms: TransformPipe()
[2022-06-07 09:17:11,729 INFO] Counters src:1807
[2022-06-07 09:17:11,730 INFO] Counters tgt:4858


In [None]:
# training transformer openNMT model
!onmt_train -config /content/data_transf_u/data.yaml

[2022-03-15 14:15:44,140 INFO] Missing transforms field for corpus_1 data, set to default: [].
[2022-03-15 14:15:44,140 INFO] Missing transforms field for valid data, set to default: [].
[2022-03-15 14:15:44,140 INFO] Parsed 2 corpora from -data.
[2022-03-15 14:15:44,140 INFO] Get special vocabs from Transforms: {'src': set(), 'tgt': set()}.
[2022-03-15 14:15:44,140 INFO] Loading vocab from text file...
[2022-03-15 14:15:44,140 INFO] Loading src vocabulary from /content/example.vocab.src
[2022-03-15 14:15:44,172 INFO] Loaded src vocab has 3356 tokens.
[2022-03-15 14:15:44,174 INFO] Loading tgt vocabulary from /content/example.vocab.tgt
[2022-03-15 14:15:44,196 INFO] Loaded tgt vocab has 10034 tokens.
[2022-03-15 14:15:44,201 INFO] Building fields with vocab in counters...
[2022-03-15 14:15:44,214 INFO]  * tgt vocab size: 10038.
[2022-03-15 14:15:44,218 INFO]  * src vocab size: 3358.
[2022-03-15 14:15:44,218 INFO]  * src vocab size = 3358
[2022-03-15 14:15:44,218 INFO]  * tgt vocab size

In [None]:
# import saved model from /content/data_transf or where you saved your model

shutil.copyfile(src = model_transformer_u, dst = '/content/data_transf_u/loaded_model/transformer_model.pt' )

'/content/data_transf_u/loaded_model/transformer_model.pt'

In [None]:
# make prediction file
!onmt_translate -model /content/data_transf_u/loaded_model/transformer_model.pt -src /content/data_u/dev-webnlg-all-delex.triple -output /content/data_transf_u/pred.txt -gpu 0 -verbose -replace_unk

[1;30;43mOutput streaming troncato alle ultime 5000 righe.[0m
SENT 2799: ['FOOD', 'region', 'REGION', 'COUNTRY', 'ethnicGroup', 'ETHNICGROUP', 'FOOD', 'country', 'COUNTRY']
PRED 2799: The COUNTRY includes the ethnic group of ETHNICGROUP and is the location of REGION .
PRED SCORE: -7.9885

[2022-06-07 09:18:21,047 INFO] 
SENT 2800: ['FOOD', 'country', 'COUNTRY', 'FOOD', 'ingredient', 'INGREDIENT', 'FOOD', 'region', 'REGION', 'FOOD', 'course', 'COURSE', 'FOOD', 'mainIngredient', 'MAININGREDIENT']
PRED 2800: Luanda is located in the country of COUNTRY .
PRED SCORE: -1.8636

[2022-06-07 09:18:21,047 INFO] 
SENT 2801: ['FOOD', 'region', 'REGION', 'REGION', 'leader', 'LEADER']
PRED 2801: LEADER is a leader of REGION .
PRED SCORE: -2.5342

[2022-06-07 09:18:21,047 INFO] 
SENT 2802: ['FOOD', 'country', 'COUNTRY', 'FOOD', 'region', 'REGION', 'COURSE', 'dishVariation', 'DISHVARIATION', 'COUNTRY', 'leader', 'LEADER', 'FOOD', 'course', 'COURSE']
PRED 2802: Luanda is located in COUNTRY .
PRED SCO

In [None]:
%cd /content/webnlg-baseline_u
%ls

/content/webnlg-baseline_u
all-notdelex-oneref-ter.txt   dev-webnlg-all-notdelex.lex
all-notdelex-reference0.lex   dev-webnlg-all-notdelex.triple
all-notdelex-reference1.lex   LICENSE
all-notdelex-reference2.lex   metrics.py
all-notdelex-reference3.lex   [0m[01;32mmulti-bleu.perl[0m*
all-notdelex-reference4.lex   [01;34m__pycache__[0m/
all-notdelex-reference5.lex   README.md
all-notdelex-refs-meteor.txt  relexicalised_predictions-ter.txt
all-notdelex-refs-ter.txt     relexicalised_predictions.txt
all-notdelex-source.triple    train-webnlg-all-delex.lex
benchmark_reader.py           train-webnlg-all-delex.triple
[01;32mcalculate_bleu_dev.sh[0m*        train-webnlg-all-notdelex.lex
delex_dict.json               train-webnlg-all-notdelex.triple
dev-webnlg-all-delex.lex      webnlg_baseline_input.py
dev-webnlg-all-delex.triple   webnlg_relexicalise.py


In [None]:
# relexication

!python webnlg_relexicalise.py -i /content/data-directory_u/ -f /content/data_transf_u/pred.txt


Input directory is /content/data-directory_u/
Path to the file is /content/data_transf_u/pred.txt
Total of 83 files processed in train with all-delex mode
Total of 83 files processed in train with all-notdelex mode
Total of 7 files processed in dev with all-delex mode
Total of 7 files processed in dev with all-notdelex mode
Files necessary for training/evaluating are written on disc.


#### Evaluation Metrics: TRANSFORMER

##### Bleu

In [None]:
!chmod 755 /content/webnlg-baseline_u/calculate_bleu_dev.sh
!chmod 755 /content/webnlg-baseline_u/multi-bleu.perl

In [None]:
# bleu score
!./calculate_bleu_dev.sh

BLEU = 25.50, 65.1/32.3/17.9/11.2 (BP=1.000, ratio=1.137, hyp_len=19607, ref_len=17251)


In [None]:
# create file for meteor and ter

!python metrics.py

Input files for METEOR and TER generated successfully.


In [None]:
cd /content/

/content


##### Meteor

In [None]:
# if you didn't import meteor metric before, please run this code
'''
%%capture 
import shutil

source_dir = meteor_path
destination_dir = r"/content/meteor-1.5"
shutil.copytree(source_dir, destination_dir)
'''

'/content/meteor-1.5'

In [None]:
%cd /content/meteor-1.5/

/content/meteor-1.5


In [None]:
!java -Xmx2G -jar meteor-1.5.jar /content/webnlg-baseline_u/relexicalised_predictions.txt /content/webnlg-baseline_u/all-notdelex-refs-meteor.txt -l en -norm -r 8

Meteor version: 1.5

Eval ID:        meteor-1.5-wo-en-norm-0.85_0.2_0.6_0.75-ex_st_sy_pa-1.0_0.6_0.8_0.6

Language:       English
Format:         plaintext
Task:           Ranking
Modules:        exact stem synonym paraphrase
Weights:        1.0 0.6 0.8 0.6
Parameters:     0.85 0.2 0.6 0.75

Segment 1 score:	0.23636785878886957
Segment 2 score:	0.242176694090124
Segment 3 score:	0.19226095232028023
Segment 4 score:	0.20949350090368915
Segment 5 score:	0.11367309239464718
Segment 6 score:	0.1128540112511216
Segment 7 score:	0.1450439133477154
Segment 8 score:	0.16596790612052242
Segment 9 score:	0.1570306944446592
Segment 10 score:	0.13745700095721952
Segment 11 score:	0.16643819760272982
Segment 12 score:	0.06697410141935543
Segment 13 score:	0.067495916309544
Segment 14 score:	0.06346516359401035
Segment 15 score:	0.09634679529475691
Segment 16 score:	0.13199612009117992
Segment 17 score:	0.22188773511105456
Segment 18 score:	0.17694957741694511
Segment 19 score:	0.15372242109356496
S

##### Ter

In [None]:
# if you didn't import ter metric before, please run this code
'''
import shutil
source_dir = ter_path
destination_dir = r"/content/tercom-0.7.25"
shutil.copytree(source_dir, destination_dir)
'''

'/content/tercom-0.7.25'

In [None]:
%cd /content/tercom-0.7.25/

/content/tercom-0.7.25


In [None]:
!java -jar tercom.7.25.jar -h /content/webnlg-baseline_u/relexicalised_predictions-ter.txt -r /content/webnlg-baseline_u/all-notdelex-refs-ter.txt

"/content/webnlg-baseline_u/relexicalised_predictions-ter.txt" was successfully parsed as Trans text
"/content/webnlg-baseline_u/all-notdelex-refs-ter.txt" was successfully parsed as Trans text
Processing id0:1
Processing id1:1
Processing id2:1
Processing id3:1
Processing id4:1
Processing id5:1
Processing id6:1
Processing id7:1
Processing id8:1
Processing id9:1
Processing id10:1
Processing id11:1
Processing id12:1
Processing id13:1
Processing id14:1
Processing id15:1
Processing id16:1
Processing id17:1
Processing id18:1
Processing id19:1
Processing id20:1
Processing id21:1
Processing id22:1
Processing id23:1
Processing id24:1
Processing id25:1
Processing id26:1
Processing id27:1
Processing id28:1
Processing id29:1
Processing id30:1
Processing id31:1
Processing id32:1
Processing id33:1
Processing id34:1
Processing id35:1
Processing id36:1
Processing id37:1
Processing id38:1
Processing id39:1
Processing id40:1
Processing id41:1
Processing id42:1
Processing id43:1
Processing id44:1
Proces

##### Rouge

In [None]:
# if you didn't import and install rouge metric before, please run this code
'''
# import and install rouge metric

%cd /content/
!git clone https://github.com/pltrdy/rouge.git
%cd rouge
!python setup.py install
'''

In [None]:
%cd /content/rouge
from rouge import FilesRouge

hyp_path = r'/content/webnlg-baseline_u/relexicalised_predictions-ter.txt'

ref_path= r'/content/webnlg-baseline_u/all-notdelex-oneref-ter.txt'


files_rouge = FilesRouge()
scores = files_rouge.get_scores(hyp_path, ref_path, avg=True)
scores


/content/rouge


{'rouge-1': {'f': 0.4311607428788771,
  'p': 0.6033267689334919,
  'r': 0.35850062766381807},
 'rouge-2': {'f': 0.13995982441875673,
  'p': 0.18577696053130083,
  'r': 0.12187111188562148},
 'rouge-l': {'f': 0.3755101783420323,
  'p': 0.5328832250544943,
  'r': 0.3098683834951574}}

##### Bert Score

In [None]:
# if you didn't install bert score before, please run this code

#!pip install bert-score

In [None]:
a_file = open("/content/webnlg-baseline_u/all-notdelex-oneref-ter.txt", "r")

ref = []
for line in a_file:
  stripped_line = line.strip()
  #line_list = stripped_line.split()
  ref.append(stripped_line)

a_file.close()

In [None]:
a_file = open("/content/webnlg-baseline_u/relexicalised_predictions-ter.txt", "r")

hyp = []
for line in a_file:
  stripped_line = line.strip()
  #line_list = stripped_line.split()
  hyp.append(stripped_line)

a_file.close()

In [None]:
from bert_score import score
def bert_score_(references, hypothesis, lng='en'):
    from bert_score import score
    for i, refs in enumerate(references):
        references[i] = [ref for ref in refs if ref.strip() != '']
    try:
        P, R, F1 = score(hypothesis, references, lang=lng)
    #     print('FINISHING TO COMPUTE BERT SCORE...')
        P, R, F1 = list(P), list(R), list(F1)
        F1 = float(sum(F1) / len(F1))
        P = float(sum(P) / len(P))
        R = float(sum(R) / len(R))
    except:
        P, R, F1 = 0, 0, 0
    return P, R, F1
 
bert_score_(references=ref,hypothesis=hyp, lng='en' )

Some weights of the model checkpoint at roberta-large were not used when initializing RobertaModel: ['lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.bias', 'lm_head.decoder.weight', 'lm_head.layer_norm.bias', 'lm_head.dense.weight']
- This IS expected if you are initializing RobertaModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


(0.798651397228241, 0.9997825026512146, 0.8878664374351501)

#### Models results

In [None]:
df_pred_tr_u = pd.read_fwf('/content/data_transf_u/pred.txt', header=None)
df_pred_tr_u = df_pred_tr_u.rename(columns={0:'text'})
df_pred_tr_u = df_pred_tr_u[['text']]
df_pred_tr_u = df_pred_tr_u.head(10)
df_pred_tr_u 

Unnamed: 0,text
0,The surface of the order ORDER is made of ORDER .
1,Dessert Dessert Dessert is in Dessert .
2,Pietro Grasso and LEADER are leaders in the co...
3,"One of the ethnic groups in COUNTRY , where th..."
4,CAPITAL is the capital of the COUNTRY . This c...
5,"country is also known as ALTERNATIVENAME , COU..."
6,FOOD FOOD FOOD is in FOOD .
7,COUNTRY ' s leader is LEADER .
8,CAPITAL is the capital of the COUNTRY .
9,Luanda is located in COUNTRY .


In [None]:
val_sample_u

Unnamed: 0,rdf_triple,ref_text
0,FOOD ingredient INGREDIENT FOOD mainIngredient...,The ingredients of binignit include taro and s...
1,Dessert dishVariation DISHVARIATION\n,DISHVARIATION is a type of dessert .
2,FOOD country COUNTRY COUNTRY leader Pietro Gra...,FOOD originates from COUNTRY whose capital cit...
3,COUNTRY leader LEADER FOOD country COUNTRY REG...,FOOD is a dish from REGION and COUNTRY . REGIO...
4,REGION capital CAPITAL REGION language LANGUAG...,FOOD is a Chinese style dish from REGION ( cap...
5,FOOD country COUNTRY FOOD alternativeName ALTE...,FOOD ( or ALTERNATIVENAME ) contains the ingre...
6,FOOD mainIngredient MAININGREDIENT\n,MAININGREDIENT is a main ingredient in FOOD .
7,FOOD country COUNTRY COUNTRY leader LEADER FOO...,FOOD is a dessert found in COUNTRY . The leade...
8,FOOD country COUNTRY COUNTRY capital CAPITAL\n,FOOD comes from the COUNTRY where the capital ...
9,COUNTRY leader Felipe VI of COUNTRY FOOD regio...,"FOOD is from the REGION region in COUNTRY , th..."


In [None]:
prediction_df_tr_u  = pd.DataFrame(columns=['rdf_triple', 'prediction_text'] )
prediction_df_tr_u.rdf_triple = val_sample_u.rdf_triple.values
prediction_df_tr_u.prediction_text = df_pred_tr_u.text.values
prediction_df_tr_u

Unnamed: 0,rdf_triple,prediction_text
0,FOOD ingredient INGREDIENT FOOD mainIngredient...,The surface of the order ORDER is made of ORDER .
1,Dessert dishVariation DISHVARIATION\n,Dessert Dessert Dessert is in Dessert .
2,FOOD country COUNTRY COUNTRY leader Pietro Gra...,Pietro Grasso and LEADER are leaders in the co...
3,COUNTRY leader LEADER FOOD country COUNTRY REG...,"One of the ethnic groups in COUNTRY , where th..."
4,REGION capital CAPITAL REGION language LANGUAG...,CAPITAL is the capital of the COUNTRY . This c...
5,FOOD country COUNTRY FOOD alternativeName ALTE...,"country is also known as ALTERNATIVENAME , COU..."
6,FOOD mainIngredient MAININGREDIENT\n,FOOD FOOD FOOD is in FOOD .
7,FOOD country COUNTRY COUNTRY leader LEADER FOO...,COUNTRY ' s leader is LEADER .
8,FOOD country COUNTRY COUNTRY capital CAPITAL\n,CAPITAL is the capital of the COUNTRY .
9,COUNTRY leader Felipe VI of COUNTRY FOOD regio...,Luanda is located in COUNTRY .


In [None]:
text_comparation_tr_u = pd.DataFrame(columns=['ref_text', 'prediction_text'] )
text_comparation_tr_u.ref_text = val_sample_u.ref_text.values
text_comparation_tr_u.prediction_text = df_pred_tr_u.text.values
text_comparation_tr_u

Unnamed: 0,ref_text,prediction_text
0,The ingredients of binignit include taro and s...,The surface of the order ORDER is made of ORDER .
1,DISHVARIATION is a type of dessert .,Dessert Dessert Dessert is in Dessert .
2,FOOD originates from COUNTRY whose capital cit...,Pietro Grasso and LEADER are leaders in the co...
3,FOOD is a dish from REGION and COUNTRY . REGIO...,"One of the ethnic groups in COUNTRY , where th..."
4,FOOD is a Chinese style dish from REGION ( cap...,CAPITAL is the capital of the COUNTRY . This c...
5,FOOD ( or ALTERNATIVENAME ) contains the ingre...,"country is also known as ALTERNATIVENAME , COU..."
6,MAININGREDIENT is a main ingredient in FOOD .,FOOD FOOD FOOD is in FOOD .
7,FOOD is a dessert found in COUNTRY . The leade...,COUNTRY ' s leader is LEADER .
8,FOOD comes from the COUNTRY where the capital ...,CAPITAL is the capital of the COUNTRY .
9,"FOOD is from the REGION region in COUNTRY , th...",Luanda is located in COUNTRY .
