<a href="https://colab.research.google.com/github/RiccardoCozzi96/DeepComedy/blob/main/Results_comparison.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Results Comparison

We provide this notebook for: 
* Inspecting a single canto by specifying model and temperature
*   Reading the results of all the generations of all the models
*   Finding the best performance for each metric (hendecasyllables, rhymes, etc...)
* Finding the overall best models



##Setup

In [13]:
import pandas as pd
import numpy as np
import json

# we stored our files on Git
import sys
!git clone "https://github.com/RiccardoCozzi96/DeepComedy"

fatal: destination path 'DeepComedy' already exists and is not an empty directory.


In [14]:
path = f"DeepComedy/generated cantos/"
comedy_filename = "DeepComedy/datasets/commedia.txt"
results_filename = "DeepComedy/results.csv"

### Some useful functions

In [15]:
def print_model(model_string, temp=None):
  m = model_string.split("_")
  print("\nencoders\t{}\ndecoders\t{}\ndff     \t{}\nd_model \t{}\nheads   \t{}\nprod epochs\t{}""\ncomedy epochs\t{}"
  .format(m[0], m[1], m[2], m[3], m[4], m[5], m[6]))
  if temp != None:
    print("temperature\t{}".format(temp))

def get_text(model, temp):
  return json.load(open(f"{path}{model}/LOG_{model}.json"))["generations"]["temp_"+temp]

## Load data

In [16]:
data = pd.read_csv(results_filename, index_col=["id"]) 

### Ignore unuseful scores
Some metrics we tried to implement are not useful or not accurate enough. We ignore them for the moment. 

In [17]:
#TEMP change name for hendec
data["hendec_ratio"] = data["hendec"]
data = data.drop(columns=["hendec"])

# TEMP SUBTRACT AVG_SYL
# data["hendec_correctness"] = [1 - score for score in data["avg_syl"].values]
data = data.drop(columns=["avg_syl"])

#TEMP convert the incorrectness to sigmoid value
data["word_correctness"] = [1+x for x in data.incorr]
data = data.drop(columns=["incorr"])

#TEMP parameters to be ignored
data = data.drop(columns=["plagiar"])
data = data.drop(columns=["n_vers"])

data

Unnamed: 0_level_0,model,temperature,struct,rhymes,hendec_ratio,word_correctness
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
0,1_1_256_512_4_0_150,0.5,1.000,0.891,0.87,0.56
1,1_1_256_512_4_0_150,0.6,1.000,0.891,0.90,0.57
2,1_1_256_512_4_0_150,0.7,1.000,0.875,0.90,0.55
3,1_1_256_512_4_0_150,0.8,1.000,0.828,0.92,0.68
4,1_1_256_512_4_0_150,0.9,1.000,0.891,0.90,0.49
...,...,...,...,...,...,...
237,7_7_256_512_4_0_150,1.1,1.000,0.000,0.89,0.72
238,7_7_256_512_4_0_150,1.2,1.000,0.047,0.85,0.68
239,7_7_256_512_4_0_150,1.3,1.000,0.062,0.85,0.62
240,7_7_256_512_4_0_150,1.4,0.917,0.031,0.85,0.65


## Inspect a generated canto 

In [18]:
model = "1_7_256_512_4_0_150"
temp = "0.5"

# show model info and generated text
print_model(model, temp)
print(get_text(model, temp))

# print scores of the selected canto
data.loc[data['model'] == model].loc[data["temperature"] == float(temp)]



encoders	1
decoders	7
dff     	256
d_model 	512
heads   	4
prod epochs	0
comedy epochs	150
temperature	0.5

àhi quànto a dir qual èra è còsa?.
ed elli a me: li spirti ché vien canta;
perche' qui si può dir più óltre riposa.

io t'ho vegnate, ché da se' mandanta,
nón l'aspetto se' tu de la tua ritesta;
ma perche' se' fàtta ha più ti dicanta.

rispuos'io lui: ciò ché tu se' dimostrasta
nón m'è il piànto di cotal ché qui méco
còl sàngue sùo e de la ménte più grasta.

per men vo di là dal móndo se cieco
dél càrro è più di sópra ché si puote
ciò ché 'l móndo nón salir nón regeco.

ma se dal móndo sùo voler vedete,
la càrca mìa memòria la vìsta spenta,
da tutta presa, e pòi conosce piete.

al quàle mi grazia, ché sì s'intenta,
che, se nón fossi, al sapere, scésa
per li raper esser più dov'è aventata.

qui si rivolge, e qui si riscésa,
sòn li tuoi dover, da quella selvaggia
parèa dél tùo parlar sarebbe risposa.

nón lascia mìa dònna cón a più caggia
sóvra pigliàre occhi, e nón aver latesta,


Unnamed: 0_level_0,model,temperature,struct,rhymes,hendec_ratio,word_correctness
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
33,1_7_256_512_4_0_150,0.5,1.0,0.938,0.95,0.52


##Selecting best model for each metric


In [19]:
limit = 5

attributes = data.columns[2:]
best_ids = []

# extract best generations
bests = {a:[] for a in attributes}
for attribute in attributes:
  bests[attribute] = data.sort_values(by=attribute, ascending=False)[:limit]
  best_ids.extend(bests[attribute].index)

for attribute in list(bests):
  print("\n\ntop of {}:\n{}\n".format(attribute, "="*80))
  print(bests[attribute][["model", "temperature", attribute]].head())




top of struct:

                     model  temperature  struct
id                                             
0      1_1_256_512_4_0_150          0.5     1.0
149     5_5_256_512_4_0_70          1.1     1.0
138  5_3_256_512_4_150_150          1.1     1.0
139  5_3_256_512_4_150_150          1.2     1.0
140  5_3_256_512_4_150_150          1.3     1.0


top of rhymes:

                   model  temperature  rhymes
id                                           
36   1_7_256_512_4_0_150          0.8   0.984
143   5_5_256_512_4_0_70          0.5   0.969
25   1_5_256_512_4_0_150          0.8   0.969
114  5_3_256_512_4_0_150          0.9   0.969
169  5_5_256_512_4_70_70          0.9   0.969


top of hendec_ratio:

                     model  temperature  hendec_ratio
id                                                   
44     3_1_256_512_4_0_150          0.5          1.00
83     3_5_256_512_4_0_150          1.1          0.97
68   3_3_256_512_4_150_150          0.7          0.97
37     1_7_2

##Find the best generated text by summing all the scores


In [20]:
sums_column = []
for i in range(len(data)):
  row_score = sum(data.iloc[i, 1:][["struct", "hendec_ratio", "rhymes", "word_correctness"]].values)
  sums_column.append([data.iloc[i].name,
                      data.iloc[i]["model"],
                      data.iloc[i]["temperature"],
                      row_score])

winners = pd.DataFrame(sums_column).sort_values(by=[3], ascending=False).drop(columns=0)[:limit]
winners.columns = ["model", "temperature", "final_score"]
winners

Unnamed: 0,model,temperature,final_score
154,5_5_256_512_4_0_150,0.5,3.532
143,5_5_256_512_4_0_70,0.5,3.529
155,5_5_256_512_4_0_150,0.6,3.512
169,5_5_256_512_4_70_70,0.9,3.509
25,1_5_256_512_4_0_150,0.8,3.479


##And the winners are...


In [21]:
data.iloc[winners.index.values]

Unnamed: 0_level_0,model,temperature,struct,rhymes,hendec_ratio,word_correctness
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
154,5_5_256_512_4_0_150,0.5,1.0,0.922,0.96,0.65
143,5_5_256_512_4_0_70,0.5,1.0,0.969,0.9,0.66
155,5_5_256_512_4_0_150,0.6,1.0,0.922,0.96,0.63
169,5_5_256_512_4_70_70,0.9,1.0,0.969,0.93,0.61
25,1_5_256_512_4_0_150,0.8,1.0,0.969,0.93,0.58


In [22]:
for i in range(len(winners)):
  id = winners.iloc[i].name
  model = winners.iloc[i]["model"]
  temp = str(winners.iloc[i]["temperature"])
  score = winners.iloc[i]["final_score"]

  print("\n{}\nSCORE: {}\n".format("="*80, score))
  print_model(model, temp)
  print("\n{}\n{}\n{}\n".format("-"*40, data.iloc[id], "-"*40))
  print(get_text(model, temp))


SCORE: 3.532


encoders	5
decoders	5
dff     	256
d_model 	512
heads   	4
prod epochs	0
comedy epochs	150
temperature	0.5

----------------------------------------
model               5_5_256_512_4_0_150
temperature                         0.5
struct                                1
rhymes                            0.922
hendec_ratio                       0.96
word_correctness                   0.65
Name: 154, dtype: object
----------------------------------------


àhi quànto a dir qual èra è còsa dùra
ésta prote, e tal lòde óltre pensa,
ché 'ntrati e vecchie, da l'altra richiusi alsura

tòsto ché ne la mìa vòglia richiusa,
ancor ti tolse, e cón le man beatrice
quàndo vegnon posto, ancor mi fìa musa.

i' dìco dopo i nostri mìlle passi mìlle
quàndo la maria, e nón mólto mandura,
mólto sarà di dio, quàndo le spalle!.

o buòn maestro mio, buòn ti madura,
se tu vuo' saperr, di ciò ch'io disiri
per far sùo lòco óve tèmpo s'attura;

sì che, per dilazion dél tùo disiri
la fàccia mìa mente,