# Probability Calibration in KG Embedding
This experiemnt is to investigate which calibration technique is the most suitable one given a dataset and a KG Embedding model.

Within this experiment, we are going to see the performance of 4 typical calibration techniques for 4 KGE models in 3 datasets:
- calibration techniques:
  - Platt Scaling
  - Isotonic Regression
  - Histogram BInning
  - Beta Calibration
- KG Embedding models
  - TransE
  - ComplEx
  - DistMult
  - HoLE
- Datasets
  - FB13k
  - Win11
  - Yago39

In [1]:
import sys
# enable importing the modules from probcalkge
sys.path.append('../')
sys.path.append('../probcalkge')

In [2]:
import importlib
from pprint import pprint
import numpy as np
import pandas as pd

In [3]:
from ampligraph.latent_features import RandomBaseline, TransE
import probcalkge
importlib.reload(probcalkge)
from probcalkge import Experiment
from probcalkge import get_calibrators
from probcalkge import get_datasets, get_fb13, get_wn11, get_kgemodels, get_yago39
from probcalkge import brier_score, negative_log_loss, ks_error

In [4]:
ds = get_datasets()
cals = get_calibrators()
kges = get_kgemodels()




In [5]:
exp = Experiment(
    cals=[cals.uncal, cals.platt, cals.isot, cals.histbin, cals.beta, cals.temperature], 
    datasets=[ds.fb13, ds.wn18, ds.yago39, ds.dp50, ds.nations, ds.kinship, ds.umls], 
    kges=[kges.transE, kges.complEx, kges.distMult, kges.hoLE], 
    metrics=[brier_score, negative_log_loss, ks_error]
    )

In [7]:
# exp.load_trained_kges('../saved_models/')
exp.train_kges()

training TransE on FB13k ...


Average TransE Loss:   1.088411: 100%|██████████| 100/100 [19:01<00:00, 11.42s/epoch]


training TransE on WN11 ...


Average TransE Loss:   0.960336: 100%|██████████| 100/100 [08:26<00:00,  5.07s/epoch]


training TransE on YAGO39 ...


Average TransE Loss:   0.907999: 100%|██████████| 100/100 [13:49<00:00,  8.30s/epoch]


training TransE on DBpedia50 ...


Average TransE Loss:   1.025704: 100%|██████████| 100/100 [04:55<00:00,  2.95s/epoch]


training TransE on Nations ...


Average TransE Loss:   1.370893: 100%|██████████| 100/100 [00:07<00:00, 13.40epoch/s]


training TransE on Kinship ...


Average TransE Loss:   1.365408: 100%|██████████| 100/100 [00:11<00:00,  8.50epoch/s]


training TransE on UMLS ...


Average TransE Loss:   1.204408: 100%|██████████| 100/100 [00:10<00:00,  9.22epoch/s]


training ComplEx on FB13k ...


Average ComplEx Loss:   0.191410: 100%|██████████| 100/100 [44:20<00:00, 26.60s/epoch]


training ComplEx on WN11 ...


Average ComplEx Loss:   0.008652: 100%|██████████| 100/100 [18:57<00:00, 11.38s/epoch]


training ComplEx on YAGO39 ...


Average ComplEx Loss:   0.064476: 100%|██████████| 100/100 [35:39<00:00, 21.39s/epoch]


training ComplEx on DBpedia50 ...


Average ComplEx Loss:   0.088023: 100%|██████████| 100/100 [10:22<00:00,  6.22s/epoch]


training ComplEx on Nations ...


Average ComplEx Loss:   0.739790: 100%|██████████| 100/100 [00:11<00:00,  8.43epoch/s]


training ComplEx on Kinship ...


Average ComplEx Loss:   0.312642: 100%|██████████| 100/100 [00:28<00:00,  3.53epoch/s]


training ComplEx on UMLS ...


Average ComplEx Loss:   0.423243: 100%|██████████| 100/100 [00:24<00:00,  4.09epoch/s]


training DistMult on FB13k ...


Average DistMult Loss:   0.209913: 100%|██████████| 100/100 [18:47<00:00, 11.27s/epoch]


training DistMult on WN11 ...


Average DistMult Loss:   0.021578: 100%|██████████| 100/100 [08:40<00:00,  5.21s/epoch]


training DistMult on YAGO39 ...


Average DistMult Loss:   0.106434: 100%|██████████| 100/100 [13:53<00:00,  8.33s/epoch]


training DistMult on DBpedia50 ...


Average DistMult Loss:   0.281309: 100%|██████████| 100/100 [05:01<00:00,  3.01s/epoch]


training DistMult on Nations ...


Average DistMult Loss:   0.955975: 100%|██████████| 100/100 [00:09<00:00, 10.87epoch/s]


training DistMult on Kinship ...


Average DistMult Loss:   0.467793: 100%|██████████| 100/100 [00:16<00:00,  5.96epoch/s]


training DistMult on UMLS ...


Average DistMult Loss:   0.485556: 100%|██████████| 100/100 [00:14<00:00,  6.97epoch/s]


training HolE on FB13k ...


Average HolE Loss:   0.722732: 100%|██████████| 100/100 [44:29<00:00, 26.69s/epoch]


training HolE on WN11 ...


Average HolE Loss:   0.723679: 100%|██████████| 100/100 [19:11<00:00, 11.52s/epoch]


training HolE on YAGO39 ...


Average HolE Loss:   0.326513: 100%|██████████| 100/100 [43:54<00:00, 26.35s/epoch]


training HolE on DBpedia50 ...


Average HolE Loss:   0.693485: 100%|██████████| 100/100 [10:34<00:00,  6.34s/epoch]


training HolE on Nations ...


Average HolE Loss:   0.952201: 100%|██████████| 100/100 [00:15<00:00,  6.41epoch/s]


training HolE on Kinship ...


Average HolE Loss:   0.476121: 100%|██████████| 100/100 [00:46<00:00,  2.16epoch/s]


training HolE on UMLS ...


Average HolE Loss:   0.731894: 100%|██████████| 100/100 [00:37<00:00,  2.68epoch/s]


In [11]:
exp.cals.append(cals.enir)
exp_res = exp.run_with_trained_kges()
# exp.save_trained_kges('../saved_models/new/')

training various calibrators for TransE on FB13k ...
True
training various calibrators for ComplEx on FB13k ...
True


ValueError: y_prob contains values greater than 1.

In [10]:
exp_res.to_frame()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,ExpRes
dataset,kge,cal,metric,Unnamed: 4_level_1
FB13k,TransE,UncalCalibrator,brier_score,0.241895
FB13k,TransE,UncalCalibrator,negative_log_loss,0.676001
FB13k,TransE,UncalCalibrator,ks_error,0.099063
FB13k,TransE,PlattCalibrator,brier_score,0.211441
FB13k,TransE,PlattCalibrator,negative_log_loss,0.614908
...,...,...,...,...
UMLS,HolE,BetaCalibrator,negative_log_loss,0.293783
UMLS,HolE,BetaCalibrator,ks_error,0.013610
UMLS,HolE,TemperatureCalibrator,brier_score,0.101561
UMLS,HolE,TemperatureCalibrator,negative_log_loss,0.351584


In [None]:
exp._train_cal_and_eval(exp.trained_kges['YAGO39']['ComplEx'], ds.yago39)

training various calibrators for ComplEx on YAGO39 ...


RuntimeError: all elements of input should be between 0 and 1