# TransE

### Knowledge Graph

Let $KG = (V, E, L; \vdash)$ be a knowledge graph with a set of
    vertices $V$, a set of edges $E \subseteq V \times V$, a label
    function $L: V \cup E \mapsto Lab$ that assigns labels from a set
    of labels $Lab$ to vertices and edges, and an inference relation
    $\vdash$.
    
**A knowledge graph embedding is a function** $f_\eta : L(V) \cup L(E) \mapsto \mathbb{R}^n$. That is, the function takes elements from the set $ L(V) \cup L(E) \subseteq Lab$ and gets elements in $\mathbb{R}^n$, where $n$ is the _embedding size_.

### TransE idea
TransE aims to model multirelational data by representing relationships as **translations** in the following way:

Consider an edge is the graph of the form $(h, \ell, t)$, where $h$ is the head of the edge, $\ell$ is the type of relation and $t$ is the tail of the edge. Let's denote the corresponding embeddings as $\boldsymbol{h}$, $\boldsymbol{\ell}$ and $\boldsymbol{t}$. TransE learns the embeddings such that: 
$$\boldsymbol{h} + \boldsymbol{\ell} \approx \boldsymbol{t}$$

### Objective function
TransE minimizes the following objective function: $$
\mathcal{L}=\sum_{(h, \ell, t) \in S} \sum_{\left(h^{\prime}, \ell, t^{\prime}\right) \in S_{(h, \ell, t)}^{\prime}}\left[\gamma+d(\boldsymbol{h}+\boldsymbol{\ell}, \boldsymbol{t})-d\left(\boldsymbol{h}^{\prime}+\boldsymbol{\ell}, \boldsymbol{t}^{\prime}\right)\right]_{+}
$$

Where $d(\boldsymbol{h}+\boldsymbol{\ell}, \boldsymbol{t})$ is the _dissimilarity_ score of a positive edge. Furthermore, $d\left(\boldsymbol{h}^{\prime}+\boldsymbol{\ell}, \boldsymbol{t}^{\prime}\right)$ is the _dissimilarity_ score for a negative triple obtained by corrupting either the head or tail (but not both) of a positive triple. In this way, TransE favors lower scores for positive edges and big scores for negative edges. 

Regarding the parameter $\gamma$, it is used to enforce that the score of a positive edge is different (lower) than the score of a negative edge by at least $\gamma$.

In [1]:
import sys
sys.path.append("../../../")

import torch as th
import logging

from mowl.datasets.ppi_yeast import PPIYeastSlimDataset

from mowl.embeddings.graph_based.translational.model import TranslationalOnt

ModuleNotFoundError: No module named 'mowl.embeddings.graph_based.translational'

In [4]:
def main():
    dataset = PPIYeastSlimDataset()
    
    model = OntTransE(dataset, parsing_method = "taxonomy_rels")
    
    model.train()
    model.evaluate()

In [6]:
main()

INFO: Number of ontology classes: 11020
INFO: Number of ontology classes: 3610


DEBUG:root:Traininig entities: 11020, relations 9. Testing entities: 3610, relations 1.
DEBUG:root:LEN OF TRAIN TRIPLES: 249064


Training epochs on cuda:   0%|          | 0/5 [00:00<?, ?epoch/s]

Training batches on cuda:   0%|          | 0/7784 [00:00<?, ?batch/s]

Training batches on cuda:   0%|          | 0/7784 [00:00<?, ?batch/s]

Training batches on cuda:   0%|          | 0/7784 [00:00<?, ?batch/s]

Training batches on cuda:   0%|          | 0/7784 [00:00<?, ?batch/s]

Training batches on cuda:   0%|          | 0/7784 [00:00<?, ?batch/s]

INFO: Number of ontology classes: 3610


DEBUG:root:LEN OF TEST TRIPLES: 12040
given. This means you probably forgot to pass (at least) the training triples. Try:

    additional_filter_triples=[dataset.training.mapped_triples]

Or if you want to use the Bordes et al. (2013) approach to filtering, do:

    additional_filter_triples=[
        dataset.training.mapped_triples,
        dataset.validation.mapped_triples,
    ]



Evaluating on cuda:   0%|          | 0.00/12.0k [00:00<?, ?triple/s]

INFO:pykeen.evaluation.evaluator:Evaluation took 1.17s seconds


RankBasedMetricResults(arithmetic_mean_rank={'tail': {'realistic': 275.99547342192693, 'optimistic': 275.9953488372093, 'pessimistic': 275.9955980066445}, 'head': {'realistic': 279.6311877076412, 'optimistic': 279.6310631229236, 'pessimistic': 279.6313122923588}, 'both': {'realistic': 277.81333056478405, 'optimistic': 277.81320598006647, 'pessimistic': 277.81345514950164}}, geometric_mean_rank={'tail': {'realistic': 79.81777770898599, 'optimistic': 79.81777163382525, 'pessimistic': 79.81778378155316}, 'head': {'realistic': 78.97132022208281, 'optimistic': 78.97118571502506, 'pessimistic': 78.97145220471425}, 'both': {'realistic': 79.39342090419937, 'optimistic': 79.39335026974412, 'pessimistic': 79.39349026835346}}, median_rank={'tail': {'realistic': 84.0, 'optimistic': 84.0, 'pessimistic': 84.0}, 'head': {'realistic': 86.0, 'optimistic': 86.0, 'pessimistic': 86.0}, 'both': {'realistic': 85.0, 'optimistic': 85.0, 'pessimistic': 85.0}}, harmonic_mean_rank={'tail': {'realistic': 27.35001