Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
40 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
Novel Link Prediction | ||
===================== | ||
After training, the interaction model (e.g., TransE, ConvE, RotatE) can assign a score to an arbitrary triple, | ||
whether it appeared during training, testing, or not. In PyKEEN, each is implemented such that the higher the score | ||
(or less negative the score), the more likely a triple is to be true. | ||
|
||
However, for most models, these scores do not have obvious statistical interpretations. This has two main consequences: | ||
|
||
1. The score for a triple from one model can not be compared to the score for that triple from another model | ||
2. There is no *a priori* minimum score for a triple to be labeled as true, so predictions must be given as | ||
a prioritization by sorting a set of triples by their respective scores. | ||
|
||
After training a model, there are three high-level interfaces for making predictions: | ||
|
||
1. :func:`pykeen.models.Model.predict_tails` for a given head/relation pair | ||
2. :func:`pykeen.models.Model.predict_heads` for a given relation/tail pair | ||
3. :func:`pykeen.models.Model.predict_top_k_triples` for prioritizing links | ||
|
||
Scientifically, :func:`pykeen.models.Model.predict_top_k_triples` is the most interesting in a scenario where | ||
predictions could be tested and validated experimentally. | ||
|
||
.. code-block:: python | ||
from pykeen.pipeline import pipeline | ||
results = pipeline(dataset='Nations', model='RotatE') | ||
model = results.model | ||
# Predict tails | ||
predicted_tails_df = model.predict_tails('brazil', 'intergovorgs') | ||
# Predict heads | ||
prediced_heads_df = model.predict_heads('conferences', 'brazil') | ||
Potential Caveats | ||
----------------- | ||
The model is trained on its ability to predict the appropriate tail for a given head/relation pair as well as its | ||
ability to predict the appropriate head for a given relation/tail pair. This means that while the model can | ||
technically predict relations between a given head/tail pair, it must be done with the caveat that it was not | ||
trained for this task. |