PyKEEN

PyKEEN (Python KnowlEdge EmbeddiNgs) is a Python package designed to train and evaluate knowledge graph embedding models (incorporating multi-modal information).

Installation • Quickstart • Datasets (37) • Inductive Datasets (5) • Models (40) • Support • Citation

Installation

The latest stable version of PyKEEN requires Python 3.9+. It can be downloaded and installed from PyPI with:

pip install pykeen

The latest version of PyKEEN can be installed directly from the source code on GitHub with:

pip install git+https://github.com/pykeen/pykeen.git

More information about installation (e.g., development mode, Windows installation, Colab, Kaggle, extras) can be found in the installation documentation.

Quickstart

This example shows how to train a model on a dataset and test on another dataset.

The fastest way to get up and running is to use the pipeline function. It provides a high-level entry into the extensible functionality of this package. The following example shows how to train and evaluate the TransE model on the Nations dataset. By default, the training loop uses the stochastic local closed world assumption (sLCWA) training approach and evaluates with rank-based evaluation.

from pykeen.pipeline import pipeline

result = pipeline(
    model='TransE',
    dataset='nations',
)

The results are returned in an instance of the PipelineResult dataclass that has attributes for the trained model, the training loop, the evaluation, and more. See the tutorials on using your own dataset, understanding the evaluation, and making novel link predictions.

PyKEEN is extensible such that:

Each model has the same API, so anything from pykeen.models can be dropped in
Each training loop has the same API, so pykeen.training.LCWATrainingLoop can be dropped in
Triples factories can be generated by the user with from pykeen.triples.TriplesFactory

The full documentation can be found at https://pykeen.readthedocs.io.

Implementation

Below are the models, datasets, training modes, evaluators, and metrics implemented in pykeen.

Datasets

The following 37 datasets are built in to PyKEEN. The citation for each dataset corresponds to either the paper describing the dataset, the first paper published using the dataset with knowledge graph embedding models, or the URL for the dataset if neither of the first two are available. If you want to use a custom dataset, see the Bring Your Own Dataset tutorial. If you have a suggestion for another dataset to include in PyKEEN, please let us know here.

Name	Documentation	Citation	Entities	Relations	Triples
Aristo-v4	`pykeen.datasets.AristoV4`	Chen et al., 2021	42016	1593	279425
BioKG	`pykeen.datasets.BioKG`	Walsh et al., 2019	105524	17	2067997
Clinical Knowledge Graph	`pykeen.datasets.CKG`	Santos et al., 2020	7617419	11	26691525
CN3l Family	`pykeen.datasets.CN3l`	Chen et al., 2017	3206	42	21777
CoDEx (large)	`pykeen.datasets.CoDExLarge`	Safavi et al., 2020	77951	69	612437
CoDEx (medium)	`pykeen.datasets.CoDExMedium`	Safavi et al., 2020	17050	51	206205
CoDEx (small)	`pykeen.datasets.CoDExSmall`	Safavi et al., 2020	2034	42	36543
ConceptNet	`pykeen.datasets.ConceptNet`	Speer et al., 2017	28370083	50	34074917
Countries	`pykeen.datasets.Countries`	Bouchard et al., 2015	271	2	1158
Commonsense Knowledge Graph	`pykeen.datasets.CSKG`	Ilievski et al., 2020	2087833	58	4598728
DB100K	`pykeen.datasets.DB100K`	Ding et al., 2018	99604	470	697479
DBpedia50	`pykeen.datasets.DBpedia50`	Shi et al., 2017	24624	351	34421
Drug Repositioning Knowledge Graph	`pykeen.datasets.DRKG`	`gnn4dr/DRKG`	97238	107	5874257
FB15k	`pykeen.datasets.FB15k`	Bordes et al., 2013	14951	1345	592213
FB15k-237	`pykeen.datasets.FB15k237`	Toutanova et al., 2015	14505	237	310079
Global Biotic Interactions	`pykeen.datasets.Globi`	Poelen et al., 2014	404207	39	1966385
Hetionet	`pykeen.datasets.Hetionet`	Himmelstein et al., 2017	45158	24	2250197
Kinships	`pykeen.datasets.Kinships`	Kemp et al., 2006	104	25	10686
Nations	`pykeen.datasets.Nations`	`ZhenfengLei/KGDatasets`	14	55	1992
NationsL	`pykeen.datasets.NationsLiteral`	`pykeen/pykeen`	14	55	1992
OGB BioKG	`pykeen.datasets.OGBBioKG`	Hu et al., 2020	93773	51	5088434
OGB WikiKG2	`pykeen.datasets.OGBWikiKG2`	Hu et al., 2020	2500604	535	17137181
OpenBioLink	`pykeen.datasets.OpenBioLink`	Breit et al., 2020	180992	28	4563407
OpenBioLink LQ	`pykeen.datasets.OpenBioLinkLQ`	Breit et al., 2020	480876	32	27320889
OpenEA Family	`pykeen.datasets.OpenEA`	Sun et al., 2020	15000	248	38265
PharMeBINet	`pykeen.datasets.PharMeBINet`	Königs et al., 2022	2869407	208	15883653
PharmKG	`pykeen.datasets.PharmKG`	Zheng et al., 2020	188296	39	1093236
PharmKG8k	`pykeen.datasets.PharmKG8k`	Zheng et al., 2020	7247	28	485787
PrimeKG	`pykeen.datasets.PrimeKG`	Chandak et al., 2022	129375	30	8100498
Unified Medical Language System	`pykeen.datasets.UMLS`	`ZhenfengLei/KGDatasets`	135	46	6529
WD50K (triples)	`pykeen.datasets.WD50KT`	Galkin et al., 2020	40107	473	232344
Wikidata5M	`pykeen.datasets.Wikidata5M`	Wang et al., 2019	4594149	822	20624239
WK3l-120k Family	`pykeen.datasets.WK3l120k`	Chen et al., 2017	119748	3109	1375406
WK3l-15k Family	`pykeen.datasets.WK3l15k`	Chen et al., 2017	15126	1841	209041
WordNet-18	`pykeen.datasets.WN18`	Bordes et al., 2014	40943	18	151442
WordNet-18 (RR)	`pykeen.datasets.WN18RR`	Toutanova et al., 2015	40559	11	92583
YAGO3-10	`pykeen.datasets.YAGO310`	Mahdisoltani et al., 2015	123143	37	1089000

Inductive Datasets

The following 5 inductive datasets are built in to PyKEEN.

Name	Documentation	Citation
ILPC2022 Large	`pykeen.datasets.ILPC2022Large`	Galkin et al., 2022
ILPC2022 Small	`pykeen.datasets.ILPC2022Small`	Galkin et al., 2022
FB15k-237	`pykeen.datasets.InductiveFB15k237`	Teru et al., 2020
NELL	`pykeen.datasets.InductiveNELL`	Teru et al., 2020
WordNet-18 (RR)	`pykeen.datasets.InductiveWN18RR`	Teru et al., 2020

Representations

The following 22 representations are implemented by PyKEEN.

Name	Reference
Backfill	`pykeen.nn.BackfillRepresentation`
Text Encoding	`pykeen.nn.BiomedicalCURIERepresentation`
Combined	`pykeen.nn.CombinedRepresentation`
Embedding	`pykeen.nn.Embedding`
EmbeddingBag	`pykeen.nn.EmbeddingBagRepresentation`
Featurized Message Passing	`pykeen.nn.FeaturizedMessagePassingRepresentation`
Low Rank Embedding	`pykeen.nn.LowRankRepresentation`
Multi-Backfill	`pykeen.nn.MultiBackfillRepresentation`
NodePiece	`pykeen.nn.NodePieceRepresentation`
Partition	`pykeen.nn.PartitionRepresentation`
R-GCN	`pykeen.nn.RGCNRepresentation`
Simple Message Passing	`pykeen.nn.SimpleMessagePassingRepresentation`
CompGCN	`pykeen.nn.SingleCompGCNRepresentation`
Subset Representation	`pykeen.nn.SubsetRepresentation`
Tensor-Train	`pykeen.nn.TensorTrainRepresentation`
Text Encoding	`pykeen.nn.TextRepresentation`
Tokenization	`pykeen.nn.TokenizationRepresentation`
Transformed	`pykeen.nn.TransformedRepresentation`
Typed Message Passing	`pykeen.nn.TypedMessagePassingRepresentation`
Visual	`pykeen.nn.VisualRepresentation`
Wikidata Text Encoding	`pykeen.nn.WikidataTextRepresentation`
Wikidata Visual	`pykeen.nn.WikidataVisualRepresentation`

Interactions

The following 34 interactions are implemented by PyKEEN.

Name	Reference	Citation
AutoSF	`pykeen.nn.AutoSFInteraction`	Zhang et al., 2020
BoxE	`pykeen.nn.BoxEInteraction`	Abboud et al., 2020
ComplEx	`pykeen.nn.ComplExInteraction`	Trouillon et al., 2016
ConvE	`pykeen.nn.ConvEInteraction`	Dettmers et al., 2018
ConvKB	`pykeen.nn.ConvKBInteraction`	Nguyen et al., 2018
Canonical Tensor Decomposition	`pykeen.nn.CPInteraction`	Lacroix et al., 2018
CrossE	`pykeen.nn.CrossEInteraction`	Zhang et al., 2019
DistMA	`pykeen.nn.DistMAInteraction`	Shi et al., 2019
DistMult	`pykeen.nn.DistMultInteraction`	Yang et al., 2014
ER-MLP	`pykeen.nn.ERMLPInteraction`	Dong et al., 2014
ER-MLP (E)	`pykeen.nn.ERMLPEInteraction`	Sharifzadeh et al., 2019
HolE	`pykeen.nn.HolEInteraction`	Nickel et al., 2016
KG2E	`pykeen.nn.KG2EInteraction`	He et al., 2015
LineaRE	`pykeen.nn.LineaREInteraction`	Peng et al., 2020
MultiLinearTucker	`pykeen.nn.MultiLinearTuckerInteraction`	Tucker et al., 1966
MuRE	`pykeen.nn.MuREInteraction`	Balažević et al., 2019
NTN	`pykeen.nn.NTNInteraction`	Socher et al., 2013
PairRE	`pykeen.nn.PairREInteraction`	Chao et al., 2020
ProjE	`pykeen.nn.ProjEInteraction`	Shi et al., 2017
QuatE	`pykeen.nn.QuatEInteraction`	Zhang et al., 2019
RESCAL	`pykeen.nn.RESCALInteraction`	Nickel et al., 2011
RotatE	`pykeen.nn.RotatEInteraction`	Sun et al., 2019
Structured Embedding	`pykeen.nn.SEInteraction`	Bordes et al., 2011
SimplE	`pykeen.nn.SimplEInteraction`	Kazemi et al., 2018
TorusE	`pykeen.nn.TorusEInteraction`	Ebisu et al., 2018
TransD	`pykeen.nn.TransDInteraction`	Ji et al., 2015
TransE	`pykeen.nn.TransEInteraction`	Bordes et al., 2013
TransF	`pykeen.nn.TransFInteraction`	Feng et al., 2016
Transformer	`pykeen.nn.TransformerInteraction`	Galkin et al., 2020
TransH	`pykeen.nn.TransHInteraction`	Wang et al., 2014
TransR	`pykeen.nn.TransRInteraction`	Lin et al., 2015
TripleRE	`pykeen.nn.TripleREInteraction`	Yu et al., 2021
TuckER	`pykeen.nn.TuckERInteraction`	Balažević et al., 2019
Unstructured Model	`pykeen.nn.UMInteraction`	Bordes et al., 2014

Models

The following 40 models are implemented by PyKEEN.

Name	Model	Citation
AutoSF	`pykeen.models.AutoSF`	Zhang et al., 2020
BoxE	`pykeen.models.BoxE`	Abboud et al., 2020
Canonical Tensor Decomposition	`pykeen.models.CP`	Lacroix et al., 2018
CompGCN	`pykeen.models.CompGCN`	Vashishth et al., 2020
ComplEx	`pykeen.models.ComplEx`	Trouillon et al., 2016
ComplEx Literal	`pykeen.models.ComplExLiteral`	Kristiadi et al., 2018
ConvE	`pykeen.models.ConvE`	Dettmers et al., 2018
ConvKB	`pykeen.models.ConvKB`	Nguyen et al., 2018
CooccurrenceFiltered	`pykeen.models.CooccurrenceFilteredModel`	Berrendorf et al., 2022
CrossE	`pykeen.models.CrossE`	Zhang et al., 2019
DistMA	`pykeen.models.DistMA`	Shi et al., 2019
DistMult	`pykeen.models.DistMult`	Yang et al., 2014
DistMult Literal	`pykeen.models.DistMultLiteral`	Kristiadi et al., 2018
DistMult Literal (Gated)	`pykeen.models.DistMultLiteralGated`	Kristiadi et al., 2018
ER-MLP	`pykeen.models.ERMLP`	Dong et al., 2014
ER-MLP (E)	`pykeen.models.ERMLPE`	Sharifzadeh et al., 2019
Fixed Model	`pykeen.models.FixedModel`	Berrendorf et al., 2021
HolE	`pykeen.models.HolE`	Nickel et al., 2016
InductiveNodePiece	`pykeen.models.InductiveNodePiece`	Galkin et al., 2021
InductiveNodePieceGNN	`pykeen.models.InductiveNodePieceGNN`	Galkin et al., 2021
KG2E	`pykeen.models.KG2E`	He et al., 2015
MuRE	`pykeen.models.MuRE`	Balažević et al., 2019
NTN	`pykeen.models.NTN`	Socher et al., 2013
NodePiece	`pykeen.models.NodePiece`	Galkin et al., 2021
PairRE	`pykeen.models.PairRE`	Chao et al., 2020
ProjE	`pykeen.models.ProjE`	Shi et al., 2017
QuatE	`pykeen.models.QuatE`	Zhang et al., 2019
R-GCN	`pykeen.models.RGCN`	Schlichtkrull et al., 2018
RESCAL	`pykeen.models.RESCAL`	Nickel et al., 2011
RotatE	`pykeen.models.RotatE`	Sun et al., 2019
SimplE	`pykeen.models.SimplE`	Kazemi et al., 2018
Structured Embedding	`pykeen.models.SE`	Bordes et al., 2011
TorusE	`pykeen.models.TorusE`	Ebisu et al., 2018
TransD	`pykeen.models.TransD`	Ji et al., 2015
TransE	`pykeen.models.TransE`	Bordes et al., 2013
TransF	`pykeen.models.TransF`	Feng et al., 2016
TransH	`pykeen.models.TransH`	Wang et al., 2014
TransR	`pykeen.models.TransR`	Lin et al., 2015
TuckER	`pykeen.models.TuckER`	Balažević et al., 2019
Unstructured Model	`pykeen.models.UM`	Bordes et al., 2014

Losses

The following 15 losses are implemented by PyKEEN.

Name	Reference	Description
Adversarially weighted binary cross entropy (with logits)	`pykeen.losses.AdversarialBCEWithLogitsLoss`	An adversarially weighted BCE loss.
Binary cross entropy (after sigmoid)	`pykeen.losses.BCEAfterSigmoidLoss`	The numerically unstable version of explicit Sigmoid + BCE loss.
Binary cross entropy (with logits)	`pykeen.losses.BCEWithLogitsLoss`	The binary cross entropy loss.
Cross entropy	`pykeen.losses.CrossEntropyLoss`	The cross entropy loss that evaluates the cross entropy after softmax output.
Double Margin	`pykeen.losses.DoubleMarginLoss`	A limit-based scoring loss, with separate margins for positive and negative elements from [sun2018]_.
Focal	`pykeen.losses.FocalLoss`	The focal loss proposed by [lin2018]_.
InfoNCE loss with additive margin	`pykeen.losses.InfoNCELoss`	The InfoNCE loss with additive margin proposed by [wang2022]_.
Margin ranking	`pykeen.losses.MarginRankingLoss`	The pairwise hinge loss (i.e., margin ranking loss).
Mean squared error	`pykeen.losses.MSELoss`	The mean squared error loss.
Self-adversarial negative sampling	`pykeen.losses.NSSALoss`	The self-adversarial negative sampling loss function proposed by [sun2019]_.
Pairwise logistic	`pykeen.losses.PairwiseLogisticLoss`	The pairwise logistic loss.
Pointwise Hinge	`pykeen.losses.PointwiseHingeLoss`	The pointwise hinge loss.
Soft margin ranking	`pykeen.losses.SoftMarginRankingLoss`	The soft pairwise hinge loss (i.e., soft margin ranking loss).
Softplus	`pykeen.losses.SoftplusLoss`	The pointwise logistic loss (i.e., softplus loss).
Soft Pointwise Hinge	`pykeen.losses.SoftPointwiseHingeLoss`	The soft pointwise hinge loss.

Regularizers

The following 6 regularizers are implemented by PyKEEN.

Name	Reference	Description
combined	`pykeen.regularizers.CombinedRegularizer`	A convex combination of regularizers.
lp	`pykeen.regularizers.LpRegularizer`	A simple L_p norm based regularizer.
no	`pykeen.regularizers.NoRegularizer`	A regularizer which does not perform any regularization.
normlimit	`pykeen.regularizers.NormLimitRegularizer`	A regularizer which formulates a soft constraint on a maximum norm.
orthogonality	`pykeen.regularizers.OrthogonalityRegularizer`	A regularizer for the soft orthogonality constraints from [wang2014]_.
powersum	`pykeen.regularizers.PowerSumRegularizer`	A simple x^p based regularizer.

Training Loops

The following 3 training loops are implemented in PyKEEN.

Name	Reference	Description
lcwa	`pykeen.training.LCWATrainingLoop`	A training loop that is based upon the local closed world assumption (LCWA).
slcwa	`pykeen.training.SLCWATrainingLoop`	A training loop that uses the stochastic local closed world assumption training approach.
symmetriclcwa	`pykeen.training.SymmetricLCWATrainingLoop`	A "symmetric" LCWA scoring heads and tails at once.

Negative Samplers

The following 3 negative samplers are implemented in PyKEEN.

Name	Reference	Description
basic	`pykeen.sampling.BasicNegativeSampler`	A basic negative sampler.
bernoulli	`pykeen.sampling.BernoulliNegativeSampler`	An implementation of the Bernoulli negative sampling approach proposed by [wang2014]_.
pseudotyped	`pykeen.sampling.PseudoTypedNegativeSampler`	A sampler that accounts for which entities co-occur with a relation.

Stoppers

The following 2 stoppers are implemented in PyKEEN.

Name	Reference	Description
early	`pykeen.stoppers.EarlyStopper`	A harness for early stopping.
nop	`pykeen.stoppers.NopStopper`	A stopper that does nothing.

Evaluators

The following 5 evaluators are implemented in PyKEEN.

Name	Reference	Description
classification	`pykeen.evaluation.ClassificationEvaluator`	An evaluator that uses a classification metrics.
macrorankbased	`pykeen.evaluation.MacroRankBasedEvaluator`	Macro-average rank-based evaluation.
ogb	`pykeen.evaluation.OGBEvaluator`	A sampled, rank-based evaluator that applies a custom OGB evaluation.
rankbased	`pykeen.evaluation.RankBasedEvaluator`	A rank-based evaluator for KGE models.
sampledrankbased	`pykeen.evaluation.SampledRankBasedEvaluator`	A rank-based evaluator using sampled negatives instead of all negatives.

Metrics

The following 44 metrics are implemented in PyKEEN.

Name	Interval	Direction	Description	Type
Accuracy	$[0, 1]$	📈	The ratio of the number of correct classifications to the total number.	Classification
Area Under The Receiver Operating Characteristic Curve	$[0, 1]$	📈	The area under the receiver operating characteristic curve.	Classification
Average Precision Score	$[0, 1]$	📈	The average precision across different thresholds.	Classification
Balanced Accuracy Score	$[0, 1]$	📈	The average of recall obtained on each class.	Classification
Diagnostic Odds Ratio	$[0, ∞)$	📈	The ratio of positive and negative likelihood ratio.	Classification
F1 Score	$[0, 1]$	📈	The harmonic mean of precision and recall.	Classification
False Discovery Rate	$[0, 1]$	📉	The proportion of predicted negatives which are true positive.	Classification
False Negative Rate	$[0, 1]$	📉	The probability that a truly positive triple is predicted negative.	Classification
False Omission Rate	$[0, 1]$	📉	The proportion of predicted positives which are true negative.	Classification
False Positive Rate	$[0, 1]$	📉	The probability that a truly negative triple is predicted positive.	Classification
Fowlkes Mallows Index	$[0, 1]$	📈	The Fowlkes Mallows index.	Classification
Informedness	$[-1, 1]$	📈	The informedness metric.	Classification
Matthews Correlation Coefficient	$[-1, 1]$	📈	The Matthews Correlation Coefficient (MCC).	Classification
Negative Likelihood Ratio	$[0, ∞)$	📉	The ratio of false positive rate to true positive rate.	Classification
Negative Predictive Value	$[0, 1]$	📈	The proportion of predicted negatives which are true negatives.	Classification
Number of Scores	$[0, ∞)$	📈	The number of scores.	Classification
Positive Likelihood Ratio	$[0, ∞)$	📈	The ratio of true positive rate to false positive rate.	Classification
Positive Predictive Value	$[0, 1]$	📈	The proportion of predicted positives which are true positive.	Classification
Prevalence Threshold	$[0, ∞)$	📉	The prevalence threshold.	Classification
Threat Score	$[0, 1]$	📈	The harmonic mean of precision and recall.	Classification
True Negative Rate	$[0, 1]$	📈	The probability that a truly false triple is predicted negative.	Classification
True Positive Rate	$[0, 1]$	📈	The probability that a truly positive triple is predicted positive.	Classification
Adjusted Arithmetic Mean Rank (AAMR)	$[0, 2)$	📉	The mean over all ranks divided by its expected value.	Ranking
Adjusted Arithmetic Mean Rank Index (AAMRI)	$[-1, 1]$	📈	The re-indexed adjusted mean rank (AAMR)	Ranking
Adjusted Geometric Mean Rank Index (AGMRI)	$(\frac{-E[f]}{1-E[f]}, 1]$	📈	The re-indexed adjusted geometric mean rank (AGMRI)	Ranking
Adjusted Hits at K	$(\frac{-E[f]}{1-E[f]}, 1]$	📈	The re-indexed adjusted hits at K	Ranking
Adjusted Inverse Harmonic Mean Rank	$(\frac{-E[f]}{1-E[f]}, 1]$	📈	The re-indexed adjusted MRR	Ranking
Geometric Mean Rank (GMR)	$[1, ∞)$	📉	The geometric mean over all ranks.	Ranking
Harmonic Mean Rank (HMR)	$[1, ∞)$	📉	The harmonic mean over all ranks.	Ranking
Hits @ K	$[0, 1]$	📈	The relative frequency of ranks not larger than a given k.	Ranking
Inverse Arithmetic Mean Rank (IAMR)	$(0, 1]$	📈	The inverse of the arithmetic mean over all ranks.	Ranking
Inverse Geometric Mean Rank (IGMR)	$(0, 1]$	📈	The inverse of the geometric mean over all ranks.	Ranking
Inverse Median Rank	$(0, 1]$	📈	The inverse of the median over all ranks.	Ranking
Mean Rank (MR)	$[1, ∞)$	📉	The arithmetic mean over all ranks.	Ranking
Mean Reciprocal Rank (MRR)	$(0, 1]$	📈	The inverse of the harmonic mean over all ranks.	Ranking
Median Rank	$[1, ∞)$	📉	The median over all ranks.	Ranking
z-Geometric Mean Rank (zGMR)	$(-∞, ∞)$	📈	The z-scored geometric mean rank	Ranking
z-Hits at K	$(-∞, ∞)$	📈	The z-scored hits at K	Ranking
z-Mean Rank (zMR)	$(-∞, ∞)$	📈	The z-scored mean rank	Ranking
z-Mean Reciprocal Rank (zMRR)	$(-∞, ∞)$	📈	The z-scored mean reciprocal rank	Ranking

Trackers

The following 8 trackers are implemented in PyKEEN.

Name	Reference	Description
console	`pykeen.trackers.ConsoleResultTracker`	A class that directly prints to console.
csv	`pykeen.trackers.CSVResultTracker`	Tracking results to a CSV file.
json	`pykeen.trackers.JSONResultTracker`	Tracking results to a JSON lines file.
mlflow	`pykeen.trackers.MLFlowResultTracker`	A tracker for MLflow.
neptune	`pykeen.trackers.NeptuneResultTracker`	A tracker for Neptune.ai.
python	`pykeen.trackers.PythonResultTracker`	A tracker which stores everything in Python dictionaries.
tensorboard	`pykeen.trackers.TensorBoardResultTracker`	A tracker for TensorBoard.
wandb	`pykeen.trackers.WANDBResultTracker`	A tracker for Weights and Biases.

Experimentation

Reproduction

PyKEEN includes a set of curated experimental settings for reproducing past landmark experiments. They can be accessed and run like:

pykeen experiments reproduce tucker balazevic2019 fb15k

Where the three arguments are the model name, the reference, and the dataset. The output directory can be optionally set with -d.

Ablation

PyKEEN includes the ability to specify ablation studies using the hyper-parameter optimization module. They can be run like:

pykeen experiments ablation ~/path/to/config.json

Large-scale Reproducibility and Benchmarking Study

We used PyKEEN to perform a large-scale reproducibility and benchmarking study which are described in our article:

@article{ali2020benchmarking,
  author={Ali, Mehdi and Berrendorf, Max and Hoyt, Charles Tapley and Vermue, Laurent and Galkin, Mikhail and Sharifzadeh, Sahand and Fischer, Asja and Tresp, Volker and Lehmann, Jens},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  title={Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph Embedding Models under a Unified Framework},
  year={2021},
  pages={1-1},
  doi={10.1109/TPAMI.2021.3124805}}
}

We have made all code, experimental configurations, results, and analyses that lead to our interpretations available at https://github.com/pykeen/benchmarking.

Contributing

Contributions, whether filing an issue, making a pull request, or forking, are appreciated. See CONTRIBUTING.md for more information on getting involved.

If you have questions, please use the GitHub discussions feature at https://github.com/pykeen/pykeen/discussions/new.

Acknowledgements

Supporters

This project has been supported by several organizations (in alphabetical order):

Funding

The development of PyKEEN has been funded by the following grants:

Funding Body	Program	Grant
DARPA	Young Faculty Award (PI: Benjamin Gyori)	W911NF2010255
DARPA	Automating Scientific Knowledge Extraction (ASKE)	HR00111990009
German Federal Ministry of Education and Research (BMBF)	Maschinelles Lernen mit Wissensgraphen (MLWin)	01IS18050D
German Federal Ministry of Education and Research (BMBF)	Munich Center for Machine Learning (MCML)	01IS18036A
Innovation Fund Denmark (Innovationsfonden)	Danish Center for Big Data Analytics driven Innovation (DABAI)	Grand Solutions

Logo

The PyKEEN logo was designed by Carina Steinborn

Citation

If you have found PyKEEN useful in your work, please consider citing our article:

@article{ali2021pykeen,
    author = {Ali, Mehdi and Berrendorf, Max and Hoyt, Charles Tapley and Vermue, Laurent and Sharifzadeh, Sahand and Tresp, Volker and Lehmann, Jens},
    journal = {Journal of Machine Learning Research},
    number = {82},
    pages = {1--6},
    title = {{PyKEEN 1.0: A Python Library for Training and Evaluating Knowledge Graph Embeddings}},
    url = {http://jmlr.org/papers/v22/20-825.html},
    volume = {22},
    year = {2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2,969 Commits
.github		.github
benchmarking		benchmarking
docs/source		docs/source
notebooks		notebooks
src/pykeen		src/pykeen
tests		tests
.gitignore		.gitignore
.mailmap		.mailmap
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
AUTHORS.md		AUTHORS.md
CHANGELOG.rst		CHANGELOG.rst
CITATION.bib		CITATION.bib
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
tox.ini		tox.ini

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

PyKEEN

Installation

Quickstart

Implementation

Datasets

Inductive Datasets

Representations

Interactions

Models

Losses

Regularizers

Training Loops

Negative Samplers

Stoppers

Evaluators

Metrics

Trackers

Experimentation

Reproduction

Ablation

Large-scale Reproducibility and Benchmarking Study

Contributing

Acknowledgements

Supporters

Funding

Logo

Citation

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 22

Sponsor this project

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages