Better explain where models' defaults come from (#181)

Co-authored-by: PyKEEN_bot <pykeen2019@gmail.com> Co-authored-by: mali <ali-mehdi@live.de> Co-authored-by: Max Berrendorf <berrendorf@dbs.ifi.lmu.de>
pykeen · Dec 3, 2020 · 5b78cec · 5b78cec
1 parent 0d22383
commit 5b78cec
Show file tree

Hide file tree

Showing 5 changed files with 37 additions and 8 deletions.
diff --git a/README.md b/README.md
@@ -243,6 +243,10 @@ in ``pykeen``.
 | random | [`optuna.samplers.RandomSampler`](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.samplers.RandomSampler.html) | Sampler using random sampling.                                  |
 | tpe    | [`optuna.samplers.TPESampler`](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.samplers.TPESampler.html)       | Sampler using TPE (Tree-structured Parzen Estimator) algorithm. |
 
+Any sampler class extending the [optuna.samplers.BaseSampler](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.samplers.BaseSampler.html#optuna.samplers.BaseSampler),
+such as their sampler implementing the [CMA-ES](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.samplers.CmaEsSampler.html#optuna.samplers.CmaEsSampler)
+algorithm, can also be used.
+
 ## Experimentation
 
 ### Reproduction

diff --git a/docs/source/references.rst b/docs/source/references.rst
@@ -3,5 +3,8 @@ References
 .. automodule:: pykeen.models.unimodal
 .. automodule:: pykeen.models.multimodal
 
+.. [ali2020a] Ali, M., *et al.* (2020). `Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge
+   Graph Embedding Models Under a Unified Framework <http://arxiv.org/abs/2006.13365>`_. *arXiv*, 2006.13365.
+
 .. [safavi2020] Safavi, T. & Koutra, D. (2020). `CoDEx: A Comprehensive Knowledge Graph
    Completion Benchmark <http://arxiv.org/abs/2009.07810>`_.  *arXiv*, 2009.07810.
diff --git a/src/pykeen/hpo/__init__.py b/src/pykeen/hpo/__init__.py
@@ -28,21 +28,37 @@
 ...    model='TransE',
 ... )
 
-Every model in PyKEEN not only has default hyper-parameters, but default
-strategies for optimizing these hyper-parameters. While the default values can
-be found in the ``__init__()`` function of each model, the ranges/scales can be
+Every model in PyKEEN has default values for its hyper-parameters chosen from the best-reported values in each model's
+original paper unless otherwise stated on the model's reference page. In case hyper-parameters for a model for a
+specific dataset were not available, we choose the hyper-parameters based on the findings in our
+large-scale benchmarking [ali2020a]_.
+
+
+In addition to reasonable default hyper-parameters, every model in PyKEEN has
+default "strategies" for optimizing these hyper-parameters which either constitute
+ranges for integer/floating point numbers or as enumerations for categorical variables
+and booleans.
+
+While the default values for hyper-parameters are encoded with the python syntax
+for default values of the ``__init__()`` function of each model, the ranges/scales can be
 found in the class variable :py:attr:`pykeen.models.Model.hpo_default`. For
 example, the range for TransE's embedding dimension is set to optimize
 between 50 and 350 at increments of 25 in :py:attr:`pykeen.models.TransE.hpo_default`.
 TransE also has a scoring function norm that will be optimized by a categorical
 selection of {1, 2} by default.
 
-All hyper-parameters defined in the ``hpo_default`` of your chosen Model will be
+.. note ::
+
+   These hyper-parameter ranges were chosen as reasonable defaults for the benchmark
+   datasets FB15k-237 / WN18RR. When using different datasets, the ranges might be suboptimal.
+
+All hyper-parameters defined in the ``hpo_default`` of your chosen model will be
 optimized by default. If you already have a value that you're happy with for
 one of them, you can specify it with the ``model_kwargs`` attribute. In the
 following example, the ``embedding_dim`` for a TransE model is fixed at 200,
-while the rest of the parameters will be optimized. For TransE, that means that
-the scoring function norm will be optimized between 1 and 2.
+while the rest of the parameters will be optimized using the pre-defined HPO strategies in
+the model. For TransE, that means that the scoring function norm will be optimized
+as 1 or 2.
 
 >>> from pykeen.hpo import hpo_pipeline
 >>> hpo_pipeline_result = hpo_pipeline(
@@ -65,7 +81,7 @@
 ...     dataset='Nations',
 ...     model='TransE',
 ...     model_kwargs_ranges=dict(
-...         embedding_dim=dict(type=int, low=100, high=400, q=100),
+...         embedding_dim=dict(type=int, low=100, high=500, q=100),
 ...     ),
 ... )
 

diff --git a/src/pykeen/pipeline.py b/src/pykeen/pipeline.py
@@ -154,7 +154,9 @@
 
 The entries in ``model_kwargs`` correspond to the arguments given to :func:`pykeen.models.TransE.__init__`. For a
 complete listing of models, see :mod:`pykeen.models`, where there are links to the reference for each
-model that explain what kwargs are possible.
+model that explain what kwargs are possible. Each model's default hyper-parameters were chosen based on the
+best reported values from the paper originally publishing the model unless otherwise noted on the model's
+reference page.
 
 Because the pipeline takes care of looking up classes and instantiating them,
 there are several other parameters to :func:`pykeen.pipeline.pipeline` that

diff --git a/src/pykeen/templates/README.md b/src/pykeen/templates/README.md
@@ -151,6 +151,10 @@ in ``pykeen``.
 
 {{ hpo_samplers }}
 
+Any sampler class extending the [optuna.samplers.BaseSampler](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.samplers.BaseSampler.html#optuna.samplers.BaseSampler),
+such as their sampler implementing the [CMA-ES](https://optuna.readthedocs.io/en/stable/reference/generated/optuna.samplers.CmaEsSampler.html#optuna.samplers.CmaEsSampler)
+algorithm, can also be used.
+
 ## Experimentation
 
 ### Reproduction