Add neptune.ai tracker (#183)

Closes #182 This PR does the following: reorganizes pykeen.trackers to be a subpackage since it was getting pretty full for a single file adds a tracker for neptune and an accompanying tutorial enables mypy on a limited basis and makes a small change to the utils to pass updates the README Co-authored-by: PyKEEN_bot <pykeen2019@gmail.com> Co-authored-by: Laurent Vermue <lvermue@users.noreply.github.com>
pykeen · Dec 2, 2020 · 0d22383 · 0d22383
1 parent 5f3d79a
commit 0d22383
Show file tree

Hide file tree

Showing 17 changed files with 420 additions and 279 deletions.
diff --git a/.appveyor.yml b/.appveyor.yml
diff --git a/.appveyor_on_request.yml b/.appveyor_on_request.yml
diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml
@@ -30,10 +30,9 @@ jobs:
 
       - name: Check package metadata with Pyroma
         run: tox -e pyroma
-      # - name: Check static typing with MyPy
-      #  run: tox -e mypy
-      #  # Allow failure, see https://github.community/t/continue-on-error-allow-failure-ui-indication/16773
-      #  if: succeeded() || failed()
+
+      - name: Check static typing with MyPy
+        run: tox -e mypy
   docs:
     if: "contains(github.event.head_commit.message, 'Trigger CI')"
     name: Documentation

diff --git a/.github/workflows/tests_master.yml b/.github/workflows/tests_master.yml
@@ -30,10 +30,9 @@ jobs:
 
       - name: Check package metadata with Pyroma
         run: tox -e pyroma
-      # - name: Check static typing with MyPy
-      #  run: tox -e mypy
-      #  # Allow failure, see https://github.community/t/continue-on-error-allow-failure-ui-indication/16773
-      #  if: succeeded() || failed()
+
+      - name: Check static typing with MyPy
+        run: tox -e mypy
   docs:
     if: "!contains(github.event.head_commit.message, 'skip ci')"
     name: Documentation

diff --git a/README.md b/README.md
@@ -225,12 +225,13 @@ in ``pykeen``.
 | Mean Reciprocal Rank    | The mean over all reciprocal ranks: mean_i (1/r_i). Higher is better.                                              | rankbased   | `pykeen.evaluation.RankBasedMetricResults` |
 | Roc Auc Score           | The area under the ROC curve between [0.0, 1.0]. Higher is better.                                                 | sklearn     | `pykeen.evaluation.SklearnMetricResults`   |
 
-### Trackers (2)
+### Trackers (3)
 
-| Name   | Reference                                                                                                                     | Description                       |
-|--------|-------------------------------------------------------------------------------------------------------------------------------|-----------------------------------|
-| mlflow | [`pykeen.trackers.MLFlowResultTracker`](https://pykeen.readthedocs.io/en/latest/api/pykeen.trackers.MLFlowResultTracker.html) | A tracker for MLFlow.             |
-| wandb  | [`pykeen.trackers.WANDBResultTracker`](https://pykeen.readthedocs.io/en/latest/api/pykeen.trackers.WANDBResultTracker.html)   | A tracker for Weights and Biases. |
+| Name    | Reference                                                                                                                       | Description                       |
+|---------|---------------------------------------------------------------------------------------------------------------------------------|-----------------------------------|
+| mlflow  | [`pykeen.trackers.MLFlowResultTracker`](https://pykeen.readthedocs.io/en/latest/api/pykeen.trackers.MLFlowResultTracker.html)   | A tracker for MLflow.             |
+| neptune | [`pykeen.trackers.NeptuneResultTracker`](https://pykeen.readthedocs.io/en/latest/api/pykeen.trackers.NeptuneResultTracker.html) | A tracker for Neptune.ai.         |
+| wandb   | [`pykeen.trackers.WANDBResultTracker`](https://pykeen.readthedocs.io/en/latest/api/pykeen.trackers.WANDBResultTracker.html)     | A tracker for Weights and Biases. |
 
 ## Hyper-parameter Optimization
 

diff --git a/docs/source/tutorial/trackers/index.rst b/docs/source/tutorial/trackers/index.rst
@@ -5,4 +5,5 @@ Trackers
    :name: trackers
 
    using_mlflow
+   using_neptune
    using_wandb
diff --git a/docs/source/tutorial/trackers/using_mlflow.rst b/docs/source/tutorial/trackers/using_mlflow.rst
@@ -11,8 +11,8 @@ by default.
 Pipeline Example
 ----------------
 This example shows using MLflow with the :func:`pykeen.pipeline.pipeline` function.
-Minimally, the `tracking_uri` and `experiment_name` are required in the
-`result_tracker_kwargs`.
+Minimally, the ``tracking_uri`` and ``experiment_name`` are required in the
+``result_tracker_kwargs``.
 
 .. code-block:: python
 

diff --git a/docs/source/tutorial/trackers/using_neptune.rst b/docs/source/tutorial/trackers/using_neptune.rst
@@ -0,0 +1,96 @@
+Using Neptune.ai
+================
+`Neptune <https://neptune.ai>`_ is a graphical tool for tracking the results of machine learning. PyKEEN integrates
+Neptune into the pipeline and HPO pipeline.
+
+Preparation
+-----------
+1. To use it, you'll first have to install Neptune's client with ``pip install neptune-client`` or
+   install PyKEEN with the ``neptune`` extra with ``pip install pykeen[neptune]``.
+2. Create an account at `Neptune <https://neptune.ai>`_.
+
+   - Get an API token following `this tutorial <https://docs.neptune.ai/security-and-privacy/api-tokens/how-to-find-and-set-neptune-api-token.html>`_.
+   - [Optional] Set the ``NEPTUNE_API_TOKEN`` environment variable to your API token.
+3. [Optional] Create a new project by following `this tutorial for project and user
+   management <https://docs.neptune.ai/workspace-project-and-user-management/projects/create-project.html>`_.
+   Neptune automatically creates a project for all new users called ``sandbox`` which you
+   can directly use.
+
+Pipeline Example
+----------------
+This example shows using Neptune with the :func:`pykeen.pipeline.pipeline` function.
+Minimally, the ``project_qualified_name`` and ``experiment_name`` must be set.
+
+.. code-block:: python
+
+    from pykeen.pipeline import pipeline
+
+    pipeline_result = pipeline(
+        model='RotatE',
+        dataset='Kinships',
+        result_tracker='neptune',
+        result_tracker_kwargs=dict(
+            project_qualified_name='cthoyt/sandbox',
+            experiment_name='Tutorial Training of RotatE on Kinships',
+        ),
+    )
+
+.. warning::
+
+    If you haven't set the ``NEPTUNE_API_TOKEN`` environment variable, the ``api_token`` becomes
+    a mandatory key.
+
+Reusing Experiments
+-------------------
+In the Neptune web application, you'll see that experiments are assigned an ID. This means you can re-use the same
+ID to group different sub-experiments together using the ``experiment_id`` keyword argument instead of
+``experiment_name``.
+
+.. code-block:: python
+
+    from pykeen.pipeline import pipeline
+
+    experiment_id = 4  # if doesn't already exist, will throw an error!
+    pipeline_result = pipeline(
+        model='RotatE',
+        dataset='Kinships',
+        result_tracker='neptune'
+        result_tracker_kwargs=dict(
+            project_qualified_name='cthoyt/sandbox',
+            experiment_id=4,
+        ),
+    )
+
+Don't worry - you can keep using the ``experiment_name`` argument and the experiment's identifier will
+be automatically looked up eah time.
+
+Adding Tags
+-----------
+Tags are additional information that you might want to add to the experiment
+and store in Neptune. Note this is different from MLflow, which considers tags
+as key/value pairs.
+
+For example, if you're using custom input, you might want to add some labels
+about if the experiment is cool or not.
+
+.. code-block:: python
+
+    from pykeen.pipeline import pipeline
+
+    data_version = ...
+
+    pipeline_result = pipeline(
+        model='RotatE',
+        training=...,
+        testing=...,
+        validation=...,
+        result_tracker='mlflow',
+        result_tracker_kwargs=dict(
+            project_qualified_name='cthoyt/sandbox',
+            experiment_name='Tutorial Training of RotatE on Kinships',
+            tags={'cool', 'doggo'},
+        ),
+    )
+
+Additional documentation of the valid keyword arguments can be found
+under :class:`pykeen.trackers.NeptuneResultTracker`.
diff --git a/setup.cfg b/setup.cfg
@@ -83,6 +83,8 @@ mlflow =
     mlflow>=1.8.0
 wandb =
     wandb
+neptune =
+    neptune-client
 docs =
     sphinx
     sphinx-rtd-theme