Skip to content

Commit

Permalink
Add neptune.ai tracker (#183)
Browse files Browse the repository at this point in the history
Closes #182

This PR does the following:

reorganizes pykeen.trackers to be a subpackage since it was getting pretty full for a single file
adds a tracker for neptune and an accompanying tutorial
enables mypy on a limited basis and makes a small change to the utils to pass
updates the README

Co-authored-by: PyKEEN_bot <pykeen2019@gmail.com>
Co-authored-by: Laurent Vermue <lvermue@users.noreply.github.com>
  • Loading branch information
3 people committed Dec 2, 2020
1 parent 5f3d79a commit 0d22383
Show file tree
Hide file tree
Showing 17 changed files with 420 additions and 279 deletions.
47 changes: 0 additions & 47 deletions .appveyor.yml

This file was deleted.

46 changes: 0 additions & 46 deletions .appveyor_on_request.yml

This file was deleted.

7 changes: 3 additions & 4 deletions .github/workflows/tests.yml
Expand Up @@ -30,10 +30,9 @@ jobs:

- name: Check package metadata with Pyroma
run: tox -e pyroma
# - name: Check static typing with MyPy
# run: tox -e mypy
# # Allow failure, see https://github.community/t/continue-on-error-allow-failure-ui-indication/16773
# if: succeeded() || failed()

- name: Check static typing with MyPy
run: tox -e mypy
docs:
if: "contains(github.event.head_commit.message, 'Trigger CI')"
name: Documentation
Expand Down
7 changes: 3 additions & 4 deletions .github/workflows/tests_master.yml
Expand Up @@ -30,10 +30,9 @@ jobs:

- name: Check package metadata with Pyroma
run: tox -e pyroma
# - name: Check static typing with MyPy
# run: tox -e mypy
# # Allow failure, see https://github.community/t/continue-on-error-allow-failure-ui-indication/16773
# if: succeeded() || failed()

- name: Check static typing with MyPy
run: tox -e mypy
docs:
if: "!contains(github.event.head_commit.message, 'skip ci')"
name: Documentation
Expand Down
11 changes: 6 additions & 5 deletions README.md
Expand Up @@ -225,12 +225,13 @@ in ``pykeen``.
| Mean Reciprocal Rank | The mean over all reciprocal ranks: mean_i (1/r_i). Higher is better. | rankbased | `pykeen.evaluation.RankBasedMetricResults` |
| Roc Auc Score | The area under the ROC curve between [0.0, 1.0]. Higher is better. | sklearn | `pykeen.evaluation.SklearnMetricResults` |

### Trackers (2)
### Trackers (3)

| Name | Reference | Description |
|--------|-------------------------------------------------------------------------------------------------------------------------------|-----------------------------------|
| mlflow | [`pykeen.trackers.MLFlowResultTracker`](https://pykeen.readthedocs.io/en/latest/api/pykeen.trackers.MLFlowResultTracker.html) | A tracker for MLFlow. |
| wandb | [`pykeen.trackers.WANDBResultTracker`](https://pykeen.readthedocs.io/en/latest/api/pykeen.trackers.WANDBResultTracker.html) | A tracker for Weights and Biases. |
| Name | Reference | Description |
|---------|---------------------------------------------------------------------------------------------------------------------------------|-----------------------------------|
| mlflow | [`pykeen.trackers.MLFlowResultTracker`](https://pykeen.readthedocs.io/en/latest/api/pykeen.trackers.MLFlowResultTracker.html) | A tracker for MLflow. |
| neptune | [`pykeen.trackers.NeptuneResultTracker`](https://pykeen.readthedocs.io/en/latest/api/pykeen.trackers.NeptuneResultTracker.html) | A tracker for Neptune.ai. |
| wandb | [`pykeen.trackers.WANDBResultTracker`](https://pykeen.readthedocs.io/en/latest/api/pykeen.trackers.WANDBResultTracker.html) | A tracker for Weights and Biases. |

## Hyper-parameter Optimization

Expand Down
1 change: 1 addition & 0 deletions docs/source/tutorial/trackers/index.rst
Expand Up @@ -5,4 +5,5 @@ Trackers
:name: trackers

using_mlflow
using_neptune
using_wandb
4 changes: 2 additions & 2 deletions docs/source/tutorial/trackers/using_mlflow.rst
Expand Up @@ -11,8 +11,8 @@ by default.
Pipeline Example
----------------
This example shows using MLflow with the :func:`pykeen.pipeline.pipeline` function.
Minimally, the `tracking_uri` and `experiment_name` are required in the
`result_tracker_kwargs`.
Minimally, the ``tracking_uri`` and ``experiment_name`` are required in the
``result_tracker_kwargs``.

.. code-block:: python
Expand Down
96 changes: 96 additions & 0 deletions docs/source/tutorial/trackers/using_neptune.rst
@@ -0,0 +1,96 @@
Using Neptune.ai
================
`Neptune <https://neptune.ai>`_ is a graphical tool for tracking the results of machine learning. PyKEEN integrates
Neptune into the pipeline and HPO pipeline.

Preparation
-----------
1. To use it, you'll first have to install Neptune's client with ``pip install neptune-client`` or
install PyKEEN with the ``neptune`` extra with ``pip install pykeen[neptune]``.
2. Create an account at `Neptune <https://neptune.ai>`_.

- Get an API token following `this tutorial <https://docs.neptune.ai/security-and-privacy/api-tokens/how-to-find-and-set-neptune-api-token.html>`_.
- [Optional] Set the ``NEPTUNE_API_TOKEN`` environment variable to your API token.
3. [Optional] Create a new project by following `this tutorial for project and user
management <https://docs.neptune.ai/workspace-project-and-user-management/projects/create-project.html>`_.
Neptune automatically creates a project for all new users called ``sandbox`` which you
can directly use.

Pipeline Example
----------------
This example shows using Neptune with the :func:`pykeen.pipeline.pipeline` function.
Minimally, the ``project_qualified_name`` and ``experiment_name`` must be set.

.. code-block:: python
from pykeen.pipeline import pipeline
pipeline_result = pipeline(
model='RotatE',
dataset='Kinships',
result_tracker='neptune',
result_tracker_kwargs=dict(
project_qualified_name='cthoyt/sandbox',
experiment_name='Tutorial Training of RotatE on Kinships',
),
)
.. warning::

If you haven't set the ``NEPTUNE_API_TOKEN`` environment variable, the ``api_token`` becomes
a mandatory key.

Reusing Experiments
-------------------
In the Neptune web application, you'll see that experiments are assigned an ID. This means you can re-use the same
ID to group different sub-experiments together using the ``experiment_id`` keyword argument instead of
``experiment_name``.

.. code-block:: python
from pykeen.pipeline import pipeline
experiment_id = 4 # if doesn't already exist, will throw an error!
pipeline_result = pipeline(
model='RotatE',
dataset='Kinships',
result_tracker='neptune'
result_tracker_kwargs=dict(
project_qualified_name='cthoyt/sandbox',
experiment_id=4,
),
)
Don't worry - you can keep using the ``experiment_name`` argument and the experiment's identifier will
be automatically looked up eah time.

Adding Tags
-----------
Tags are additional information that you might want to add to the experiment
and store in Neptune. Note this is different from MLflow, which considers tags
as key/value pairs.

For example, if you're using custom input, you might want to add some labels
about if the experiment is cool or not.

.. code-block:: python
from pykeen.pipeline import pipeline
data_version = ...
pipeline_result = pipeline(
model='RotatE',
training=...,
testing=...,
validation=...,
result_tracker='mlflow',
result_tracker_kwargs=dict(
project_qualified_name='cthoyt/sandbox',
experiment_name='Tutorial Training of RotatE on Kinships',
tags={'cool', 'doggo'},
),
)
Additional documentation of the valid keyword arguments can be found
under :class:`pykeen.trackers.NeptuneResultTracker`.
2 changes: 2 additions & 0 deletions setup.cfg
Expand Up @@ -83,6 +83,8 @@ mlflow =
mlflow>=1.8.0
wandb =
wandb
neptune =
neptune-client
docs =
sphinx
sphinx-rtd-theme
Expand Down

0 comments on commit 0d22383

Please sign in to comment.