Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pytorch Autologging Description Added To Tracking.rst #3636

Merged
merged 10 commits into from Nov 9, 2020
38 changes: 38 additions & 0 deletions docs/source/tracking.rst
Expand Up @@ -429,6 +429,44 @@ Autologging captures the following information:
| | | `OneCycleScheduler`_ callbacks | | |
+-----------+------------------------+----------------------------------------------------------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Pytorch (experimental)
--------------------------

Call :py:func:`mlflow.pytorch.autolog` before your Pytorch Lightning training code to enable automatic logging of metrics, parameters, and models. See example usages `here <https://github.com/chauhang/mlflow/tree/master/examples/pytorch/MNIST>`__. Note
that currently, Pytorch autologging supports only models trained using Pytorch Lightning.

Autologging is triggered on calls to ``pytorch_lightning.trainer.Trainer.fit`` and captures the following information:

+------------------------------------------------+-------------------------------------------------------------+--------------------------------------------------------------------------------------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| Framework/module | Metrics | Parameters | Tags | Artifacts |
+------------------------------------------------+-------------------------------------------------------------+--------------------------------------------------------------------------------------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| ``pytorch_lightning.trainer.Trainer`` | Training loss; validation loss; average_test_accuracy; | ``fit()`` parameters; optimizer name; learning rate; epsilon. | -- | Model summary on training start, `MLflow Model <https://mlflow.org/docs/latest/models.html>`_ (Pytorch model) on training end; |
| | user-defined-metrics. | | | |
| | | | | |
| | | | | |
| | | | | |
+------------------------------------------------+-------------------------------------------------------------+--------------------------------------------------------------------------------------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| ``pytorch_lightning.callbacks.earlystopping`` | Training loss; validation loss; average_test_accuracy; | ``fit()`` parameters; optimizer name; learning rate; epsilon | -- | Model summary on training start; `MLflow Model <https://mlflow.org/docs/latest/models.html>`_ (Pytorch model) on training end; |
| | user-defined-metrics. | Parameters from the ``EarlyStopping`` callbacks. | | Best Pytorch model checkpoint, if training stops due to early stopping callback. |
| | Metrics from the ``EarlyStopping`` callbacks. | For example, ``min_delta``, ``patience``, ``baseline``,``restore_best_weights``, etc | | |
| | For example, ``stopped_epoch``, ``restored_epoch``, | | | |
| | ``restore_best_weight``, etc. | | | |
| | | | | |
| | | | | |
| | | | | |
+------------------------------------------------+-------------------------------------------------------------+--------------------------------------------------------------------------------------+---------------+-----------------------------------------------------------------------------------------------------------------------------------------------+

If no active run exists when ``autolog()`` captures data, MLflow will automatically create a run to log information, ending the run once
the call to ``pytorch_lightning.trainer.Trainer.fit()`` completes.

If a run already exists when ``autolog()`` captures data, MLflow will log to that run but not automatically end that run after training.

.. note::
- Parameters not explicitly passed by users (parameters that use default values) while using ``pytorch_lightning.trainer.Trainer.fit()`` are not currently automatically logged
- In case of a multi-optimizer scenario (such as usage of autoencoder), only the parameters for the first optimizer are logged
- This feature is experimental - the API and format of the logged data are subject to change


.. _organizing_runs_in_experiments:

Organizing Runs in Experiments
Expand Down
11 changes: 6 additions & 5 deletions mlflow/pytorch/__init__.py
Expand Up @@ -614,20 +614,21 @@ def predict(self, data, device="cpu"):

def autolog(log_every_n_epoch=1):
"""
Wrapper for `mlflow.pytorch._pytorch_autolog.autolog` method.
Automatically log metrics, params, and models from `PyTorch Lightning
<https://pytorch-lightning.readthedocs.io/en/latest>`_ model training.
Autologging is performed when you call the `fit` method of `pytorch_lightning.Trainer()
Autologging is performed when you call the `fit` method of
`pytorch_lightning.Trainer() \
<https://pytorch-lightning.readthedocs.io/en/latest/trainer.html#>`_.

**Note**: Autologging is only supported for PyTorch Lightning models,
i.e. models that subclass `pytorch_lightning.LightningModule
i.e. models that subclass
`pytorch_lightning.LightningModule \
<https://pytorch-lightning.readthedocs.io/en/latest/lightning_module.html>`_.
In particular, autologging support for vanilla Pytorch models that only subclass
`torch.nn.Module <https://pytorch.org/docs/stable/generated/torch.nn.Module.html>`
`torch.nn.Module <https://pytorch.org/docs/stable/generated/torch.nn.Module.html>`_
is not yet available.

:param log_every_n_epoch: parameter to log metrics once in `n` epoch. By default, metrics
:param log_every_n_epoch: If specified, logs metrics once every `n` epochs. By default, metrics
are logged after every epoch.
"""
from mlflow.pytorch._pytorch_autolog import _autolog
Expand Down