Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding LLM Tracking docs #8234

Merged
merged 2 commits into from
Apr 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Get started using the :ref:`quickstart` or by reading about the :ref:`key concep
tutorials-and-examples/index
concepts
tracking
llm-tracking
projects
models
model-registry
Expand Down
74 changes: 74 additions & 0 deletions docs/source/llm-tracking.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
.. _llm-tracking:

=====================
MLflow LLM Tracking
=====================

The Mlflow LLM Tracking component consists of two elements for logging and viewing the behavior of LLM's.
Firstly it is a set of APIs that allow for logging inputs, outputs, and prompts submitted and returned
from LLM's. Accompanying these APIs is a UI components that provides a simplified means of viewing the
results of experimental submissions (prompts and inputs) and the results (LLM outputs).

.. contents:: Table of Contents
:local:
:depth: 2

.. _llm-tracking-concepts:

Concepts
==========

MLflow LLM Tracking is organized around the concept of *runs*, which are executions of some piece of
data science code. Each run records the following information:

Parameters
Key-value input parameters of your choice. Both keys and values are strings. These could be LLM
parameters like top_k, temperature, etc.

Metrics
Key-value metrics, where the value is numeric. Each metric can be updated throughout the
course of the run (for example, to track how your model's loss function is converging), and
MLflow records and lets you visualize the metric's full history.

Predictions
For offline evaluation, you can log predictions returned from your model by passing in the
prompts and inputs, as well as the returned outputs from the model.
These predictions are logged in ``csv`` format as an MLflow artifact.

Artifacts
Output files in any format. For example, you can record images (e.g., PNGs), models
(e.g., a pickled ``openai`` model), and data files (e.g., a
`Parquet <https://parquet.apache.org/>`_ file) as artifacts.

You can optionally organize runs into *experiments*, which group and compare together runs for a
specific task. You can create an experiment using the ``mlflow experiments`` CLI, with
:py:func:`mlflow.create_experiment`, or using the corresponding REST parameters. To interact
with experiments, the MLflow API and UI let you create and search for experiments.

Once your runs have been recorded, you can query them and compare predictions using the :ref:`tracking_ui`.

.. _how_llm_predictions_recorded:

How LLM Tracking Information Recorded
=======================================
Parameters: :py:func:`mlflow.log_param` logs a single key-value param in the currently active run. The key and
value are both strings. Use :py:func:`mlflow.log_params` to log multiple params at once.

Metrics: :py:func:`mlflow.log_metric` logs a single key-value metric. The value must always be a number.
MLflow remembers the history of values for each metric. Use :py:func:`mlflow.log_metrics` to log
multiple metrics at once.

Predictions: :py:func:`mlflow.llm.log_predictions` logs inputs, outputs and prompts. Inputs and prompts could either
be a list of strings or list of dict whereas the output would be a list of strings.

Artifacts: :py:func:`mlflow.log_artifact` logs a local file or directory as an artifact, optionally taking an
``artifact_path`` to place it in within the run's artifact URI. Run artifacts can be organized into
directories, enabling nested storage in multiple different paradigms for logging of inputs and predictions.

.. _where_llm_tracking_information_are_recorded:

Where LLM Tracking Information Are Recorded
=============================================
All the tracking information is recorded as part of a MLflow Experiment run.


6 changes: 6 additions & 0 deletions docs/source/python_api/mlflow.llm.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
mlflow.llm
============

.. automodule:: mlflow.llm
:members:
:undoc-members:
1 change: 1 addition & 0 deletions mlflow/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@
"diviner",
"transformers",
"langchain",
"llm",
]
except ImportError as e:
# We are conditional loading these commands since the skinny client does
Expand Down
2 changes: 1 addition & 1 deletion mlflow/llm.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
The ``mlflow.llm`` module provides a utility for Large Language Models (LLMs).
The ``mlflow.llm`` module provides utilities for Large Language Models (LLMs).
"""

from mlflow.tracking.llm_utils import log_predictions
Expand Down
2 changes: 1 addition & 1 deletion mlflow/tracking/llm_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,8 @@ def log_predictions(
]


# Log llm predictions
with mlflow.start_run():
# Log llm predictions
mlflow.llm.log_predictions(inputs, outputs, prompts)
"""
if len(inputs) <= 0 or len(inputs) != len(outputs) or len(inputs) != len(prompts):
Expand Down