From cd8a03360eaa0e863cf0f40d50a4674571806cbf Mon Sep 17 00:00:00 2001
From: Daniel Lok
Date: Mon, 22 Jan 2024 12:33:55 +0800
Subject: [PATCH] Prompt template docs (#10836)
Signed-off-by: Daniel Lok
---
docs/source/llms/index.rst | 2 +-
docs/source/llms/transformers/guide/index.rst | 67 ++++
docs/source/llms/transformers/index.rst | 14 +
.../prompt-templating/prompt-templating.ipynb | 349 ++++++++++++++++++
4 files changed, 431 insertions(+), 1 deletion(-)
create mode 100644 docs/source/llms/transformers/tutorials/prompt-templating/prompt-templating.ipynb
diff --git a/docs/source/llms/index.rst b/docs/source/llms/index.rst
index adc731f227b19..c4b379bb738a2 100644
--- a/docs/source/llms/index.rst
+++ b/docs/source/llms/index.rst
@@ -318,7 +318,7 @@ Interested in learning how to leverage MLflow for your LLM projects?
Look in the tutorials and guides below to learn more about interesting use cases that could help to make your journey into leveraging LLMs a bit easier!
-Note that there are additional tutorials within the `Native Integration Guides and Tutorials section above <#native-integration-guides-and-tutorials>`_, so be sure to check those out as well!
+Note that there are additional tutorials within the `"Explore the Native LLM Flavors" section above <#explore-the-native-llm-flavors>`_, so be sure to check those out as well!
.. toctree::
:maxdepth: 1
diff --git a/docs/source/llms/transformers/guide/index.rst b/docs/source/llms/transformers/guide/index.rst
index d643890e5cf9c..d4a9af716c333 100644
--- a/docs/source/llms/transformers/guide/index.rst
+++ b/docs/source/llms/transformers/guide/index.rst
@@ -88,6 +88,73 @@ avoid failed inference requests.
\***** If using `pyfunc` in MLflow Model Serving for realtime inference, the raw audio in bytes format must be base64 encoded prior to submitting to the endpoint. String inputs will be interpreted as uri locations.
+
+Saving Prompt Templates with Transformer Pipelines
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. note::
+
+ This feature is only available in MLflow 2.10.0 and above.
+
+MLflow supports specifying prompt templates for certain pipeline types:
+
+- `feature-extraction `_
+- `fill-mask `_
+- `summarization `_
+- `text2text-generation `_
+- `text-generation `_
+
+Prompt templates are strings that are used to format user inputs prior to ``pyfunc`` inference. To specify a prompt template,
+use the ``prompt_template`` argument when calling :py:func:`mlflow.transformers.save_model()` or :py:func:`mlflow.transformers.log_model()`.
+The prompt template must be a string with a single format placeholder, ``{prompt}``.
+
+For example:
+
+.. code-block:: python
+
+ import mlflow
+ from transformers import pipeline
+
+ # Initialize a pipeline. `distilgpt2` uses a "text-generation" pipeline
+ generator = pipeline(model="distilgpt2")
+
+ # Define a prompt template
+ prompt_template = "Answer the following question: {prompt}"
+
+ # Save the model
+ mlflow.transformers.save_model(
+ transformers_model=generator,
+ path="path/to/model",
+ prompt_template=prompt_template,
+ )
+
+When the model is then loaded with :py:func:`mlflow.pyfunc.load_model()`, the prompt
+template will be used to format user inputs before passing them into the pipeline:
+
+.. code-block:: python
+
+ import mlflow
+
+ # Load the model with pyfunc
+ model = mlflow.pyfunc.load_model("path/to/model")
+
+ # The prompt template will be used to format this input, so the
+ # string that is passed to the text-generation pipeline will be:
+ # "Answer the following question: What is MLflow?"
+ model.predict("What is MLflow?")
+
+.. note::
+
+ ``text-generation`` pipelines with a prompt template will have the `return_full_text pipeline argument `_
+ set to ``False`` by default. This is to prevent the template from being shown to the users,
+ which could potentially cause confusion as it was not part of their original input. To
+ override this behaviour, either set ``return_full_text`` to ``True`` via ``params``, or by
+ including it in a ``model_config`` dict in ``log_model()``. See `this section <#using-model-config-and-model-signature-params-for-transformers-inference>`_
+ for more details on how to do this.
+
+For a more in-depth guide, check out the `Prompt Templating notebook <../tutorials/prompt-templating/prompt-templating.ipynb>`_!
+
+
Using model_config and model signature params for `transformers` inference
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
diff --git a/docs/source/llms/transformers/index.rst b/docs/source/llms/transformers/index.rst
index 48d792accb544..9a8eff0222e96 100644
--- a/docs/source/llms/transformers/index.rst
+++ b/docs/source/llms/transformers/index.rst
@@ -49,6 +49,7 @@ MLflow supports the use of the Transformers package by providing:
- **Fine-tuning of Foundational Models**: Users can `fine-tune transformers models `_ on custom datasets while tracking metrics and parameters.
- **Experiment Tracking**: Log experiments, including all relevant details and artifacts, for easy comparison and reproducibility.
- **Simplified Model Deployment**: Deploy models with `minimal configuration requirements `_.
+- **Prompt Management**: `Save prompt templates `_ with transformers pipelines to optimize inference with less boilerplate.
**Example Use Case:**
@@ -156,6 +157,16 @@ These more advanced tutorials are designed to showcase different applications of
+