Prompt template docs (#10836)

Signed-off-by: Daniel Lok <daniel.lok@databricks.com>
mlflow · Jan 22, 2024 · cd8a033 · cd8a033
1 parent fb2109e
commit cd8a033
Show file tree

Hide file tree

Showing 4 changed files with 431 additions and 1 deletion.
diff --git a/docs/source/llms/index.rst b/docs/source/llms/index.rst
@@ -318,7 +318,7 @@ Interested in learning how to leverage MLflow for your LLM projects?
 
 Look in the tutorials and guides below to learn more about interesting use cases that could help to make your journey into leveraging LLMs a bit easier!
 
-Note that there are additional tutorials within the `Native Integration Guides and Tutorials section above <#native-integration-guides-and-tutorials>`_, so be sure to check those out as well!
+Note that there are additional tutorials within the `"Explore the Native LLM Flavors" section above <#explore-the-native-llm-flavors>`_, so be sure to check those out as well!
 
 .. toctree::
     :maxdepth: 1

diff --git a/docs/source/llms/transformers/guide/index.rst b/docs/source/llms/transformers/guide/index.rst
@@ -88,6 +88,73 @@ avoid failed inference requests.
 
 \***** If using `pyfunc` in MLflow Model Serving for realtime inference, the raw audio in bytes format must be base64 encoded prior to submitting to the endpoint. String inputs will be interpreted as uri locations.
 
+
+Saving Prompt Templates with Transformer Pipelines
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. note::
+
+    This feature is only available in MLflow 2.10.0 and above.
+
+MLflow supports specifying prompt templates for certain pipeline types:
+
+- `feature-extraction <https://huggingface.co/transformers/main_classes/pipelines.html#transformers.FeatureExtractionPipeline>`_
+- `fill-mask <https://huggingface.co/transformers/main_classes/pipelines.html#transformers.FillMaskPipeline>`_
+- `summarization <https://huggingface.co/transformers/main_classes/pipelines.html#transformers.SummarizationPipeline>`_
+- `text2text-generation <https://huggingface.co/transformers/main_classes/pipelines.html#transformers.Text2TextGenerationPipeline>`_
+- `text-generation <https://huggingface.co/transformers/main_classes/pipelines.html#transformers.TextGenerationPipeline>`_
+
+Prompt templates are strings that are used to format user inputs prior to ``pyfunc`` inference. To specify a prompt template,
+use the ``prompt_template`` argument when calling :py:func:`mlflow.transformers.save_model()` or :py:func:`mlflow.transformers.log_model()`.
+The prompt template must be a string with a single format placeholder, ``{prompt}``. 
+
+For example:
+
+.. code-block:: python
+
+    import mlflow
+    from transformers import pipeline
+
+    # Initialize a pipeline. `distilgpt2` uses a "text-generation" pipeline
+    generator = pipeline(model="distilgpt2")
+
+    # Define a prompt template
+    prompt_template = "Answer the following question: {prompt}"
+
+    # Save the model
+    mlflow.transformers.save_model(
+        transformers_model=generator,
+        path="path/to/model",
+        prompt_template=prompt_template,
+    )
+
+When the model is then loaded with :py:func:`mlflow.pyfunc.load_model()`, the prompt
+template will be used to format user inputs before passing them into the pipeline:
+
+.. code-block:: python
+
+    import mlflow
+
+    # Load the model with pyfunc
+    model = mlflow.pyfunc.load_model("path/to/model")
+
+    # The prompt template will be used to format this input, so the
+    # string that is passed to the text-generation pipeline will be:
+    # "Answer the following question: What is MLflow?"
+    model.predict("What is MLflow?")
+
+.. note::
+
+    ``text-generation`` pipelines with a prompt template will have the `return_full_text pipeline argument <https://huggingface.co/docs/huggingface_hub/main/en/package_reference/inference_client#huggingface_hub.inference._text_generation.TextGenerationParameters.return_full_text>`_
+    set to ``False`` by default. This is to prevent the template from being shown to the users,
+    which could potentially cause confusion as it was not part of their original input. To
+    override this behaviour, either set ``return_full_text`` to ``True`` via ``params``, or by 
+    including it in a ``model_config`` dict in ``log_model()``. See `this section <#using-model-config-and-model-signature-params-for-transformers-inference>`_ 
+    for more details on how to do this.
+
+For a more in-depth guide, check out the `Prompt Templating notebook <../tutorials/prompt-templating/prompt-templating.ipynb>`_!
+
+
 Using model_config and model signature params for `transformers` inference
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 

diff --git a/docs/source/llms/transformers/index.rst b/docs/source/llms/transformers/index.rst
@@ -49,6 +49,7 @@ MLflow supports the use of the Transformers package by providing:
 - **Fine-tuning of Foundational Models**: Users can `fine-tune transformers models <tutorials/fine-tuning/transformers-fine-tuning.html>`_ on custom datasets while tracking metrics and parameters.
 - **Experiment Tracking**: Log experiments, including all relevant details and artifacts, for easy comparison and reproducibility.
 - **Simplified Model Deployment**: Deploy models with `minimal configuration requirements <guide/index.html#scalability-for-inference>`_.
+- **Prompt Management**: `Save prompt templates <guide/index.html#saving-prompt-templates-with-transformer-pipelines>`_ with transformers pipelines to optimize inference with less boilerplate.
 
 **Example Use Case:**
 
@@ -156,6 +157,16 @@ These more advanced tutorials are designed to showcase different applications of
                     </p>
                 </a>
             </div>
+            <div class="simple-card">
+                <a href="tutorials/prompt-templating/prompt-templating.html">
+                    <div class="header">
+                        Prompt templating with Transformers Pipelines
+                    </div>
+                    <p>
+                        Learn how to set prompt templates on Transformers Pipelines to optimize your LLM's outputs, and simplify the end-user experience.
+                    </p>
+                </a>
+            </div>
             <div class="simple-card">
                 <a href="../custom-pyfunc-for-llms/notebooks/custom-pyfunc-advanced-llm.html">
                     <div class="header">
@@ -180,6 +191,7 @@ To download the transformers tutorial notebooks to run in your environment, clic
     <a href="https://raw.githubusercontent.com/mlflow/mlflow/master/docs/source/llms/transformers/tutorials/translation/component-translation.ipynb" class="notebook-download-btn">Download the Translation Notebook</a><br>        
     <a href="https://raw.githubusercontent.com/mlflow/mlflow/master/docs/source/llms/transformers/tutorials/conversational/conversational-model.ipynb" class="notebook-download-btn">Download the Chat Conversational Notebook</a><br>
     <a href="https://raw.githubusercontent.com/mlflow/mlflow/master/docs/source/llms/transformers/tutorials/fine-tuning/transformers-fine-tuning.ipynb" class="notebook-download-btn">Download the Fine Tuning Notebook</a><br>
+    <a href="https://raw.githubusercontent.com/mlflow/mlflow/master/docs/source/llms/transformers/tutorials/prompt-templating/prompt-templating.ipynb" class="notebook-download-btn">Download the Prompt Templating Notebook</a><br>
     <a href="https://raw.githubusercontent.com/mlflow/mlflow/master/docs/source/llms/custom-pyfunc-for-llms/notebooks/custom-pyfunc-advanced-llm.ipynb" class="notebook-download-btn">Download the Custom PyFunc transformers Notebook</a><br>
 
 .. toctree::
@@ -191,6 +203,7 @@ To download the transformers tutorial notebooks to run in your environment, clic
     tutorials/translation/component-translation.ipynb
     tutorials/conversational/conversational-model.ipynb
     tutorials/fine-tuning/transformers-fine-tuning.ipynb
+    tutorials/prompt-templating/prompt-templating.ipynb
 
 
 Options for Logging Transformers Models - Pipelines vs. Component logging
@@ -234,6 +247,7 @@ When working with the transformers flavor in MLflow, there are several important
 - **Input and Output Types**: The input and output types for the python_function implementation may differ from those expected from the native pipeline. Users need to ensure compatibility with their data processing workflows.
 - **Model Configuration**: When saving or logging models, the `model_config` can be used to set certain parameters. However, if both model_config and a `ModelSignature` with parameters are saved, the default parameters in ModelSignature will override those in `model_config`.
 - **Audio and Vision Models**: Audio and text-based large language models are supported for use with pyfunc, while other types like computer vision and multi-modal models are only supported for native type loading.
+- **Prompt Templates**: Prompt templating is currently supported for a few pipeline types. For a full list of supported pipelines, and more information about the feature, see `this link <guide/index.html#saving-prompt-templates-with-transformer-pipelines>`_.
 
 The currently supported pipeline types for Pyfunc can be seen `here <guide/index.html#supported-transformers-pipeline-types-for-pyfunc>`_.