Refactor deployment docs and tutorials #10726

B-Step62 · 2023-12-20T05:40:29Z

🛠 DevTools 🛠

Install mlflow from this PR

pip install git+https://github.com/mlflow/mlflow.git@refs/pull/10726/merge

Checkout with GitHub CLI

gh pr checkout 10726

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

Follow-up on #10675. Refactoring the existing Kubernetes deployment guide to include concrete and runnable steps.

How is this PR tested?

Existing unit/integration tests
New unit/integration tests
Manual tests

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

Interface

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Language

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

github-actions · 2023-12-20T05:40:49Z

Documentation preview for 32c1d40 will be available here when this CircleCI job completes successfully.

More info

Ignore this comment if this PR does not change the documentation.
It takes a few minutes for the preview to be available.
The preview is updated when a new commit is pushed to this PR.
This comment was created by https://github.com/mlflow/mlflow/actions/runs/7286498864.

B-Step62 · 2023-12-20T05:42:32Z

docs/source/deployment/index.rst

@@ -124,24 +124,19 @@ MLflow offers support for a variety of deployment targets. For detailed informat
                </a>
            </div>
            <div class="simple-card">
-                <a href="deploy-model-to-ray-serve.html">
+                <a href="../plugins.html#deployment-plugins">


Ray serve plugin is not actively maintained. We will probably replace this with deployment guide for LLMs, but for the time being just link to community plugins.

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

harupy · 2023-12-20T06:59:29Z

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

+Please open the experient named "wine-quality" on the left, then click the run named "default-params" in the table.
+For this case, you should see parameters including ``alpha`` and ``l1_ratio`` and metrics like ``training_score`` and ``mean_absolute_error_X_test``.
+
+Step 4: Running Hyperparameter Tuning


Is this step necessary?

I initially had the same thought for the original tutorial, but ended up not removing this and positioned this page as an end-to-end tutorial. The main reason is that pure deployment steps are pretty much covered by the partner's docs, so the main audience I expect here is ppl who already know about k8s but not much about MLflow. Hence I thought it's not a bad idea to stretch a bit to demonstrate MLflow capability in realistic scenario - single training run doesn't look nice in the UI indeed. For ppl who already have knowledge about MLflow, I added an info box to tell skip to Step 6:)

I think that giving the UI some runs is a good idea :)

Sounds good, let's keep this step

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

BenWilson2 · 2023-12-21T02:42:50Z

docs/source/deployment/deploy-model-to-kubernetes/index.rst

+=================================
+Using MLServer as the Inference Server
+--------------------------------------
+By default, MLflow deployment uses `Flask <https://flask.palletsprojects.com/en/1.1.x/>`_, a widely used WSGI web application framework for Python,


It might be nice to provide either a link or a brief description of what WSGI is (it can help to inform why we're saying this about Flask :) ) since the reading audience might not be familiar with the differences between a standard Web Server Gateway Interface and something else (like ASGI or the built-in optimizations for inference serving that MLServer from Seldon has).
What do you think about some brief education for readers on these topics so that they know why we're talking about and supporting MLServer on k8s?

The comparison is briefly described here in local deployment guide: https://output.circle-artifacts.com/output/job/0576e156-79b5-48d6-a81d-8ceafb03de8b/artifacts/0/docs/build/html/deployment/deploy-model-locally.html#serving-frameworks

But I didn't add low-level details on why, I will write a bit more specific details there and put link to it here.

Perfect! :D

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

BenWilson2 · 2023-12-21T02:47:37Z

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

+
+MLflow provides an easy-to-use interface for deploying modeles as a Flask-based inference server. You can deploy the same inference
+server to a Kubernetes cluster by containerizing it using the ``mlflow models build-docker`` command. However, this approach may not be scalable
+and could be unsuitable for production use cases. Flask is not designed for high performance, and manually managing multiple instances of


Can we say why Flask isn't ideally suited for ML inference (it's blocking) and how async gateways are far more optimized due to the potentially long-running nature of inference (depending on the model architecture, size, and optimization of the underlying library) and issues with scalability with a synchronous blocking web gateway?

Sure! Will add brief explanation on why.

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

BenWilson2 · 2023-12-21T02:55:28Z

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

+
+  warnings.filterwarnings("ignore")
+
+  alphas = [0.2, 0.5, 1.0]


Grid Search is terrible. Can we do a Random Search instead?
A fun paper to read on the topic: https://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf

Yeah no reason not doing it:)

harupy · 2023-12-21T02:57:46Z

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

+
+.. code-block:: python
+
+  from itertools import product


Can we https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html instead? Autologging automatically creates child runs.

Sure (will use RandomizedSearchCV as per above comment)

BenWilson2 · 2023-12-21T03:00:32Z

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

+
+Step 4: Running Hyperparameter Tuning
+-------------------------------------
+


We might want to add a ref to https://www.mlflow.org/docs/latest/traditional-ml/hyperparameter-tuning-with-child-runs/notebooks/hyperparameter-tuning-with-child-runs.html
For people who want to see a more in-depth guide on hyperparameter tuning

Great idea, will add!

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

BenWilson2 · 2023-12-21T03:08:36Z

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

+
+  mlflow models serve -m runs:/<run_id_for_your_best_run>/model -p 1234 --enable-mlserver
+
+This command starts a local server listening on port 1234. You can send a request to the server using ``curl`` command:


Suggested change

This command starts a local server listening on port 1234. You can send a request to the server using ``curl`` command:

This command starts a local server listening on port 1234. You can send a request to the server using a ``curl`` command:

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

BenWilson2 · 2023-12-21T03:10:10Z

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

+    For this tutorial, we'll push the image to `Docker Hub <https://hub.docker.com/>`_, but you can use any other Docker registry,
+    such as `Amazon ECR <https://aws.amazon.com/ecr/>`_ or private registry.
+
+    If you don't have a Docker Hub account yet, create one at https://hub.docker.com/signup.


docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

BenWilson2 · 2023-12-21T03:14:12Z

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

+    By default, MLflow stores the model in the local file system, so you need to configure MLflow to store the model in remote storage.
+    Please refer to `Artifact Store <../../../tracking.html#artifact-stores>`_ for setup instructions.
+
+    After configuring the artifact store, repeat the model training steps.


or load the best model from the model uri and re-log it from its in-memory object?

docs/source/deployment/deploy-model-to-kubernetes/tutorial.rst

BenWilson2

Fantastic work here @B-Step62 !! This is a HUGE improvement with a ton of great detail and a very easy to read step by step guide to something that is quite complex for most users.
After addressing the remaining comments, let's get this merged so we can push it out with the next site push (probably early next week when I get time :) )

harupy

LGTM!

…ks guides (mlflow#10675) Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> Signed-off-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com> Co-authored-by: Ben Wilson <39283302+BenWilson2@users.noreply.github.com> Co-authored-by: Serena Ruan <82044803+serena-ruan@users.noreply.github.com>

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

Co-authored-by: Harutaka Kawamura <hkawamura0130@gmail.com> Signed-off-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

Co-authored-by: Ben Wilson <39283302+BenWilson2@users.noreply.github.com> Signed-off-by: Yuki Watanabe <31463517+B-Step62@users.noreply.github.com>

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>

B-Step62 requested review from harupy and BenWilson2 December 20, 2023 05:40

github-actions bot added area/docs Documentation issues rn/none List under Small Changes in Changelogs. labels Dec 20, 2023

B-Step62 commented Dec 20, 2023

View reviewed changes

harupy reviewed Dec 20, 2023

View reviewed changes