New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Improve the code of quickstart 2 - hyperparam search #10344

Merged

chenmoneygithub merged 4 commits into mlflow:master from chenmoneygithub:better-quickstart-2

Dec 4, 2023

Collaborator

chenmoneygithub commented Nov 9, 2023 •

edited

🛠 DevTools 🛠

Install mlflow from this PR

pip install git+https://github.com/mlflow/mlflow.git@refs/pull/10344/merge

Checkout with GitHub CLI

gh pr checkout 10344

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

A few improvements, mostly centered around code part:

Refactor the Keras code to be compatible with Keras best practice.
Reduce the workload of the script, e.g., less epoch.
Break long sentences into multiple lines.

How is this PR tested?

Existing unit/integration tests
New unit/integration tests
Manual tests

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

Interface

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Language

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

github-actions bot commented Nov 9, 2023 •

edited

Documentation preview for b22b208 will be available here when this CircleCI job completes successfully.

More info

Ignore this comment if this PR does not change the documentation.
It takes a few minutes for the preview to be available.
The preview is updated when a new commit is pushed to this PR.
This comment was created by https://github.com/mlflow/mlflow/actions/runs/7092922201.

github-actions bot added the rn/documentation label

chenmoneygithub force-pushed the better-quickstart-2 branch from 1b9d2cb to ee3b160 Compare

November 9, 2023 23:55

chenmoneygithub requested a review from BenWilson2

November 9, 2023 23:55

chenmoneygithub force-pushed the better-quickstart-2 branch from ee3b160 to 5338ba3 Compare

November 10, 2023 00:15

BenWilson2 reviewed

View reviewed changes

docs/source/getting-started/quickstart-2/index.rst Outdated Show resolved Hide resolved

BenWilson2 reviewed

View reviewed changes

docs/source/getting-started/quickstart-2/index.rst Show resolved Hide resolved

chenmoneygithub force-pushed the better-quickstart-2 branch from 1ba6390 to cb814c1 Compare

November 29, 2023 18:18

chenmoneygithub requested a review from jessechancy

November 30, 2023 21:51

jessechancy approved these changes

View reviewed changes

Collaborator

jessechancy left a comment

Looks good, just have some comments about cleaning up

docs/source/getting-started/quickstart-2/index.rst

		@@ -24,16 +27,18 @@ Set up
		------

		- Install MLflow. See the :ref:`introductory quickstart <quickstart-1>` for instructions

Collaborator

jessechancy Nov 30, 2023

Do we want to display :ref: to the user?

Collaborator Author

chenmoneygithub Dec 4, 2023

yea, that's a weird sphinx stuff

docs/source/getting-started/quickstart-2/index.rst

		@@ -76,68 +79,79 @@ Now load the dataset and split it into training, validation, and test sets.
		)

Collaborator

jessechancy Nov 30, 2023

This isn't part of this change, but I was wondering if we should do this cleaner?

Split the data into input and label first before running the splitting instead of in the middle
Or shuffle and split manually based on a predefined split percentage (60,15,25 etc.) instead of using two train_test_splits since in this case the proportion of the dataset allocated to each is not immediately apparent

Collaborator Author

chenmoneygithub Dec 4, 2023

agreed, it would be more clear

docs/source/getting-started/quickstart-2/index.rst Outdated

                           # Evaluate the model
-                          predicted_qualities = model.predict(test_x)
-                          rmse = np.sqrt(mean_squared_error(test_y, predicted_qualities))
+                          eval_result = model.evaluate(test_x, test_y, batch_size=64)

Collaborator

jessechancy Nov 30, 2023

I think for hyperparameter tuning, we shouldn't encourage the user to optimize the objective function in relation to the test dataset, but rather the validation dataset. Optimizing on the test dataset is in some sense polluting the results of testing

Collaborator

jessechancy Nov 30, 2023

We can optionally use the test data at the end of optimization to see how well the 'best' model actually does

Collaborator Author

chenmoneygithub Dec 4, 2023

very good call, fixed

docs/source/getting-started/quickstart-2/index.rst

+              .. image:: ../../_static/images/quickstart_mlops/mlflow_registry_transitions.png
+                  :width: 800px
+                  :align: center
+                  :alt: Screenshot of MLflow tracking UI models page showing the registered model

Collaborator

jessechancy Nov 30, 2023

I think this image is not rendering

Collaborator Author

chenmoneygithub Dec 4, 2023

nice catch

docs/source/getting-started/quickstart-2/index.rst

+              (Note that specifying the port as above will be necessary if you are running the tracking server on the
+              same machine at the default port of **5000**.)
+              You could also have used a ``runs:/<run_id>`` URI to serve a model, or any supported URI described in :ref:`artifact-stores`.

Collaborator

jessechancy Nov 30, 2023

I think this :ref: is also rendered to the user, not sure if we want that

Collaborator Author

chenmoneygithub Dec 4, 2023

on the website that will be correct, the md preview is not working 100% well with sphinx.

chenmoneygithub added 3 commits

December 4, 2023 13:40


          better

98d2ea5

Signed-off-by: chenmoneygithub <chen.qian@databricks.com>


          hyperopt writing

77a1925

Signed-off-by: chenmoneygithub <chen.qian@databricks.com>


          fix comment

96a527a

Signed-off-by: chenmoneygithub <chen.qian@databricks.com>

chenmoneygithub force-pushed the better-quickstart-2 branch from cb814c1 to 96a527a Compare

December 4, 2023 21:41


          dummy file to trigger CI

b22b208

Signed-off-by: chenmoneygithub <chen.qian@databricks.com>

BenWilson2 reviewed

View reviewed changes

docs/source/getting-started/quickstart-2/index.rst

+              Transition the model to **Staging** by choosing the **Stage** dropdown:
+              .. image:: ../../_static/images/quickstart_mlops/register_model_button.png
+                  :width: 800px

Member

BenWilson2 Dec 4, 2023

Can we set this to a percentage value? 70% at full screen will render this to roughly what 800px would be. Fixed width sizes have issues with screen resizing.

chenmoneygithub merged commit 36c3572 into mlflow:master

54 checks passed

chenmoneygithub deleted the better-quickstart-2 branch

January 2, 2024 22:53

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment