Use REPL context attributes if available to avoid calling JVM methods #5132

harupy · 2021-12-01T00:17:32Z

What changes are proposed in this pull request?

Use REPL context attributes if available to avoid calling JVM methods.

How is this patch tested?

Install mlflow from this branch on Databricks and confirmed we can run mlflow code in multiprocessing.

Does this PR change the documentation?

No. You can skip the rest of this section.
Yes. Make sure the changed pages / sections render correctly by following the steps below.

Check the status of the ci/circleci: build_doc check. If it's successful, proceed to the
next step, otherwise fix it.
Click Details on the right to open the job page of CircleCI.
Click the Artifacts tab.
Click docs/build/html/index.html.
Find the changed pages / sections and make sure they render correctly.

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)

What component(s), interfaces, languages, and integrations does this PR affect?

Components

Interface

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Language

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy · 2021-12-01T01:52:58Z

mlflow/utils/databricks_utils.py

+_env_var_prefix = "DATABRICKS_"
+
+
+def _use_env_var_if_exists(env_var, *, if_exists=lambda x: os.environ[x]):


Introduced this decorator to make it easier to preserve the existing logic for older runtime versions.

BenWilson2 · 2021-12-01T02:00:23Z

mlflow/utils/databricks_utils.py

+    """
+
+    def decorator(f):
+        @functools.wraps(f)


nice use of the decorator factory here. +1

BenWilson2

really clever, elegant, and simplified solution.

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

dbczumar · 2021-12-01T03:12:16Z

mlflow/utils/databricks_utils.py

@@ -50,6 +79,7 @@ def _get_context_tag(context_tag_key):
        return None


+@_use_env_var_if_exists(_env_var_prefix + "ACL_PATH_OF_ACL_ROOT")


Can we prefix these environment variables with DATABRICKS?

We probably don't need ACL_PATH_OF_ACL_ROOT, since this is used for is_in_databricks_notebook / get_notebook_id. We can rely on DATABRICKS_NOTEBOOK_ID for those.

@dbczumar Thanks for the comment! _env_var_prefix adds DATABRICKS_ or am I missing something?

Doh. Sorry - missed that.

We probably don't need ACL_PATH_OF_ACL_ROOT, since this is used for is_in_databricks_notebook / get_notebook_id. We can rely on DATABRICKS_NOTEBOOK_ID for those.

Makes sense!

dbczumar

LGTM once #5132 (comment) is addressed. Thanks Haru!

harupy · 2021-12-01T03:22:29Z

@BenWilson2 @dbczumar Thanks for the review, I still need to update the code for dynamic metadata (e.g. command run id).

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

BenWilson2 · 2021-12-01T16:13:03Z

mlflow/utils/databricks_utils.py

@@ -133,6 +166,7 @@ def get_notebook_id():
    return None


+@_use_env_var_if_exists(_ENV_VAR_PREFIX + "NOTEBOOK_PATH")
 def get_notebook_path():
    """Should only be called if is_in_databricks_notebook is true"""
    path = _get_property_from_spark_context("spark.databricks.notebook.path")


does this work with ephemeral notebooks within and without jobs?

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

Signed-off-by: harupy <hkawamura0130@gmail.com>

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

dbczumar

re-LGTM! Thanks @harupy !

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy added 2 commits November 30, 2021 21:57

update get_notebook_path

a035719

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

add decorator

c1d440c

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

github-actions bot added the rn/none List under Small Changes in Changelogs. label Dec 1, 2021

pass func to wraps

70c5c7e

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy commented Dec 1, 2021

View reviewed changes

BenWilson2 reviewed Dec 1, 2021

View reviewed changes

mlflow/utils/databricks_utils.py

"""

def decorator(f):

@functools.wraps(f)

Copy link

Member

BenWilson2 Dec 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice use of the decorator factory here. +1

BenWilson2 approved these changes Dec 1, 2021

View reviewed changes

harupy added 2 commits December 1, 2021 11:14

refactor

4c981cb

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

update docstring

e413f16

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy changed the title ~~Get databricks metadata from environment variables and jupyter command metadata~~ [WIP] Get databricks metadata from environment variables and jupyter command metadata Dec 1, 2021

dbczumar reviewed Dec 1, 2021

View reviewed changes

dbczumar approved these changes Dec 1, 2021

View reviewed changes

harupy added 2 commits December 1, 2021 14:01

test

b83e7e2

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

fix

bf30d27

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

BenWilson2 reviewed Dec 1, 2021

View reviewed changes

harupy and others added 7 commits December 2, 2021 15:56

hardcode prefix

a610852

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

add _use_message_metadata_if_exists

6278492

Signed-off-by: harupy <hkawamura0130@gmail.com>

use context metadata

a2213e3

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

debug

757f0e2

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

fix attr name

78fa884

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

remove print

409bcce

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

fix tests

7d07c30

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy changed the title ~~[WIP] Get databricks metadata from environment variables and jupyter command metadata~~ Get databricks metadata from environment variables and jupyter command metadata Dec 6, 2021

harupy added 4 commits December 6, 2021 16:32

fix docstring

97654f7

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

use get_context

d8f7b0e

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

rename functions

9971989

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

fix module name

9f7dd9d

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy changed the title ~~Get databricks metadata from environment variables and jupyter command metadata~~ Use REPL context attributes if available to avoid calling JVM methods Dec 14, 2021

fix module name

3c3676c

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

dbczumar approved these changes Dec 15, 2021

View reviewed changes

use boolean attributes

5ac0475

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy force-pushed the use-env-var-command-metadata branch from 30473e2 to 5ac0475 Compare December 16, 2021 07:56

harupy added 2 commits December 21, 2021 17:22

refactor

6804295

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

Merge branch 'master' into use-env-var-command-metadata

3ca6b30

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

harupy merged commit 852f567 into mlflow:master Dec 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use REPL context attributes if available to avoid calling JVM methods #5132

Use REPL context attributes if available to avoid calling JVM methods #5132

harupy commented Dec 1, 2021 •

edited

Loading

harupy Dec 1, 2021

BenWilson2 Dec 1, 2021

BenWilson2 left a comment

dbczumar Dec 1, 2021

dbczumar Dec 1, 2021

harupy Dec 1, 2021

dbczumar Dec 1, 2021

harupy Dec 1, 2021

dbczumar left a comment •

edited

Loading

harupy commented Dec 1, 2021

BenWilson2 Dec 1, 2021

dbczumar left a comment

		_env_var_prefix = "DATABRICKS_"


		def _use_env_var_if_exists(env_var, *, if_exists=lambda x: os.environ[x]):

		@@ -50,6 +79,7 @@ def _get_context_tag(context_tag_key):
		return None


		@_use_env_var_if_exists(_env_var_prefix + "ACL_PATH_OF_ACL_ROOT")

Use REPL context attributes if available to avoid calling JVM methods #5132

Use REPL context attributes if available to avoid calling JVM methods #5132

Conversation

harupy commented Dec 1, 2021 • edited Loading

What changes are proposed in this pull request?

How is this patch tested?

Does this PR change the documentation?

Release Notes

Is this a user-facing change?

What component(s), interfaces, languages, and integrations does this PR affect?

How should the PR be classified in the release notes? Choose one:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BenWilson2 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dbczumar left a comment • edited Loading

Choose a reason for hiding this comment

harupy commented Dec 1, 2021

Choose a reason for hiding this comment

dbczumar left a comment

Choose a reason for hiding this comment

harupy commented Dec 1, 2021 •

edited

Loading

dbczumar left a comment •

edited

Loading