Introduce get_parent_run fluent API #8493

annzhang-db · 2023-05-22T22:15:20Z

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

Add get_parent_run to mlflow client API
Add get_parent_run fluent API

How is this patch tested?

Existing unit/integration tests
New unit/integration tests
Manual tests (describe details, including test results, below)

Does this PR change the documentation?

No. You can skip the rest of this section.
Yes. Make sure the changed pages / sections render correctly in the documentation preview.

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

Introduce mlflow.get_parent_run() fluent API

What component(s), interfaces, languages, and integrations does this PR affect?

Components

Interface

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Language

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

mlflow-automation · 2023-05-22T22:40:12Z

Documentation preview for 6a6ff89 will be available here when this CircleCI job completes successfully.

More info

Ignore this comment if this PR does not change the documentation.
It takes a few minutes for the preview to be available.
The preview is updated when a new commit is pushed to this PR.
This comment was created by https://github.com/mlflow/mlflow/actions/runs/5071443109.

prithvikannan

This looks great @annzhang-db . Can we also extend this to the mlflow client APIs (see mlflow/tracking/client.py)? Similar to the way we call MlflowClient().get_run(run_id), it would be helpful to call MlflowClient().get_parent_run(run_id).

sunishsheth2009

Looks good to me :) Awesome work

sunishsheth2009 · 2023-05-23T00:21:32Z

mlflow/tracking/fluent.py

@@ -518,6 +518,43 @@ def get_run(run_id: str) -> Run:
    return MlflowClient().get_run(run_id)


+def get_parent_run(run_id: str) -> Optional[Run]:
+    """
+    Gets the parent run for the given run id.


Gets the parent run for the given run id if one exists

harupy · 2023-05-23T00:32:50Z

mlflow/tracking/fluent.py

+    child_run = MlflowClient().get_run(run_id)
+    parent_run_id = child_run.data.tags.get(MLFLOW_PARENT_RUN_ID)
+    if parent_run_id is None:
+        return None
+    return MlflowClient().get_run(parent_run_id)


Suggested change

child_run = MlflowClient().get_run(run_id)

parent_run_id = child_run.data.tags.get(MLFLOW_PARENT_RUN_ID)

if parent_run_id is None:

return None

return MlflowClient().get_run(parent_run_id)

client = MlflowClient()

child_run = client.get_run(run_id)

parent_run_id = child_run.data.tags.get(MLFLOW_PARENT_RUN_ID)

if parent_run_id is None:

return None

return client.get_run(parent_run_id)

Can we reuse the client?

harupy · 2023-05-23T00:33:49Z

tests/tracking/fluent/test_fluent.py

+def test_get_parent_run():
+    parent_run_id = mlflow.start_run().info.run_id
+    mlflow.log_param("a", 1)
+    mlflow.log_metric("b", 2.0)
+    child_run_id = mlflow.start_run(nested=True).info.run_id
+    mlflow.end_run()
+    mlflow.end_run()
+
+    parent_run = mlflow.get_parent_run(child_run_id)
+    assert parent_run.info.run_id == parent_run_id
+    assert parent_run.data.metrics == {"b": 2.0}
+    assert parent_run.data.params == {"a": "1"}


Can we also add a test that ensures get_parent_run returns None if the parent run doesn't exist?

harupy

Left comments, otherwise LGTM!

harupy · 2023-05-23T02:57:24Z

tests/tracking/fluent/test_fluent.py

+    parent_run_id = mlflow.start_run().info.run_id
+    mlflow.log_param("a", 1)
+    mlflow.log_metric("b", 2.0)
+    child_run_id = mlflow.start_run(nested=True).info.run_id
+    mlflow.end_run()
+    mlflow.end_run()


Suggested change

parent_run_id = mlflow.start_run().info.run_id

mlflow.log_param("a", 1)

mlflow.log_metric("b", 2.0)

child_run_id = mlflow.start_run(nested=True).info.run_id

mlflow.end_run()

mlflow.end_run()

with mlflow.start_run() as parent:

mlflow.log_param("a", 1)

mlflow.log_metric("b", 2.0)

with mlflow.start_run(nested=True) as child:

Can we use with mlflow.start_run()?

WeichenXu123

LGTM once @harupy 's comments are addressed

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

prithvikannan

Awesome work @annzhang-db ! Just one small nit comment and a test

prithvikannan · 2023-05-23T21:36:24Z

mlflow/tracking/client.py

+                with mlflow.start_run(nested=True) as child_run:
+                    child_run_id = child_run.info.run_id
+
+            parent_run = mlflow.get_parent_run(child_run_id)


can we update this to demonstrate the MlflowClient approach? something like

parent_run = MlflowClient().get_parent_run

prithvikannan · 2023-05-23T21:36:43Z

mlflow/tracking/fluent.py

+        child_run_id: 7d175204675e40328e46d9a6a5a7ee6a
+        parent_run_id: 8979459433a24a52ab3be87a229a9cdf
+    """
+    return MlflowClient().get_parent_run(run_id)


mlflow/tracking/client.py

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

Introduce get_parent_run fluent API

c976cbe

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

annzhang-db requested a review from prithvikannan May 22, 2023 22:17

Fix code block

5487052

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

github-actions bot added area/tracking Tracking service, tracking client APIs, autologging rn/feature Mention under Features in Changelogs. labels May 22, 2023

prithvikannan reviewed May 22, 2023

View reviewed changes

sunishsheth2009 approved these changes May 23, 2023

View reviewed changes

harupy reviewed May 23, 2023

View reviewed changes

harupy approved these changes May 23, 2023

View reviewed changes

harupy reviewed May 23, 2023

View reviewed changes

WeichenXu123 approved these changes May 23, 2023

View reviewed changes

annzhang-db added 2 commits May 23, 2023 12:57

Add get_parent_run to client API

db098e6

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

Modify unit tests

d974bde

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

annzhang-db requested a review from prithvikannan May 23, 2023 20:41

prithvikannan approved these changes May 23, 2023

View reviewed changes

annzhang-db added 3 commits May 23, 2023 17:24

modify code example for client API

2380c77

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

Modify import for get_parent_run client API

d6570ff

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

Merge branch 'master' into get_parent_run

6a6ff89

Signed-off-by: Ann Zhang <ann.zhang@databricks.com>

annzhang-db enabled auto-merge (squash) May 24, 2023 17:35

annzhang-db merged commit 9ec828d into mlflow:master May 24, 2023
26 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce get_parent_run fluent API #8493

Introduce get_parent_run fluent API #8493

annzhang-db commented May 22, 2023 •

edited

mlflow-automation commented May 22, 2023 •

edited

prithvikannan left a comment

sunishsheth2009 left a comment

sunishsheth2009 May 23, 2023

harupy May 23, 2023

harupy May 23, 2023 •

edited

harupy left a comment

harupy May 23, 2023

WeichenXu123 left a comment •

edited

prithvikannan left a comment

prithvikannan May 23, 2023

prithvikannan May 23, 2023

Introduce get_parent_run fluent API #8493

Introduce get_parent_run fluent API #8493

Conversation

annzhang-db commented May 22, 2023 • edited

Related Issues/PRs

What changes are proposed in this pull request?

How is this patch tested?

Does this PR change the documentation?

Release Notes

Is this a user-facing change?

What component(s), interfaces, languages, and integrations does this PR affect?

How should the PR be classified in the release notes? Choose one:

mlflow-automation commented May 22, 2023 • edited

prithvikannan left a comment

Choose a reason for hiding this comment

sunishsheth2009 left a comment

Choose a reason for hiding this comment

sunishsheth2009 May 23, 2023

Choose a reason for hiding this comment

harupy May 23, 2023

Choose a reason for hiding this comment

harupy May 23, 2023 • edited

Choose a reason for hiding this comment

harupy left a comment

Choose a reason for hiding this comment

harupy May 23, 2023

Choose a reason for hiding this comment

WeichenXu123 left a comment • edited

Choose a reason for hiding this comment

prithvikannan left a comment

Choose a reason for hiding this comment

prithvikannan May 23, 2023

Choose a reason for hiding this comment

prithvikannan May 23, 2023

Choose a reason for hiding this comment

annzhang-db commented May 22, 2023 •

edited

mlflow-automation commented May 22, 2023 •

edited

harupy May 23, 2023 •

edited

WeichenXu123 left a comment •

edited