-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable auto dependency inference in spark flavor #4759
Changes from 5 commits
f236960
d7b1be6
9750551
a35420f
9ecc045
a665a74
a4755d5
9f1e636
6dfffa0
c4966cf
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -13,6 +13,7 @@ | |||||||||||||||||||||||||||||
from mlflow.utils.file_utils import write_to | ||||||||||||||||||||||||||||||
from mlflow.pyfunc import MAIN | ||||||||||||||||||||||||||||||
from mlflow.models.model import MLMODEL_FILE_NAME, Model | ||||||||||||||||||||||||||||||
from mlflow.utils.databricks_utils import is_in_databricks_runtime | ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
def _get_top_level_module(full_module_name): | ||||||||||||||||||||||||||||||
|
@@ -84,6 +85,11 @@ def main(): | |||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
cap_cm = _CaptureImportedModules() | ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
if flavor == "spark" and is_in_databricks_runtime(): | ||||||||||||||||||||||||||||||
from dbruntime.spark_connection import initialize_spark_connection | ||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
initialize_spark_connection() | ||||||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if the case not in databricks runtime ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If not, a new spark session is created in the following code: Lines 687 to 700 in c4b8e84
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shall we add try/catch here and add fallback handling ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agree with @WeichenXu123 , I think a try/catch is a good idea here. |
||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||
# If `model_path` refers to an MLflow model directory, load the model using | ||||||||||||||||||||||||||||||
# `mlflow.pyfunc.load_model` | ||||||||||||||||||||||||||||||
if os.path.isdir(model_path) and MLMODEL_FILE_NAME in os.listdir(model_path): | ||||||||||||||||||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach breaks when
initialize_spark_connection
is renamed or moved to a different module.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make the spark initialization process modifiable via monkey-patching or an environment variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a MLR test to prevent it break.