Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoML support for MLflow Pipelines using FLAML #6959

Merged
merged 23 commits into from
Oct 14, 2022

Conversation

mshtelma
Copy link
Contributor

@mshtelma mshtelma commented Oct 3, 2022

Signed-off-by: Michael Shtelma michael.shtelma@databricks.com

Related Issues/PRs

What changes are proposed in this pull request?

This PR adds AutoML support to MLflow Pipelines using FLAML library.

How is this patch tested?

  • I have written tests (not required for typo or doc fix) and confirmed the proposed feature/bug-fix/change works.

Does this PR change the documentation?

  • No. You can skip the rest of this section.
  • Yes. Make sure the changed pages / sections render correctly by following the steps below.
  1. Click the Details link on the Preview docs check.
  2. Find the changed pages / sections and make sure they render correctly.

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

Support training via AutoML using FLAML in MLflow Pipelines.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/pipelines: Pipelines, Pipeline APIs, Pipeline configs, Pipeline Templates
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

Language

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

Integrations

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Signed-off-by: mshtelma <mshtelma@gmail.com>
@dbczumar dbczumar changed the title ML-25576 AutoML support for MLflow Pipelines using FLAML AutoML support for MLflow Pipelines using FLAML Oct 3, 2022
Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com>
@github-actions github-actions bot added the area/recipes MLflow Recipes, Recipes APIs, Recipes configs, Recipe Templates label Oct 3, 2022
@prithvikannan prithvikannan added the rn/feature Mention under Features in Changelogs. label Oct 3, 2022
prithvikannan and others added 5 commits October 3, 2022 13:17
Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com>
Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com>
Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com>
# Conflicts:
#	mlflow/pipelines/steps/train.py
@mlflow-automation
Copy link
Collaborator

mlflow-automation commented Oct 10, 2022

Documentation preview will be available here.

Notes
  • Ignore this comment if this PR does not change the documentation.
  • It takes a few minutes for the preview to be available.
  • The preview is updated on every commit to this PR.
  • Job URL: https://circleci.com/gh/mlflow/mlflow/33203
  • Updated at: 2022-10-14 21:49:00.554120

Signed-off-by: Michael Shtelma <michael.shtelma@databricks.com>
Signed-off-by: Michael Shtelma <michael.shtelma@databricks.com>
Copy link
Collaborator

@prithvikannan prithvikannan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for building this out @mshtelma -- this is awesome! I've left a few small comments, mostly about code styling.

Comment on lines +49 to +50
# pylint: disable=keyword-arg-before-vararg
# pylint: disable=unused-argument
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to do this? Can we just remove labels, weight_val, and weight_train and these pylint lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the signature that FLAML expects, so I believe we should follow it.

Comment on lines 98 to 103
else:
_logger.warning(
f"There is no FLAML alternative or custom metric for {primary_metric} metric.\n"
f"Using 'auto' metric instead."
)
metric = "auto"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should throw an error in this case. We should require the user to specify a primary metric (which we do), and either respect that metric or error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

automl_settings["metric"] = metric
automl_settings["task"] = "regression"

mlflow.autolog(disable=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to disable autologging here? In microsoft/FLAML#747, we fixed the mlflow-FLAML integration so that the logging is in the form we want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I comment out this line, I get a weird exception in our auto-logging code. Btw, it looks like an MLflow bug:

Traceback (most recent call last):
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/pipelines/steps/automl/flaml.py", line 114, in _create_model_automl
    automl.fit(X, y, **automl_settings)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlflow-dev-env/lib/python3.8/site-packages/flaml/automl.py", line 2898, in fit
    self._search()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlflow-dev-env/lib/python3.8/site-packages/flaml/automl.py", line 3462, in _search
    self._search_sequential()
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlflow-dev-env/lib/python3.8/site-packages/flaml/automl.py", line 3280, in _search_sequential
    analysis = tune.run(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlflow-dev-env/lib/python3.8/site-packages/flaml/tune/tune.py", line 514, in run
    result = evaluation_function(trial_to_run.config)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlflow-dev-env/lib/python3.8/site-packages/flaml/automl.py", line 350, in _compute_with_config_base
    ) = compute_estimator(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlflow-dev-env/lib/python3.8/site-packages/flaml/ml.py", line 610, in compute_estimator
    val_loss, metric_for_logging, train_time, pred_time = evaluate_model_CV(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlflow-dev-env/lib/python3.8/site-packages/flaml/ml.py", line 523, in evaluate_model_CV
    val_loss_i, metric_i, train_time_i, pred_time_i = get_val_loss(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlflow-dev-env/lib/python3.8/site-packages/flaml/ml.py", line 412, in get_val_loss
    estimator.fit(X_train, y_train, budget, **fit_kwargs)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlflow-dev-env/lib/python3.8/site-packages/flaml/model.py", line 1133, in fit
    self._fit(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlflow-dev-env/lib/python3.8/site-packages/flaml/model.py", line 197, in _fit
    model.fit(X_train, y_train, **kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/utils/autologging_utils/safety.py", line 555, in safe_patch_function
    patch_function(call_original, *args, **kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/utils/autologging_utils/safety.py", line 254, in patch_with_managed_run
    result = patch_function(original, *args, **kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/sklearn/__init__.py", line 1552, in patched_fit
    result = fit_impl(original, self, *args, **kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/sklearn/__init__.py", line 1299, in fit_mlflow_xgboost_and_lightgbm
    fit_output = original(self, *args, **kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/utils/autologging_utils/safety.py", line 536, in call_original
    return call_original_fn_with_event_logging(_original_fn, og_args, og_kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/utils/autologging_utils/safety.py", line 471, in call_original_fn_with_event_logging
    original_fn_result = original_fn(*og_args, **og_kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/utils/autologging_utils/safety.py", line 533, in _original_fn
    original_result = original(*_og_args, **_og_kwargs)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlflow-dev-env/lib/python3.8/site-packages/lightgbm/sklearn.py", line 895, in fit
    super().fit(X, y, sample_weight=sample_weight, init_score=init_score,
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/utils/autologging_utils/safety.py", line 555, in safe_patch_function
    patch_function(call_original, *args, **kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/utils/autologging_utils/safety.py", line 254, in patch_with_managed_run
    result = patch_function(original, *args, **kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/sklearn/__init__.py", line 1559, in patched_fit
    return original(self, *args, **kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/utils/autologging_utils/safety.py", line 536, in call_original
    return call_original_fn_with_event_logging(_original_fn, og_args, og_kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/utils/autologging_utils/safety.py", line 471, in call_original_fn_with_event_logging
    original_fn_result = original_fn(*og_args, **og_kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/utils/autologging_utils/safety.py", line 533, in _original_fn
    original_result = original(*_og_args, **_og_kwargs)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlflow-dev-env/lib/python3.8/site-packages/lightgbm/sklearn.py", line 748, in fit
    self._Booster = train(
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/utils/autologging_utils/safety.py", line 555, in safe_patch_function
    patch_function(call_original, *args, **kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/utils/autologging_utils/safety.py", line 254, in patch_with_managed_run
    result = patch_function(original, *args, **kwargs)
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/lightgbm.py", line 665, in train
    param_logging_operations.await_completion()
  File "/Users/michael.shtelma/Development/Projects/mlflow_msh/mlflow/utils/autologging_utils/client.py", line 69, in await_completion
    raise MlflowException(
mlflow.exceptions.MlflowException: The following failures occurred while performing one or more logging operations: [MlflowException('Failed to perform one or more operations on the run with ID 91aa473ed5cd417184e4de742889c273. Failed operations: [MlflowException(\'Changing param values is not allowed. Params were already logged=\\\'[{\\\'key\\\': \\\'boosting_type\\\', \\\'old_value\\\': None, \\\'new_value\\\': \\\'gbdt\\\'}, {\\\'key\\\': \\\'max_depth\\\', \\\'old_value\\\': \\\'0\\\', \\\'new_value\\\': \\\'-1\\\'}, {\\\'key\\\': \\\'min_child_samples\\\', \\\'old_value\\\': None, \\\'new_value\\\': \\\'20\\\'}, {\\\'key\\\': \\\'min_child_weight\\\', \\\'old_value\\\': \\\'0.9999999999999993\\\', \\\'new_value\\\': \\\'0.001\\\'}, {\\\'key\\\': \\\'min_split_gain\\\', \\\'old_value\\\': None, \\\'new_value\\\': \\\'0.0\\\'}, {\\\'key\\\': \\\'num_leaves\\\', \\\'old_value\\\': None, \\\'new_value\\\': \\\'4\\\'}, {\\\'key\\\': \\\'subsample_for_bin\\\', \\\'old_value\\\': None, \\\'new_value\\\': \\\'200000\\\'}, {\\\'key\\\': \\\'subsample_freq\\\', \\\'old_value\\\': None, \\\'new_value\\\': \\\'0\\\'}, {\\\'key\\\': \\\'max_bin\\\', \\\'old_value\\\': None, \\\'new_value\\\': \\\'255\\\'}, {\\\'key\\\': \\\'verbose\\\', \\\'old_value\\\': None, \\\'new_value\\\': \\\'-1\\\'}, {\\\'key\\\': \\\'objective\\\', \\\'old_value\\\': \\\'reg:squarederror\\\', \\\'new_value\\\': \\\'regression\\\'}, {\\\'key\\\': \\\'metric\\\', \\\'old_value\\\': None, \\\'new_value\\\': "[\\\'regression\\\']"}, {\\\'key\\\': \\\'feature_name\\\', \\\'old_value\\\': None, \\\'new_value\\\': \\\'auto\\\'}, {\\\'key\\\': \\\'categorical_feature\\\', \\\'old_value\\\': None, \\\'new_value\\\': \\\'auto\\\'}, {\\\'key\\\': \\\'evals_result\\\', \\\'old_value\\\': None, \\\'new_value\\\': \\\'None\\\'}, {\\\'key\\\': \\\'verbose_eval\\\', \\\'old_value\\\': \\\'True\\\', \\\'new_value\\\': \\\'warn\\\'}, {\\\'key\\\': \\\'keep_training_booster\\\', \\\'old_value\\\': None, \\\'new_value\\\': \\\'False\\\'}]\\\' for run ID=\\\'91aa473ed5cd417184e4de742889c273\\\'.\')]')]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind adding a ticket to follow up on this issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 118 to 121
_logger.warning(
f"Error has occurred during training of AutoML model using FLAML: {repr(e)}",
exc_info=True,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this logging since we will throw the exception with this text.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 137 to 143
if "estimator_method" not in self.step_config and self.step_config["using"] in [
"estimator_spec"
]:
raise MlflowException(
"Missing 'estimator_method' configuration in the train step.",
error_code=INVALID_PARAMETER_VALUE,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update the exception message to mention using estimator_spec?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return estimator

def _resolve_estimator_plugin(self, plugin_str, X_train, y_train):
plugin_str = plugin_str.replace("/", ".").replace("@", ".")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? Can we just make the user plugin str automl.flaml?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to have some sort of a plugin mechanism so that in the future, if we decide to add AutoGluon, for example, we can just implent autogluon.py, and "automl/autogluon" will work out of the box.

Comment on lines 394 to 395
plugin_module_str = f"{sys.modules[__name__].__package__}.{plugin_str}"
estimator_fn = getattr(importlib.import_module(plugin_module_str), "get_estimator")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we follow the pattern on L373-374 instead?

}


def get_estimator(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename this to get_estimator_and_best_params?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 142 to 153
def _load_one_custom_metric_function(pipeline_root: str, metric: PipelineMetric) -> callable:
try:
sys.path.append(pipeline_root)
custom_metrics_mod = importlib.import_module("steps.custom_metrics")
return getattr(custom_metrics_mod, metric.custom_function)
except Exception as e:
raise MlflowException(
message="Failed to load custom metric functions",
error_code=BAD_REQUEST,
) from e


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can reuse the existing _load_custom_metric_functions by just specifying a single metric in the metrics list?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point. Done!

Copy link
Collaborator

@jinzhang21 jinzhang21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the first 2 files. Will take a look at the rest later. Thanks, @mshtelma !

Comment on lines 14 to 15
AUTOML_DEFAULT_TIME_BUDGET = 10
MLFLOW_TO_FLAML_METRICS = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add "_" in front since we don't want to expose these params externally.
_AUTOML_DEFAULT_TIME_BUDGET and _MLFLOW_TO_FLAML_METRICS

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"time_budget_secs", AUTOML_DEFAULT_TIME_BUDGET
)
automl_settings["metric"] = metric
automl_settings["task"] = "regression"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we pull automl_settings["task"] as the function arg? We are building the classification pipeline right now. It'd make sense to support both problem types in the train step.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point! Done!

Comment on lines 118 to 121
_logger.warning(
f"Error has occurred during training of AutoML model using FLAML: {repr(e)}",
exc_info=True,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary to both log and raise. Simple raise should suffice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +112 to +115
mlflow.autolog(disable=True)
automl = AutoML()
automl.fit(X, y, **automl_settings)
mlflow.autolog(disable=False, log_models=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment on the expected autolog behavior change here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. It's looking like a bug in autologging functionality.

mshtelma and others added 4 commits October 12, 2022 21:43
# Conflicts:
#	mlflow/pipelines/steps/train.py
Signed-off-by: Michael Shtelma <michael.shtelma@databricks.com>
Signed-off-by: Michael Shtelma <michael.shtelma@databricks.com>
automl_settings["metric"] = metric
automl_settings["task"] = "regression"

mlflow.autolog(disable=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind adding a ticket to follow up on this issue?

@@ -135,14 +136,15 @@ def _validate_and_apply_step_config(self):
error_code=INVALID_PARAMETER_VALUE,
)
self.skip_data_profiling = self.step_config.get("skip_data_profiling", False)
if "estimator_method" not in self.step_config:
if "estimator_method" not in self.step_config and self.step_config["using"] in [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.step_config["using"] == "estimator_spec"? The exception message below is specific to this condition.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly.

Comment on lines 690 to 691
f"<pre>{param}: {value}</pre><br>"
for param, value in self.best_parameters.items()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can user simply copy these params to pipeline.yaml to create a baseline model?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, of course, this should work.

@@ -887,7 +946,7 @@ def objective(X_train, y_train, validation_df, hyperparameter_args, on_worker=Fa
def _log_estimator_to_mlflow(self, estimator, X_train_sampled, on_worker=False):
from mlflow.models.signature import infer_signature

if hasattr(estimator, "best_score_"):
if hasattr(estimator, "best_score_") and (type(estimator.best_score_) in [int, float]):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What other types could best_score_ be? None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that in some cases (lgbm as far as I remember) it can be a Dict. It is not related to AutoML, I have seen this error before as well, with AutoML it's just easier to reproduce it.

@@ -60,7 +60,14 @@ def setup_train_dataset(pipeline_root: Path):


# Sets up the constructed TrainStep instance
def setup_train_step(pipeline_root: Path, use_tuning: bool, with_hardcoded_params: bool = True):
def setup_train_step(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we split this function into setup_train_step_with_tuning and setup_train_step_with_automl? They are not very related. The function has become difficult to reasonable with so many knobs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

Copy link
Collaborator

@jinzhang21 jinzhang21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @mshtelma ! LGTM overall with some minor comments. Please attach a screenshot on the step card output. Thanks.

mshtelma and others added 3 commits October 14, 2022 08:26
Signed-off-by: Michael Shtelma <michael.shtelma@databricks.com>
Signed-off-by: Michael Shtelma <michael.shtelma@databricks.com>
Signed-off-by: Michael Shtelma <michael.shtelma@databricks.com>
Signed-off-by: Michael Shtelma <michael.shtelma@databricks.com>
Signed-off-by: Michael Shtelma <michael.shtelma@databricks.com>
@mshtelma
Copy link
Contributor Author

Here is the screenshot of the train step card:
New Note

Copy link
Collaborator

@prithvikannan prithvikannan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @mshtelma for driving this effort!

@@ -61,8 +61,9 @@ def __init__(self, step_config, pipeline_root, pipeline_config=None):
self.pipeline_config = pipeline_config

def _validate_and_apply_step_config(self):
self.task = self.step_config.get("template_name", "regression/v1").rsplit("/", 1)[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice - clever way to not hardcode the flaml task

Copy link
Collaborator

@prithvikannan prithvikannan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, one small comment about writing outputs to yaml

Comment on lines 684 to 701
if self.step_config["using"].startswith("automl"):
automl_card_tab = card.add_tab(
"Best Estimator (FLAML)",
"{{ AUTOML }} ",
)
params_html = "".join(
[f"<pre>{param}: {value}</pre>" for param, value in self.best_parameters.items()]
)
automl_card_tab.add_html(
"AUTOML",
f"<b>Best estimator:</b><br>"
f"<pre>{self.best_estimator_name}</pre><br>"
f"<b>Best estimator class:</b><br>"
f"<pre>{self.best_estimator_class}</pre><br><br>"
f"<b>Best parameters:</b><br>"
f"{params_html}<br><br>",
)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we write the automl parameters output to a yaml file? and also log that file to mlflow? this is important for getting users from automl -> estimator spec training.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Signed-off-by: Michael Shtelma <michael.shtelma@databricks.com>
Copy link
Collaborator

@prithvikannan prithvikannan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this change! Left a small comment about changing the best_parameters.yaml syntax, but we are almost there

@@ -414,6 +414,7 @@ def _resolve_estimator_plugin(self, plugin_str, X_train, y_train):
f"{estimator.__class__.__module__}.{estimator.__class__.__name__}"
)
self.best_parameters = best_parameters
self._write_best_parameters_outputs(best_parameters, {}, output_directory)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will need to change the behavior in _write_best_parameters_outputs.

Currently it will output in this form

# tuned hyperparameters
alpha: 0.1

# hardcoded parameters

but that doesn't make sense in the context of automl.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to write an empty "# hardcoded parameters" bucket for tuning runs? I will check if the respective dict has any values and write the header only in this case. Hope this is ok for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry my comment was unclear. the issue is that automl parameters will be written under the heading "# tuned hyperparameters", but that is not accurate. instead, we should extend _write_best_parameters_outputs to accept a automl parameters argument, and then output those under the heading "# automl parameters"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the empty dict for hardcoded params is not a problem IMO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, pushed this one as well!

@@ -642,6 +681,25 @@ def render_schema(inputs, title):
f"<b>Best parameters:</b><br>" f"<pre>{best_parameters}</pre><br><br>",
)

# Tab 8.1: Best AutoML Parameters
if self.step_config["using"].startswith("automl"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we can collapse this into the if os.path.exists(best_parameters_yaml): section above now that we are also writing that file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, for AutoML, we also output the name and class of the estimator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

…parameters yaml

Signed-off-by: Michael Shtelma <michael.shtelma@databricks.com>
…parameters yaml

Signed-off-by: Michael Shtelma <michael.shtelma@databricks.com>
Copy link
Collaborator

@prithvikannan prithvikannan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for working through many iterations of this @mshtelma

…parameters yaml

Signed-off-by: Michael Shtelma <michael.shtelma@databricks.com>
@mshtelma mshtelma merged commit f947f73 into mlflow:master Oct 14, 2022
@prithvikannan prithvikannan mentioned this pull request Nov 12, 2022
33 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/recipes MLflow Recipes, Recipes APIs, Recipes configs, Recipe Templates rn/feature Mention under Features in Changelogs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants