Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilizing methods in Python Models modules #1226

Merged
merged 5 commits into from
May 10, 2019
Merged

Conversation

sueann
Copy link
Contributor

@sueann sueann commented May 8, 2019

What changes are proposed in this pull request?

To make the methods in Models modules stable in MLflow 1.0, we want to future-proof them against any potential changes in the underlying frameworks (e.g. sklearn, spark, tensorflow). To do so, here we propose to make some methods take in keyword arguments only, by introducing a keyword_only decorator that enforces such.

The change is breaking for any current usages of the APIs, so here we recommend changing only the TensorFlow APIs which may need to change in the near future:

def log_model(tf_saved_model_dir, tf_meta_graph_tags, tf_signature_def_key, artifact_path, conda_env=None)
def save_model(tf_saved_model_dir, tf_meta_graph_tags, tf_signature_def_key, path, mlflow_model=Model(), conda_env=None)

Alternatively, we could not put any restrictions and create new modules (e.g. tensorflow2) in the future as needed.

If we go with the keyword_only decorator route, the following also may merit the decorator if we think the requirement for a sample_df in mleap will change in the future:

mlflow.mleap.log_model(spark_model, sample_input, artifact_path)
mlflow.mleap.save_model(spark_model, sample_input, path, mlflow_model=Model())

but the current PR does not mark them as so.

How is this patch tested?

Existing & new unit tests

pytest --large tests/tensorflow/test_tensorflow_model_export.py

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

We now require the arguments of save_model and log_model methods in the tensorflow and mleap modules to be keyword arguments. This is to future-proof their usages against API changes involving the TensorFlow model specification.

What component(s) does this PR affect?

  • API
  • Models
  • Python

How should the PR be classified in the release notes? Choose one:

  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section

@sueann sueann added the rn/breaking-change Mention under Breaking Changes in Changelogs. label May 8, 2019
@sueann
Copy link
Contributor Author

sueann commented May 8, 2019

Screen Shot 2019-05-08 at 3 21 27 PM

@aarondav
Copy link
Contributor

aarondav commented May 9, 2019

Cool!

@aarondav
Copy link
Contributor

aarondav commented May 9, 2019

Error from providing only a subset:

log_model() missing 3 required positional arguments: 'tf_saved_model_dir', 'tf_signature_def_key', and 'artifact_path'

seems fine!

@sueann
Copy link
Contributor Author

sueann commented May 9, 2019

Yeah we could make them optional and put validation inside the methods for them if this turns out to be confusing, but I think it is okay for now.

@sueann sueann changed the title [RFC] Stabilizing methods in Python Models modules Stabilizing methods in Python Models modules May 9, 2019
@wraps(func)
def wrapper(*args, **kwargs):
if len(args) > 0:
raise TypeError("Method %s only accepts keyword arguments." % func.__name__)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> requires

@sueann
Copy link
Contributor Author

sueann commented May 9, 2019

let's add keyword_only to mleap as well


FLAVOR_NAME = "mleap"


@keyword_only
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dbczumar I think you mentioned these might be used in sagemaker container? i didn't see them used anywhere actually - let me know if I'm missing something (hopefully the tests will tell me if I am but just in case)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_to_model is used in mlflow.spark.save_model

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, I wasn't going to keyword_only add_to_model since the sample_df param is the last one. i guess we might as well... :-/

@aarondav aarondav added the LGTM label May 10, 2019
@sueann sueann merged commit 3125972 into mlflow:master May 10, 2019
avflor pushed a commit to avflor/mlflow that referenced this pull request Aug 22, 2020
To make the methods in Models modules stable in MLflow 1.0, we want to future-proof them against any potential changes in the underlying frameworks (e.g. sklearn, spark, tensorflow). To do so, here we propose to make some methods take in keyword arguments only, by introducing a keyword_only decorator that enforces such.

We apply the keyword_only decorator to `log_model`, `save_model` and `add_to_model` methods in the `tensorflow` and `mleap` modules.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
LGTM rn/breaking-change Mention under Breaking Changes in Changelogs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants