Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iris Example with Model Signature #46

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

shrinath-suresh
Copy link
Collaborator

Signed-off-by: Shrinath Suresh shrinath@ideas2it.com

What changes are proposed in this pull request?

Logging model signature and validating model signature for iris classification example.

Reused the model signature enforcements from pyfunc.

How is this patch tested?

Existing Unit Tests

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/projects: MLproject format, project running backends
  • area/scoring: Local serving, model deployment tools, spark UDFs
  • area/server-infra: MLflow server, JavaScript dev server
  • area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • area/uiux: Front-end, user experience, JavaScript, plotting
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

Language

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

Integrations

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
Signed-off-by: Shrinath Suresh <shrinath@ideas2it.com>
@shrinath-suresh shrinath-suresh changed the title [WIP] Iris Example with Model Signature Iris Example with Model Signature Mar 25, 2021
Comment on lines +720 to +729
if self.mlmodel_file_path:
mlmodel = Model.load(self.mlmodel_file_path)
if not hasattr(mlmodel, "signature"):
raise Exception("Model Signature not found")

input_schema = mlmodel.get_input_schema()

from mlflow.pyfunc import _enforce_schema
_enforce_schema(data, input_schema)

Copy link

@harupy harupy Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shrinath-suresh
Do we really need this change? Have you tried using mlflow.pyfunc.load_model("<model_uri>")?

Copy link
Collaborator Author

@shrinath-suresh shrinath-suresh Apr 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shrinath-suresh
Do we really need this change? Have you tried using mlflow.pyfunc.load_model("<model_uri>")?

As of now, mlflow.pytorch library only allows to log the signature. But, it doesnt have the mechanism to validate the signature.

Is it the case, that pytorch users can log the model signature using mlflow.pytorch and validate it using mlflow.pyfunc ? If that is the case, we don't need this change.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that pytorch users can log the model signature using mlflow.pytorch and validate it using mlflow.pyfunc

I think yes because that's the only way to enforce the schema now.

import pandas as pd


class IrisClassification(pl.LightningModule):
Copy link

@harupy harupy Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a similar class in other pytorch examples. Can we reuse it?

Comment on lines +242 to +246
# Uncomment this block to check invalid data type enforcement
# for column in df.columns:
# df[column] = df[column].astype("str")
#
# print("Result with invalid datatype: ", model.predict(df))
Copy link

@harupy harupy Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we uncomment this block and wrap it with try-catch and print out the error message in the except clause?

try:
    model.predict(df)
except Exception as e:
    print(e)

mlflow.pytorch.save_model(trainer.get_model(), "model", signature=signature)

model = _load_pyfunc(path="model/data", validate_signature=True)
df = pd.read_json("sample.json")
Copy link

@harupy harupy Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need really sample.json? Can we just hard-code its content in pd.DataFrame?

df = pd.DataFrame({"sepal length (cm)": ...})

Comment on lines +223 to +231
input_schema = Schema(
[
ColSpec("double", "sepal length (cm)"),
ColSpec("double", "sepal width (cm)"),
ColSpec("double", "petal length (cm)"),
ColSpec("double", "petal width (cm)"),
]
)
output_schema = Schema([ColSpec("long")])
Copy link

@harupy harupy Apr 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we infer the schema from the iris dataframe?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants