New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for np array-based tensors for Keras, Pytorch and TF flavors #3808
Conversation
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
mlflow/keras.py
Outdated
if self._graph is not None: | ||
with self._graph.as_default(): | ||
with self._sess.as_default(): | ||
predicted = pd.DataFrame(self.keras_model.predict(dataframe.values)) | ||
predicted = pd.DataFrame(self.keras_model.predict(input_data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we only want to return DataFrame if the input was DataFrame. If the was one of the numpy array based data types we should return the result as is.
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, I left some comments.
if isinstance(data, dict): | ||
feed_dict = {k: tensorflow.constant(v) for k, v in data.items()} | ||
elif isinstance(data, pandas.DataFrame): | ||
for df_col_name in list(data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably simplify now that know data is DataFrame to:feed_dict = {k: tensorflow.constant(v.values) for k, v in list(data)}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I think the purpose of this implementation is explained in the comments below. We need to also check if data[df_col_name]
is also a dataframe (there are multiple columns with the same name) and convert that to a np array instead. If there's only one column, data[df_col_name]
will have a pandas.core.series.Series
type, which is fine to pass into tensorflow.constant
I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, interesting, I did not know! Thanks for clarifying this.
I guess this is another way of passing tensors in a Dataframe.
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
…type is not supported Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! I think we should add one minor test case, but otherwise lgtm.
if isinstance(data, dict): | ||
feed_dict = {k: tensorflow.constant(v) for k, v in data.items()} | ||
elif isinstance(data, pandas.DataFrame): | ||
for df_col_name in list(data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, interesting, I did not know! Thanks for clarifying this.
I guess this is another way of passing tensors in a Dataframe.
@@ -267,6 +267,49 @@ def predict(pdf): | |||
res = pyfunc_model.predict(pdf) | |||
assert res.dtypes.to_dict() == expected_types | |||
|
|||
# 8. np.ndarrays can be converted to dataframe but have no columns | |||
with pytest.raises(MlflowException) as ex: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also test that his works if the schema has no column names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this case is tested here under test_schema_enforcement_no_col_names
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
This reverts commit 2b825e9. Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
…ors (mlflow#3808) * Add support for np array-based tensors for Keras flavor Signed-off-by: Wendy Hu <wendy.hu@databricks.com> * Add support for np array inputs for pytorch pyfunc model Signed-off-by: Wendy Hu <wendy.hu@databricks.com> * Add support for dicts for tf pyfunc models Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
…ors (mlflow#3808) * Add support for np array-based tensors for Keras flavor Signed-off-by: Wendy Hu <wendy.hu@databricks.com> * Add support for np array inputs for pytorch pyfunc model Signed-off-by: Wendy Hu <wendy.hu@databricks.com> * Add support for dicts for tf pyfunc models Signed-off-by: Wendy Hu <wendy.hu@databricks.com>
…ors (mlflow#3808) * Add support for np array-based tensors for Keras flavor Signed-off-by: Wendy Hu <wendy.hu@databricks.com> * Add support for np array inputs for pytorch pyfunc model Signed-off-by: Wendy Hu <wendy.hu@databricks.com> * Add support for dicts for tf pyfunc models Signed-off-by: Wendy Hu <wendy.hu@databricks.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
…ors (mlflow#3808) * Add support for np array-based tensors for Keras flavor Signed-off-by: Wendy Hu <wendy.hu@databricks.com> * Add support for np array inputs for pytorch pyfunc model Signed-off-by: Wendy Hu <wendy.hu@databricks.com> * Add support for dicts for tf pyfunc models Signed-off-by: Wendy Hu <wendy.hu@databricks.com> Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: Wendy Hu wendy.hu@databricks.com
How is this patch tested?
Release Notes
Is this a user-facing change?
Added support for np array-based input tensors for Keras. PyTorch and TF flavors
What component(s), interfaces, languages, and integrations does this PR affect?
Components
area/artifacts
: Artifact stores and artifact loggingarea/build
: Build and test infrastructure for MLflowarea/docs
: MLflow documentation pagesarea/examples
: Example codearea/model-registry
: Model Registry service, APIs, and the fluent client calls for Model Registryarea/models
: MLmodel format, model serialization/deserialization, flavorsarea/projects
: MLproject format, project running backendsarea/scoring
: Local serving, model deployment tools, spark UDFsarea/server-infra
: MLflow server, JavaScript dev serverarea/tracking
: Tracking Service, tracking client APIs, autologgingInterface
area/uiux
: Front-end, user experience, JavaScript, plottingarea/docker
: Docker use across MLflow's components, such as MLflow Projects and MLflow Modelsarea/sqlalchemy
: Use of SQLAlchemy in the Tracking Service or Model Registryarea/windows
: Windows supportLanguage
language/r
: R APIs and clientslanguage/java
: Java APIs and clientslanguage/new
: Proposals for new client languagesIntegrations
integrations/azure
: Azure and Azure ML integrationsintegrations/sagemaker
: SageMaker integrationsintegrations/databricks
: Databricks integrationsHow should the PR be classified in the release notes? Choose one:
rn/breaking-change
- The PR will be mentioned in the "Breaking Changes" sectionrn/none
- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/feature
- A new user-facing feature worth mentioning in the release notesrn/bug-fix
- A user-facing bug fix worth mentioning in the release notesrn/documentation
- A user-facing documentation change worth mentioning in the release notes