Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix test_spark_udf_autofills_no_arguments #4914

Merged
merged 1 commit into from
Oct 21, 2021

Conversation

harupy
Copy link
Member

@harupy harupy commented Oct 21, 2021

Signed-off-by: harupy 17039389+harupy@users.noreply.github.com

What changes are proposed in this pull request?

In pyspark 3.2, an error message for AnalysisException has changed, which broke test_spark_udf_autofills_no_arguments.

https://github.com/mlflow/mlflow/pull/4912/checks?check_run_id=3958964853#step:6:356

            with pytest.raises(AnalysisException, match=r"cannot resolve '`a`' given input columns"):
>               bad_data.withColumn("res", udf())
E               AssertionError: Pattern 'cannot resolve '`a`' given input columns' not found in 'cannot resolve 'a' given input columns: [b, c, d, x];
E               'Project [x#271L, b#272L, c#273L, d#274L, predict('a, b#272L, c#273L) AS res#280]
E               +- LogicalRDD [x#271L, b#272L, c#273L, d#274L], false

How is this patch tested?

(Details)

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

Language

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

Integrations

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
@github-actions github-actions bot added area/build Build and test infrastructure for MLflow rn/none List under Small Changes in Changelogs. labels Oct 21, 2021
Copy link
Collaborator

@dbczumar dbczumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@harupy harupy merged commit 276617e into mlflow:master Oct 21, 2021
adriangonz pushed a commit to adriangonz/mlflow that referenced this pull request Oct 21, 2021
Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>
tomasatdatabricks pushed a commit that referenced this pull request Oct 22, 2021
* Fix error message match (#4914)

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add initial functionality to run MLServer

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Run black on previous changes

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Make _get_requires_recursive more robust towards recursive dep cycles

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Remove mlserver, since it's also contained within mlserver-mlflow

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Parametrize serve test to run on both scoring_server and mlserver

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Make args to get_cmd optional

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Refactor _serve_pyfunc to leverage get_cmd method

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Make args to mlserver.get_cmd optional

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Use MLServer or current scoring server depending on env var

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add mlserver to list of deps

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add port explicitly

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Update nginx conf to expose v2 protocol

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add default model name

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add section on MLServer to models documentation

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Fix links

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add note on MLServer

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Re-structure bullet points

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Disable NGINX if MLServer is enabled

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Update flag name in tests

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Clarify _prune_packages changes

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Simplify docs note on MLServer

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Fix parametrize name

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Ensure MLServer test runs in Python 3.7

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Group env vars together

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Use Python 3.7 on MLServer image tests

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Fix linter

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Fix missing extra_args list

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Update docs text

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Rename variable

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Undo changes on nginx.conf

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add example to deploy to Seldon Core and KServe

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Set Python version in example to 3.7

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Remove extra flag

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Fix kserve and sc manifests

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Revert back logic to build recursive list of packages, keeping the set of visited nodes

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Patch PYTHON_VERSION to 3.7 if MLServer is enabled

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Expose MLServer in port 8080 by default

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Update port used in tutorial

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Skip assertions on error structure if MLServer is enabled

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Pass set of visited packages

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* If Python 3.7 is already in use, disable Conda to address issues with Windows tests

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add MLServer parameter to FlavorBackend

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Replace set of visited packages for top-level package name

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Skip MLServer test if running on Windows

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add constrain to MLServer version

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Bump MLServer version to 0.5.2

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Raise exception if MLServer is enabled in R backend

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

Co-authored-by: Harutaka Kawamura <hkawamura0130@gmail.com>
tomasatdatabricks pushed a commit that referenced this pull request Nov 1, 2021
* Fix error message match (#4914)

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add initial functionality to run MLServer

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Run black on previous changes

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Make _get_requires_recursive more robust towards recursive dep cycles

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Remove mlserver, since it's also contained within mlserver-mlflow

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Parametrize serve test to run on both scoring_server and mlserver

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Make args to get_cmd optional

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Refactor _serve_pyfunc to leverage get_cmd method

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Make args to mlserver.get_cmd optional

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Use MLServer or current scoring server depending on env var

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add mlserver to list of deps

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add port explicitly

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Update nginx conf to expose v2 protocol

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add default model name

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add section on MLServer to models documentation

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Fix links

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add note on MLServer

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Re-structure bullet points

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Disable NGINX if MLServer is enabled

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Update flag name in tests

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Clarify _prune_packages changes

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Simplify docs note on MLServer

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Fix parametrize name

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Ensure MLServer test runs in Python 3.7

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Group env vars together

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Use Python 3.7 on MLServer image tests

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Fix linter

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Fix missing extra_args list

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Update docs text

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Rename variable

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Undo changes on nginx.conf

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add example to deploy to Seldon Core and KServe

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Set Python version in example to 3.7

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Remove extra flag

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Fix kserve and sc manifests

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Revert back logic to build recursive list of packages, keeping the set of visited nodes

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Patch PYTHON_VERSION to 3.7 if MLServer is enabled

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Expose MLServer in port 8080 by default

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Update port used in tutorial

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Skip assertions on error structure if MLServer is enabled

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Pass set of visited packages

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* If Python 3.7 is already in use, disable Conda to address issues with Windows tests

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add MLServer parameter to FlavorBackend

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Replace set of visited packages for top-level package name

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Skip MLServer test if running on Windows

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add constrain to MLServer version

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Bump MLServer version to 0.5.2

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Raise exception if MLServer is enabled in R backend

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Require minimum 0.5.3 for MLServer

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Shuffle imports to trigger pytorch tests

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Move MLServer dependency to extras

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Install mlserver in container

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Pass kwarg to _install_pyfunc_deps

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Add note to install mlserver[extras]

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Format line to ensure linter passes

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Shuffle imports to trigger sklearn tests

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Ensure MLServer is present in Conda environment used for tests

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

* Revert import shuffling

Signed-off-by: Adrian Gonzalez-Martin <agm@seldon.io>

Co-authored-by: Harutaka Kawamura <hkawamura0130@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/build Build and test infrastructure for MLflow rn/none List under Small Changes in Changelogs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants