Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-35176][PYTHON] Standardize input validation error type #32368

Closed
wants to merge 5 commits into from

Conversation

Yikun
Copy link
Member

@Yikun Yikun commented Apr 27, 2021

What changes were proposed in this pull request?

This PR corrects some exception type when the function input params are failed to validate due to TypeError.
In order to convenient to review, there are 3 commits in this PR:

  • Standardize input validation error type on sql
  • Standardize input validation error type on ml
  • Standardize input validation error type on pandas

Why are the changes needed?

As suggestion from Python exception doc [1]: "Raised when an operation or function is applied to an object of inappropriate type.", but there are many Value error are raised in some pyspark code, this patch fix them.

[1] https://docs.python.org/3/library/exceptions.html#TypeError

Note that: this patch only addresses the exsiting some wrong raise type for input validation, the input validation decorator/framework which mentioned in SPARK-35176, would be submited in a speparated patch.

Does this PR introduce any user-facing change?

Yes, code can raise the right TypeError instead of ValueError.

How was this patch tested?

Existing test case and UT

@SparkQA
Copy link

SparkQA commented Apr 27, 2021

Test build #138000 has finished for PR 32368 at commit 5455d2b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 27, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42520/

@SparkQA
Copy link

SparkQA commented Apr 27, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42520/

@Yikun
Copy link
Member Author

Yikun commented Apr 27, 2021

cc @HyukjinKwon

@HyukjinKwon
Copy link
Member

Nice, thanks for working on this @Yikun. I am off this week so might be less active ..
cc @zero323 and @viirya FYI

@zero323
Copy link
Member

zero323 commented Apr 29, 2021

At first glance it looks good (I'll try to do more thorough scan if I have access to larger screen). We might have to document this as a change of behaviour though, as it might break some 3rd party code.

@HyukjinKwon
Copy link
Member

Oh yeah, that's a very good point.

@Yikun feel free to create a new page for Spark 3.1 -> 3.2 at https://github.com/apache/spark/tree/master/python/docs/source/migration_guide and document this one

@Yikun
Copy link
Member Author

Yikun commented Apr 29, 2021

@HyukjinKwon @zero323 Thanks for reminder, the migration doc has been added in 02080b0

@SparkQA
Copy link

SparkQA commented Apr 29, 2021

Test build #138060 has finished for PR 32368 at commit 02080b0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 29, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42579/

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks okay as TypeError should be more accurate for the kind of exceptions.

@SparkQA
Copy link

SparkQA commented Apr 29, 2021

Test build #138069 has finished for PR 32368 at commit 63a31fe.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 29, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42588/

@SparkQA
Copy link

SparkQA commented Apr 29, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42588/

@HyukjinKwon
Copy link
Member

Merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants