[SPARK-40301][PYTHON] Add parameter validations in pyspark.rdd#37752
[SPARK-40301][PYTHON] Add parameter validations in pyspark.rdd#37752zhengruifeng wants to merge 1 commit intoapache:masterfrom
Conversation
Yikun
left a comment
There was a problem hiding this comment.
Generally, LGTM but might need to doc migration-doc for API changes?
| """ | ||
| assert fraction >= 0.0, "Negative fraction value: %s" % fraction | ||
| if not fraction >= 0: | ||
| raise ValueError("Fraction must be nonnegative.") |
There was a problem hiding this comment.
Change assert to ValueError is right definately, I believe all assert in main code (espacailly, use when validation params) should be fix (test mode is okay)
[1] https://mail.python.org/pipermail/python-list/2013-November/810940.html
There was a problem hiding this comment.
AssertionError is replaced with a ValueError, but I think it maybe trivial to add it to migration-doc
|
Thanks! Shall we mention that ValueError will be raised for invalid inputs in the |
sure, thanks for pointing out it |
|
Merged into master, thank you @xinrong-meng @Yikun for reivews! |
What changes were proposed in this pull request?
1,compared with the scala side, some parameter validations were missing in
pyspark.rdd2,
rdd.samplechecking fraction will raiseValueErrorinstead ofAssertionErrorWhy are the changes needed?
add missing parameter validations in
pyspark.rddDoes this PR introduce any user-facing change?
yes, when fraction is invalide,
ValueErroris raised instead ofAssertionErrorHow was this patch tested?
existing testsutes