Skip to content

Comments

[SPARK-40301][PYTHON] Add parameter validations in pyspark.rdd#37752

Closed
zhengruifeng wants to merge 1 commit intoapache:masterfrom
zhengruifeng:py_rdd_check
Closed

[SPARK-40301][PYTHON] Add parameter validations in pyspark.rdd#37752
zhengruifeng wants to merge 1 commit intoapache:masterfrom
zhengruifeng:py_rdd_check

Conversation

@zhengruifeng
Copy link
Contributor

@zhengruifeng zhengruifeng commented Sep 1, 2022

What changes were proposed in this pull request?

1,compared with the scala side, some parameter validations were missing in pyspark.rdd
2, rdd.sample checking fraction will raise ValueError instead of AssertionError

Why are the changes needed?

add missing parameter validations in pyspark.rdd

Does this PR introduce any user-facing change?

yes, when fraction is invalide, ValueError is raised instead of AssertionError

How was this patch tested?

existing testsutes

Copy link
Member

@Yikun Yikun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, LGTM but might need to doc migration-doc for API changes?

"""
assert fraction >= 0.0, "Negative fraction value: %s" % fraction
if not fraction >= 0:
raise ValueError("Fraction must be nonnegative.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change assert to ValueError is right definately, I believe all assert in main code (espacailly, use when validation params) should be fix (test mode is okay)

[1] https://mail.python.org/pipermail/python-list/2013-November/810940.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AssertionError is replaced with a ValueError, but I think it maybe trivial to add it to migration-doc

@xinrong-meng
Copy link
Member

Thanks!

Shall we mention that ValueError will be raised for invalid inputs in theDoes this PR introduce any user-facing change of PR description?

@zhengruifeng
Copy link
Contributor Author

Thanks!

Shall we mention that ValueError will be raised for invalid inputs in theDoes this PR introduce any user-facing change of PR description?

sure, thanks for pointing out it

@zhengruifeng
Copy link
Contributor Author

Merged into master, thank you @xinrong-meng @Yikun for reivews!

@zhengruifeng zhengruifeng deleted the py_rdd_check branch September 4, 2022 06:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants