-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-20552][SQL][PYTHON] Add isNotDistinctFrom/isDistinctFrom for column APIs in Scala and Python #17827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| https://spark.apache.org/docs/latest/sql-programming-guide.html#nan-semantics | ||
| .. versionadded:: 2.3.0 | ||
| """ | ||
| _isNotDistinctFrom_doc = _eqNullSafe_doc.replace("eqNullSafe", "isNotDistinctFrom") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| .. _NaN Semantics: | ||
| https://spark.apache.org/docs/latest/sql-programming-guide.html#nan-semantics | ||
| .. versionadded:: 2.3.0 | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| testData2.collect().toSeq.filter(r => r.getInt(0) == 1)) | ||
|
|
||
| checkAnswer( | ||
| testData2.filter($"a" === $"b"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test below:
checkAnswer(
testData2.filter($"a" === 1),
testData2.collect().toSeq.filter(r => r.getInt(0) == 1))
checkAnswer(
testData2.filter($"a" === $"b"),
testData2.collect().toSeq.filter(r => r.getInt(0) == r.getInt(1)))looked to me identical with the test for === above and not testing <=>. So I removed this as a duplicated test.
| testData2.filter($"a" === $"b"), | ||
| testData2.collect().toSeq.filter(r => r.getInt(0) == r.getInt(1))) | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
=!= test looked actually testing <=>. I switched this to <=> and created a test for =!= below separately.
|
cc @gatorsmile and @ptkool, could you take a look and see if it makes sense please? |
|
Test build #76370 has finished for PR 17827 at commit
|
|
Test build #76374 has started for PR 17827 at commit |
|
retest this please |
|
|
||
| /** | ||
| * Equality test that is safe for null values. | ||
| * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like eqNullSafe, they are normally used for JAVA APIs.
|
|
|
Yea, that is what I initially thought. I am closing this. |
|
Test build #76376 has finished for PR 17827 at commit
|


What changes were proposed in this pull request?
This PR proposes to add both
isNotDistinctFromandisDistinctFromto both Scala and Python column APIs.IS [NOT] DISTINCT FROMsyntax is now supported in favour of #17764Adding a Python API was initially suggested in that PR but that PR turned to SQL syntax change only. Per #17764 (comment) I assume we want this.
How was this patch tested?
Doctests for Python and unit tests in
ColumnExpressionSuite.