-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change comparison operators to return null when one of the operands is null #33
Conversation
…s null This is consistent with DataArrays, SQL and R, is more consistent with the null-propagating behavior used with other operations, and is safer in case null values were not expected (as an error will be thrown). isequal() and isless() still return true or false.
Codecov Report
@@ Coverage Diff @@
## master #33 +/- ##
==========================================
- Coverage 100% 98.36% -1.64%
==========================================
Files 1 1
Lines 50 61 +11
==========================================
+ Hits 50 60 +10
- Misses 0 1 +1
Continue to review full report at Codecov.
|
Coverage fall seems spurious since the offending line is clearly tested. |
I imagine this will be fairly breaking for any packages currently relying on Nulls. We may have wanted to ensure we get a patch release out with any non-breaking changes (if any exist) before merging this. |
Yes, in particular we need to fix DataFrames and add upper bounds. The positive side is that it will be less breaking for people migrating from the current DataFrames release. |
I only saw this now. I'm still of the same opinion that I was a year ago: I think this is not a good choice. In fact, one of the main reasons I created |
I kind of agree, though as much as I really dislike 3VL, it's what's familiar to users of R and SQL, so it would be potentially more confusing to newcomers to have our own conventions. Personally I prefer |
Yes, I know you still don't like this. But we've discussed it again, based on how R, SQL and DataArrays work, and given the feedback we've had from Erik Lippert at JuliaLang/julia#19034 (comment). We also considered the fact that we generally do not drop/skip nulls silently (e.g. with |
Actually, I take that back. One of the main reasons I didn't like this for I still don't like the SQL behavior of dropping rows for which a filter predicate returns |
Great! Indeed that's exactly the advantage of
Yeah, I agree that silently dropping nulls is inconsistent with what we do elsewhere, and apparently it also confuses SQL users. I have a few ideas about how to handle this with arguments to |
This is consistent with DataArrays, SQL and R, is more consistent with the null-propagating behavior used with other operations, and is safer in case null values were not expected (as an error will be thrown).
isequal()
andisless()
still returntrue
orfalse
.After discussing this, it appears returning
null
is the safest and most standard behavior for data analysis. For reference, previous discussions happened at JuliaLang/julia#19034 (comment) (and following comments), JuliaStats/NullableArrays.jl#85 and JuliaData/DataFramesMeta.jl#58.