-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix type annotations for fill_nan()
#4445
Fix type annotations for fill_nan()
#4445
Conversation
b63f76a
to
cd4365c
Compare
fill_nan()
fill_nan()
py-polars/tests/test_lazy.py
Outdated
@@ -589,6 +589,7 @@ def test_fill_nan() -> None: | |||
.collect()["a"] | |||
.series_equal(pl.Series("a", [1.0, 2.0, 3.0])) | |||
) | |||
assert tuple(df.lazy().fill_nan(None).collect()["a"]) == (1.0, None, 3.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't convert to tuple
what is the mismatch? Is it a datatype problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The print methods are identical, but comparision with frame_assert_equal()
yields not identical. So I don't really understand. You can also c/p the example I posted in my initial PR description to see the problem and maybe you'll find a hint...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. This is a problem with Series.series_equal
.
In [25]: s = pl.Series("a", [1.0, None, 3.0])
In [26]: s.series_equal(s)
Out[26]: False
I will open a separate issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lorenzwalthert Series.series_equal
has a null_equal
parameter that defaults to False
. Can you change the assert
statement to
assert (
df.lazy()
.fill_nan(None)
.collect()["a"]
.series_equal(pl.Series("a", [1.0, None, 3.0]), null_equal=True)
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ups of course. Did not know that, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if null_equal
defaults to True
for data frames in frame_equal()
, shouldn't it also have the same default value for polars.series
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should. That's more consistent.
…rame and series and expression
cd4365c
to
21720e2
Compare
Thanks @lorenzwalthert 👍 |
I added the types as suggested in #3066 (comment) for series, data frame and expression. tested it but for some reason, when comparing series and expressions with their reference, I could not get them to be equal. I had to convert to a tuple to pass the equality check. E.g. check this:
They have the same printed output but they are not identical.
For eager, it works:
Not sure the comparison method needs to be adapted too, but I considered that out of scope for this PR.