Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix type annotations for fill_nan() #4445

Merged
merged 1 commit into from
Aug 18, 2022

Conversation

lorenzwalthert
Copy link
Contributor

I added the types as suggested in #3066 (comment) for series, data frame and expression. tested it but for some reason, when comparing series and expressions with their reference, I could not get them to be equal. I had to convert to a tuple to pass the equality check. E.g. check this:

import polars as pl
lazy = pl.DataFrame({"a": [1.0, np.nan, 3.0]}).lazy().fill_nan(None).collect()["a"]
assert lazy.series_equal(pl.Series("a", [1.0, None, 3.0]))

They have the same printed output but they are not identical.
For eager, it works:

lazy = pl.DataFrame({"a": [1.0, np.nan, 3.0]}).fill_nan(None)
assert lazy.frame_equal(pl.DataFrame({"a": [1.0, None, 3.0]}))

Not sure the comparison method needs to be adapted too, but I considered that out of scope for this PR.

@github-actions github-actions bot added the python Related to Python Polars label Aug 16, 2022
@lorenzwalthert lorenzwalthert changed the title Fix type annotation for fill_nan() Fix type annotations for fill_nan() Aug 16, 2022
@@ -589,6 +589,7 @@ def test_fill_nan() -> None:
.collect()["a"]
.series_equal(pl.Series("a", [1.0, 2.0, 3.0]))
)
assert tuple(df.lazy().fill_nan(None).collect()["a"]) == (1.0, None, 3.0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't convert to tuple what is the mismatch? Is it a datatype problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The print methods are identical, but comparision with frame_assert_equal() yields not identical. So I don't really understand. You can also c/p the example I posted in my initial PR description to see the problem and maybe you'll find a hint...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. This is a problem with Series.series_equal.

In [25]: s = pl.Series("a", [1.0, None, 3.0])

In [26]: s.series_equal(s)
Out[26]: False

I will open a separate issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lorenzwalthert Series.series_equal has a null_equal parameter that defaults to False. Can you change the assert statement to

assert (
        df.lazy()
        .fill_nan(None)
        .collect()["a"]
        .series_equal(pl.Series("a", [1.0, None, 3.0]), null_equal=True)
    )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ups of course. Did not know that, thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if null_equal defaults to True for data frames in frame_equal(), shouldn't it also have the same default value for polars.series?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should. That's more consistent.

@ritchie46
Copy link
Member

Thanks @lorenzwalthert 👍

@ritchie46 ritchie46 merged commit fca8433 into pola-rs:master Aug 18, 2022
@lorenzwalthert lorenzwalthert deleted the type-annotation-fill-nan branch August 18, 2022 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants