Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: identity checking NA in map incorrect #58392

Draft
wants to merge 60 commits into
base: main
Choose a base branch
from

Conversation

droussea2001
Copy link
Contributor

@droussea2001 droussea2001 commented Apr 23, 2024

In bug #57390 Identity checking NA in map is incorrect because pd.NA are coerced to np.nan when evaluating a UDF.

This PR proposes to correct this problem in:
A) Avoiding cast in object for map input
B) Avoiding cast in object for map output

That's why test_map and test_map_na_action_ignore were modified in this way (we expect in this modified tests to keep pd.NA after a map)

@droussea2001 droussea2001 changed the title BUG: identity checking na in map incorrect BUG: identity checking NA in map incorrect Apr 24, 2024
@droussea2001 droussea2001 marked this pull request as ready for review April 26, 2024 18:01
@droussea2001
Copy link
Contributor Author

@mroeschke: Hello, hope you're doing well, would it be possible to merge this PR or should I propose another approach ?
Thanks in advance

@droussea2001
Copy link
Contributor Author

@mroeschke: for information I put this PR in Draft because I try to avoid a cast in object before calling lib.map_infer in map_array

@droussea2001
Copy link
Contributor Author

pre-commit.ci autofix

@droussea2001
Copy link
Contributor Author

droussea2001 commented May 27, 2024

For information, the idea in this new PR version is to be able to take into account into lib.map_infer:

  • BooleanArray, FloatingArray and IntegerArray values without casting them into object before
  • pd.NA value as valid value for a UDF

That's why I add a mask parameter to lib.map_infer.

@droussea2001 droussea2001 marked this pull request as ready for review May 29, 2024 08:52
@droussea2001
Copy link
Contributor Author

@WillAyd : Hello Will, I would have a question: the check "Docstring validation, typing, and other manual pre-commit hooks" is in the state "cancelled" in the CI just after the following line:

[90/152] Compiling C object pandas/_libs/join.cpython-310-x86_64-linux-gnu.so.p/meson-generated_pandas__libs_join.pyx.c.o
Error: The operation was canceled.

I don't see any error (I try to rebuild it locally from scratch) and the other checks seems to be ok. Would you have an idea about what causing this problem ?
Thanks in advance

@WillAyd
Copy link
Member

WillAyd commented May 29, 2024

I agree that looks strange - just restarted the job. Let's see what happens

@droussea2001
Copy link
Contributor Author

The error is clearer now, I will investigate. :-) Thanks a lot !

@droussea2001 droussea2001 marked this pull request as draft July 14, 2024 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Identity checking NA in map is incorrect
3 participants