Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: eval and query not working with ea dtypes #50764

Merged
merged 27 commits into from
Feb 9, 2023
Merged

Conversation

phofl
Copy link
Member

@phofl phofl commented Jan 15, 2023

This is really ugly, not sure if we want to do this. But could not find another way to make this work

@mroeschke mroeschke added expressions pd.eval, query NA - MaskedArrays Related to pd.NA and nullable extension arrays labels Jan 16, 2023
@phofl
Copy link
Member Author

phofl commented Feb 1, 2023

cc @mroeschke could you have a look? Remaining failures were only because pyarrow was not installed in the pipelines in question

non_ea_dtypes.append(arr_or_dtype)

if non_ea_dtypes:
np_dtype = np.result_type(*non_ea_dtypes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this hit the except ValueError: as above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not totally sure, we don't have a test hitting the ValueError above, but added it down here as well

for elem in parsed_expr.terms.operand_types
)
):
engine = "python"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean if a user is specifying engine="numexpr", this switches the engine to python?

If so, I think it would be better to explicitly raise a NotImplementedError and tell the user to switch the engine manually.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, would you be ok with a RuntimeWarning? Since numexpr is the default, raising seems a bit noisy maybe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the issues linked already raise an exception currently, so a NotImplementedError would be similar but more explicit. Unless there's a case I'm not thinking of.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am more concerned with people using our nullable option who are switching over from NumPy. Would be annoying if query/eval stop working then.

But you are correct, an appropriate NotImplementedError would be an improvement

Copy link
Member

@mroeschke mroeschke Feb 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair. Yeah a RuntimeWarning warning & encouraging users to switch to the python engine would be okay

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the warning

@phofl
Copy link
Member Author

phofl commented Feb 9, 2023

ci is green now

@mroeschke mroeschke added this to the 2.0 milestone Feb 9, 2023
@mroeschke mroeschke merged commit b9a4335 into pandas-dev:main Feb 9, 2023
@mroeschke
Copy link
Member

Thanks @phofl

@phofl phofl deleted the eval branch February 9, 2023 16:46
@lithomas1 lithomas1 mentioned this pull request Feb 15, 2023
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
expressions pd.eval, query NA - MaskedArrays Related to pd.NA and nullable extension arrays
Projects
None yet
2 participants