Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEPR: NDFrame.align broadcast_axis, fill_axis keywords #51856

Closed
jbrockmendel opened this issue Mar 9, 2023 · 4 comments · Fixed by #52130
Closed

DEPR: NDFrame.align broadcast_axis, fill_axis keywords #51856

jbrockmendel opened this issue Mar 9, 2023 · 4 comments · Fixed by #52130
Labels
Bug Deprecate Functionality to remove in pandas Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@jbrockmendel
Copy link
Member

jbrockmendel commented Mar 9, 2023

broadcast_axis is only used internally in one places in NDFrame._where, and is ignored unless broadcast_axis == 1 and self.ndim != other.ndim. It also looks like the code path in which it is used is going to be buggy if there are non-unique columns. I doubt any users are using this. Let's deprecate it.

fill_axis has no non-test usages. It looks like it, along with method and limit, are used to do a fillna after the actual alignment.

We are also inconsistenet in what we do with fill_value. In _align_frame we pass fill_value to _reindex_with_indexers and dont pass it to that follow-up fillna. In align_series we do not pass it to reindex_indexer but do pass it to the follow-up fillna. My guess is that we could consistent pass it during the reindex step. Assuming that is the case, we should deprecate the fillna step and associated kwargs and tell users to do the fillna explicitly.

@jbrockmendel jbrockmendel added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 9, 2023
@jbrockmendel jbrockmendel added the Deprecate Functionality to remove in pandas label Mar 11, 2023
@jorisvandenbossche
Copy link
Member

Assuming that is the case, we should deprecate the fillna step and associated kwargs and tell users to do the fillna explicitly.

+1 on deprecating/removing the "advanced" fill options in align, i.e. the backwards/forwards filling (interpolating) with a limit (so that would be the method, limit and fill_axis keywords).
I would keep the simple fill option with a scalar value (fill_value keyword), as that is something that can actually be done more efficiently while reindexing (if we then pass it consistently to the reindex operation, as you mention)


The broadcast_axis seems a somewhat useful option, and also doesn't add that much of complication in the code I would say. Not that I ever used it myself, but just to say that I think deprecating the fill related keywords is more useful (more keywords to get rid off, can easily be done with other existing APIs)

@jbrockmendel
Copy link
Member Author

Agrred method/limit/fill_axis are higher priorities, opend #51968 to deprecated those.

If we don't deprecate broadcast_axis, we should fix a) it looks like it will mess up with non-unique columns and b) the annotation is wrong (currently we're ignoring a non-spurious mypy complaint)

@jbrockmendel
Copy link
Member Author

xref #13194

@jbrockmendel
Copy link
Member Author

On today's call we discussed this and the mood was generally pro-deprecation for broadcast_axis since a) there is a simple "do X instead" and b) we can revert if people complain.

Implementing this I'm looking at the one non-test usage in NDFrame._where. The relevant cases have cond (the mask) having a different ndim than self. broadcast_axis was specifically implemented with the cond.ndim == 1 and self.ndim == 2 case in mind, and we have tests that hit that. We do not have tests for the opposite case, and it isn't even clear to me what that would mean. Any objection to just raising in that case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Deprecate Functionality to remove in pandas Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants