Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for axis=None in reductions #16229

Draft
wants to merge 4 commits into
base: branch-24.12
Choose a base branch
from

Conversation

Matt711
Copy link
Contributor

@Matt711 Matt711 commented Jul 9, 2024

Description

Closes #12335

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@Matt711 Matt711 added feature request New feature or request Python Affects Python cuDF API. breaking Breaking change labels Jul 9, 2024
@Matt711 Matt711 self-assigned this Jul 9, 2024
Copy link

copy-pr-bot bot commented Jul 10, 2024

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

return getattr(concat_columns(source._data.columns), op)(
**kwargs
)
elif axis == 2 and op in {"std", "var"}:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should std and var reductions be treated? I think 1. is right, but I wanted to check.

  1. Combine the columns, then compute the reduction
  2. Compute the reduction over columns, then compute the reduction again over those results
    cc. @mroeschke, @vyasr

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In regards to 1, we do have to keep this comment in mind: #14930 (comment)

In terms of correctness, I forgot if std and var validate the minimum number of elements in the columns, but if so, the logic for axis=None should be something like.

  1. if self.size < minimum_std_var_elements: Return NA
  2. if len(self) < minimum_std_var_elements and self.size > minimum_std_var_elements: Your option 1
  3. else: Your option 2

@Matt711 Matt711 changed the base branch from branch-24.08 to branch-24.10 July 24, 2024 16:11
@Matt711
Copy link
Contributor Author

Matt711 commented Sep 24, 2024

TODO: Check if this is still relevant

@Matt711 Matt711 changed the base branch from branch-24.10 to branch-24.12 September 24, 2024 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change feature request New feature or request Python Affects Python cuDF API.
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

[FEA] Support axis=None in reductions
2 participants