Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: masked ops for reductions (min/max) #33261

Merged
merged 7 commits into from
Apr 6, 2020

Conversation

jorisvandenbossche
Copy link
Member

Follow-up on #30982, adding similar mask reduction but now for min/max.

@jorisvandenbossche jorisvandenbossche added NA - MaskedArrays Related to pd.NA and nullable extension arrays Performance Memory or execution speed performance labels Apr 3, 2020
@jorisvandenbossche jorisvandenbossche added this to the 1.1 milestone Apr 3, 2020


min = _minmax(np.min)
max = _minmax(np.max)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since mypy is a static checker, min and max will resolve to Any.

something to consider is that maybe could do something like

def min(values: np.ndarray, mask: np.ndarray, skipna: bool = True):
    return _reduction(np.min, values=values, mask=mask, skipna=skipna)


def max(values: np.ndarray, mask: np.ndarray, skipna: bool = True):
    return _reduction(np.max, values=values, mask=mask, skipna=skipna)

and remove _minmax and move func into reduction signature.

This issue occurs elsewhere in the codebase, so this is not a blocker, but may give more confidence in the refactors.

alternatively, would using functools.partial be clearer. (newer versions on mypy might support this better)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, adapted as you suggested (it gives an extra function call in runtime (instead of only at import time), but that should be negligible)

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. some comments by others.

@jreback jreback merged commit cdc4d97 into pandas-dev:master Apr 6, 2020
@jreback
Copy link
Contributor

jreback commented Apr 6, 2020

thanks @jorisvandenbossche

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NA - MaskedArrays Related to pd.NA and nullable extension arrays Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants