Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: add ._simple_new method to masked arrays #53013

Merged

Conversation

topper-123
Copy link
Contributor

Add a ._simple_new method to BaseMaskedArray and child classes in order to allow instantiation without validation to get better performance when validation is not needed.

Example:

>>> import pandas as pd
>>> arr = pd.array(np.arange(2), dtype="Int32")
>>> %timeit arr.reshape(-1, 1)
1.4 µs ± 12 ns per loop  # main
544 ns ± 1.87 ns per loop  # this PR

Motivated by performance considerations for #52836.

@mroeschke mroeschke added Performance Memory or execution speed performance NA - MaskedArrays Related to pd.NA and nullable extension arrays labels May 1, 2023
@topper-123
Copy link
Contributor Author

topper-123 commented May 6, 2023

Ping @jbrockmendel & @rhshadrach. This is currently contained in #52788 to fix a slowdown I saw when working in that PR, but can be reviewed independently.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@rhshadrach
Copy link
Member

@topper-123 - just a conflict to resolve

@topper-123 topper-123 added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label May 7, 2023
@topper-123 topper-123 merged commit fa29f09 into pandas-dev:main May 7, 2023
36 checks passed
@topper-123 topper-123 deleted the perf_masked_arrays_simple_instantiation branch May 7, 2023 15:44
@topper-123
Copy link
Contributor Author

I got some social obligations, so merging this, so #52788 is rebased and ready, when I'm back...

Rylie-W pushed a commit to Rylie-W/pandas that referenced this pull request May 19, 2023
* PERF: add _simple_new method to masked arrays
Daquisu pushed a commit to Daquisu/pandas that referenced this pull request Jul 8, 2023
* PERF: add _simple_new method to masked arrays
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NA - MaskedArrays Related to pd.NA and nullable extension arrays Performance Memory or execution speed performance Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants