Skip to content

Conversation

@MarcoGorelli
Copy link
Contributor

This is something I've noticed on the way towards #2622 (which may not be that far off!)

The builtin Python max would need to iterate over elements, which is much slower than calling the native Series.max (where the algorithm would be in a low-level language)

Example:

In [43]: s = pd.Series(rng.integers(0, 10, size=1_000_000))

In [44]: %timeit max(s) > 1
65.4 ms ± 5.59 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [45]: %timeit s.max() > 1
221 μs ± 16.5 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

@MarcoGorelli MarcoGorelli marked this pull request as ready for review December 2, 2024 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants