Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: Add type-hints in tzconversion.pyx #55241

Merged
merged 2 commits into from
Sep 22, 2023

Conversation

rhshadrach
Copy link
Member

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Part of #55179. With this, I'm seeing time plummet in the benchmark

tslibs.tz_convert.TimeTZConvert.time_tz_localize_to_utc(1000000, tzfile('/usr/share/zoneinfo/Asia/Tokyo'))

from 159ms (Cython 0.29) and 271ms (Cython 3.0.2) to ~5-6ms on both.

@rhshadrach rhshadrach added Performance Memory or execution speed performance Timezones Timezone data dtype labels Sep 22, 2023
@rhshadrach
Copy link
Member Author

Any suggestions on how to write a whatsnew note here? Perf improvement when localizing time to UTC?

Copy link
Member

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow, nice

and yeah, listing a perf improvement in whatsnew sounds good

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find

@rhshadrach rhshadrach added this to the 2.2 milestone Sep 22, 2023
@rhshadrach rhshadrach merged commit 2d16863 into pandas-dev:main Sep 22, 2023
39 checks passed
@rhshadrach rhshadrach deleted the tzconversion_type_hints branch September 22, 2023 14:59
@jbrockmendel
Copy link
Member

This is great. One potential future issue to be aware of though: this locks us in to having only 1D ndarrays passed to _get_utc_bounds. This isn't a problem ATM since the calling function tz_localize_to_utc is already locked in. But if we ever wanted to make tz_localize_to_utc handle unknown-ndim (which is on my radar bc it would allow removing the potentially-copying @dtl.ravel_compat from DatetimeArray.tz_localize) this would become an issue. So no worries, just heads up.

@rhshadrach
Copy link
Member Author

In that case to keep perf we'd have to resort to fused types, is that right?

@jbrockmendel
Copy link
Member

AFAIK there isn't a nice solution to the arbitrary-ndim use case. In practice it is always either 1 or 2 (maybe 0 in some corner cases) so we could make a fused type for just those. This won't be an issue anytime soon so this is just a "be aware of". im kind of hoping cython adds support for dtype-declaring ndarrays without declaring the dimension

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Performance Memory or execution speed performance Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants