Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fastpaths for Timestamp properties #18539

Merged
merged 3 commits into from
Nov 29, 2017
Merged

Conversation

jbrockmendel
Copy link
Member

Addresses a bunch of the TimestampProperties regressions in #18532 . ASV vs 0.21.0

asv continuous -f 1.1 -E virtualenv 81372093 HEAD -b TimestampProperties
[...]
       before           after         ratio
     [81372093]       [5fc79fb0]
+     5.73±0.02μs      10.5±0.03μs     1.84  timestamp.TimestampProperties.time_dayofyear(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
+     5.96±0.04μs      10.9±0.03μs     1.82  timestamp.TimestampProperties.time_is_month_end(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
+     5.80±0.01μs       10.5±0.1μs     1.82  timestamp.TimestampProperties.time_days_in_month(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
+     6.18±0.01μs       10.6±0.6μs     1.72  timestamp.TimestampProperties.time_week(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-     5.82±0.01μs          337±4ns     0.06  timestamp.TimestampProperties.time_is_leap_year(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-     5.99±0.09μs         326±10ns     0.05  timestamp.TimestampProperties.time_quarter(None)
-     5.99±0.01μs        324±0.6ns     0.05  timestamp.TimestampProperties.time_is_year_end(None)
-     5.90±0.01μs        319±0.5ns     0.05  timestamp.TimestampProperties.time_is_month_start(None)
-     6.02±0.02μs          321±1ns     0.05  timestamp.TimestampProperties.time_is_leap_year(None)
-     6.08±0.05μs          321±2ns     0.05  timestamp.TimestampProperties.time_is_year_end(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-     6.06±0.01μs        316±0.8ns     0.05  timestamp.TimestampProperties.time_is_month_start(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-     6.19±0.09μs        319±0.7ns     0.05  timestamp.TimestampProperties.time_is_quarter_start(None)
-      6.09±0.2μs        310±0.6ns     0.05  timestamp.TimestampProperties.time_quarter(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-      6.38±0.2μs        323±0.7ns     0.05  timestamp.TimestampProperties.time_is_year_start(None)
-      6.90±0.4μs          337±4ns     0.05  timestamp.TimestampProperties.time_is_quarter_end(None)
-      6.73±0.1μs          325±2ns     0.05  timestamp.TimestampProperties.time_is_year_start(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-      6.75±0.2μs          321±1ns     0.05  timestamp.TimestampProperties.time_is_quarter_start(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-      7.14±0.1μs        325±0.4ns     0.05  timestamp.TimestampProperties.time_is_quarter_end(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)

Timestamps with freqs are still SOL, but that's a relatively rare case.

@jbrockmendel
Copy link
Member Author

Or just against master(ish):

asv continuous -f 1.1 -E virtualenv 34b036c6 HEAD -b TimestampProperties
[...]
       before           after         ratio
     [34b036c6]       [5fc79fb0]
-     6.16±0.01μs          325±1ns     0.05  timestamp.TimestampProperties.time_is_quarter_start(None)
-     6.16±0.07μs          318±2ns     0.05  timestamp.TimestampProperties.time_is_month_start(None)
-     6.39±0.02μs          325±2ns     0.05  timestamp.TimestampProperties.time_is_leap_year(None)
-      6.19±0.1μs          313±1ns     0.05  timestamp.TimestampProperties.time_quarter(None)
-     6.44±0.01μs          320±1ns     0.05  timestamp.TimestampProperties.time_is_year_end(None)
-     6.64±0.01μs          328±3ns     0.05  timestamp.TimestampProperties.time_is_year_start(None)
-      6.80±0.2μs          327±1ns     0.05  timestamp.TimestampProperties.time_is_quarter_end(None)
-     10.0±0.03μs          327±3ns     0.03  timestamp.TimestampProperties.time_quarter(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-      10.7±0.2μs          330±3ns     0.03  timestamp.TimestampProperties.time_is_year_end(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-     11.0±0.08μs          325±2ns     0.03  timestamp.TimestampProperties.time_is_leap_year(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-     11.3±0.08μs          329±2ns     0.03  timestamp.TimestampProperties.time_is_year_start(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-     11.1±0.07μs          314±2ns     0.03  timestamp.TimestampProperties.time_is_month_start(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-     11.3±0.06μs          319±5ns     0.03  timestamp.TimestampProperties.time_is_quarter_start(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)
-      11.9±0.1μs          329±3ns     0.03  timestamp.TimestampProperties.time_is_quarter_end(<DstTzInfo 'Europe/Amsterdam' LMT+0:20:00 STD>)

@codecov
Copy link

codecov bot commented Nov 28, 2017

Codecov Report

Merging #18539 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #18539      +/-   ##
==========================================
- Coverage   91.35%   91.33%   -0.02%     
==========================================
  Files         164      164              
  Lines       49801    49802       +1     
==========================================
- Hits        45494    45487       -7     
- Misses       4307     4315       +8
Flag Coverage Δ
#multiple 89.13% <ø> (ø) ⬆️
#single 40.81% <ø> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/io/gbq.py 25% <0%> (-58.34%) ⬇️
pandas/core/frame.py 97.81% <0%> (-0.1%) ⬇️
pandas/core/base.py 96.55% <0%> (ø) ⬆️
pandas/core/groupby.py 92.04% <0%> (+0.01%) ⬆️
pandas/core/indexes/base.py 96.47% <0%> (+0.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 88ab693...9baeb6d. Read the comment docs.

@jreback
Copy link
Contributor

jreback commented Nov 28, 2017

can you confirm we have tests for all of these for both with and w/o freq

and asv's for same

In [13]: t = pd.Timestamp('20130101')

In [14]: %timeit t.is_month_start
11.2 µs ± 178 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [15]: t = pd.Timestamp('20130101', freq='B')

In [16]: %timeit t.is_month_start
12.6 µs ± 60.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

@jreback jreback added Performance Memory or execution speed performance Datetime Datetime data dtype labels Nov 28, 2017
@@ -304,10 +304,12 @@ cdef class _Timestamp(datetime):
out = get_date_field(np.array([val], dtype=np.int64), field)
return int(out[0])

cpdef _get_start_end_field(self, field):
cpdef bint _get_start_end_field(self, str field):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so why is this in _Timestamp again (as opposed to Timestamp); this is why its cpdef, why not just cdef?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why its in _Timestamp instead of Timestamp (though my understanding is putting it in _Timestamp is slightly more performant and smaller memory footprint); it was that way before I got here.

It is cpdef and not cdef because if it were cdef then calling it from Timestamp would be an AttributeError. That's why #18446 moved a bunch of properties up to _Timestamp after making it cdef.

@jbrockmendel
Copy link
Member Author

can you confirm we have tests for all of these for both with and w/o freq

Just pushed with these added.

@jreback jreback added this to the 0.22.0 milestone Nov 29, 2017
@jreback jreback merged commit 32f562d into pandas-dev:master Nov 29, 2017
@jreback
Copy link
Contributor

jreback commented Nov 29, 2017

thanks!

@jbrockmendel jbrockmendel deleted the regressions branch December 8, 2017 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Datetime Datetime data dtype Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants