Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: perf improvements in DataFrame construction with a non-daily datelike index (GH6479) #6481

Merged
merged 2 commits into from
Feb 27, 2014

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Feb 26, 2014

closes #6479

@jreback jreback added this to the 0.14.0 milestone Feb 26, 2014
@qwhelan
Copy link
Contributor

qwhelan commented Feb 26, 2014

This is going to move the issue from 'M' to 'D' since Day()._should_cache() != MonthEnd()._should_cache()

@jreback
Copy link
Contributor Author

jreback commented Feb 26, 2014

I know....the problem is that all of the class definitions got changed (CachableOffset was always the 2nd class). I remember this as an issue, but nothing broke and I recall it was just changed. So now need to test cases and see what slows down, or I may just change the classes back.

@jreback
Copy link
Contributor Author

jreback commented Feb 26, 2014

So you are correct....have to now fix daily....
I don't think much else broke though....do that's good

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_ctor_dtindex_monthly                   |   1.2960 | 197.8787 |   0.0065 |
frame_assign_timeseries_index                |   0.6283 |   1.0956 |   0.5735 |
datetime_index_intersection                  |   0.3193 |   0.5510 |   0.5795 |
datetime_index_union                         |   0.1477 |   0.0737 |   2.0043 |
frame_ctor_dtindex_daily                     |   2.5813 |   0.9703 |   2.6603 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [efe176d] : PERF: perf improvements in DataFrame construction with a non-daily datelike index (GH6479)
Base   [ed20b1d] : Merge pull request #6477 from jreback/resample

BUG: Bug in resample with a timezone and certain offsets (GH6397)

@qwhelan
Copy link
Contributor

qwhelan commented Feb 26, 2014

I can dig into this in a few hours, if you'd like to move on to something else.

@jreback
Copy link
Contributor Author

jreback commented Feb 26, 2014

This is versus 0.12. I just turned caching off (it was only on for business* offsets) anyhow. But I think this should be triggered by the user as it attempts to cache a fairly large range (which is why in 0.12 the business day's show a big time).

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_ctor_dtindex_BMonthEnd(1)              |   1.3034 | 201.6877 |   0.0065 |
frame_ctor_dtindex_BusinessDay(1)            |   1.2206 |  70.0500 |   0.0174 |
frame_ctor_dtindex_BDay(1)                   |   1.2197 |  69.9514 |   0.0174 |
frame_ctor_dtindex_BQuarterBegin(1)          |   1.3313 |   1.2209 |   1.0904 |
frame_ctor_dtindex_BQuarterEnd(1)            |   1.3280 |   1.1880 |   1.1179 |
frame_ctor_dtindex_QuarterEnd(1)             |   1.3134 |   1.1717 |   1.1209 |
frame_ctor_dtindex_QuarterBegin(1)           |   1.3123 |   1.1700 |   1.1217 |
frame_ctor_dtindex_BMonthBegin(1)            |   1.3143 |   1.1060 |   1.1883 |
frame_ctor_dtindex_BMonthBegin(2)            |   1.3139 |   1.1044 |   1.1898 |
frame_ctor_dtindex_MonthBegin(1)             |   1.2810 |   1.0583 |   1.2104 |
frame_ctor_dtindex_CDay(1)                   |   1.5373 |   1.2674 |   1.2130 |
frame_ctor_dtindex_MonthBegin(2)             |   1.2820 |   1.0526 |   1.2179 |
frame_ctor_dtindex_CustomBusinessDay(1)      |   1.5484 |   1.2643 |   1.2247 |
frame_ctor_dtindex_CDay(2)                   |   1.5473 |   1.2604 |   1.2277 |
frame_ctor_dtindex_CustomBusinessDay(2)      |   1.5490 |   1.2610 |   1.2284 |
frame_ctor_dtindex_BMonthEnd(2)              |   1.3043 |   1.0297 |   1.2667 |
frame_ctor_dtindex_MonthEnd(1)               |   1.2987 |   1.0246 |   1.2674 |
frame_ctor_dtindex_MonthEnd(2)               |   1.2963 |   1.0223 |   1.2680 |
frame_ctor_dtindex_BDay(2)                   |   1.2493 |   0.9780 |   1.2774 |
frame_ctor_dtindex_BusinessDay(2)            |   1.2507 |   0.9727 |   1.2858 |
frame_ctor_dtindex_Week(1)                   |   1.0883 |   0.8347 |   1.3038 |
frame_ctor_dtindex_Week(2)                   |   1.0943 |   0.8323 |   1.3147 |
frame_ctor_dtindex_Micro(1)                  |   0.9180 |   0.6970 |   1.3171 |
frame_ctor_dtindex_Day(2)                    |   0.9183 |   0.6950 |   1.3213 |
frame_ctor_dtindex_Day(1)                    |   0.9216 |   0.6947 |   1.3267 |
frame_ctor_dtindex_Hour(1)                   |   0.9259 |   0.6976 |   1.3273 |
frame_ctor_dtindex_Second(2)                 |   0.9234 |   0.6951 |   1.3285 |
frame_ctor_dtindex_Minute(2)                 |   0.9267 |   0.6973 |   1.3289 |
frame_ctor_dtindex_Micro(2)                  |   0.9274 |   0.6957 |   1.3330 |
frame_ctor_dtindex_Hour(2)                   |   0.9290 |   0.6950 |   1.3368 |
frame_ctor_dtindex_Milli(2)                  |   0.9280 |   0.6933 |   1.3385 |
frame_ctor_dtindex_Milli(1)                  |   0.9300 |   0.6947 |   1.3387 |
frame_ctor_dtindex_Second(1)                 |   0.9323 |   0.6944 |   1.3427 |
frame_ctor_dtindex_Minute(1)                 |   0.9433 |   0.6930 |   1.3611 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

@jreback
Copy link
Contributor Author

jreback commented Feb 26, 2014

vs master

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_ctor_dtindex_BMonthEnd(1)              |   1.3040 | 227.7931 |   0.0057 |
frame_ctor_dtindex_BQuarterEnd(1)            |   1.3280 | 223.8190 |   0.0059 |
frame_ctor_dtindex_BQuarterBegin(1)          |   1.3313 | 211.0933 |   0.0063 |
frame_ctor_dtindex_BMonthBegin(1)            |   1.3133 | 203.5247 |   0.0065 |
frame_ctor_dtindex_MonthEnd(1)               |   1.3046 | 193.4933 |   0.0067 |
frame_ctor_dtindex_QuarterEnd(1)             |   1.3210 | 192.7724 |   0.0069 |
frame_ctor_dtindex_QuarterBegin(1)           |   1.3146 | 191.1183 |   0.0069 |
frame_ctor_dtindex_MonthBegin(1)             |   1.2854 | 182.2054 |   0.0071 |
frame_ctor_dtindex_BusinessDay(1)            |   1.2253 |  92.1127 |   0.0133 |
frame_ctor_dtindex_BDay(1)                   |   1.2267 |  92.1164 |   0.0133 |

@jreback
Copy link
Contributor Author

jreback commented Feb 26, 2014

@qwhelan pls review when you have a chance

jreback added a commit that referenced this pull request Feb 27, 2014
PERF: perf improvements in DataFrame construction with a non-daily datelike index (GH6479)
@jreback jreback merged commit 8cd9819 into pandas-dev:master Feb 27, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Frequency DateOffsets Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Serious performance regression in DataFrame construction with monthly DatetimeIndex
2 participants