Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: significantly improve performance of MultiIndex.shape #27384

Merged
merged 2 commits into from Jul 18, 2019

Conversation

@qwhelan
Copy link
Contributor

commented Jul 14, 2019

MultiIndex.shape is currently extremely slow as it triggers the creation of ._values, which can be quite expensive for datetime levels. The one mitigating factor is that this result is cached and thus making ._values.shape near-instant on subsequent calls, but also hard to catch in asv benchmarks; this commit adds a suite dedicated to measuring such cached properties on Index objects.

asv results show a ~400,000x speedup for a relatively straightforward case:

       before           after         ratio
     [269d3681]       [d205acf6]
     <master>       <shape>   
-      3.52±0.02s       8.33±0.2μs     0.00  index_cached_properties.MultiIndexCached.time_shape
  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

@qwhelan qwhelan force-pushed the qwhelan:shape branch from e7b6c75 to af102aa Jul 14, 2019

Show resolved Hide resolved pandas/core/indexes/multi.py Outdated
Show resolved Hide resolved pandas/core/indexes/multi.py Outdated

@qwhelan qwhelan force-pushed the qwhelan:shape branch from af102aa to ca16e6c Jul 18, 2019

@qwhelan

This comment has been minimized.

Copy link
Contributor Author

commented Jul 18, 2019

Updated asv results show moving into Index benefits a few other classes significantly as well:

       before           after         ratio
     [a4c19e7a]       [3c946017]
     <unsorted_cats~1>       <shape>   
-     2.59±0.07μs      2.16±0.06μs     0.83  index_cached_properties.IndexCache.time_shape('PeriodIndex')
-     2.74±0.09μs       2.26±0.1μs     0.83  index_cached_properties.IndexCache.time_shape('DatetimeIndex')
-      5.06±0.2μs       3.57±0.2μs     0.70  index_cached_properties.IndexCache.time_shape('UInt64Index')
-      5.80±0.4μs       3.70±0.3μs     0.64  index_cached_properties.IndexCache.time_shape('Float64Index')
-      6.40±0.4μs       4.08±0.3μs     0.64  index_cached_properties.IndexCache.time_shape('TimedeltaIndex')
-      6.80±0.3μs       3.88±0.2μs     0.57  index_cached_properties.IndexCache.time_shape('IntervalIndex')
-        65.2±1μs         903±20ns     0.01  index_cached_properties.IndexCache.time_shape('Int64Index')
-      65.1±0.9μs         892±10ns     0.01  index_cached_properties.IndexCache.time_shape('RangeIndex')
-         214±2ms       4.45±0.2μs     0.00  index_cached_properties.IndexCache.time_shape('MultiIndex')

@TomAugspurger TomAugspurger added this to the 0.25.0 milestone Jul 18, 2019

@WillAyd WillAyd referenced this pull request Jul 18, 2019

Closed

RLS: 0.25.0 #24950

@TomAugspurger TomAugspurger merged commit 44322d1 into pandas-dev:master Jul 18, 2019

15 checks passed

codecov/patch 100% of diff hit (target 50%)
Details
codecov/project 92.93% (target 82%)
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
pandas-dev.pandas Build #20190718.7 succeeded
Details
pandas-dev.pandas (Checks) Checks succeeded
Details
pandas-dev.pandas (Docs) Docs succeeded
Details
pandas-dev.pandas (Linux py35_compat) Linux py35_compat succeeded
Details
pandas-dev.pandas (Linux py36_32bit) Linux py36_32bit succeeded
Details
pandas-dev.pandas (Linux py36_locale_slow) Linux py36_locale_slow succeeded
Details
pandas-dev.pandas (Linux py36_locale_slow_old_np) Linux py36_locale_slow_old_np succeeded
Details
pandas-dev.pandas (Linux py37_locale) Linux py37_locale succeeded
Details
pandas-dev.pandas (Linux py37_np_dev) Linux py37_np_dev succeeded
Details
pandas-dev.pandas (Windows py36_np15) Windows py36_np15 succeeded
Details
pandas-dev.pandas (Windows py37_np141) Windows py37_np141 succeeded
Details
pandas-dev.pandas (macOS py35_macos) macOS py35_macos succeeded
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.