CLN: Simplify logic in _format_labels function for cut/qcut #30768

jschendel · 2020-01-07T05:18:48Z

Small simplification: modify the breaks metadata before creating an IntervalIndex then create and an IntervalIndex from the modified breaks. The existing approach creates an IntervalIndex, modifies the first Interval, then creates a new IntervalIndex with the updated first Interval.

This yields a slight performance improvement but doesn't seem dramatic enough to warrant a whatsnew entry, though I can add one if desired.

On this branch:

In [1]: import numpy as np; import pandas as pd; pd.__version__
Out[1]: '0.26.0.dev0+1668.ga6c08fc02'

In [2]: a = np.arange(10**5)

In [3]: %timeit pd.qcut(a, 10**4)
273 ms ± 914 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)

On master:

In [1]: import numpy as np; import pandas as pd; pd.__version__
Out[1]: '0.26.0.dev0+1667.g40bff2fed'

In [2]: a = np.arange(10**5)

In [3]: %timeit pd.qcut(a, 10**4)
317 ms ± 1.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

TomAugspurger · 2020-01-07T12:16:05Z

Thanks!

jschendel added Performance Memory or execution speed performance Reshaping Concat, Merge/Join, Stack/Unstack, Explode Clean labels Jan 7, 2020

jschendel added this to the 1.0 milestone Jan 7, 2020

CLN: Simplify logic in _format_labels function for cut/qcut

edf46d6

jschendel force-pushed the cln-format-labels branch from a6c08fc to edf46d6 Compare January 7, 2020 07:24

TomAugspurger approved these changes Jan 7, 2020

View reviewed changes

TomAugspurger merged commit c5948d1 into pandas-dev:master Jan 7, 2020

jschendel deleted the cln-format-labels branch January 7, 2020 16:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

CLN: Simplify logic in _format_labels function for cut/qcut #30768

CLN: Simplify logic in _format_labels function for cut/qcut #30768

Uh oh!

jschendel commented Jan 7, 2020

Uh oh!

TomAugspurger commented Jan 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

CLN: Simplify logic in _format_labels function for cut/qcut #30768

CLN: Simplify logic in _format_labels function for cut/qcut #30768

Uh oh!

Conversation

jschendel commented Jan 7, 2020

Uh oh!

TomAugspurger commented Jan 7, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants