Skip to content

Micro-optimization for random.choices(). Convert range() to repeat().#11889

Merged
rhettinger merged 1 commit intopython:masterfrom
rhettinger:choices_with_repeat
Feb 16, 2019
Merged

Micro-optimization for random.choices(). Convert range() to repeat().#11889
rhettinger merged 1 commit intopython:masterfrom
rhettinger:choices_with_repeat

Conversation

@rhettinger
Copy link
Copy Markdown
Contributor

Borrowing from the technique used in the timeit module, loop with repeat() instead of range() to avoid unnecessary creation and destruction of integer objects. Gives a modest speed-up (in the 5% range).

Patched:

$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, cum_weights=w, k=2)'
200000 loops, best of 11: 1.82 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, cum_weights=w, k=1000)'
1000 loops, best of 11: 388 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, cum_weights=w, k=1000)'
500 loops, best of 11: 390 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, k=2)'
200000 loops, best of 11: 1.43 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' 'choices(p, k=1000)'
1000 loops, best of 11: 255 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' 'choices(p, k=1000)'
1000 loops, best of 11: 255 usec per loop

Baseline:

$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, cum_weights=w, k=2)'
200000 loops, best of 11: 1.95 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, cum_weights=w, k=1000)'
500 loops, best of 11: 398 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, cum_weights=w, k=1000)'
500 loops, best of 11: 402 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, k=2)'
200000 loops, best of 11: 1.5 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, k=1000)'
1000 loops, best of 11: 271 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, k=1000)'
1000 loops, best of 11: 272 usec per loop

@rhettinger
Copy link
Copy Markdown
Contributor Author

The above timings were on Clang. The following were run with GCC-8 and show a consistent 5% to 6% improvement.

Patched:

$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, k=2)'
200000 loops, best of 11: 1.27 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, cum_weights=w, k=2)'
200000 loops, best of 11: 1.68 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, k=1000)'
1000 loops, best of 11: 211 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, cum_weights=w, k=1000)'
1000 loops, best of 11: 363 usec per loop

Baseline:

$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, k=2)'
200000 loops, best of 11: 1.34 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, cum_weights=w, k=2)'
200000 loops, best of 11: 1.79 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, k=1000)'
1000 loops, best of 11: 225 usec per loop
$ ./python.exe -m timeit -r11 -s 'from random import choices' -s'p=range(20)' -s'w=list(map(float, range(20)))' 'choices(p, cum_weights=w, k=1000)'
1000 loops, best of 11: 347 usec per loop

@tirkarthi
Copy link
Copy Markdown
Member

Is repeat(None, k) with a for loop the fastest way to call a function k times in Python?

@rhettinger
Copy link
Copy Markdown
Contributor Author

repeat(None, k) is the fastest way to create an iterator that loops k times.

Since the iterated action is doing more than just a function call, map() and list() won't be of any help in the loop body.

Copy link
Copy Markdown
Member

@pganssle pganssle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Given that itertools is already imported, it doesn't seem like there's an import-time cost, so I don't see any downsides.

@rhettinger rhettinger merged commit 5382203 into python:master Feb 16, 2019
@bedevere-bot
Copy link
Copy Markdown

@rhettinger: Please replace # with GH- in the commit message next time. Thanks!

@rhettinger rhettinger deleted the choices_with_repeat branch February 16, 2019 21:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance or resource usage skip issue skip news

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants