New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizations for threadpool #1153

Merged
merged 5 commits into from Mar 24, 2018

Conversation

Projects
None yet
1 participant
@jamadden
Member

jamadden commented Mar 23, 2018

Especially for map. None of the pools really need map to go through imap to implement map since they have to wait for everything anyway and they return results ordered. For greenlets, the get() operations will still yield to the loop.

Benchmark 36_threadpool_master 36_threadpool_opt_cond5
imap_unordered_seq 1.15 ms 1.07 ms: 1.08x faster (-7%)
imap_unordered_par 1.02 ms 950 us: 1.08x faster (-7%)
imap_seq 1.17 ms 1.10 ms: 1.06x faster (-6%)
imap_par 1.07 ms 1000 us: 1.07x faster (-7%)
map_seq 1.16 ms 724 us: 1.60x faster (-37%)
map_par 1.07 ms 646 us: 1.66x faster (-40%)
apply 1.22 ms 1.14 ms: 1.07x faster (-7%)
spawn 1.21 ms 1.13 ms: 1.07x faster (-7%)

jamadden added some commits Mar 23, 2018

Optimizations for threadpool
Especially for map. None of the pools really need map to go through
imap since they have to wait for everything anyway and they return
results ordererd.

| Benchmark          | 36_threadpool_master | 36_threadpool_opt_cond5     |
|--------------------|----------------------|-----------------------------|
| imap_unordered_seq | 1.15 ms              | 1.07 ms: 1.08x faster (-7%) |
| imap_unordered_par | 1.02 ms              | 950 us: 1.08x faster (-7%)  |
| imap_seq           | 1.17 ms              | 1.10 ms: 1.06x faster (-6%) |
| imap_par           | 1.07 ms              | 1000 us: 1.07x faster (-7%) |
| map_seq            | 1.16 ms              | 724 us: 1.60x faster (-37%) |
| map_par            | 1.07 ms              | 646 us: 1.66x faster (-40%) |
| apply              | 1.22 ms              | 1.14 ms: 1.07x faster (-7%) |
| spawn              | 1.21 ms              | 1.13 ms: 1.07x faster (-7%) |
More optimizations and clarifying comments
Compared to the previous commit:

| Benchmark          | 36_threadpool_opt_PR | 36_threadpool_opt_cond10    |
+--------------------+----------------------+-----------------------------+
| imap_unordered_seq | 1.06 ms              | 1.02 ms: 1.04x faster (-4%) |
| imap_unordered_par | 965 us               | 928 us: 1.04x faster (-4%)  |
| imap_seq           | 1.08 ms              | 1.03 ms: 1.04x faster (-4%) |
| map_seq            | 785 us               | 870 us: 1.11x slower (+11%) |
| map_par            | 656 us               | 675 us: 1.03x slower (+3%)  |
| apply              | 1.14 ms              | 1.12 ms: 1.02x faster (-2%) |
Add change note.
Here's the improvement for the greenlet pools:

| Benchmark          | 36_pool_master | 36_pool_opts                |
+--------------------+----------------+-----------------------------+
| imap_unordered_seq | 803 us         | 686 us: 1.17x faster (-15%) |
| imap_unordered_par | 445 us         | 389 us: 1.14x faster (-13%) |
| imap_seq           | 793 us         | 729 us: 1.09x faster (-8%)  |
| imap_par           | 407 us         | 398 us: 1.02x faster (-2%)  |
| map_seq            | 715 us         | 293 us: 2.44x faster (-59%) |
| map_par            | 388 us         | 199 us: 1.96x faster (-49%) |

Not significant (2): apply; spawn

@jamadden jamadden force-pushed the threadpool-opts branch from 8f314b3 to c21db37 Mar 24, 2018

@jamadden

This comment has been minimized.

Member

jamadden commented Mar 24, 2018

Final numbers for plain greenlet pools:

Benchmark 36_pool_master 36_pool_opts
imap_unordered_seq 803 us 686 us: 1.17x faster (-15%)
imap_unordered_par 445 us 389 us: 1.14x faster (-13%)
imap_seq 793 us 729 us: 1.09x faster (-8%)
imap_par 407 us 398 us: 1.02x faster (-2%)
map_seq 715 us 293 us: 2.44x faster (-59%)
map_par 388 us 199 us: 1.96x faster (-49%)

Not significant: apply, spawn

And the threadpool:

Benchmark 36_threadpool_master 36_threadpool_opt_cond10
imap_unordered_seq 1.15 ms 1.02 ms: 1.13x faster (-11%)
imap_unordered_par 1.02 ms 928 us: 1.10x faster (-9%)
imap_seq 1.17 ms 1.03 ms: 1.13x faster (-11%)
imap_par 1.07 ms 993 us: 1.08x faster (-7%)
map_seq 1.16 ms 870 us: 1.33x faster (-25%)
map_par 1.07 ms 675 us: 1.59x faster (-37%)
apply 1.22 ms 1.12 ms: 1.10x faster (-9%)
spawn 1.21 ms 1.09 ms: 1.11x faster (-10%)

@jamadden jamadden merged commit b4db40b into master Mar 24, 2018

5 checks passed

continuous-integration/appveyor/branch AppVeyor build succeeded
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.03%) to 84.032%
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment