Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Windows contention performance #455

Closed
wants to merge 1 commit into from
Closed

Optimize Windows contention performance #455

wants to merge 1 commit into from

Conversation

adierking
Copy link
Contributor

The dispatch_cascade test is very slow on Windows right now (often taking 4-5
minutes to execute on my machine and sometimes locking it up entirely) because
contention is not being handled correctly.

_dispatch_contention_usleep() is expected to put the thread to sleep, but on
Windows it spins on QueryPerformanceCounter(). This is causing a huge amount
of starvation in the dispatch_cascade test. Implement it using Sleep(), and
accordingly adjust DISPATCH_CONTENTION_USLEEP_START to be 1ms on Windows.

Additionally, _dispatch_contention_spins() is currently implemented using the
rand_s() function. This is slow (and experimentally seems to be slower than
not randomizing the spin count at all!) because rand_s() guarantees
cryptographic security, which is unnecessary for dispatch's use case. Replace it
with a basic linear congruential generator (from K&R) since there isn't any
other rand_r() equivalent. Based on the average wall clock times reported by
bsdtestharness, this is around 35% faster on my PC (i7-8700K).

These changes bring the runtime of the dispatch_cascade test down to around 1-2s
at most for me. (It's even faster than this if stdout isn't a console window
because that slows down the histogram display.)

The dispatch_cascade test is very slow on Windows right now (often taking 4-5
minutes to execute on my machine and sometimes locking it up entirely) because
contention is not being handled correctly.

`_dispatch_contention_usleep()` is expected to put the thread to sleep, but on
Windows it spins on `QueryPerformanceCounter()`. This is causing a huge amount
of starvation in the dispatch_cascade test. Implement it using `Sleep()`, and
accordingly adjust `DISPATCH_CONTENTION_USLEEP_START` to be 1ms on Windows.

Additionally, `_dispatch_contention_spins()` is currently implemented using the
`rand_s()` function. This is slow (and experimentally seems to be slower than
not randomizing the spin count at all!) because `rand_s()` guarantees
cryptographic security, which is unnecessary for dispatch's use case. Replace it
with a basic linear congruential generator (from K&R) since there isn't any
other `rand_r()` equivalent. Based on the average wall clock times reported by
bsdtestharness, this is around 35% faster on my PC (i7-8700K).

These changes bring the runtime of the dispatch_cascade test down to around 1-2s
at most for me. (It's even faster than this if stdout isn't a console window
because that slows down the histogram display.)
@adierking
Copy link
Contributor Author

@compnerd
Copy link
Collaborator

compnerd commented Mar 7, 2019

Yeah, the problem is that the duration of Sleep is not precise enough to do the microsleep :-(

@compnerd
Copy link
Collaborator

compnerd commented Mar 7, 2019

@swift-ci please test

7 similar comments
@compnerd
Copy link
Collaborator

compnerd commented Mar 8, 2019

@swift-ci please test

@compnerd
Copy link
Collaborator

compnerd commented Mar 8, 2019

@swift-ci please test

@compnerd
Copy link
Collaborator

compnerd commented Mar 8, 2019

@swift-ci please test

@compnerd
Copy link
Collaborator

compnerd commented Mar 8, 2019

@swift-ci please test

@compnerd
Copy link
Collaborator

compnerd commented Mar 9, 2019

@swift-ci please test

@compnerd
Copy link
Collaborator

@swift-ci please test

@compnerd
Copy link
Collaborator

@swift-ci please test

@adierking
Copy link
Contributor Author

Closing because this is included in #453.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants