Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows improvements #453

Merged
merged 21 commits into from Mar 20, 2019
Merged

Conversation

compnerd
Copy link
Collaborator

@compnerd compnerd commented Mar 6, 2019

No description provided.

src/event/event_windows.c Outdated Show resolved Hide resolved
Copy link
Member

@MadCoder MadCoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea if the event_windows.c changes make sense, but overall, except for that semaphore thing, the change looks good from my POV

src/shims/lock.h Outdated Show resolved Hide resolved
src/shims/lock.h Outdated Show resolved Hide resolved
@compnerd
Copy link
Collaborator Author

compnerd commented Mar 7, 2019

CC: @MadCoder @ktopley-apple - okay, with this patch set, I can pass the XCTest test suite (modulo some unrelated changes).

Copy link
Member

@MadCoder MadCoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me, @ktopley-apple should handle when it should be merged.

@compnerd
Copy link
Collaborator Author

compnerd commented Mar 7, 2019

@MadCoder thanks for having a look!
@ktopley-apple - how should we handle this? But, this definitely makes dispatch way better for Windows. Most of this is relevant even after the 1121 merge.

src/shims/time.h Outdated Show resolved Hide resolved
@compnerd
Copy link
Collaborator Author

compnerd commented Mar 8, 2019

@swift-ci please test

1 similar comment
@compnerd
Copy link
Collaborator Author

@swift-ci please test

@ktopley-apple
Copy link
Contributor

@MadCoder thanks for having a look!
@ktopley-apple - how should we handle this? But, this definitely makes dispatch way better for Windows. Most of this is relevant even after the 1121 merge.

Once 1121 is completed, the other PRs can then be merged on top of it (assuming they still apply cleanly). I will be continuing work on the 1121 merge this week.

@compnerd
Copy link
Collaborator Author

@ktopley-apple - okay, this definitely wont apply cleanly to 1121. I'm definitely looking forward to the 1121 merge, but this is a fairly huge improvement to libdispatch on Windows, and is needed to get XCTest building and passing tests on Windows, so it would be really helpful to get this in (I'm okay with getting 1121 merged and then re-doing this patchset for 1121, as that has a bunch of improvements that I am really looking forward to).

The `_dispatch_sema4_t` type is a pointer to the handle of the
semaphore, so we need to indirect through the pointer before using it.
This site was incorrectly using it directly.
@compnerd compnerd mentioned this pull request Mar 19, 2019
This sets up the event loop for Windows to run and schedule timers.
This allows us to start running some of the timer sources.
This cleans up the __LP64__ usage to allow libdispatch to work correctly
on LLP64 targets (like Windows x64).
Linux, FreeBSD already define DISPATCH_COCOA_COMPAT, and Windows was
defining that in a couple of places in an unstructured haphazard manner.
Define it similarly and clean up the other sites.
FlsAlloc returns FLS_OUT_OF_INDEXES in an error scenario, everything
else is a valid result.  Correct the assertion.
The NT time representation uses 1/1/1601 as epoch as opposed to Unix
which uses 1/1/1970.  Account for the offset in the calculation.  This
corrects the calculation of now.
This is needed for XCTest to run its test suite.
Use the `GetSystemTimePreciseAsFileTime` rather than
`GetSystemTimeAsFileTime` as we may otherwise get a cached value from
the last context switch.
Ideally we would do the lazy semaphore approach that Mach does.
Unfortunately, the queue wasn't getting drained as in that case.  A
follow up should investigate and fix the lazy allocation.
@compnerd
Copy link
Collaborator Author

@ktopley-apple - if there are reasons to do the changes differently, please let me know, and Ill update the diffs.

@ktopley-apple
Copy link
Contributor

@ktopley-apple - if there are reasons to do the changes differently, please let me know, and Ill update the diffs.

All looks good, thanks.

@compnerd
Copy link
Collaborator Author

@swift-ci please test

compnerd and others added 6 commits March 19, 2019 12:40
This was being re-implemented in CF by means of a macro.  Just expose
the function instead.
The dispatch_cascade test is very slow on Windows right now (often taking 4-5
minutes to execute on my machine and sometimes locking it up entirely) because
contention is not being handled correctly.

`_dispatch_contention_usleep()` is expected to put the thread to sleep, but on
Windows it spins on `QueryPerformanceCounter()`. This is causing a huge amount
of starvation in the dispatch_cascade test. Implement it using `Sleep()`, and
accordingly adjust `DISPATCH_CONTENTION_USLEEP_START` to be 1ms on Windows.

Additionally, `_dispatch_contention_spins()` is currently implemented using the
`rand_s()` function. This is slow (and experimentally seems to be slower than
not randomizing the spin count at all!) because `rand_s()` guarantees
cryptographic security, which is unnecessary for dispatch's use case. Replace it
with a basic linear congruential generator (from K&R) since there isn't any
other `rand_r()` equivalent. Based on the average wall clock times reported by
bsdtestharness, this is around 35% faster on my PC (i7-8700K).

These changes bring the runtime of the dispatch_cascade test down to around 1-2s
at most for me. (It's even faster than this if stdout isn't a console window
because that slows down the histogram display.)
The 1121 merge introduced additional usage of sys/queue.h.  Flesh out
the shims header further to repair the build on Windows.
The `dsema_value` field was marked as long which is incorrect for LLP64
environments.  Update the type to `intptr_t` which is the portable type
for this value.
This should use `QueryUnbiasedInterruptTime`  `QueryPerformanceCounter`
is roughly in units of of TSC frequency.  We would need to use
`QueryPerformanceFrequency` to convert that into nanoseconds.
Additionally, `QueryPerformanceCounter` includes time spent during
suspend, which is not what is desired.  Switch to
`QueryUnbiasedInterruptTime` to get something close to `CLOCK_MONOTONIC`
on Linux.
Windows does not support 128-bit division, disable the support for now.
One option is to statically link compiler-rt to provide `__udivti3`.
@compnerd
Copy link
Collaborator Author

@swift-ci please test

compnerd and others added 2 commits March 19, 2019 14:59
This addresses the updates that were not applied to the Windows port but
were applied to the Linux port and the macOS port.
PR apple#453 provides an initial implementation for timer support on Windows. Support
building and running the `DISPATCH_SOURCE_TYPE_TIMER` tests so we can easily
verify that timers work correctly.

These tests should pass on Windows with this and apple#453 applied:

- dispatch_drift
- dispatch_suspend_timer
- dispatch_timer
- dispatch_timer_bit31
- dispatch_timer_bit63
- dispatch_timer_set_time
- dispatch_timer_short
- dispatch_timer_timeout
@compnerd
Copy link
Collaborator Author

@swift-ci please test

@compnerd
Copy link
Collaborator Author

@ktopley-apple - okay, so, I've pulled in the changes from @adierking into this set to run the tests as well. There was a dropped change that caused the failure, but this now builds and passes the timer queue tests as well! With this set of changes, libdispatch 1121 should be as good as 931 on Windows (well, better than what was actually in tree). This should be ready to merge now :-)

@ktopley-apple
Copy link
Contributor

@ktopley-apple - okay, so, I've pulled in the changes from @adierking into this set to run the tests as well. There was a dropped change that caused the failure, but this now builds and passes the timer queue tests as well! With this set of changes, libdispatch 1121 should be as good as 931 on Windows (well, better than what was actually in tree). This should be ready to merge now :-)

Great! I'll merge it once the build is clean.

@compnerd
Copy link
Collaborator Author

@ktopley-apple - awesome, seems that this is through the build bot, and should greatly improve libdispatch on Windows! Thanks for the help with that

@ktopley-apple ktopley-apple merged commit 4659503 into apple:master Mar 20, 2019
@compnerd compnerd deleted the windows-improvements branch March 20, 2019 15:15
rokhinip pushed a commit that referenced this pull request Nov 5, 2021
Windows improvements

Signed-off-by: Kim Topley <ktopley@apple.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants