Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of std::atomic::wait #593

Merged
merged 321 commits into from Aug 2, 2020
Merged

Conversation

AlexGuteniev
Copy link
Contributor

@AlexGuteniev AlexGuteniev commented Mar 8, 2020

Description

Implementation of std::atomic::wait , part of #52

Partly based on https://github.com/ogiroux/atomic_wait (no code copying, but being familiar with that implementation)
(initially; currently it substantially diverges from that idea)

Design decisions:

  • Run-time selection between WaitOnAddress and fallback. Still when STL targets Windows 8+ (ARM), the decision to use WaitOnAddress is made at compile time
  • Fall back to Condition Variable + SRW Lock, since not targeting older systems, like Windows XP and older
  • Do not use a flag whether "Wake" call is needed. 'WakeByAddressSingle' / 'WakeByAddressAll' do this check by itself
  • Do not spin when target address is directly passed to WaitOnAddress, since WaitOnAddress spins by itself (and SleepConditionVariableSRW is expecteed to spin too, though did not check that).
  • Checking is done in header. Other operations are in atomic_wait.cpp. When DLL run-time library is used, it is in a satellite DLL

Known issues:

  • Does not yet cover atomic_shared_ptr. Another change maybe.
  • The 16 bytes lock-free case is not tested yet. It will be tested when atomic_ref is merged, as it is available only for atomic_ref

Checklist

Be sure you've read README.md and understand the scope of this repo.

If you're unsure about a box, leave it unchecked. A maintainer will help you.

  • Identifiers in product code changes are properly _Ugly as per
    https://eel.is/c++draft/lex.name#3.1 or there are no product code changes.
  • The STL builds successfully and all tests have passed (must be manually
    verified by an STL maintainer before automated testing is enabled on GitHub,
    leave this unchecked for initial submission).
  • These changes introduce no known ABI breaks (adding members, renaming
    members, adding virtual functions, changing whether a type is an aggregate
    or trivially copyable, etc.).
  • These changes were written from scratch using only this repository,
    the C++ Working Draft (including any cited standards), other WG21 papers
    (excluding reference implementations outside of proposed standard wording),
    and LWG issues as reference material. If they were derived from a project
    that's already listed in NOTICE.txt, that's fine, but please mention it.
    If they were derived from any other project (including Boost and libc++,
    which are not yet listed in NOTICE.txt), you must mention it here,
    so we can determine whether the license is compatible and what else needs
    to be done.

@AdamBucior
Copy link
Contributor

What happened to README.md?

@AlexGuteniev
Copy link
Contributor Author

What happened to README.md?

Sorry, I'm new to Github, it was deleted by mistake. Reverted.

@SuperWig
Copy link
Contributor

SuperWig commented Mar 8, 2020

Isn't the lack of a license (AFAICT) on https://github.com/ogiroux/atomic_wait problematic?

@AlexGuteniev
Copy link
Contributor Author

Isn't the lack of a license (AFAICT) on https://github.com/ogiroux/atomic_wait problematic?

I think that's fine. Every file there is under MIT license, see top of https://github.com/ogiroux/atomic_wait/blob/master/include/atomic_wait for example

@SuperWig
Copy link
Contributor

SuperWig commented Mar 8, 2020

Isn't the lack of a license (AFAICT) on https://github.com/ogiroux/atomic_wait problematic?

I think that's fine. Every file there is under MIT license, see top of https://github.com/ogiroux/atomic_wait/blob/master/include/atomic_wait for example

Forgot to check source files 🤦‍♂️.

And MIT isn't a compatible license #113 (review)

@AlexGuteniev
Copy link
Contributor Author

And MIT isn't a compatible license #113 (review)

I see...

Wouldn't it help that:

@SuperWig
Copy link
Contributor

SuperWig commented Mar 8, 2020

Not a lawyer or maintainer but for the second bullet point, no. STL pointed out in this comment #29 (comment) for another C++20 feature that the reference implementation couldn't be used due to its MIT license.

Copy link
Member

@BillyONeal BillyONeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I asked Olivier to comment on the licensing issue over Twitter
  • This needs tests
  • The similar wait-on-address fallback in parallel_algorithms.cpp should probably be changed to call this, it's OK to file that as an issue if you don't want to do it here (after this merges)

stl/inc/atomic Outdated Show resolved Hide resolved
stl/inc/atomic Outdated Show resolved Hide resolved
stl/src/atomic.cpp Outdated Show resolved Hide resolved
stl/src/atomic.cpp Outdated Show resolved Hide resolved
stl/src/atomic.cpp Outdated Show resolved Hide resolved
stl/src/atomic.cpp Outdated Show resolved Hide resolved
stl/src/atomic.cpp Outdated Show resolved Hide resolved
stl/inc/atomic Outdated Show resolved Hide resolved
stl/inc/atomic Outdated Show resolved Hide resolved
stl/inc/atomic Outdated Show resolved Hide resolved
stl/src/atomic.cpp Outdated Show resolved Hide resolved
@BillyONeal
Copy link
Member

Thanks for your contribution!

@brycelelbach
Copy link

https://github.com/ogiroux/atomic_wait is an earlier version of the code; the newer version is https://github.com/ogiroux/freestanding, and the version incorporated in the CUDA toolkit as of CUDA 10.2. I'm the team lead at NVIDIA for that codebase, and can affirm that it is licensed under the Apache 2.0 with LLVM exception license.

@AlexGuteniev
Copy link
Contributor Author

  • This needs tests

@BillyONeal , regarding tests. I see there are several folders with test cases. Apparently this one should go into a new folder named "p1135r6_atomic_wait" under /tests/std/tests.

I have not figured out how to build and run those tests.
Is there some makefile, script, or .vcxproj ?

@StephanTLavavej
Copy link
Member

Since @AlexGutenev has confirmed that this PR's code was written from scratch (thank you for carefully disclosing the origins!), and the "inspiration" code's license is identical to ours, I don't believe that we'll need to update our cgmanifest.json (which records commit hashes of repos that we've directly taken code from) or NOTICE.txt (which collects the copyright/license banners of code that we've directly taken).

the newer version is https://github.com/ogiroux/freestanding, and the version incorporated in the CUDA toolkit as of CUDA 10.2. I'm the team lead at NVIDIA for that codebase, and can affirm that it is licensed under the Apache 2.0 with LLVM exception license.

Thanks, that's extremely helpful. @brycelelbach, can we ask you or @ogiroux to check in a LICENSE file to ogiroux/freestanding stating that? Like many companies, Microsoft has an internal process for scanning open-source code, and if anyone has any concerns later, I would like to be able to point to the ogiroux/freestanding repo as an authoritative source, instead of a GitHub PR comment.

(Aside: I observe that that repo currently bears license banners of a different nature:

https://github.com/ogiroux/freestanding/blob/ae7886730d39245880de1b68c0e839eb042a2622/include/details/__atomic_derived#L1-L23

It seems like these banners should be updated if the license is Apache 2.0 + LLVM.)

@AlexGuteniev
Copy link
Contributor Author

  • The similar wait-on-address fallback in parallel_algorithms.cpp should probably be changed to call this, it's OK to file that as an issue if you don't want to do it here (after this merges)

I like more the approach in parallel_algorithms.cpp, when decltype is used. So making parallel_algorithms.cpp using awint.h looks like a downgrade to me.

@BillyONeal , could awint.h be updated to use decltype?

@AlexGuteniev
Copy link
Contributor Author

  • The similar wait-on-address fallback in parallel_algorithms.cpp should probably be changed to call this, it's OK to file that as an issue if you don't want to do it here (after this merges)

Didn't get it from the beginning. You mean sharing fallback, not just address obtaining.

parallel_algorithms.cpp do not target XP and they use SRW Lock.

My intention was to target Windows XP, that's why I did not consider using SRW Lock.

But if there isn't such intention for atomic<T>::wait, then possibly reverse should be considered.

@miscco
Copy link
Contributor

miscco commented Mar 9, 2020

My intention was to target Windows XP, that's why I did not consider using SRW Lock.

Correct me if I am wrong but the Windows 7 the successor of the successor of XP reached end of Life earlier this year. So why would anybody ever again target XP with a modern library?

@AlexGuteniev
Copy link
Contributor Author

AlexGuteniev commented Mar 9, 2020

@miscco ,

Correct me if I am wrong but the Windows 7 the successor of the successor of XP reached end of Life earlier this year. So why would anybody ever again target XP with a modern library?

Yes, you are right. I thought VS 2019 still comes with XP support, but it is because it can install VS 2017 XP toolset, which obviously doesn't come with VS 2019 STL

@BillyONeal , any motivation in preferring my implementation based on Windows Semaphore over the one in parallel_algorithms.cpp ? There are obvious disadvantages. Most of them are related to Kernel object creation: need handles management, need timed backoff for low resources, handles may be occupied unnecessarily for most of the time. And another one is extra complexity.

I've created a branch with an implementation that does not care about XP https://github.com/AlexGutenev/STL/tree/vista

@BillyONeal
Copy link
Member

@BillyONeal , could awint.h be updated to use decltype?

That touches preprocessor metaprogramming I don't want to touch but I have no objection to you touching it :). Though in that case you probably want that in a different PR.

@BillyONeal
Copy link
Member

My intention was to target Windows XP, that's why I did not consider using SRW Lock.

But if there isn't such intention for atomic<T>::wait, then possibly reverse should be considered.

XP support is unnecessary for completely new features; I think for them Win7 is fine. We only need to keep existing programs that don't touch new features working on XP since they share the same redist, as you mentioned.

Does that mean you want to switch from the Semaphore fallback to something else?

stl/inc/atomic Outdated Show resolved Hide resolved
@@ -548,6 +564,7 @@ extern "C" PVOID __KERNEL32Functions[eMaxKernel32Function] = {0};

static int __cdecl initialize_pointers() {
HINSTANCE hKernel32 = GetModuleHandleW(L"kernel32.dll");
HINSTANCE hSynch = GetModuleHandleW(L"api-ms-win-core-synch-l1-2-0.dll");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetModuleHandleEx with GET_MODULE_HANDLE_EX_FLAG_PIN

That doesn't fix the problem that the module might not be loaded in the first place. We also can't _PIN as that prevents the standard library from being cleanly unloaded.

Like I said I would refrain from touching this until we have someone who actually understands apisets here to comment.

@AlexGuteniev
Copy link
Contributor Author

Does that mean you want to switch from the Semaphore fallback to something else?

Well, yes. To something that is very similar to what is already there for parallel_algorithms.cpp, which also looks like a simplified version of what I've seen in https://github.com/ogiroux/atomic_wait

(https://github.com/ogiroux/atomic_wait is a bit more complex, as it tries not to wake CV unnecessarily, but Windows CV takes care about that on its own)

@AlexGuteniev
Copy link
Contributor Author

Changed to use SRW Lock

stl/inc/atomic Outdated Show resolved Hide resolved
stl/inc/atomic Outdated Show resolved Hide resolved
stl/inc/atomic Outdated Show resolved Hide resolved
stl/inc/atomic Outdated Show resolved Hide resolved
stl/inc/atomic Outdated Show resolved Hide resolved
stl/inc/atomic Outdated Show resolved Hide resolved
stl/inc/atomic Outdated Show resolved Hide resolved
stl/src/awint.h Outdated Show resolved Hide resolved
stl/src/winapisupp.cpp Outdated Show resolved Hide resolved
stl/src/winapisupp.cpp Outdated Show resolved Hide resolved
stl/src/atomic.cpp Outdated Show resolved Hide resolved
@AlexGuteniev
Copy link
Contributor Author

  • The similar wait-on-address fallback in parallel_algorithms.cpp should probably be changed to call this, it's OK to file that as an issue if you don't want to do it here (after this merges)

I've created #598 for that.

stl/src/atomic.cpp Outdated Show resolved Hide resolved
stl/src/atomic.cpp Outdated Show resolved Hide resolved
stl/src/atomic.cpp Outdated Show resolved Hide resolved
stl/src/atomic.cpp Outdated Show resolved Hide resolved
stl/src/atomic.cpp Outdated Show resolved Hide resolved
stl/src/atomic.cpp Outdated Show resolved Hide resolved
stl/inc/atomic Outdated Show resolved Hide resolved
@StephanTLavavej StephanTLavavej moved this from Work In Progress to Final Review in Code Reviews Jul 30, 2020
Copy link
Member

@BillyONeal BillyONeal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does the loop on an absolute timeout so on spurious wake the timeout explodes. I'm working on fixing that now.

Code Reviews automation moved this from Final Review to Work In Progress Aug 1, 2020
@BillyONeal
Copy link
Member

@AlexGuteniev
Copy link
Contributor Author

Correct. Spurious wakes were handled by arriving to header, and now they are not. I guess just making indirect function take absolute timeout is fine

…n spurious wake by having the header compare again.

* Remove double indirect call trampoline for the non-lock free case.
@BillyONeal
Copy link
Member

Instead of passing absolute timeouts over the boundary which has caused us ABI problems before I added a special case for no timeout where the separately compiled bits can eat some of the spurious wakes.

stl/inc/atomic Outdated Show resolved Hide resolved
stl/inc/atomic Outdated Show resolved Hide resolved
stl/src/atomic_wait.cpp Outdated Show resolved Hide resolved
stl/src/atomic_wait.cpp Outdated Show resolved Hide resolved
Copy link
Member

@StephanTLavavej StephanTLavavej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion.

stl/inc/atomic Outdated
}

struct _Spinlock_guard {
long& _Spinlock;
_Spinlock_guard(long& _Spinlock_) noexcept : _Spinlock(_Spinlock_) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that explicit is most important for construction from common types like integers and pointers:

Suggested change
_Spinlock_guard(long& _Spinlock_) noexcept : _Spinlock(_Spinlock_) {
explicit _Spinlock_guard(long& _Spinlock_) noexcept : _Spinlock(_Spinlock_) {

@BillyONeal
Copy link
Member

This has passed DevDiv tests too. I'll apply STL and Casey's nitpicks and merge :)

@AlexGuteniev Thanks for putting up with us!

@BillyONeal BillyONeal moved this from Work In Progress to Ready To Merge in Code Reviews Aug 1, 2020
stl/inc/atomic Outdated Show resolved Hide resolved
@BillyONeal BillyONeal merged commit e4b75dc into microsoft:master Aug 2, 2020
Code Reviews automation moved this from Ready To Merge to Done Aug 2, 2020
@StephanTLavavej
Copy link
Member

Yay! I can't wait for this to ship in VS 2019 16.8 Preview 3! 😹

@CaseyCarter
Copy link
Member

Yay! I can't wait for this to ship in VS 2019 16.8 Preview 3! 😹

Will we ship it all at once, or piecemeal so some people might observe the feature partially shipping?

@BillyONeal
Copy link
Member

@CaseyCarter Gotta watch out for those spurious wakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cxx20 C++20 feature
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet