Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[runtime] Thread safety for weak references #1454

Merged
merged 2 commits into from May 4, 2016

Conversation

@glessard
Copy link
Contributor

glessard commented Feb 26, 2016

It has been fairly easy to cause the runtime to crash on multithreaded accesses to weak references (e.g. SR-192). Although weak references are value types, they can get elevated to the heap in multiple ways, such as closure capture and property use in a class object. In such cases, race conditions involving weak references could cause the runtime to perform to multiple decrement operations of the unowned reference count for a single increment; this eventually caused early deallocation, leading to use-after-free, modify-after-free and double-free errors.

This commit changes the weak reference operations to use atomic operations rather than a thread-local logic. In particular, when the weak reference needs to be nulled, it is not done unconditionally, but via an atomic_compare_exchange operation, with the release operation only performed on success in order to avoid duplicated decrement operations. The assign operations assume the destination may be visible to multiple threads; the init operations assume the destination is local to the current thread. In all cases it is assumed that the source may be visible to multiple threads.

With this change, the crasher discussed in SR-192 no longer encounters modify-after-free or double-free crashes.

@gribozavr
gribozavr reviewed Feb 26, 2016
View changes
test/Runtime/weak-reference.swift Outdated
let iterations = 200_000
for i in 1...iterations {
let box = WeakBox(Thing())
dispatch_async(q) {

This comment has been minimized.

Copy link
@gribozavr

gribozavr Feb 26, 2016

Collaborator

We have race test infrastructure (stdlib/private/StdlibUnittest/RaceTest.swift) that makes it possible to write such tests in a cross-platform way. Could you use that instead of dispatch?

@lattner
Copy link
Collaborator

lattner commented Feb 26, 2016

@glessard
Copy link
Contributor Author

glessard commented Feb 26, 2016

@rjmccall What are the restrictions on the uses of the Take calls? (The issues I observed seemed to bein swift_weakLoadStrong and swift_weakCopyAssign, and I expanded the logic to the rest of the swift_weak calls in HeapObject.cpp).

Why can you assume that assignment has exclusive access? Aren't those calls used to assign to a pre-existing var? One couldn't assume exclusive access of a var property in a class instance, for example.

@glessard
Copy link
Contributor Author

glessard commented Feb 26, 2016

The problem is due to not knowing the previous state of Value when storing to it. If two threads enter two of these calls simultaneously with a pointer to the same not-yet-nulled reference (to an object with no strong retains), they might both get past object->refCount.isDeallocating(), then both overwrite Value with null, and then they'll both call unownedRelease. One of them comes second and is making a wrong assumption. This is averted by using the exchange operation; the nulling operation is easy to catch both because of the consequence (modify-after-free), and also because more time passes between the read and the write in the original version. Other cases wouldn't lead to such obvious failures, but there could be memory leaks if Value changes between any paired read and write.

@glessard
Copy link
Contributor Author

glessard commented Feb 26, 2016

The only unknownWeak operations I can see are the non-ObjC stubs that directly forward to the HeapObject.cpp definitions from HeapObject.h. Is there another place to look? Thanks

@rjmccall
Copy link
Member

rjmccall commented Feb 26, 2016

In general, Swift does not guarantee correctness in the presence of read/write, write/write, or anything/destroy data races on a variable. It is one of the few undefined behavior rules that we embrace. This applies to weak references just as much as it applies to strong references.

Eventually, we want to provide a language concurrency model that makes races impossible outside of explicitly unsafe code. For now, it's the user's responsibility to ensure that these do not occur by properly synchronizing modifications of variables.

That's not to say that you aren't fixing a real bug. It's not supposed to be a race condition for two threads to simultaneously read from a weak reference, but right now it is because of the way that reads can clear the reference when they detect that the object is undergoing deallocation. That's something that needs to be fixed.

@rjmccall
Copy link
Member

rjmccall commented Feb 26, 2016

It's not easy to fix this particular race, however, because it isn't safe for other threads to access the referent at all if a thread is decrementing the unowned reference count, but other readers will naturally read the reference and try to increment the reference count of the referent. We basically need to have some way to tell other threads that there are reads in progress. One idea floated for fixing that is to use an activity count; please refer to the discussion thread with Mike Ash on swift-dev starting on 12/10. I'm not sure what the current status of his fix is, though.

@gribozavr
gribozavr reviewed Feb 26, 2016
View changes
test/Runtime/weak-reference.swift Outdated
_ = box.weak
}
dispatch_async(q) {
_ = box.weak

This comment has been minimized.

Copy link
@gribozavr

gribozavr Feb 26, 2016

Collaborator

You might also want to use _blackHole() from StdlibUnittest to ensure that the compiler does not eliminate the unused weak load.

@rjmccall
Copy link
Member

rjmccall commented Feb 26, 2016

The implementation of unknownWeak that I'm referring to is the non-trivial implementation in SwiftObject.mm. This is only an issue for platforms with ObjC compatibility requirements.

@glessard
Copy link
Contributor Author

glessard commented Feb 26, 2016

I see; I forgot to search the *.mm files. (I always do.)

@glessard
Copy link
Contributor Author

glessard commented Feb 26, 2016

I had thought about something like the activity count, but assumed the WeakReference struct's ABI was set in stone. Is it changeable? The CAS solution as I implemented is not perfect, but it's better, as the window of opportunity for badness is much smaller, without being too onerous.

@jckarter
Copy link
Member

jckarter commented Feb 26, 2016

The ABI is not set in stone yet.

@glessard
Copy link
Contributor Author

glessard commented Feb 26, 2016

It's probably worth doing, then.

@glessard glessard changed the title [runtime] Thread safety for weak references WIP [runtime] Thread safety for weak references Feb 29, 2016
@glessard
Copy link
Contributor Author

glessard commented Feb 29, 2016

Here's another attempt; it's a rather simple spinlock approach. On thinking about options that expand the WeakReference struct, I can't think of a way to implement it as a lock-free algorithm. If the read-read race is to be resolved with a read-write lock (which is what Mike Ash's activity count suggestion looks like to me), I'm not convinced the extra complexity over a spinlock is warranted (please advise).

This being said, the tests that I called "direct capture" now complete without crashing, but the "class instance property" ones do not. When this occurs the crashed thread has attempted to dereference the spinlock illegal pointer, but the call stack does not contain any of the modified functions; the shared weak reference must be getting read as thread-local by error, but I don't know where at the moment.

@jckarter
Copy link
Member

jckarter commented Feb 29, 2016

We need a lock that can yield to the scheduler, since iOS's scheduler doesn't guarantee preemption. If a background thread took the lock, a spinlock will deadlock any higher-priority threads that contend it.

@glessard
Copy link
Contributor Author

glessard commented Feb 29, 2016

Agreed; however, I'm unsure of the appropriate way to yield (I hope there is one.)
A lock-free method would be much better, but with this interplay of non-adjacent variables I'm not seeing what that would be.

@glessard
Copy link
Contributor Author

glessard commented Feb 29, 2016

If the strong and weak counts could both be CAS'd at once, refCount.isDeallocating() could be replaced by a call that tries to increment the weak count but fails if the deallocating flag is true (retainUnlessDeallocating?). This would solve the case where an unownedRetain loses a race with an unownedRelease. The other race is the one to null the weak reference and avoid a double-release; that one is easily solved with a CAS on WeakReference.Value.

I think this combination would successfully be lock-free.

@jckarter
Copy link
Member

jckarter commented Mar 1, 2016

We might be able to pull that off on 64-bit platforms, yeah. On 32-bit, the object header refcounts are only 32-bit aligned, and I don't think either i386 or armv7 can do an unaligned 64-bit atomic cmpxchg. We could align the refcounts, at the cost of four bytes per object, maybe.

@rjmccall
Copy link
Member

rjmccall commented Mar 1, 2016

I don't understand how a CAS on WeakReference.Value can eliminate the nulling-out race. Could you spell that one out? There's a tempting solution that looks like that, but it is not actually correct because simply having loaded the reference does not guarantee its validity. You basically need transaction memory for anything like that to work.

Other than this weak-reference-nulling issue, there is no way to validly get a race between an unowned-retain and the final unowned-release that triggers deallocation. You cannot perform an unowned retain without some reason to think that the referent is valid. For example, unowned reference variables maintain an invariant of owning an unowned-retain of their referent. As long as that invariant holds, it's safe to load/copy from the variable because the unowned reference count is always at least 1. Anything that would race with the load/copy would have to be a change to the value of the variable (or destroying it), or in other words, a store; and a load/store race is undefined behavior.

@glessard
Copy link
Contributor Author

glessard commented Mar 1, 2016

You're right, in both cases I forgot that the pointer copied from Value becomes untrustworthy on the very next clock cycle if there's a race. So the lock-free thoughts go out the window.

As for racing between a retain and a release,: I was thinking of a situation that starts with one weak reference, visible to 2 threads; strong count is 1. Thread A gets false from isDeallocating while attempting to copy the reference, then gets preempted; then thread B sets the deallocating flag on the strong count, then thread C (working with the same weak reference as A) gets true from isDeallocating. At that point an unownedRelease could happen before the unownedRetain. Depending on the length of the preemption on A, it could end up with a use-after-free. This being said, this story involves trusting the pointer from Value for too long.

@glessard glessard force-pushed the glessard:weakref-threadsafety branch 2 times, most recently Mar 11, 2016
@glessard glessard changed the title WIP [runtime] Thread safety for weak references [runtime] Thread safety for weak references Mar 11, 2016
@glessard
Copy link
Contributor Author

glessard commented Mar 11, 2016

New version: uses 2 bits from WeakReference.Value to avoid races. One allows the unknownWeak functions to call through without having to test the reference (which led to some crashes,) while the other acts as a spin lock for weakLoadStrong and weakCopyInit. sched_yield is invoked after some small amount of spinning in the event of contention.

@jckarter
Copy link
Member

jckarter commented Mar 11, 2016

@gparker42 Is sched_yield sufficient to prevent deadlocking on iOS?

@gparker42
Copy link
Contributor

gparker42 commented Mar 12, 2016

Maybe. I don't know what the kernel folks can promise, but I have been unable to make simple tests of QOS_CLASS_DEFAULT versus QOS_CLASS_BACKGROUND deadlock when there is a sched_yield in the loop (whereas similar tests do hang with a naive spinlock).

@rjmccall
rjmccall reviewed Apr 4, 2016
View changes
stdlib/public/runtime/HeapObject.cpp Outdated
auto ptr = __atomic_fetch_or(&ref->Value, WR_READING, __ATOMIC_RELAXED);
while (ptr & WR_READING) {
short c = 0;
while (ref->Value & WR_READING) {

This comment has been minimized.

Copy link
@rjmccall

rjmccall Apr 4, 2016

Member

This is not guaranteed to actually reload the data.

This comment has been minimized.

Copy link
@glessard

glessard Apr 4, 2016

Author Contributor

The intent is to not hit the cache line too often. See line 730: an atomic_fetch occurs after line 724 returned true. If it's preferable to just repeatedly call atomic_fetch, I can change it.

This comment has been minimized.

Copy link
@rjmccall

rjmccall Apr 4, 2016

Member

You need to at least do something to prevent the compiler from folding the repeated loads into a single one.

This comment has been minimized.

Copy link
@glessard

glessard Apr 4, 2016

Author Contributor

My (weak) circumstantial evidence was that the benchmarked spin time scales with the counter limit at line 725, though I haven't looked at the generated assembly. Given that this depends on the current abilities of the optimizer and we agree that it isn't a high contention case, would you think it's justified to simply do atomic_fetch repeatedly? Otherwise I'm not entirely sure what would ensure repeated loads.

This comment has been minimized.

Copy link
@rjmccall

rjmccall Apr 5, 2016

Member

Repeated atomic_fetch would be fine. You're doing a relaxed load, so this really should just compile down to an ordinary load; it's just that the compiler will not be able to combine the loads.

@rjmccall
rjmccall reviewed Apr 4, 2016
View changes
stdlib/public/runtime/HeapObject.cpp Outdated
while (ptr & WR_READING) {
short c = 0;
while (ref->Value & WR_READING) {
if ((++c & 0x3f) == 0) {

This comment has been minimized.

Copy link
@rjmccall

rjmccall Apr 4, 2016

Member

64 seems like a lot of spin iterations before yielding.

This comment has been minimized.

Copy link
@glessard

glessard Apr 4, 2016

Author Contributor

I benchmarked it on several machines, and 64 iterations takes about twice as long as the critical section on x86-64 and 64-bit ARM (<65ns); 32-bit ARM did worse than that, however. This shouldn't be a high-contention lock and the critical section is short, so it seems unfair to yield before waiting long enough to even get a chance to continue, but it certainly could be a different number.

This comment has been minimized.

Copy link
@rjmccall

rjmccall Apr 4, 2016

Member

If it works well, it's fine by me. I agree that it's unlikely to be high-contention.

@glessard glessard force-pushed the glessard:weakref-threadsafety branch Apr 6, 2016
@glessard glessard force-pushed the glessard:weakref-threadsafety branch Apr 8, 2016
@glessard glessard force-pushed the glessard:weakref-threadsafety branch Apr 15, 2016
@rjmccall
Copy link
Member

rjmccall commented May 2, 2016

@swift-ci Please test.

@rjmccall
Copy link
Member

rjmccall commented May 3, 2016

That test failure isn't this patch's fault.

@glessard
Copy link
Contributor Author

glessard commented May 3, 2016

should I squash the later changes?

@rjmccall
Copy link
Member

rjmccall commented May 3, 2016

If you wouldn't mind.


WR_NATIVEMASK = WR_NATIVE | swift::heap_object_abi::ObjCReservedBitsMask,
};

This comment has been minimized.

Copy link
@rjmccall

rjmccall May 3, 2016

Member

Please add a static_assert that WR_READING doesn't interfere with normal pointer bits, i.e. that WR_READING < alignof(void*).

@rjmccall
Copy link
Member

rjmccall commented May 3, 2016

Sorry about the long delay reviewing this, but with that static_assert and the squash, this should be good for 3.0.

glessard added 2 commits Mar 8, 2016
It has been fairly easy to cause the runtime to crash on multithreaded read-read access to weak references (e.g. https://bugs.swift.org/browse/SR-192). Although weak references are value types, they can get elevated to the heap in multiple ways, such as when captured by a closure or when used as a property in a class object instance. In such cases, race conditions involving weak references could cause the runtime to perform to multiple decrement operations of the unowned reference count for a single increment; this eventually causes early deallocation, leading to use-after-free, modify-after-free and double-free errors.

This commit changes the weak reference operations to use a spinlock rather than assuming thread-exclusive access, when appropriate.
With this change, the crasher discussed in SR-192 no longer encounters crashes due to modify-after-free or double-free errors.
Dispatch-based tests exist because (on OS X) they are more likely to encounter the race condition than `StdlibUnitTest`'s `runRaceTest()` is.
@glessard glessard force-pushed the glessard:weakref-threadsafety branch to c87309e May 4, 2016
@glessard
Copy link
Contributor Author

glessard commented May 4, 2016

Added the static_assert and squashed.
The test files are in a new directory, test/Runtime. I put them there because I couldn't find a spot; is there a better one?
I also tried to make it happen with an added field to the WeakReference struct, but wasn't successful at that. It would conceivably be less onerous in the native case, and it would support the far-fetched case of two compatibility modes for a given platform -- alas.

@rjmccall
Copy link
Member

rjmccall commented May 4, 2016

I do think we should bump weak references out to occupy two words for ABI stability, but we don't need to do that in this patch. Committing.

@rjmccall rjmccall merged commit 9d2dfc0 into apple:master May 4, 2016
@rjmccall
Copy link
Member

rjmccall commented May 4, 2016

Thanks for taking care of this!

@glessard
Copy link
Contributor Author

glessard commented May 4, 2016

Thanks!

@gribozavr

This comment has been minimized.

Copy link
Collaborator

gribozavr commented on test/Runtime/weak-reference-racetests-dispatch.swift in c87309e May 4, 2016

This workaround is not needed, could you remove it?

This comment has been minimized.

Copy link
Contributor Author

glessard replied May 4, 2016

I missed that it had become unnecessary. #2397

@jckarter
Copy link
Member

jckarter commented May 5, 2016

Thanks for taking this on @glessard, great to see this fixed!

@glessard
Copy link
Contributor Author

glessard commented May 5, 2016

@jckarter I got lots of help from you guys!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

6 participants
You can’t perform that action at this time.