Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-6543] Data race when passing copies of values across threads. #49093

Open
swift-ci opened this issue Dec 6, 2017 · 13 comments
Open

[SR-6543] Data race when passing copies of values across threads. #49093

swift-ci opened this issue Dec 6, 2017 · 13 comments

Comments

@swift-ci
Copy link
Collaborator

swift-ci commented Dec 6, 2017

Previous ID SR-6543
Radar rdar://problem/34188955
Original Reporter romainj (JIRA User)
Type Bug

Attachment: Download

Environment

Xcode 9.2 (9C40b)

Swift 4.0.3 (swiftlang-900.0.74.1 clang-900.0.39.2)

Additional Detail from JIRA
Votes 3
Component/s Standard Library
Labels Bug
Assignee @yln
Priority Medium

md5: 48a81342e68c16da71cc9c10a944287b

relates to:

  • SR-6604 Are Array writes race free in case of copying an array?

Issue Description:

Consider the following code:

var array: [Int] = [1, 2, 3]

let iterations = 1_000_000

var copy1 = array
q1.async {
    for i in 0..<iterations {
        copy1.append(i)
    }
}

var copy2 = array
q2.async {
    for i in 0..<iterations {
        copy2.append(i) // Data race: Write of size 8 by thread 9
    }
}

var copy3 = array
q3.async {
    for i in 0..<iterations {
        copy3.append(i)
    }
}

for i in 0..<iterations {
    array.append(i) // Data race: Read of size 8 by thread 1
}

q1.sync {}
q2.sync {}
q3.sync {}
NSLog("done")

A data race is detected by the thread-sanitizer when the code is compiled with -O.

Please find attached some additional examples where the issue arises (SwiftDataRaceSample.swift).

This issue is very problematic when you want to expose a value type as a public property of a thread-safe class, as shown in the attached sample code.

Related discussion: https://lists.swift.org/pipermail/swift-users/Week-of-Mon-20171204/006693.html

@belkadan
Copy link
Contributor

belkadan commented Dec 6, 2017

@milseman, for _fixLifetime and a good stress test.

@milseman
Copy link
Mannequin

milseman mannequin commented Dec 6, 2017

CC Lance (JIRA User) @lorentey

@atrick
Copy link
Member

atrick commented Dec 6, 2017

These are great test cases.

These are the steps in the simple test case from the bug description
(let's ignore copy1 and copy2, which are redundant):

1. `array` refcount is incremented when `copy2` is captured.

2. `array` is captured by an escaping closure.

3. The last (unsynchronized) call to array.append is executed first.

3a. The append performs an unsynchronized read of the shared buffer to copy it into a new unique buffer.

4. `array` goes out of local scope, decrementing the refcount and becoming unique.

5. The synchronized closure runs.

5a. The original `array` buffer is now unique, so no copy is done.

5b. The closure writes to the original `array` buffer. This is an unsynchronized write-after-read.

I talked to Kuba, and it's almost certainly: the same thing as
<rdar://problem/34188955> TSan false positive on Swift COW arrays, missing synchronization between release() and isUniquelyReferenced()

I looked into this long enough to feel pretty sure I understand the
problem but I'm still a not sure why it only reproduces on 4.0.3 and
not master.

Also, incidentally, when I decompose compilation into separate
compile/link steps, both with -santize=thread, TSAN no longer reports
the race :/

@atrick
Copy link
Member

atrick commented Dec 6, 2017

Aha!!! Finally I see the "problem". In 4.0.3 this part of the array copy gets inlined into user code. On master, it is no longer being inlined:

%59 = load %52 : $*Builtin.BridgeObject // user: %60
%60 = unchecked_ref_cast %59 : $Builtin.BridgeObject to $_ContiguousArrayStorageBase // user: %61
%61 = ref_element_addr %60 : $_ContiguousArrayStorageBase, #_ContiguousArrayStorageBase.countAndCapacity // user: %62
%62 = struct_element_addr %61 : $*_ArrayBody, #_ArrayBody._storage // user: %63
%63 = struct_element_addr %62 : $*_SwiftArrayBodyStorage, #_SwiftArrayBodyStorage.count // user: %64
%64 = struct_element_addr %63 : $*Int, #Int._value // user: %65
%65 = load %64 : $*Builtin.Int64 // user: %66
%66 = builtin "assumeNonNegative_Int64"(%65 : $Builtin.Int64) : $Builtin.Int64 // user: %67

%67 = struct $Int (%66 : $Builtin.Int64) // user: %69
// function_ref specialized Array._copyToNewBuffer(oldCount🙂
%68 = function_ref @_T0Sa16_copyToNewBufferySi8oldCount_tFSi_Tg5 : $@convention(method) (Int, @inout Array<Int>) -> () // user: %69
%69 = apply %68(%67, %0) : $@convention(method) (Int, @inout Array<Int>) -> ()

@aschwaighofer
Copy link
Member

aschwaighofer commented Dec 13, 2017

There is a race even if the release is not executed: SR-6604

@atrick
Copy link
Member

atrick commented Dec 13, 2017

I disagree with aschwaighofer@apple.com (JIRA User)'s comment above about there being a race in the code.

@aschwaighofer
Copy link
Member

aschwaighofer commented Dec 13, 2017

Yes, Andrew is right there is no race assuming properly fenced library calls are used. If they weren't there is a language level race regardless of whether an Array or a plain value type is used that is my point in SR-6604.

@atrick
Copy link
Member

atrick commented Dec 15, 2017

aschwaighofer@apple.com (JIRA User), Minor addendum to the steps above.

2. `array` is captured by an escaping closure.

\[A full memory fence is present here. TSAN correctly recognizes that\]

...

3a. The append performs an unsynchronized read of the shared buffer to copy it into a new unique buffer.

4. `array` goes out of local scope, decrementing the refcount and becoming unique.

5. The synchronized closure runs.

[There is no memory fence here, at least not according to TSAN's semantics]

Hence the WAR race. TSAN needs to recognize the uniqueness-causing-release as a synchronization point (full fence)]

Note that in reality, the hardware is probably executing many fences, but TSAN only models semantic fences that the user can rely on.

@ole
Copy link
Contributor

ole commented Apr 10, 2019

@atrick aschwaighofer@apple.com (JIRA User) Can this issue be closed? I tested the code in the description with several recent Xcode releases (9.2/Swift 4.0, 9.4/Swift 4.1, 10.1/Swift 4.2, 10.2/Swift 5.0), and Xcode 9.2/Swift 4.0 seems to be the last release where the Thread Sanitizer finds an issue. This seems to match what @atrick wrote above: "In 4.0.3 this part of the array copy gets inlined into user code. On master, it is no longer being inlined"

I'm not sure I follow your discussion but it looks like you see this as a TSAN issue and not a Swift issue anyway.

@atrick
Copy link
Member

atrick commented Apr 10, 2019

[EDIT] If my original JIRA comments are correct, then TSAN actually needs to consider a "release" operation as synchronizing on the released object conditionally at runtime when its refcount reaches one.

@devincoughlin
Copy link
Member

devincoughlin commented Apr 10, 2019

Yes, let's have a TSan bug for this. Note that we don't recommend/support running TSan under -O because the diagnostic reports can be hard to understand. But we still shouldn't have false positives in this case.

@atrick
Copy link
Member

atrick commented Apr 10, 2019

To reproduce it on master, I think you'd need to use `@inline(__always)` or hack up your own CoW data type and make sure the important code path is inlined.

@ole
Copy link
Contributor

ole commented Apr 10, 2019

@atrick Thank you for clarifying this for me.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants