-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tsan_release edge on task creation #38074
Conversation
Make sure to synchronize on Job address (AsyncTasks are Jobs, but not all Jobs are AsyncTasks).
without this, we are getting false data races between when a task is created and immediately scheduled on a different thread. False positive for `Sanitizers/tsan/actor_counters.swift` test: ``` WARNING: ThreadSanitizer: data race (pid=81452) Read of size 8 at 0x7b2000000560 by thread T5: #0 Counter.next() <null>:2 (a.out:x86_64+0x1000047f8) #1 (1) suspend resume partial function for worker(identity:counters:numIterations:) <null>:2 (a.out:x86_64+0x100005961) #2 swift::runJobInEstablishedExecutorContext(swift::Job*) <null>:2 (libswift_Concurrency.dylib:x86_64+0x280ef) Previous write of size 8 at 0x7b2000000560 by main thread: #0 Counter.init(maxCount:) <null>:2 (a.out:x86_64+0x1000046af) #1 Counter.__allocating_init(maxCount:) <null>:2 (a.out:x86_64+0x100004619) #2 runTest(numCounters:numWorkers:numIterations:) <null>:2 (a.out:x86_64+0x100006d2e) #3 swift::runJobInEstablishedExecutorContext(swift::Job*) <null>:2 (libswift_Concurrency.dylib:x86_64+0x280ef) #4 main <null>:2 (a.out:x86_64+0x10000a175) ``` New edge with this change: ``` [4357150208] allocate task 0x7b3800000000, parent = 0x0 [4357150208] creating task 0x7b3800000000 with parent 0x0 [4357150208] tsan_release on 0x7b3800000000 <<< new release edge [139088221442048] tsan_acquire on 0x7b3800000000 [139088221442048] trying to switch from executor 0x0 to 0x7ff85e2d9a00 [139088221442048] switch failed, task 0x7b3800000000 enqueued on executor 0x7ff85e2d9a00 [139088221442048] enqueue job 0x7b3800000000 on executor 0x7ff85e2d9a00 [139088221442048] tsan_release on 0x7b3800000000 [139088221442048] tsan_release on 0x7b3800000000 [4357150208] tsan_acquire on 0x7b3800000000 counters: 1, workers: 1, iterations: 1 [4357150208] allocate task 0x7b3c00000000, parent = 0x0 [4357150208] creating task 0x7b3c00000000 with parent 0x0 [4357150208] tsan_release on 0x7b3c00000000 <<< new release edge [139088221442048] tsan_acquire on 0x7b3c00000000 [4357150208] task 0x7b3800000000 waiting on task 0x7b3c00000000, going to sleep [4357150208] tsan_release on 0x7b3800000000 [4357150208] tsan_release on 0x7b3800000000 [139088221442048] getting current executor 0x0 [139088221442048] tsan_release on 0x7b3c00000000 ... ``` rdar://78932849
stdlib/public/Concurrency/Task.cpp
Outdated
@@ -642,6 +642,8 @@ static AsyncTaskAndContext swift_task_create_commonImpl( | |||
swift_retain(task); | |||
swift_task_enqueue(task, executor); | |||
} | |||
|
|||
_swift_tsan_release(static_cast<Job *>(task)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the false positive went away again after 3727db2 because the call to swift_task_enqueue()
above adds the required release edge.
@DougGregor
Are all tasks that are created here, i.e., also the ones that are not immediately enqueued above (taskCreateFlags.enqueueJob() == false
) always guaranteed to be scheduled via swift_task_enqueue()
before they start executing on a different thread? If so, then we don't need this anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They will be scheduled via swift_task_enqueue
at some point, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know what changed here? Previously, not all tasks were going through the swift_task_enqueue()
funnel, which produced the false positive we were seeing.
Do you recommend adding this (superfluous) edge here to protect us from future changes, or relying on "all tasks go through swift_task_enqueue ()
even if not immediately scheduled"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doug pointed out that some places call swift_task_enqueueGlobal()
directly.
The most common (and currently covered) case seems to be swift_task_enqueue()
calling swift_task_enqueueGlobal()
, but if the latter (more specialized) version is getting called “by itself”, we should add an edge there (and actually not at task creation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the TSan release edge from swift_task_create_commonImpl()
to swift_task_enqueueGlobalImpl()
. Task creation itself is not an event that needs synchronization, but rather that task creation "happens before" execution of that task on another thread.
This edge is usually added when the task is scheduled via swift_task_enqueue()
(which then usually calls swift_task_enqueueGlobal()
). However, not all task scheduling goes through the swift_task_enqueue()
funnel: some places call the more specific swift_task_enqueueGlobal()
directly. So let's annotate this function (duplicate edges aren't harmful) to ensure we cover all schedule events, including newly-created tasks (our original problem).
@swift-ci Please test |
Build failed |
Unrelated tests:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense.
Move the TSan release edge from `swift_task_create_commonImpl()` to `swift_task_enqueueGlobalImpl()`. Task creation itself is not an event that needs synchronization, but rather that task creation "happens before" execution of that task on another thread. This edge is usually added when the task is scheduled via `swift_task_enqueue()` (which then usually calls `swift_task_enqueueGlobal()`). However, not all task scheduling goes through the `swift_task_enqueue()` funnel as some places call the more specific `swift_task_enqueueGlobal()` directly. So let's annotate this function (duplicate edges aren't harmful) to ensure we cover all schedule events, including newly-created tasks (our original problem here). rdar://78932849
@swift-ci Please test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thinks!
We need to update the modeling of Swift Concurrency operations for TSan to avoid false positives.
For the original TSan annotations, I documented my understanding of the operations and reasoning for where to put the TSan edges in the commit messages: initial commit, task groups.
Since then the Swift Concurrency Runtime has evolved and we need to refine our TSan annotations. Relevant commits: 1, 2, ...
Fix: Add tsan_release edge on task creation
Without this, we are getting false data races when a task is created and immediately scheduled on a different thread.
False positive for
Sanitizers/tsan/actor_counters.swift
test:New edge with this change:
Note: TSan annotations should surround "synchronization operations" as follows:
rdar://78932849