Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

Draft: Document asynchronous threads #371

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
8 changes: 8 additions & 0 deletions docs/extended_api/memory_model.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,13 @@ It is low across threads within a block, but high across arbitrary threads in th

To account for non-uniform thread synchronization costs that are not always low, CUDA C++ extends the standard C++ memory model and concurrency facilities in the `cuda::` namespace with **thread scopes**, retaining the syntax and semantics of standard C++ by default.

## Asynchronous operations

[Asynchronous operations] - like the copy operations performed by [`memcpy_async`] - are performed _as-if_ by new _asynchronous threads_.

[Asynchronous operations]: extended_api/asynchronous_operations.md
[`memcpy_async`]: extended_api/asynchronous_operations/memcpy_async.md

## Thread Scopes

A _thread scope_ specifies the kind of threads that can synchronize with each other using synchronization primitive such as [`atomic`] or [`barrier`].
Expand Down Expand Up @@ -39,6 +46,7 @@ Each program thread is related to each other program thread by one or more threa
- Each GPU thread is related to each other GPU thread in the same CUDA device by the *device* thread scope: `thread_scope_device`.
- Each GPU thread is related to each other GPU thread in the same CUDA thread block by the *block* thread scope: `thread_scope_block`.
- Each thread is related to itself by the `thread` thread scope: `thread_scope_thread`.
- Each thread is related to each asynchronous thread that it creates by all scopes.

## Synchronization primitives

Expand Down