[SYCL][doc] Add mention of cuda gpu arch for enabling cuda-arch speci…

…fic features (#6978) Some users were mistakenly thinking that native atomics are not supported in DPC++ for CUDA. The doc now mentions that you need to provide correct arch flags when compiling if you wish to use native atomics as well as other features.
intel · Oct 6, 2022 · 4e5d276 · 4e5d276
1 parent 28d0cd3
commit 4e5d276
Showing 1 changed file with 11 additions and 0 deletions.
diff --git a/sycl/doc/GetStartedGuide.md b/sycl/doc/GetStartedGuide.md
@@ -641,6 +641,17 @@ clang++ -fsycl -fsycl-targets=amdgcn-amd-amdhsa \
   simple-sycl-app.cpp -o simple-sycl-app-amd.exe
 ```
 
+The target architecture may also be specified for the CUDA backend, with 
+`-Xsycl-target-backend --cuda-gpu-arch=<arch>`. Specifying the architecture is 
+necessary if an application aims to use newer hardware features, such as
+native atomic operations or tensor core operations. 
+
+```bash
+clang++ -fsycl -fsycl-targets=nvptx64-nvidia-cuda \
+  simple-sycl-app.cpp -o simple-sycl-app-cuda.exe \
+  -Xsycl-target-backend --cuda-gpu-arch=sm_80
+```
+
 To build simple-sycl-app ahead of time for GPU, CPU or Accelerator devices,
 specify the target architecture.  The examples provided use a supported
 alias for the target, representing a full triple.  Additional details can