From a726bd30735fd50956edf9c8cfc59bb6ac398b02 Mon Sep 17 00:00:00 2001 From: Mehdi Goli Date: Thu, 4 Apr 2024 18:05:18 +0100 Subject: [PATCH] Updating README-sycl.md to capture the 3.5 modifications (#16) * Updating README-sycl.md to capture the 3.5 modifications * Update README-sycl.md Co-authored-by: aacostadiaz * Remove the sgemm_nt_1_sycl PoC (#15) * Remove sgemm_nt_1 PoC * Fix build issues * Fix code style format * Remove ENABLE_NVPTX flag * Update include/cute/util/debug.hpp Co-authored-by: Mehdi Goli * Cosmetic --------- Co-authored-by: Mehdi Goli * Applying the comments --------- Co-authored-by: aacostadiaz --- README-sycl.md | 35 ++++++++++++++++++++++++++++------- 1 file changed, 28 insertions(+), 7 deletions(-) diff --git a/README-sycl.md b/README-sycl.md index 36467e58f7..ea41236438 100644 --- a/README-sycl.md +++ b/README-sycl.md @@ -17,14 +17,24 @@ resources for GPUs. Currently, only one example works on NVIDIA SM 80. -## Building with SYCL support - -To build CUTLASS SYCL support you need the latest version of DPC++ compiler, -you can either use a recent [nighly build](https://github.com/intel/llvm/releases) +## Requirements + +To build CUTLASS SYCL support you need the latest version of DPC++ compiler, you can either use a recent [nighly build](https://github.com/intel/llvm/releases) or build the compiler from source. -In either case, make sure to enable the NVIDIA plugin so you can build applications +For the latter, make sure to enable the NVIDIA plugin so you can build applications for NVIDIA GPUs. + +I see, in that case let's not call it plugins, which confuses with the Plugins available on the codeplay's website to people who are completely new to SYCL, + +we can phrase it as - + +Suggested change +In either case, make sure to enable the NVIDIA plugin so you can build applications +To build CUTLASS with SYCL support, install the latest DPC++ compiler with the CUDA backend enabled, either by building from source as described [here](https://github.com/intel/llvm/blob/sycl/sycl/doc/GetStartedGuide.md#build-dpc-toolchain-with-support-for-nvidia-cuda) , or by downloading the [nightly releases](https://github.com/intel/llvm/releases) + + +## Building with SYCL support Once you have your compiler installed, you need to point the `CMAKE_CUDA_HOST_COMPILER` flag to the clang++ provided by it. This enables the compilation of SYCL sources without altering the current NVCC path. @@ -44,18 +54,29 @@ make -G Ninja \ # Running the example +## CuTe Currently, you can build the CuTe Tutorial using the following command: ``` -ninja sgemm_nt_1_sycl +ninja [EXAMPLE_NAME]_sycl ``` You can run it like this from your build directory ``` -LD_LIBRARY_PATH=/path/to/sycl/install/lib ./examples/cute/tutorial/sgemm_nt_1_sycl +LD_LIBRARY_PATH=/path/to/sycl/install/lib ./examples/cute/tutorial/[EXAMPLE_NAME]_sycl ``` +## CUTLASS Example + Currently, the example `14_amper_tf32_tensorop_gemm` has been implemented for SYCL on Nvidia Ampere architecture. You can build this from your build directory by running : + ``` + ninja 14_ampere_tf32_tensorop_gemm_cute + ``` + You can run it like this from your build directory + ``` + NVIDIA_TF32_OVERRIDE=1 LD_LIBRARY_PATH=/path/to/sycl/install/lib ./examples/14_ampere_tf32_tensorop_gemm/14_ampere_tf32_tensorop_gemm_cute + ``` + # References [1] https://www.khronos.org/sycl/