[MLIR][NVVM][Docs] Explain memory spaces #168059

grypp · 2025-11-14T13:37:08Z

No description provided.

llvmbot · 2025-11-14T13:37:41Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-llvm

Author: Guray Ozen (grypp)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/168059.diff

1 Files Affected:

(modified) mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td (+39)

diff --git a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
index 1cc5b74a3cb67..5992abc8efcfd 100644
--- a/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+++ b/mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
@@ -79,6 +79,45 @@ def NVVM_Dialect : Dialect {
     sequence must be expressed directly, NVVM provides an `nvvm.inline_ptx` op to
     embed PTX inline as a last-resort escape hatch, with explicit operands and
     results.
+
+
+    **Memory Spaces:** The NVVM dialect introduces the following memory spaces,
+    each with distinct scopes and lifetimes:
+
+    | Memory Space      | Scope                | Lifetime          |
+    |-------------------|----------------------|-------------------|
+    | `generic`         | All threads          | Context-dependent |
+    | `global`          | All threads (device) | Application       |
+    | `shared`          | Thread block (CTA)   | Kernel execution  |
+    | `constant`        | All threads (RO)     | Application       |
+    | `local`           | Single thread        | Kernel execution  |
+    | `tensor`          | Thread block (CTA)   | Kernel execution  |
+    | `shared_cluster`  | Thread block cluster | Kernel execution  |
+
+    **Memory Space Details:**
+    - **generic**: Can point to any memory space; requires runtime resolution of
+      actual address space. Use when pointer origin is unknown at compile time.
+      Performance varies based on the underlying memory space.
+    - **global**: Accessible by all threads across all blocks; persists across
+      kernel launches. Highest latency (~400-800 cycles) but largest capacity
+      (device memory). Best for large data and inter-kernel communication.
+    - **shared**: Shared within a thread block (CTA); very fast on-chip memory
+      (~20-40 cycles) for cooperation between threads in the same block. Limited
+      capacity (48-164KB depending on architecture). Ideal for block-level
+      collaboration, caching, and reducing global memory traffic.
+    - **constant**: Read-only memory cached per SM; optimized for broadcast
+      patterns where all threads access the same location. Fast access when cached
+      (~20 cycles). Size typically limited to 64KB. Best for read-only data and
+      uniform values accessed by all threads.
+    - **local**: Private to each thread; used for stack frames and register spills.
+      Actually resides in global memory but cached in L1 (~100-200 cycles). Use for
+      per-thread private data and automatic variables that don't fit in registers.
+    - **tensor**: Special memory space for Tensor Memory Accelerator (TMA)
+      operations on SM 80+ architectures; used with async tensor operations and
+      wgmma instructions. Provides very fast access for matrix operations.
+    - **shared_cluster**: Shared across thread blocks within a cluster (SM 90+);
+      enables collaboration beyond single-block scope with distributed shared
+      memory. Fast access (~40-80 cycles) across cluster threads.
   }];
 
   let name = "nvvm";

mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td

durga4github

The latest revision LGTM

schwarzschild-radius

LGTM, Thanks!

llvmbot added mlir:llvm mlir labels Nov 14, 2025

[NVVM] Make nanosleep op duration SSA value

bbf17fd

grypp force-pushed the nvvm-memory branch from 3d7c84e to bbf17fd Compare November 14, 2025 13:38

grypp mentioned this pull request Nov 14, 2025

[MLIR][NVVM] Add support for shared::cta destination #168056

Merged

durga4github reviewed Nov 14, 2025

View reviewed changes

mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td Outdated Show resolved Hide resolved

durga4github reviewed Nov 14, 2025

View reviewed changes

mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td Outdated Show resolved Hide resolved

durga4github changed the title ~~[MLIR][NVVM] Explain memory spaces~~ [MLIR][NVVM][Docs] Explain memory spaces Nov 14, 2025

fx

891547d

schwarzschild-radius reviewed Nov 17, 2025

View reviewed changes

mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td Show resolved Hide resolved

fx

262145e

durga4github reviewed Nov 17, 2025

View reviewed changes

mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td Show resolved Hide resolved

durga4github approved these changes Nov 17, 2025

View reviewed changes

schwarzschild-radius approved these changes Nov 17, 2025

View reviewed changes

grypp merged commit 35ae515 into llvm:main Nov 17, 2025
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MLIR][NVVM][Docs] Explain memory spaces #168059

[MLIR][NVVM][Docs] Explain memory spaces #168059

Uh oh!

grypp commented Nov 14, 2025

Uh oh!

llvmbot commented Nov 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

durga4github left a comment

Uh oh!

schwarzschild-radius left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[MLIR][NVVM][Docs] Explain memory spaces #168059

[MLIR][NVVM][Docs] Explain memory spaces #168059

Uh oh!

Conversation

grypp commented Nov 14, 2025

Uh oh!

llvmbot commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

durga4github left a comment

Choose a reason for hiding this comment

Uh oh!

schwarzschild-radius left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

llvmbot commented Nov 14, 2025 •

edited

Loading