diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst index 95b54548f4fa8..b72710b606b97 100644 --- a/llvm/docs/AMDGPUUsage.rst +++ b/llvm/docs/AMDGPUUsage.rst @@ -1339,6 +1339,84 @@ arguments. %val = load i32, ptr %in, align 4, !amdgpu.last.use !{} +'``amdgpu.no.remote.memory``' Metadata +--------------------------------------------- + +Asserts a memory operation does not access bytes in host memory, or +remote connected peer device memory (the address must be device +local). This is intended for use with :ref:`atomicrmw ` +and other atomic instructions. This is required to emit a native +hardware instruction for some :ref:`system scope +` atomic operations on some subtargets. For most +integer atomic operations, this is a sufficient restriction to emit a +native atomic instruction. + +An :ref:`atomicrmw ` without metadata will be treated +conservatively as required to preserve the operation behavior in all +cases. This will typically be used in conjunction with +:ref:`\!amdgpu.no.fine.grained.memory`. + + +.. code-block:: llvm + + ; Indicates the atomic does not access fine-grained memory, or + ; remote device memory. + %old0 = atomicrmw sub ptr %ptr0, i32 1 acquire, !amdgpu.no.fine.grained.memory !0, !amdgpu.no.remote.memory !0 + + ; Indicates the atomic does not access peer device memory. + %old2 = atomicrmw sub ptr %ptr2, i32 1 acquire, !amdgpu.no.remote.memory !0 + + !0 = !{} + +.. _amdgpu_no_fine_grained_memory: + +'``amdgpu.no.fine.grained.memory``' Metadata +------------------------------------------------- + +Asserts a memory access does not access bytes allocated in +fine-grained allocated memory. This is intended for use with +:ref:`atomicrmw ` and other atomic instructions. This is +required to emit a native hardware instruction for some :ref:`system +scope ` atomic operations on some subtargets. An +:ref:`atomicrmw ` without metadata will be treated +conservatively as required to preserve the operation behavior in all +cases. This will typically be used in conjunction with +:ref:`\!amdgpu.no.remote.memory.access`. + +.. code-block:: llvm + + ; Indicates the access does not access fine-grained memory, or + ; remote device memory. + %old0 = atomicrmw sub ptr %ptr0, i32 1 acquire, !amdgpu.no.fine.grained.memory !0, !amdgpu.no.remote.memory.access !0 + + ; Indicates the access does not access fine-grained memory + %old2 = atomicrmw sub ptr %ptr2, i32 1 acquire, !amdgpu.no.fine.grained.memory !0 + + !0 = !{} + +.. _amdgpu_no_remote_memory_access: + +'``amdgpu.ignore.denormal.mode``' Metadata +------------------------------------------ + +For use with :ref:`atomicrmw ` floating-point +operations. Indicates the handling of denormal inputs and results is +insignificant and may be inconsistent with the expected floating-point +mode. This is necessary to emit a native atomic instruction on some +targets for some address spaces where float denormals are +unconditionally flushed. This is typically used in conjunction with +:ref:`\!amdgpu.no.remote.memory.access` +and +:ref:`\!amdgpu.no.fine.grained.memory` + + +.. code-block:: llvm + + %res0 = atomicrmw fadd ptr addrspace(1) %ptr, float %value seq_cst, align 4, !amdgpu.ignore.denormal.mode !0 + %res1 = atomicrmw fadd ptr addrspace(1) %ptr, float %value seq_cst, align 4, !amdgpu.ignore.denormal.mode !0, !amdgpu.no.fine.grained.memory !0, !amdgpu.no.remote.memory.access !0 + + !0 = !{} + LLVM IR Attributes ================== diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst index c7c2c2825f58b..2e73420d04d4b 100644 --- a/llvm/docs/ReleaseNotes.rst +++ b/llvm/docs/ReleaseNotes.rst @@ -90,6 +90,8 @@ Changes to the AMDGPU Backend ----------------------------- * Implemented the ``llvm.get.fpenv`` and ``llvm.set.fpenv`` intrinsics. +* Added ``!amdgpu.no.fine.grained.memory`` and + ``!amdgpu.no.remote.memory`` metadata to control atomic behavior. * Implemented :ref:`llvm.get.rounding ` and :ref:`llvm.set.rounding `