-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[AMDGPU] llvm.prefetch documentation for gfx1250. NFC #157949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] llvm.prefetch documentation for gfx1250. NFC #157949
Conversation
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-backend-amdgpu Author: Stanislav Mekhanoshin (rampitec) ChangesFull diff: https://github.com/llvm/llvm-project/pull/157949.diff 1 Files Affected:
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index c7d515aeb012f..37563203f2f83 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -1455,7 +1455,6 @@ The AMDGPU backend implements the following LLVM IR intrinsics.
Returns a pair for the swapped registers. The first element of the return corresponds
to the swapped element of the first argument.
-
llvm.amdgcn.permlane32.swap Provide direct access to `v_permlane32_swap_b32` instruction on supported targets.
Swaps the values across lanes of first 2 operands. Rows 2 and 3 of the first operand are
swapped with rows 0 and 1 of the second operand (one row is 16 lanes).
@@ -1476,6 +1475,25 @@ The AMDGPU backend implements the following LLVM IR intrinsics.
- `v_mov_b32 <dest> <old>`
- `v_mov_b32 <dest> <src> <dpp_ctrl> <row_mask> <bank_mask> <bound_ctrl>`
+ :ref:`llvm.prefetch <int_prefetch>` Implemented on gfx1250, ignored on earlier targets.
+ First argument is flat, global, or constant address space pointer.
+ Any other address space is not supported.
+ On gfx125x generates flat_prefetch_b8 or global_prefetch_b8 and brings data to GL2.
+ Second argument is rw and currently ignored. Can be 0 or 1.
+ Third argument is locality, 0-3. Translates to memory scope:
+
+ * 0 - SCOPE_SYS
+ * 1 - SCOPE_DEV
+ * 2 - SCOPE_SE
+ * 3 - SCOPE_SE
+
+ Note that SCOPE_CU is not generated and not safe on an invalid address.
+ Fourth argument is cache type:
+
+ * 0 - Instruction cache, currently ignored and no code is generated.
+ * 1 - Data cache.
+
+ Instruction cache prefetches are unsafe on invalid address.
============================================== ==========================================================
.. TODO::
|
31696d1
to
d43ddb5
Compare
Returns a pair for the swapped registers. The first element of the return corresponds | ||
to the swapped element of the first argument. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, why was this line removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we need 2 blank lines here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh didn't notice there is one
No description provided.