Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 19 additions & 1 deletion llvm/docs/AMDGPUUsage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1455,7 +1455,6 @@ The AMDGPU backend implements the following LLVM IR intrinsics.
Returns a pair for the swapped registers. The first element of the return corresponds
to the swapped element of the first argument.


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, why was this line removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we need 2 blank lines here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh didn't notice there is one

llvm.amdgcn.permlane32.swap Provide direct access to `v_permlane32_swap_b32` instruction on supported targets.
Swaps the values across lanes of first 2 operands. Rows 2 and 3 of the first operand are
swapped with rows 0 and 1 of the second operand (one row is 16 lanes).
Expand All @@ -1476,6 +1475,25 @@ The AMDGPU backend implements the following LLVM IR intrinsics.
- `v_mov_b32 <dest> <old>`
- `v_mov_b32 <dest> <src> <dpp_ctrl> <row_mask> <bank_mask> <bound_ctrl>`

:ref:`llvm.prefetch <int_prefetch>` Implemented on gfx1250, ignored on earlier targets.
First argument is flat, global, or constant address space pointer.
Any other address space is not supported.
On gfx125x generates flat_prefetch_b8 or global_prefetch_b8 and brings data to GL2.
Second argument is rw and currently ignored. Can be 0 or 1.
Third argument is locality, 0-3. Translates to memory scope:

* 0 - SCOPE_SYS
* 1 - SCOPE_DEV
* 2 - SCOPE_SE
* 3 - SCOPE_SE

Note that SCOPE_CU is not generated and not safe on an invalid address.
Fourth argument is cache type:

* 0 - Instruction cache, currently ignored and no code is generated.
* 1 - Data cache.

Instruction cache prefetches are unsafe on invalid address.
============================================== ==========================================================

.. TODO::
Expand Down
2 changes: 2 additions & 0 deletions llvm/docs/LangRef.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14631,6 +14631,8 @@ Semantics:
The return value type of :ref:`llvm.get.dynamic.area.offset <int_get_dynamic_area_offset>`
must match the target's :ref:`alloca address space <alloca_addrspace>` type.

.. _int_prefetch:

'``llvm.prefetch``' Intrinsic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
Loading