Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 59 additions & 4 deletions llvm/docs/LangRef.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24430,7 +24430,7 @@ Examples:
.. _int_loop_dependence_war_mask:

'``llvm.loop.dependence.war.mask.*``' Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""
Expand Down Expand Up @@ -24469,11 +24469,12 @@ Semantics:
The intrinsic returns ``poison`` if the distance between ``%prtA`` and ``%ptrB``
is smaller than ``VF * %elementsize`` and either ``%ptrA + VF * %elementSize``
or ``%ptrB + VF * %elementSize`` wrap.

The element of the result mask is active when loading from %ptrA then storing to
%ptrB is safe and doesn't result in a write-after-read hazard, meaning that:

* (ptrB - ptrA) <= 0 (guarantees that all lanes are loaded before any stores), or
* (ptrB - ptrA) >= elementSize * lane (guarantees that this lane is loaded
* elementSize * lane < (ptrB - ptrA) (guarantees that this lane is loaded
before the store to the same address)

Examples:
Expand All @@ -24486,10 +24487,37 @@ Examples:
[...]
call @llvm.masked.store.v4i32.p0v4i32(<4 x i32> %vecA, ptr align 4 %ptrB, <4 x i1> %loop.dependence.mask)

; For the above example, consider the following cases:
;
; 1. ptrA >= ptrB
;
; load = <0,1,2,3> ; uint32_t load = array[i+2];
; store = <0,1,2,3> ; array[i] = store;
;
; This results in an all-true mask, as the load always occurs before the
; store, so it does not depend on any values to be stored.
;
; 2. ptrB - ptrA = 2 * elementSize:
;
; load = <0,1,2,3> ; uint32_t load = array[i];
; store = <0,1,2,3> ; array[i+2] = store;
;
; This results in a mask with the first two lanes active. This is because
; we can only read two lanes before we would read values that have yet to
; be written.
;
; 3. ptrB - ptrA = 4 * elementSize
;
; load = <0,1,2,3> ; uint32_t load = array[i];
; store = <0,1,2,3> ; array[i+4] = store;
;
; This results in an all-true mask, as the store is a full vector ahead
; of the load, so all values will be written before any lane is read.

.. _int_loop_dependence_raw_mask:

'``llvm.loop.dependence.raw.mask.*``' Intrinsics
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Syntax:
"""""""
Expand Down Expand Up @@ -24533,10 +24561,11 @@ Semantics:
The intrinsic returns ``poison`` if the distance between ``%prtA`` and ``%ptrB``
is smaller than ``VF * %elementsize`` and either ``%ptrA + VF * %elementSize``
or ``%ptrB + VF * %elementSize`` wrap.

The element of the result mask is active when storing to %ptrA then loading from
%ptrB is safe and doesn't result in aliasing, meaning that:

* abs(ptrB - ptrA) >= elementSize * lane (guarantees that the store of this lane
* elementSize * lane < abs(ptrB - ptrA) (guarantees that the store of this lane
occurs before loading from this address), or
* ptrA == ptrB (doesn't introduce any new hazards that weren't in the scalar
code)
Expand All @@ -24551,6 +24580,32 @@ Examples:
[...]
%vecB = call <4 x i32> @llvm.masked.load.v4i32.p0v4i32(ptr align 4 %ptrB, <4 x i1> %loop.dependence.mask, <4 x i32> poison)

; For the above example, consider the following cases:
;
; 1. ptrA == ptrB
;
; store = <0,1,2,3> ; array[i] = store;
; load = <0,1,2,3> ; uint32_t load = array[i];
;
; This results in a all-true mask. There is no conflict.
;
; 2. ptrB - ptrA = 2 * elementSize
;
; store = <0,1,2,3> ; array[i] = store;
; load = <0,1,2,3> ; uint32_t load = array[i+2];
;
; This results in a mask with the first two lanes active. In this case,
; only two lanes can be written without overwriting values yet to be read.
;
; 3. ptrB - ptrA = -2 * elementSize
;
; store = <0,1,2,3> ; array[i+2] = store;
; load = <0,1,2,3> ; uint32_t load = array[i];
;
; This also results in a mask with the first two lanes active. This is
; because if any more lanes were active the load would be dependent on the
; completion of the store.

.. _int_experimental_vp_splice:

'``llvm.experimental.vp.splice``' Intrinsic
Expand Down
Loading