Skip to content

Commit

Permalink
[RFC] Introduce convergence control intrinsics
Browse files Browse the repository at this point in the history
This is a reboot of the original design and implementation by
Nicolai Haehnle <nicolai.haehnle@amd.com>:
https://reviews.llvm.org/D85603

This change also obsoletes an earlier attempt at restarting the work on
convergence tokens:
https://reviews.llvm.org/D104504

Changes relative to D85603:

 1. Clean up the definition of a "convergent operation", a convergent
    call and convergent function.
 2. Clean up the relationship between dynamic instances, sets of threads and
    convergence tokens.
 3. Redistribute the formal rules into the definitions of the convergence
    intrinsics.
 4. Expand on the semantics of entering a function from outside LLVM,
    and the environment-defined outcome of the entry intrinsic.
 5. Replace the term "cycle" with "closed path". The static rules are defined
    in terms of closed paths, and then a relation is established with cycles.
 6. Specify that if a function contains a controlled convergent operation, then
    all convergent operations in that function must be controlled.
 7. Describe an optional procedure to infer tokens for uncontrolled convergent
    operations.
 8. Introduce controlled maximal convergence-before and controlled m-converged
    property as an update to the original properties in UniformityAnalysis.
 9. Additional constraint that a cycle heart can only occur in the header of a
    reducible cycle (natural loop).

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D147116
  • Loading branch information
ssahasra committed Jul 12, 2023
1 parent e36dd3e commit da61c86
Show file tree
Hide file tree
Showing 22 changed files with 2,302 additions and 109 deletions.
79 changes: 42 additions & 37 deletions llvm/docs/ConvergenceAndUniformity.rst
@@ -1,3 +1,5 @@
.. _convergence-and-uniformity:

==========================
Convergence And Uniformity
==========================
Expand Down Expand Up @@ -82,6 +84,8 @@ Diverged path
either reaches a join node of the branch or reaches the end of the
function without passing through any join node of the branch.

.. _convergence-dynamic-instances:

Threads and Dynamic Instances
=============================

Expand Down Expand Up @@ -135,7 +139,7 @@ instance*. Informally, two threads that produce converged dynamic
instances are said to be *converged*, and they are said to execute
that static instance *convergently*, at that point in the execution.

*Convergence order* is a strict partial order over dynamic instances
*Convergence-before* is a strict partial order over dynamic instances
that is defined as the transitive closure of:

1. If dynamic instance ``P`` is executed strictly before ``Q`` in the
Expand Down Expand Up @@ -171,40 +175,26 @@ The fact that *convergence-before* is a strict partial order is a
constraint on the *converged-with* relation. It is trivially satisfied
if different dynamic instances are never converged. It is also
trivially satisfied for all known implementations for which
convergence plays some role. Aside from the strict partial convergence
order, there are currently no additional constraints on the
*converged-with* relation imposed in LLVM IR.
convergence plays some role.

.. _convergence-note-convergence:

.. note::

1. The ``convergent`` attribute on convergent operations does
constrain changes to ``converged-with``, but it is expressed in
terms of control flow and does not explicitly deal with thread
convergence.

2. The convergence-before relation is not
1. The convergence-before relation is not
directly observable. Program transforms are in general free to
change the order of instructions, even though that obviously
changes the convergence-before relation.

3. Converged dynamic instances need not be executed at the same
2. Converged dynamic instances need not be executed at the same
time or even on the same resource. Converged dynamic instances
of a convergent operation may appear to do so but that is an
implementation detail. The fact that ``P`` is convergence-before
implementation detail.

3. The fact that ``P`` is convergence-before
``Q`` does not automatically imply that ``P`` happens-before
``Q`` in a memory model sense.

4. **Future work:** Providing convergence-related guarantees to
compiler frontends enables some powerful optimization techniques
that can be used by programmers or by high-level program
transforms. Constraints on the ``converged-with`` relation may
be added eventually as part of the definition of LLVM
IR, so that guarantees can be made that frontends can rely on.
For a proposal on how this might work, see `D85603
<https://reviews.llvm.org/D85603>`_.

.. _convergence-maximal:

Maximal Convergence
Expand All @@ -217,8 +207,11 @@ relation is reasonable for real targets and is compatible with
convergent operations.

The maximal converged-with relation is defined in terms of cycle
headers, which are not unique to a given CFG. Each cycle hierarchy for
the same CFG results in a different maximal converged-with relation.
headers, with the assumption that threads converge at the header on every
"iteration" of the cycle. Informally, two threads execute the same iteration of
a cycle if they both previously executed the cycle header the same number of
times after they entered that cycle. In general, this needs to account for the
iterations of parent cycles as well.

**Maximal converged-with:**

Expand All @@ -235,6 +228,10 @@ the same CFG results in a different maximal converged-with relation.

.. note::

Cycle headers may not be unique to a given CFG if it is irreducible. Each
cycle hierarchy for the same CFG results in a different maximal
converged-with relation.

For brevity, the rest of the document restricts the term
*converged* to mean "related under the maximal converged-with
relation for the given cycle hierarchy".
Expand Down Expand Up @@ -269,7 +266,7 @@ Maximal convergence can now be demonstrated in the earlier example as follows:
Dependence on Cycles Headers
----------------------------

Contradictions in convergence order are possible only between two
Contradictions in *convergence-before* are possible only between two
nodes that are inside some cycle. The dynamic instances of such nodes
may be interleaved in the same thread, and this interleaving may be
different for different threads.
Expand Down Expand Up @@ -427,6 +424,8 @@ any use ``U`` outside the cycle receives a value from non-converged
dynamic instances of ``N``. An output of ``U`` may be divergent,
depending on the semantics of the instruction.

.. _uniformity-analysis:

Static Uniformity Analysis
==========================

Expand Down Expand Up @@ -458,20 +457,14 @@ hierarchy:


Each node ``X`` in a given CFG is reported to be m-converged if and
only if:

1. ``X`` is a :ref:`top-level<cycle-toplevel-block>` node, in which
case, there are no cycle headers to influence the convergence of
``X``.
only if every cycle that contains ``X`` satisfies the following necessary
conditions:

2. Otherwise, if ``X`` is inside a cycle, then every cycle that
contains ``X`` satisfies the following necessary conditions:

a. Every divergent branch inside the cycle satisfies the
:ref:`diverged entry criterion<convergence-diverged-entry>`, and,
b. There are no :ref:`diverged paths reaching the
cycle<convergence-diverged-outside>` from a divergent branch
outside it.
1. Every divergent branch inside the cycle satisfies the
:ref:`diverged entry criterion<convergence-diverged-entry>`, and,
2. There are no :ref:`diverged paths reaching the
cycle<convergence-diverged-outside>` from a divergent branch
outside it.

.. note::

Expand Down Expand Up @@ -700,3 +693,15 @@ Clearly, this can be determined only in a cycle hierarchy ``T`` where
in a different cycle hierarchy ``T'`` where ``C`` is part of a larger
cycle ``C'`` with the same header, but this does not contradict the
conclusion in ``T``.

Controlled Convergence
======================

:ref:`Convergence control tokens <dynamic_instances_and_convergence_tokens>`
provide an explicit semantics for determining which threads are converged at a
given point in the program. The impact of this is incorporated in a
:ref:`controlled maximal converged-with <controlled_maximal_converged_with>`
relation over dynamic instances and a :ref:`controlled m-converged
<controlled_m_converged>` property of static instances. The :ref:`uniformity
analysis <uniformity-analysis>` implemented in LLVM includes this for targets
that support convergence control tokens.

0 comments on commit da61c86

Please sign in to comment.