|
| 1 | +================================ |
| 2 | +LLVM Block Frequency Terminology |
| 3 | +================================ |
| 4 | + |
| 5 | +.. contents:: |
| 6 | + :local: |
| 7 | + |
| 8 | +Introduction |
| 9 | +============ |
| 10 | + |
| 11 | +Block Frequency is a metric for estimating the relative frequency of different |
| 12 | +basic blocks. This document describes the terminology that the |
| 13 | +``BlockFrequencyInfo`` and ``MachineBlockFrequencyInfo`` analysis passes use. |
| 14 | + |
| 15 | +Branch Probability |
| 16 | +================== |
| 17 | + |
| 18 | +Blocks with multiple successors have probabilities associated with each |
| 19 | +outgoing edge. These are called branch probabilities. For a given block, the |
| 20 | +sum of its outgoing branch probabilities should be 1.0. |
| 21 | + |
| 22 | +Branch Weight |
| 23 | +============= |
| 24 | + |
| 25 | +Rather than storing fractions on each edge, we store an integer weight. |
| 26 | +Weights are relative to the other edges of a given predecessor block. The |
| 27 | +branch probability associated with a given edge is its own weight divided by |
| 28 | +the sum of the weights on the predecessor's outgoing edges. |
| 29 | + |
| 30 | +For example, consider this IR: |
| 31 | + |
| 32 | +.. code-block:: llvm |
| 33 | +
|
| 34 | + define void @foo() { |
| 35 | + ; ... |
| 36 | + A: |
| 37 | + br i1 %cond, label %B, label %C, !prof !0 |
| 38 | + ; ... |
| 39 | + } |
| 40 | + !0 = metadata !{metadata !"branch_weights", i32 7, i32 8} |
| 41 | +
|
| 42 | +and this simple graph representation:: |
| 43 | + |
| 44 | + A -> B (edge-weight: 7) |
| 45 | + A -> C (edge-weight: 8) |
| 46 | + |
| 47 | +The probability of branching from block A to block B is 7/15, and the |
| 48 | +probability of branching from block A to block C is 8/15. |
| 49 | + |
| 50 | +See :doc:`BranchWeightMetadata` for details about the branch weight IR |
| 51 | +representation. |
| 52 | + |
| 53 | +Block Frequency |
| 54 | +=============== |
| 55 | + |
| 56 | +Block frequency is a relative metric that represents the number of times a |
| 57 | +block executes. The ratio of a block frequency to the entry block frequency is |
| 58 | +the expected number of times the block will execute per entry to the function. |
| 59 | + |
| 60 | +Block frequency is the main output of the ``BlockFrequencyInfo`` and |
| 61 | +``MachineBlockFrequencyInfo`` analysis passes. |
| 62 | + |
| 63 | +Implementation: a series of DAGs |
| 64 | +================================ |
| 65 | + |
| 66 | +The implementation of the block frequency calculation analyses each loop, |
| 67 | +bottom-up, ignoring backedges; i.e., as a DAG. After each loop is processed, |
| 68 | +it's packaged up to act as a pseudo-node in its parent loop's (or the |
| 69 | +function's) DAG analysis. |
| 70 | + |
| 71 | +Block Mass |
| 72 | +========== |
| 73 | + |
| 74 | +For each DAG, the entry node is assigned a mass of ``UINT64_MAX`` and mass is |
| 75 | +distributed to successors according to branch weights. Block Mass uses a |
| 76 | +fixed-point representation where ``UINT64_MAX`` represents ``1.0`` and ``0`` |
| 77 | +represents a number just above ``0.0``. |
| 78 | + |
| 79 | +After mass is fully distributed, in any cut of the DAG that separates the exit |
| 80 | +nodes from the entry node, the sum of the block masses of the nodes succeeded |
| 81 | +by a cut edge should equal ``UINT64_MAX``. In other words, mass is conserved |
| 82 | +as it "falls" through the DAG. |
| 83 | + |
| 84 | +If a function's basic block graph is a DAG, then block masses are valid block |
| 85 | +frequencies. This works poorly in practise though, since downstream users rely |
| 86 | +on adding block frequencies together without hitting the maximum. |
| 87 | + |
| 88 | +Loop Scale |
| 89 | +========== |
| 90 | + |
| 91 | +Loop scale is a metric that indicates how many times a loop iterates per entry. |
| 92 | +As mass is distributed through the loop's DAG, the (otherwise ignored) backedge |
| 93 | +mass is collected. This backedge mass is used to compute the exit frequency, |
| 94 | +and thus the loop scale. |
| 95 | + |
| 96 | +Implementation: Getting from mass and scale to frequency |
| 97 | +======================================================== |
| 98 | + |
| 99 | +After analysing the complete series of DAGs, each block has a mass (local to |
| 100 | +its containing loop, if any), and each loop psuedo-node has a loop scale and |
| 101 | +its own mass (from its parent's DAG). |
| 102 | + |
| 103 | +We can get an initial frequency assignment (with entry frequency of 1.0) by |
| 104 | +multiplying these masses and loop scales together. A given block's frequency |
| 105 | +is the product of its mass, the mass of containing loops' pseudo nodes, and the |
| 106 | +containing loops' loop scales. |
| 107 | + |
| 108 | +Since downstream users need integers (not floating point), this initial |
| 109 | +frequency assignment is shifted as necessary into the range of ``uint64_t``. |
| 110 | + |
| 111 | +Block Bias |
| 112 | +========== |
| 113 | + |
| 114 | +Block bias is a proposed *absolute* metric to indicate a bias toward or away |
| 115 | +from a given block during a function's execution. The idea is that bias can be |
| 116 | +used in isolation to indicate whether a block is relatively hot or cold, or to |
| 117 | +compare two blocks to indicate whether one is hotter or colder than the other. |
| 118 | + |
| 119 | +The proposed calculation involves calculating a *reference* block frequency, |
| 120 | +where: |
| 121 | + |
| 122 | +* every branch weight is assumed to be 1 (i.e., every branch probability |
| 123 | + distribution is even) and |
| 124 | + |
| 125 | +* loop scales are ignored. |
| 126 | + |
| 127 | +This reference frequency represents what the block frequency would be in an |
| 128 | +unbiased graph. |
| 129 | + |
| 130 | +The bias is the ratio of the block frequency to this reference block frequency. |
0 commit comments