Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Commit d6201ca

Browse files
committed
Lsra Documentation
Update LinearScan section of ryujit-overview.md, and add lsra-detail.md
1 parent 12c188f commit d6201ca

File tree

2 files changed

+991
-8
lines changed

2 files changed

+991
-8
lines changed

Documentation/botr/ryujit-overview.md

Lines changed: 31 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -379,27 +379,50 @@ The “@ 15” is the location number of the node. The “0=1” indicates that
379379

380380
## <a name="reg-alloc"/>Register allocation
381381

382-
The RyuJIT register allocator uses a Linear Scan algorithm, with an approach similar to [[2]](#[2]). In brief, it operates on two main data structures:
382+
The RyuJIT register allocator uses a Linear Scan algorithm, with an approach similar to [[2]](#[2]). In discussion it is referred to as either `LinearScan` (the name of the implementing class), or LSRA (Linear Scan Register Allocation). In brief, it operates on two main data structures:
383383

384384
* `Intervals` (representing live ranges of variables or tree expressions) and `RegRecords` (representing physical registers), both of which derive from `Referenceable`.
385385
* `RefPositions`, which represent uses or defs (or variants thereof, such as ExposedUses) of either `Intervals` or physical registers.
386386

387-
Pre-conditions:
387+
### Notable features of RyuJIT LinearScan
388388

389-
* The `NodeInfo` is initialized for each tree node to indicate:
390-
* Number of registers consumed and produced by the node.
391-
* Number and type (int versus float) of internal registers required.
389+
Unlike most register allocators, LSRA performs register allocation on an IR (Intermediate Representation) that is not a direct representation of the target instrutions. A given IR node may map to 0, 1 or multiple target instructions. Nodes that are "contained" are handled by code generation as part of their parent node and thus may map to 0 instructions. A simple node will have a 1-to-1 mapping to a target instruction, and a more complex node (e.g. `GT_STORE_BLK`) may map to multiple instructions.
390+
391+
### Pre-conditions:
392+
393+
It is the job of the `Lowering` phase to transform the IR such that:
394+
* The nodes are in `LIR` form (i.e. all expression trees have been linearized, and the execution order of the nodes within a BasicBlock is specified by the `gtNext` and `gtPrev` links)
395+
* All contained nodes are identified (`gtFlags` has the `GTF_CONTAINED` bit set)
396+
* All nodes for which a register is optional are identified (`gtLsraInfo.regOptional` is `true`)
397+
* This is used for x86 and x64 on operands that can be directly consumed from memory if no register is allocated.
398+
* All unused values (nodes that produce a result that is not consumed) are identified (gtLIRFlags has the LIR::Flags::UnusedValue bit set)
399+
* Since tree temps (the values produced by nodes and consumed by their parent) are expected to be single-def, single-use (SDSU), normally the live range can be determined to end at the use. If there is no use, the register allocator doesn't know where the live range ends.
400+
* Code can be generated without any context from the parent (consumer) of each node.
401+
402+
After `Lowering` has completed, liveness analysis is performed:
403+
* It identifies which `lclVar`s should have their liveness computed.
404+
* The reason this is done after `Lowering` is that it can introduce new `lclVar`s.
405+
* It then does liveness analysis on those `lclVar`s, updating the `bbLiveIn` and `bbLiveOut` sets for each `BasicBlock`.
406+
* This tells the register allocator which `lclVars` are live at block boundaries.
407+
* Note that "tree temps" cannot be live at block boundaries.
408+
409+
### Allocation Overview
392410

393411
Allocation proceeds in 4 phases:
394412

395-
* Determine the order in which the `BasicBlocks` will be allocated, and which predecessor of each block will be used to determine the starting location for variables live-in to the `BasicBlock`.
396-
* Construct Intervals for each tracked lclVar, then walk the `BasicBlocks` in the determined order building `RefPositions` for each register use, def, or kill.
413+
* Prepration:
414+
* Determine the order in which the `BasicBlocks` will be allocated, and which predecessor of each block will be used to determine the starting location for variables live-in to the `BasicBlock`.
415+
* Construct an `Interval` for each `lclVar` that may be enregistered.
416+
* Construct a `RegRecord` for each physical register.
417+
* Walk the `BasicBlocks` in the determined order building `RefPositions` for each register use, def, or kill.
418+
* Just prior to building `RefPosition`s for the node, the `TreeNodeInfoInit()` method is called to determine its register requirements.
397419
* Allocate the registers by traversing the `RefPositions`.
398420
* Write back the register assignments, and perform any necessary moves at block boundaries where the allocations don’t match.
399421

400422
Post-conditions:
401423

402-
* The `gtRegNum` property of all `GenTree` nodes that require a register has been set to a valid register number.
424+
* The `gtRegNum` property of all `GenTree` nodes that require a register has been set to `a valid register number.
425+
* For reg-optional nodes, the `GTF_NOREG_AT_USE` bit is set in `gtFlags` if a register was not allocated.
403426
* The `gtRsvdRegs` field (a set/mask of registers) has the requested number of registers specified for internal use.
404427
* All spilled values (lclVar or expression) are marked with `GTF_SPILL` at their definition. For lclVars, they are also marked with `GTF_SPILLED` at any use at which the value must be reloaded.
405428
* For all lclVars that are register candidates:

0 commit comments

Comments
 (0)