You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jan 23, 2023. It is now read-only.
Copy file name to clipboardExpand all lines: Documentation/botr/ryujit-overview.md
+31-8Lines changed: 31 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -379,27 +379,50 @@ The “@ 15” is the location number of the node. The “0=1” indicates that
379
379
380
380
## <aname="reg-alloc"/>Register allocation
381
381
382
-
The RyuJIT register allocator uses a Linear Scan algorithm, with an approach similar to [[2]](#[2]). In brief, it operates on two main data structures:
382
+
The RyuJIT register allocator uses a Linear Scan algorithm, with an approach similar to [[2]](#[2]). In discussion it is referred to as either `LinearScan` (the name of the implementing class), or LSRA (Linear Scan Register Allocation). In brief, it operates on two main data structures:
383
383
384
384
*`Intervals` (representing live ranges of variables or tree expressions) and `RegRecords` (representing physical registers), both of which derive from `Referenceable`.
385
385
*`RefPositions`, which represent uses or defs (or variants thereof, such as ExposedUses) of either `Intervals` or physical registers.
386
386
387
-
Pre-conditions:
387
+
### Notable features of RyuJIT LinearScan
388
388
389
-
* The `NodeInfo` is initialized for each tree node to indicate:
390
-
* Number of registers consumed and produced by the node.
391
-
* Number and type (int versus float) of internal registers required.
389
+
Unlike most register allocators, LSRA performs register allocation on an IR (Intermediate Representation) that is not a direct representation of the target instrutions. A given IR node may map to 0, 1 or multiple target instructions. Nodes that are "contained" are handled by code generation as part of their parent node and thus may map to 0 instructions. A simple node will have a 1-to-1 mapping to a target instruction, and a more complex node (e.g. `GT_STORE_BLK`) may map to multiple instructions.
390
+
391
+
### Pre-conditions:
392
+
393
+
It is the job of the `Lowering` phase to transform the IR such that:
394
+
* The nodes are in `LIR` form (i.e. all expression trees have been linearized, and the execution order of the nodes within a BasicBlock is specified by the `gtNext` and `gtPrev` links)
395
+
* All contained nodes are identified (`gtFlags` has the `GTF_CONTAINED` bit set)
396
+
* All nodes for which a register is optional are identified (`gtLsraInfo.regOptional` is `true`)
397
+
* This is used for x86 and x64 on operands that can be directly consumed from memory if no register is allocated.
398
+
* All unused values (nodes that produce a result that is not consumed) are identified (gtLIRFlags has the LIR::Flags::UnusedValue bit set)
399
+
* Since tree temps (the values produced by nodes and consumed by their parent) are expected to be single-def, single-use (SDSU), normally the live range can be determined to end at the use. If there is no use, the register allocator doesn't know where the live range ends.
400
+
* Code can be generated without any context from the parent (consumer) of each node.
401
+
402
+
After `Lowering` has completed, liveness analysis is performed:
403
+
* It identifies which `lclVar`s should have their liveness computed.
404
+
* The reason this is done after `Lowering` is that it can introduce new `lclVar`s.
405
+
* It then does liveness analysis on those `lclVar`s, updating the `bbLiveIn` and `bbLiveOut` sets for each `BasicBlock`.
406
+
* This tells the register allocator which `lclVars` are live at block boundaries.
407
+
* Note that "tree temps" cannot be live at block boundaries.
408
+
409
+
### Allocation Overview
392
410
393
411
Allocation proceeds in 4 phases:
394
412
395
-
* Determine the order in which the `BasicBlocks` will be allocated, and which predecessor of each block will be used to determine the starting location for variables live-in to the `BasicBlock`.
396
-
* Construct Intervals for each tracked lclVar, then walk the `BasicBlocks` in the determined order building `RefPositions` for each register use, def, or kill.
413
+
* Prepration:
414
+
* Determine the order in which the `BasicBlocks` will be allocated, and which predecessor of each block will be used to determine the starting location for variables live-in to the `BasicBlock`.
415
+
* Construct an `Interval` for each `lclVar` that may be enregistered.
416
+
* Construct a `RegRecord` for each physical register.
417
+
* Walk the `BasicBlocks` in the determined order building `RefPositions` for each register use, def, or kill.
418
+
* Just prior to building `RefPosition`s for the node, the `TreeNodeInfoInit()` method is called to determine its register requirements.
397
419
* Allocate the registers by traversing the `RefPositions`.
398
420
* Write back the register assignments, and perform any necessary moves at block boundaries where the allocations don’t match.
399
421
400
422
Post-conditions:
401
423
402
-
* The `gtRegNum` property of all `GenTree` nodes that require a register has been set to a valid register number.
424
+
* The `gtRegNum` property of all `GenTree` nodes that require a register has been set to `a valid register number.
425
+
* For reg-optional nodes, the `GTF_NOREG_AT_USE` bit is set in `gtFlags` if a register was not allocated.
403
426
* The `gtRsvdRegs` field (a set/mask of registers) has the requested number of registers specified for internal use.
404
427
* All spilled values (lclVar or expression) are marked with `GTF_SPILL` at their definition. For lclVars, they are also marked with `GTF_SPILLED` at any use at which the value must be reloaded.
0 commit comments