Skip to content

Latest commit

 

History

History
152 lines (131 loc) · 17.2 KB

GlRegDeps.md

File metadata and controls

152 lines (131 loc) · 17.2 KB

Global Register Dependency

GlRegDeps is an IL level construct that models code generator register dependencies. It associates virtual registers with real registers, or pairs virtual registers with real registers.

Register dependencies enforce constraints. They represent that a particular value (IL node) has to be in a particular real register at a particular point in the program. These particular points in the program are usually entry or exit corresponding to basic blocks or extended basic blocks where there is control flow. Fundamentally we can think of global register dependencies as a way of conveying information within real (hardware) registers across block boundaries.

Most common control flow opcodes are: if, goto, return, athrow, switch. If present, they must be the last tree before the BBEnd. They should have their own GlRegDeps for the branch target if required. The GlRegDeps under BBEnd is for the fall through case. It is important to make sure the right value is placed in the right real register at the point where the GlRegDeps is evaluated. The GlRegDeps with xRegLoads are created for BBStart that must be consistent with the GlRegDeps stated on all the predecessor blocks.

If a value that is expected to be in a register is not in the register, or the register is not preserved if it is a call, it leads to register shuffling. The local register allocator has to shuffle it to a different register to make sure it has the right value in GlRegDeps. Register shuffling does come at a cost. It should be avoided when possible.

Reference counts are a general concept that is used within the codegen for nodes and their associated virtual registers. Global Register Allocation (GRA) related opcodes such as GlRegDeps use reference counts to determine when a register is no longer needed. It must be set up correctly, otherwise it could result in registers being overwritten with some other values prematurely, or the value being kept in a register longer than necessary. In the following example, n190n is referenced 5 times but its reference count is set up incorrectly as 6 (rc=6).

n541n     treetop                                                                             [0x7f66870b98d0] bci=[1,29,-] rc=0 vc=0 vn=- li=- udi=- nc=1
n190n       iand (X>=0 cannotOverflow )                                                       [0x7f66870b2b20] bci=[1,29,-] rc=6 vc=1095 vn=- li=- udi=- nc=2 flg=0x1100
n185n         iadd (X>=0 cannotOverflow )                                                     [0x7f66870b2990] bci=[1,21,-] rc=1 vc=1095 vn=- li=- udi=- nc=2 flg=0x1100
n183n           iload  n<auto slot 5>[#370  Auto] [flags 0x3 0x0 ] (X>=0 cannotOverflow )     [0x7f66870b28f0] bci=[1,18,-] rc=1 vc=1095 vn=- li=- udi=34 nc=0 flg=0x1100
n184n           iconst -1 (X!=0 X<=0 )                                                        [0x7f66870b2940] bci=[1,20,-] rc=1 vc=1095 vn=- li=- udi=- nc=0 flg=0x204
n505n         ==>iRegLoad
n542n     treetop                                                                             [0x7f66870b9920] bci=[1,17,-] rc=0 vc=0 vn=- li=- udi=- nc=1
n506n       ==>aRegLoad
n561n     aRegStore ecx                                                                       [0x7f66870b9f10] bci=[-1,0,-] rc=0 vc=0 vn=- li=- udi=- nc=1
n560n       aconst NULL                                                                       [0x7f66870b9ec0] bci=[-1,0,-] rc=1 vc=0 vn=- li=- udi=- nc=0
n568n     iRegStore edx                                                                       [0x7f66870ba140] bci=[1,29,-] rc=0 vc=0 vn=- li=- udi=- nc=1
n190n       ==>iand
n584n     ificmpne --> block_57 BBStart at n563n ()                                           [0x7f66870ba640] bci=[1,30,-] rc=0 vc=0 vn=- li=- udi=- nc=3 flg=0x20
n582n       iand                                                                              [0x7f66870ba5a0] bci=[1,30,-] rc=1 vc=0 vn=- li=- udi=- nc=2
n580n         iloadi  <isClassFlags>[#287  Shadow +36] [flags 0x603 0x0 ]                     [0x7f66870ba500] bci=[1,30,-] rc=1 vc=0 vn=- li=- udi=- nc=1
n579n           aloadi  <componentClass>[#282  Shadow +88] [flags 0x10607 0x0 ]               [0x7f66870ba4b0] bci=[1,30,-] rc=1 vc=0 vn=- li=- udi=- nc=1
n578n             aloadi  <vft-symbol>[#288  Shadow] [flags 0x18607 0x0 ]                     [0x7f66870ba460] bci=[1,30,-] rc=1 vc=0 vn=- li=- udi=- nc=1
n506n               ==>aRegLoad
n581n         iconst 1024                                                                     [0x7f66870ba550] bci=[1,30,-] rc=1 vc=0 vn=- li=- udi=- nc=0
n583n       iconst 0                                                                          [0x7f66870ba5f0] bci=[1,30,-] rc=1 vc=0 vn=- li=- udi=- nc=0
n585n       GlRegDeps ()                                                                      [0x7f66870ba690] bci=[1,30,-] rc=1 vc=0 vn=- li=- udi=- nc=3 flg=0x20
n506n         ==>aRegLoad
n505n         ==>iRegLoad
n586n         PassThrough rdx                                                                 [0x7f66870ba6e0] bci=[1,29,-] rc=1 vc=0 vn=- li=- udi=- nc=1
n190n           ==>iand
n574n     BBEnd </block_33>                                                                   [0x7f66870ba320] bci=[1,30,-] rc=0 vc=0 vn=- li=- udi=- nc=0
n573n     BBStart <block_58> (freq 2226) (extension of previous block)                        [0x7f66870ba2d0] bci=[1,30,-] rc=0 vc=0 vn=- li=- udi=- nc=0
n576n     NULLCHK on n506n [#32]                                                              [0x7f66870ba3c0] bci=[1,17,-] rc=0 vc=0 vn=- li=- udi=- nc=1
n575n       arraylength (stride 8)                                                            [0x7f66870ba370] bci=[1,17,-] rc=2 vc=0 vn=- li=- udi=- nc=1
n506n         ==>aRegLoad
n577n     BNDCHK [#1]                                                                         [0x7f66870ba410] bci=[1,17,-] rc=0 vc=0 vn=- li=- udi=- nc=2
n575n       ==>arraylength
n190n       ==>iand
n550n     treetop                                                                             [0x7f66870b9ba0] bci=[1,30,-] rc=0 vc=0 vn=- li=- udi=- nc=1
n549n       aloadi  <array-shadow>[#229  Shadow] [flags 0x80000607 0x0 ]                      [0x7f66870b9b50] bci=[1,30,-] rc=3 vc=0 vn=- li=- udi=- nc=1
n548n         aladd                                                                           [0x7f66870b9b00] bci=[1,17,-] rc=1 vc=0 vn=- li=- udi=- nc=2
n506n           ==>aRegLoad
n547n           ladd (X>=0 )                                                                  [0x7f66870b9ab0] bci=[1,29,-] rc=1 vc=0 vn=- li=- udi=- nc=2 flg=0x100
n545n             lshl                                                                        [0x7f66870b9a10] bci=[1,29,-] rc=1 vc=0 vn=- li=- udi=- nc=2
n543n               i2l                                                                       [0x7f66870b9970] bci=[1,29,-] rc=1 vc=0 vn=- li=- udi=- nc=1
n190n                 ==>iand
n544n               iconst 3                                                                  [0x7f66870b99c0] bci=[-1,0,-] rc=1 vc=0 vn=- li=- udi=- nc=0
n546n             lconst 24                                                                   [0x7f66870b9a60] bci=[-1,0,-] rc=1 vc=0 vn=- li=- udi=- nc=0
n572n     aRegStore ecx                                                                       [0x7f66870ba280] bci=[1,30,-] rc=0 vc=0 vn=- li=- udi=- nc=1
n549n       ==>aloadi
n564n     BBEnd </block_58> ===== ()                                                          [0x7f66870ba000] bci=[1,30,-] rc=0 vc=0 vn=- li=- udi=- nc=1 flg=0x20
n562n       GlRegDeps ()                                                                      [0x7f66870b9f60] bci=[1,30,-] rc=1 vc=0 vn=- li=- udi=- nc=3 flg=0x20
n506n         ==>aRegLoad
n505n         ==>iRegLoad
n587n         PassThrough rcx                                                                 [0x7f66870ba730] bci=[1,30,-] rc=1 vc=0 vn=- li=- udi=- nc=1
n549n           ==>aloadi

The children of the GlRegDeps node that represent new values and are assigned to new global registers in the extended block must have been evaluated before GlRegDeps. The job of GlRegDeps is to take values that are already in virtual registers and simply map them to real registers. GlRegDeps is only about generating the register dependency. It should not be about generating any other instructions.

The only other instruction that a GlRegDeps evaluation can result in is a register-register move when the same value (same IL node) goes out in two (or more) different global registers. In such a case the same virtual register cannot be mapped to two (or more) real registers in the codegen register dependency since this is a condition that cannot be satisfied. By adding a new virtual register and inserting a register-register copy instruction, the codegen register dependency can use different virtual/real registers. Note that a register-register copy instruction does not result in any arithmetic or setting of control registers. Therefore, it is a safe instruction to execute between the compare and the branch for an if opcode where the GlRegDeps is present.

The following example of GlRegDeps will cause issues. It does the aload operations (n111n and n106n) and the add operation (n110n) in the middle of laying down register dependencies. These actual tree evaluations that arise due to the register dependencies could end up interfering with branching.

n803n     ificmpne --> block_44 BBStart at n781n ()                                           [0x7f9d0f08eab0] bci=[-1,119,600] rc=0 vc=1295 vn=- li=-1 udi=- nc=3 flg=0x20
n801n       iand                                                                              [0x7f9d0f08ea10] bci=[-1,119,600] rc=1 vc=1295 vn=- li=- udi=- nc=2
n799n         iloadi  <isClassFlags>[#287  Shadow +36] [flags 0x603 0x0 ]                     [0x7f9d0f08e970] bci=[-1,119,600] rc=1 vc=1295 vn=- li=- udi=- nc=1
n798n           aloadi  <componentClass>[#282  Shadow +88] [flags 0x10607 0x0 ]               [0x7f9d0f08e920] bci=[-1,119,600] rc=1 vc=1295 vn=- li=- udi=- nc=1
n797n             aloadi  <vft-symbol>[#288  Shadow] [flags 0x18607 0x0 ]                     [0x7f9d0f08e8d0] bci=[-1,119,600] rc=1 vc=1295 vn=- li=- udi=- nc=1
n106n               ==>aload
n800n         iconst 1024                                                                     [0x7f9d0f08e9c0] bci=[-1,119,600] rc=1 vc=1295 vn=- li=- udi=- nc=0
n802n       iconst 0                                                                          [0x7f9d0f08ea60] bci=[-1,119,600] rc=1 vc=1295 vn=- li=- udi=- nc=0
n804n       GlRegDeps ()                                                                      [0x7f9d0f08eb00] bci=[-1,111,600] rc=1 vc=1295 vn=- li=- udi=- nc=3 flg=0x20
n805n         PassThrough rcx                                                                 [0x7f9d0f08eb50] bci=[-1,117,600] rc=1 vc=1295 vn=- li=- udi=- nc=1
n111n           aload  tmp<auto slot 4>[#358  Auto] [flags 0x7 0x0 ]                          [0x7f9d0f081270] bci=[-1,117,600] rc=6 vc=1295 vn=- li=1 udi=70 nc=0
n806n         PassThrough r8                                                                  [0x7f9d0f08eba0] bci=[-1,116,600] rc=1 vc=1295 vn=- li=- udi=- nc=1
n110n           iadd (cannotOverflow )                                                        [0x7f9d0f081220] bci=[-1,116,600] rc=5 vc=1295 vn=- li=1 udi=- nc=2 flg=0x1000
n107n             ==>arraylength
n109n             iconst -1 (X!=0 X<=0 )                                                      [0x7f9d0f0811d0] bci=[-1,115,600] rc=1 vc=1295 vn=- li=- udi=- nc=0 flg=0x204
n807n         PassThrough rdi                                                                 [0x7f9d0f08ebf0] bci=[-1,111,600] rc=1 vc=1295 vn=- li=- udi=- nc=1
n106n           ==>aload
n796n     BBEnd </block_13>                                                                   [0x7f9d0f08e880] bci=[-1,119,600] rc=0 vc=1295 vn=- li=-1 udi=- nc=0

These nodes must be anchored under xRegStore earlier in the block. By anchoring these nodes earlier in the block, it ensures that they get evaluated earlier than GlRegDeps so that it does not interfere with the control flow. For example, n106n is now anchored under aRegStore edi. n111n is anchored under aRegStore ecx. n110n is anchored under iRegStore r8d.

n835n     aRegStore edi                                                                       [0x7fe8a708f4b0] bci=[-1,111,600] rc=0 vc=1295 vn=- li=-1 udi=- nc=1
n106n       ==>aload
n838n     iRegStore r8d                                                                       [0x7fe8a708f5a0] bci=[-1,116,600] rc=0 vc=1295 vn=- li=-1 udi=- nc=1
n110n       iadd (cannotOverflow )                                                            [0x7fe8a7081220] bci=[-1,116,600] rc=4 vc=1295 vn=- li=1 udi=- nc=2 flg=0x1000
n107n         ==>arraylength
n109n         iconst -1 (X!=0 X<=0 )                                                          [0x7fe8a70811d0] bci=[-1,115,600] rc=1 vc=1295 vn=- li=- udi=- nc=0 flg=0x204
n841n     aRegStore ecx                                                                       [0x7fe8a708f690] bci=[-1,117,600] rc=0 vc=1295 vn=- li=-1 udi=- nc=1
n111n       aload  tmp<auto slot 4>[#358  Auto] [flags 0x7 0x0 ]                              [0x7fe8a7081270] bci=[-1,117,600] rc=3 vc=1295 vn=- li=1 udi=70 nc=0
n866n     ificmpne --> block_48 BBStart at n832n ()                                           [0x7fe8a708fe60] bci=[-1,119,600] rc=0 vc=1295 vn=- li=-1 udi=- nc=3 flg=0x20
n864n       iand                                                                              [0x7fe8a708fdc0] bci=[-1,119,600] rc=1 vc=1295 vn=- li=- udi=- nc=2
n862n         iloadi  <isClassFlags>[#287  Shadow +36] [flags 0x603 0x0 ]                     [0x7fe8a708fd20] bci=[-1,119,600] rc=1 vc=1295 vn=- li=- udi=- nc=1
n861n           aloadi  <componentClass>[#282  Shadow +88] [flags 0x10607 0x0 ]               [0x7fe8a708fcd0] bci=[-1,119,600] rc=1 vc=1295 vn=- li=- udi=- nc=1
n860n             aloadi  <vft-symbol>[#288  Shadow] [flags 0x18607 0x0 ]                     [0x7fe8a708fc80] bci=[-1,119,600] rc=1 vc=1295 vn=- li=- udi=- nc=1
n106n               ==>aload
n863n         iconst 1024                                                                     [0x7fe8a708fd70] bci=[-1,119,600] rc=1 vc=1295 vn=- li=- udi=- nc=0
n865n       iconst 0                                                                          [0x7fe8a708fe10] bci=[-1,119,600] rc=1 vc=1295 vn=- li=- udi=- nc=0
n867n       GlRegDeps ()                                                                      [0x7fe8a708feb0] bci=[-1,119,600] rc=1 vc=1295 vn=- li=- udi=- nc=3 flg=0x20
n868n         PassThrough rcx                                                                 [0x7fe8a708ff00] bci=[-1,117,600] rc=1 vc=1295 vn=- li=- udi=- nc=1
n111n           ==>aload
n869n         PassThrough r8                                                                  [0x7fe8a708ff50] bci=[-1,116,600] rc=1 vc=1295 vn=- li=- udi=- nc=1
n110n           ==>iadd
n870n         PassThrough rdi                                                                 [0x7fe8a708ffa0] bci=[-1,111,600] rc=1 vc=1295 vn=- li=- udi=- nc=1
n106n           ==>aload
n854n     BBEnd </block_13>

The xRegStore serves the purpose of anchoring those children that are honored and ensuring that they get evaluated before the GlRegDeps gets encountered.

However, the xRegStore does not mandate the value has to be in the right real register at the point of xRegStore. It is not responsible for putting the value into the register. It is not actually going to do anything at that point other than evaluates the child and associates it with a virtual register.

The actual move of that value into the right real register is dictated by GlRegDeps at the point of transitioning from one block to another. The local register allocator in the codegen happens backwards. It walks backwards through the instructions. It will see GlRegDeps first, or the result of GlRegDeps first. It associates the virtual register with the real register the first time.

Check IL Rules

The IL rules can be checked by enabling TR_EnableParanoidOptCheck, which does IL validation, tree verification, block and CFG verification at the end of the optimization to catch structural problems at the compile time.