diff --git a/docs/new-regalloc b/docs/new-regalloc new file mode 100644 index 0000000000000..b687c2b50c6df --- /dev/null +++ b/docs/new-regalloc @@ -0,0 +1,68 @@ +We need to switch to a new register allocator. +The current one is split in a global and a local register allocator. +The global one can assign only callee-saves registers and happens +on the tree-based internal representation: it assigns local variables +to hardware registers. +The local one happens on the linear representation on a per basic +block basis and assigns hard registers to virtual registers (which +hold temporary values during expression executions) and it deals also +with the platform-specific issues (fixed registers, call conventions). + +Moving to a different register will help solve some of the performance +issues introduced by the above split, make the register more easily +portable and solve some of the issues generated by dealing with trees. + +The general design ideas are below. + +The new allocator should have a global view of all the method, so it can be +able to assign variables also to some of the volatile registers if possible, +even across basic blocks (this would improve performance). + +The allocator would be driven by per-arch declarative data, so porting +should be easier: an architecture needs to specify register classes, +call convention and instructions requirements (similar to the gcc code). + +The allocator should operate on the linear representation, this way it's +easier and faster to track usages more correctly. We need to assign virtual +registers on a per-method basis instead of per basic block. We can assign +virtual registers to variables, too. Note that since we fix the stack offset +of local vars only after this step (which happens after the burg rules are run), +some of the burg rules that try to optimize the code won't apply anymore: +the peephole code may need to be enhanced to do the optimizations instead. + +We need to handle floating point registers in the global allocator, too. + +The new allocator also needs to keep track precisely of which registers +contain references or managed pointers to allow us to move to a precise GC. + +It may be worth to use a single increasing set of integers for the virtual +registers, with the class of the register stored separately (unless the +current local allocator which keeps interger and fp registers separate). + +Since this is a large task, we need to do it in steps as much as possible. +The first is to run the register allocator _after_ the burg rules: this +requires a rewrite of the liveness code, too, to use linear indexes instead +of basic-block/tree number combinations. This can be done by: +*) allocating virtual regs to all the locals that can be register allocated +*) running the burg rules (some may require adjustments): the local virtual +registers are assigned starting from global-virt-regs+1, instead of the current +hardware-regs+1, so we can tell apart global and local virt regs. +*) running the liveness/whatever code is needed to allocate the global registers +*) allocate the rest of the local variables to stack slots +*) continue with the current local allocator + +This work could take 2-3 weeks. + +The next step is to define the kind of declarative data an architecture needs +and assigning virtual regs to all the registers and making the allocator +assign from the volatile registers, too. +Note that some of the code that is currently emitted in the arch-specific +code, will need to be emitted as instructions that the reg allocator +can inspect: think of a method that returns the first argument which is +received in a register: the current code copies it to either a local slot or +to a global reg in the prolog an copies it back to the return register +int he basic block, but since neither the regallocator nor the peephole code +knows about the prolog code, the first store cannot be optimized away. +The gcc code has some example of how to specify register classes in a +declarative way. +