Compressed instruction encodings #72

stoklund · 2017-04-04T17:55:21Z

It is common for instruction set architectures to provide alternative encodings for common instructions that use less code space. The compressed encodings have additional constraints on the range of immediate operands and sometimes tighter register operand constraints too.

RISC-V has the RVC extension which provides a 16-bit encoding for some instructions.
ARM's 32-bit Thumb-2 instruction set has 16-bit versions of some instructions.
Intel instructions often have 8-bit and 32-bit immediate operand variants.
64-bit Intel instructions can sometimes be encoded without a REX prefix if the register operands are right.
Intel iadd can be encoded as an lea or as a smaller add instruction which has tied register operands.

Representation

Cretonne represents encoding variants of the same instruction using separate encoding recipes and bits. This also means that each recipe maps to an exact instruction size in bytes. All of the valid encodings for an instruction appear in the encoding tables used by TargetIsa::encode(). The legalizer will always pick the first encoding with a matching predicate, but there can be more encodings after that.

We should add a TargetIsa method that makes it possible to enumerate all of the valid encodings for an instruction. Probably by returning an iterator.

Selecting the best encoding

Sometimes, the legalizer is able to select a compact encoding up front. This is the case with Intel 8-bit immediates when the immediate operand is known, for example. Often, the legalizer does not have enough information to determine if a compact encoding would be valid:

The immediate operand may encode a stack frame offset, and the location of stack slots has not yet been determined.
The immediate operand my be a branch displacement, and we don't know the exact position of EBB headers yet.
The compact encoding would constrain the register allocator too much. This goes for both RVC and Thumb-2 encodings as well as REX prefix omission for x86-64. It is better to start with the unconstrained encoding and then decay to the compact encoding after register allocation.

Instruction compression pass

The legalizer selects a compact encoding when it has enough information to do so, and it won't constrain the register allocator.

The instruction compression pass is a pure optimization pass — it can be skipped without violating correctness. It runs late when we have more information:

Registers have been assigned so the tighter register operand constraints of compact encodings can be checked.
The stack frame layout is complete, so load/store offsets are known for stack accesses.

Compressing instructions obviously changes code size, so the final offsets of EBB headers are not known, and so branch displacements are not yet known.

The text was updated successfully, but these errors were encountered:

Compute exact EBB header offsets and check that branches are in range. Not implemented yet: Relax branches that are not in range. Invoke the relax_branches() pass from the 'test binemit' file tests so they can verify the proper encoding of branch instructions too.

angusholder · 2017-09-15T20:23:01Z

@stoklund I'd like to get back into the project, could I start on the compressed RISC-V encodings?

stoklund · 2017-09-15T20:54:52Z

Sure, go for it!

stoklund · 2017-09-22T15:12:20Z

@angusholder has just landed compressed instruction support for the binemit test driver. The exact same mechanism should form the basis for the instruction compression pass.

sunfishcode · 2018-04-09T16:05:24Z

Another possible approach here would be to change the concept of a recipe such that recipes have a fixed maximum size, with their actual size being computed, considering their immediates and register assignments. That would mean we wouldn't need separate REX and non-REX versions of many x86-64 recipes, for example, and we wouldn't need a separate pass to update all the encodings after register allocation.

Having a separate pass has the advantage of making it easy to disable, for debugging, or for minimizing compile time in the case where we don't care about optimizing the encodings. However, having it integrated would make it easier for the register allocator to ask an instruction encoding what registers it prefers. And it might have slightly faster compile time for the case where we do want to optimize the encodings. So it's worth considering.

sunfishcode · 2018-04-23T23:22:37Z

1bb06bf implements the shrink-instructions pass.

I decided not to do the idea above, as it's simpler to have a separate pass for now, though we could revisit this in the future.

stoklund mentioned this issue Apr 4, 2017

Branch resolution and relaxation #73

Closed

stoklund mentioned this issue Jul 31, 2017

Test compressed encodings #128

Closed

stoklund mentioned this issue Nov 7, 2017

Native: rex prefix is on almost every instruction #186

Closed

sunfishcode added the goal:optimize-speed Focus area: the speed of the code produced by Cranelift. label Nov 8, 2017

sunfishcode closed this as completed Apr 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compressed instruction encodings #72

Compressed instruction encodings #72

stoklund commented Apr 4, 2017

angusholder commented Sep 15, 2017

stoklund commented Sep 15, 2017

stoklund commented Sep 22, 2017

sunfishcode commented Apr 9, 2018

sunfishcode commented Apr 23, 2018

Compressed instruction encodings #72

Compressed instruction encodings #72

Comments

stoklund commented Apr 4, 2017

Representation

Selecting the best encoding

Instruction compression pass

angusholder commented Sep 15, 2017

stoklund commented Sep 15, 2017

stoklund commented Sep 22, 2017

sunfishcode commented Apr 9, 2018

sunfishcode commented Apr 23, 2018