Open
Description
On newer x86 cpus (amd and intel) 3 operand LEA instructions with base, index and offset have a higher latency and less throughput than 2 operand LEA instructions.
The compiler when emitting the instructions could rewrite slow leas into e.g. LEA + ADD instructions where possible (flag clobbering ok) similar how MOV $0 R is rewritten to XOR R R.
Intel® 64 and IA-32 Architectures Optimization Reference Manual
3.5.1.3 Using LEA
For LEA instructions with three source operands and some specific situations, instruction latency has increased to 3 cycles, and must dispatch via port 1:
— LEA that has all three source operands: base, index, and offset.
— LEA that uses base and index registers where the base is EBP, RBP, or R13.
...
relevant llvm optimization ticket: https://reviews.llvm.org/D32277