Skip to content

cmd/compile: avoid slow versions of LEA instructions on x86 #21735

Open
@martisch

Description

@martisch

On newer x86 cpus (amd and intel) 3 operand LEA instructions with base, index and offset have a higher latency and less throughput than 2 operand LEA instructions.

The compiler when emitting the instructions could rewrite slow leas into e.g. LEA + ADD instructions where possible (flag clobbering ok) similar how MOV $0 R is rewritten to XOR R R.

Intel® 64 and IA-32 Architectures Optimization Reference Manual
3.5.1.3 Using LEA

For LEA instructions with three source operands and some specific situations, instruction latency has increased to 3 cycles, and must dispatch via port 1:
— LEA that has all three source operands: base, index, and offset.
— LEA that uses base and index registers where the base is EBP, RBP, or R13.
...

relevant llvm optimization ticket: https://reviews.llvm.org/D32277

/cc @TocarIP @randall77 @josharian

Metadata

Metadata

Assignees

Labels

NeedsFixThe path to resolution is known, but the work has not been done.Performanceearly-in-cycleA change that should be done early in the 3 month dev cycle.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions