[AArch64] Fusion of floating-point round+convert to integer is not always performed

AArch64 has instructions for converting a float to an int using a variety of different rounding modes (`fcvtn`, `fcvta`, `fcvtp`, `fcvtm`, etc). When a call to a floating-point rounding intrinsic (`floor`, `ceil`, `round`, `rint`, `trunc`) is followed by `fptoui` or `fptosi`, that sequence should be combined into a single rounding conversion instruction.

From my testing, LLVM currently performs this optimization for scalar `floor`, `ceil`, `round`, and `trunc`. The optimization is missing for `rint`, and for vector operations (including autovectorized ones).

[Here's a Compiler Explorer demo.](https://godbolt.org/z/nKjjEsqGh) The `round_to_int`, `floor_to_int`, `ceil_to_int`, and `trunc_to_int` functions compile down to a single `fcvt[mode]u`. However, the `round_to_int_ties_even` function compiles down to a `frintx`+`fcvtzu`. All the four-at-a-time functions are autovectorized, and likewise compile down to a vector `frint[mode]`+`fcvtzu` instead of a vector `fcvt[mode]u`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AArch64] Fusion of floating-point round+convert to integer is not always performed #170010

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[AArch64] Fusion of floating-point round+convert to integer is not always performed #170010

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions