AArch64 has instructions for converting a float to an int using a variety of different rounding modes (fcvtn, fcvta, fcvtp, fcvtm, etc). When a call to a floating-point rounding intrinsic (floor, ceil, round, rint, trunc) is followed by fptoui or fptosi, that sequence should be combined into a single rounding conversion instruction.
From my testing, LLVM currently performs this optimization for scalar floor, ceil, round, and trunc. The optimization is missing for rint, and for vector operations (including autovectorized ones).
Here's a Compiler Explorer demo. The round_to_int, floor_to_int, ceil_to_int, and trunc_to_int functions compile down to a single fcvt[mode]u. However, the round_to_int_ties_even function compiles down to a frintx+fcvtzu. All the four-at-a-time functions are autovectorized, and likewise compile down to a vector frint[mode]+fcvtzu instead of a vector fcvt[mode]u.