Some intrinsics (such as population count, FMA and rounding) are emitted by the compiler but are not present in the base ISA of the CPU the compiler is targeting. In these cases the compiler emits code to guard the execution of the instruction and fall back to a slower implementation if the required CPU feature is not present.
This technique is currently working fine but I am concerned it might not interact well with block fusing optimizations that we might add in the future (such as those mentioned in #30645) and this could lead to subtle bugs, especially since machines without some CPU features (e.g. SSE 4.1) are fairly rare these days. We already do some optimizations where we perform code movement and speculatively execute code in order to emit conditional select instructions.
I think we should consider marking these ops somehow, perhaps simply with the 'has side effects' flag. This would represent the possibility that these instructions could cause an illegal instruction exception and prevent the compiler from moving them.
The conditional select optimizations special case integer divide ops since they panic if the divisor is 0 for a similar reason: they should not be speculatively executed.
Currently there is ~zero risk of this transformation occuring because the fall back is generally an expensive function call with side effect. However I think that is the only reason this code transformation wouldn't be applied and that seems a bit fragile.
The text was updated successfully, but these errors were encountered:
Note that this would probably need to be a dynamic property of a value for generic ops since whether or not a certain op is in the base ISA differs between architectures. For example, PopCount8 could be guarded on one architecture and unguarded on another.
We don't currently have any mechanism to prevent the *p from happening before the if. We're just not doing that lifting optimization yet, so it isn't a problem.
Instead of marking an instruction as immobile, it might be a bit more flexible to mark, when generating the SSA, values that must be scheduled in a block dominated by a particular block b. We often know what b is when generating SSA (it's one of the if branches).