A small extension to #164078's interp__builtin_ia32_shuffle_generic/evalShuffleGeneric callback should allow us to handle SSE41 _mm_insert_ps / __builtin_ia32_insertps128 intrinsics - the main new feature that it will require is to set vector element results to zero based of the insertps zero mask.