[X86] Poor AVX512 codegen with constant predicate

Noticed while reviewing constexpr handling of the predicated arithmetic:
```ll
define <16 x i32> @add(<16 x i32> %x, <16 x i32> %y) {
  %add = add <16 x i32> %y, %x
  %res = shufflevector <16 x i32> %add, <16 x i32> zeroinitializer, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
  ret <16 x i32> %res
}
```
```asm
add: # @add
  vpaddd %zmm0, %zmm1, %zmm0
  movw $255, %ax
  kmovd %eax, %k1
  vpexpandd %zmm0, %zmm0 {%k1} {z}
  retq
```
Lots of things going wrong here:
1. Lowering the shuffle as an expansion instead of a select (which would fold into a predicated instruction)
2. Use of movw/kmovd instead of kxnorb to rematerialize the 0xFF predicate mask directly
3. Zeroing upper 256-bits of the vector - so this could have just been done as `vpaddd %ymm0, %ymm1, %ymm0` for implicit zeroing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[X86] Poor AVX512 codegen with constant predicate #164399

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[X86] Poor AVX512 codegen with constant predicate #164399

Description

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions