Skip to content

[X86] Failure to keep BMI/BMI2/TBM style bit manipulations patterns on avx512 mask predicates #158649

@RKSimon

Description

@RKSimon

Noticed while triaging #158646

Many of the basic BMI bit operations could easily be performed purely on the predicate registers, but instead they do a round trip to the gprs:

inline 
__mmask8 kblsmsk(__mmask8 x) {
    return x ^ (x - 1);
}
inline 
__mmask8 kblsr(__mmask8 x) {
    return x & (x - 1);
}
inline 
__mmask8 kblsi(__mmask8 x) {
    return x & -x;
}

(NOTE: The above are hacky implementations making use of the mmask types just being integers in the itrinsics headers)

(x - 1) can be performed using kadd + allones
-x might be trickier but it should be doable as not(x) + 1 (the 1 can be done as allones+kshift)

The BMI2 BZHI op and many of the TBM patterns could be easy to implement as well.

https://clang.godbolt.org/z/f1v636frj

Something like the blsi case doesn't even manage to keep the and on the predicate masks:

test_blsi(long long vector[8], long long vector[8], long long vector[8], long long vector[8]):
  vpcmpeqq %zmm1, %zmm0, %k0
  kmovd %k0, %eax
  movl %eax, %ecx
  negb %cl
  andb %al, %cl
  kmovd %ecx, %k1
  vpandq %zmm1, %zmm0, %zmm0 {%k1}
  retq

-->

test_blsi(long long vector[8], long long vector[8], long long vector[8], long long vector[8]):
  vpcmpeqq %zmm1, %zmm0, %k0
  kmovd %k0, %eax
  movl %eax, %ecx
  negb %cl
  kmovd %ecx, %k1
  kandb %k0, %k1
  vpandq %zmm1, %zmm0, %zmm0 {%k1}
  retq

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions