Nan boxing misoptimization: missing icmp combine?

This code is extracted from a popular javascript engine, but has been reworked to demonstrate the essence of the problem:

```cpp
bool isInteger(uint64_t payload) {
    return (payload & 0xffff000000000000) == 0x0001000000000000;
}

bool isNumber(uint64_t payload) {
    return (payload & 0xffff000000000000) != 0;
}

bool isFloat(uint64_t payload) {
    return isNumber(payload) && !isInteger(payload);
}

bool isFloatExpected(uint64_t payload) {
    return payload >= 0x0002000000000000;
}
```

While `isFloat` and `isFloatExpected` have 100% the same semantic, LLVM greatly complicates `isFloat` to the point it turns into 11 instruction instead of 2 on x64. The crux of the problem seems to be what the get optimized into at the IR level:

```llvm
define i1 @isFloat(i64 %0) {
  %2 = icmp ugt i64 %0, 281474976710655
  %3 = and i64 %0, -281474976710656
  %4 = icmp ne i64 %3, 281474976710656
  %5 = and i1 %2, %4
  ret i1 %5
}

define i1 @isFloatExpected(i64 %0) {
  %2 = icmp ugt i64 %0, 562949953421311
  ret i1 %2
}
```

It is worth noting that any manual inlining of `isInteger` and `isNumber` into `isFloat` produces the expected result, so LLVM confuses itself when optimizing `isInteger` and `isNumber` and is unable to recover after inlining.

This is worth handling properly, as NaN boxing is a very common techniques used in dynamic language's interpreters, so this is probably worth having a look at.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Nan boxing misoptimization: missing icmp combine? #59555

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Nan boxing misoptimization: missing icmp combine? #59555

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions