Miscompile in x86 backend, incorrectly removed bit mask #69965

meheff · 2023-10-23T20:29:21Z

EDIT: Initial post was incorrect. See comment below #69965 (comment) for correct repro.

Input IR:

source_filename = "__module"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"

define i16 @test(i8 %_in) local_unnamed_addr #0 {
  %_1 = and i8 %_in, 127
  %_2 = xor i8 %_1, 127
  %_3 = or i8 %_2, -128
  %_4 = zext i8 %_3 to i16
  %_6 = shl nuw i16 %_4, 8
  %_7 = shl nuw i8 %_2, 1
  %_8 = zext i8 %_7 to i16
  %_9 = or i16 %_6, %_8
  ret i16 %_9
}

Resulting assembly with llc (trunk):

test:                                   # @test
        not     dil
        mov     eax, edi
        or      al, -128
        movzx   ecx, al
        shl     ecx, 8
        add     dil, dil
        movzx   eax, dil
        or      eax, ecx
        ret

It looks like the initial mask and i8 %_in, 127 is being incorrectly optimized away. -O0 looks correct.

The text was updated successfully, but these errors were encountered:

hongted · 2023-10-23T21:13:53Z

Suspect 2984e35 as being the incorrect optimization.

llvmbot · 2023-10-23T21:18:39Z

@llvm/issue-subscribers-backend-x86

Author: None (meheff)

https://godbolt.org/z/h4cWeWqP6

Input IR:

source_filename = "__module"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"

define i16 @<!-- -->test(i8 %_in) local_unnamed_addr #<!-- -->0 {
  %_1 = and i8 %_in, 127
  %_2 = xor i8 %_1, 127
  %_3 = or i8 %_2, -128
  %_4 = zext i8 %_3 to i16
  %_6 = shl nuw i16 %_4, 8
  %_7 = shl nuw i8 %_2, 1
  %_8 = zext i8 %_7 to i16
  %_9 = or i16 %_6, %_8
  ret i16 %_9
}

Resulting assembly with llc (trunk):

test:                                   # @<!-- -->test
        not     dil
        mov     eax, edi
        or      al, -128
        movzx   ecx, al
        shl     ecx, 8
        add     dil, dil
        movzx   eax, dil
        or      eax, ecx
        ret

It looks like the initial mask and i8 %_in, 127 is being incorrectly optimized away. -O0 looks correct.

RKSimon · 2023-10-23T21:36:11Z

Suspect 2984e35 as being the incorrect optimization.

I might be missing something, but how can this be when the same asm is generated back on 15.x? https://godbolt.org/z/37oshEnxK

hongted · 2023-10-23T22:25:31Z

Can't speak for 15.x and this may be a latent issue revealed by that change.

All I can say is that by backing out that change -- notably the line 2984e35#diff-b2d3b99d3a02afc2ac5e2777da4194f8281f7113e1ac580d1e9c89c988dd22e1R2648 that added the additional !Src->getFlags().hasNoUnsignedWrap() constraint -- I no longer see a similar miscompile.

RKSimon · 2023-10-24T10:24:44Z

I think I've found the (very old...) underlying problem - TargetLowering.SimplifyDemandedBits ISD::SHL handling doesn't adjust the upper demanded bits depending on NUW/NSW

Let me take a further look.

…sw/nuw ops Matches InstCombinerImpl::SimplifyDemandedUseBits Exposes an issue with AND(CTPOP(X),1) -> PARITY(X) fold which fails to correctly demand known zero upper bits Fixes llvm#69965

meheff · 2023-10-24T16:37:02Z

FWIW the bug is still there after removing nuw: https://godbolt.org/z/enPndGjaW

RKSimon · 2023-10-24T19:36:44Z

The xls original error was a miscompile affecting the final result. But the reduced ir looks just to be less optimal codegen, do you agree?

meheff · 2023-10-24T23:46:57Z

Whoops. Looks like I ended up stripping out the actual problem when I simplified things to post to the LLVM bug then I misread the assembly and convinced myself the bug was still there. Sorry for the confusion!

Here is a repro you can execute.

Instructions to compare O0 and O2 results:

# O2
$ llc test.ir -filetype=obj -o test.o
$ clang main.c test.o -o run_test
$ ./run_test
0x9c36

# O0
$ llc -O0 test.ir -filetype=obj -o test.o
$ clang main.c test.o -o run_test
$ ./run_test
0x9b36

main.c

#include <stdio.h>
#include <stdint.h>

uint64_t __sample__main(void**, void**, void*, void*, void*, int64_t);

int main (int argc, char** argv) {
  uint8_t input0 = 0x64;
  void* inputs[1] = {&input0};

  uint16_t output0;
  void* outputs[1] = {&output0};
  __sample__main(inputs, outputs, NULL, NULL, NULL, 0);

  printf("0x%x\n", output0);

  return 0;
}

test.ir

; ModuleID = '__module'
source_filename = "__module"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"

; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn memory(readwrite, inaccessiblemem: none)
define i64 @__sample__main(ptr nocapture readonly %input_ptrs, ptr nocapture readonly %output_ptrs, ptr nocapture readnone %tmp_buffer, ptr nocapture readnone %events, ptr nocapture readnone %user_data, ptr nocapture readnone %runtime, i64 %continuation_point) local_unnamed_addr #0 {
entry:
  %0 = load ptr, ptr %input_ptrs, align 8
  %.val.i = load i8, ptr %0, align 1
  %1 = and i8 %.val.i, 127
  %2 = xor i8 %1, 127
  %3 = or i8 %2, -128
  %4 = zext i8 %3 to i16
  %5 = load ptr, ptr %output_ptrs, align 8
  %6 = shl nuw i16 %4, 8
  %7 = shl nuw i8 %2, 1
  %8 = zext i8 %7 to i16
  %9 = or i16 %6, %8
  store i16 %9, ptr %5, align 2
  ret i64 0
}

The previous reduced test case just showed a minor codegen regression, this test now shows the actual miscompilation

…plify a SHL node's input We already do this for variable shifts, but we missed it for constant shifts Fixes llvm#69965

RKSimon · 2023-10-27T15:41:15Z

@meheff Please can you confirm if this has now fixed the original case?

meheff · 2023-10-27T16:13:09Z

The original issue is resolved. Thanks for the quick fix!

github-actions bot added the new issue label Oct 23, 2023

meheff mentioned this issue Oct 23, 2023

fuzzer error: SampleError: result miscompare 60a0 google/xls#1159

Closed

EugeneZelenko added backend:X86 miscompilation and removed new issue labels Oct 23, 2023

RKSimon self-assigned this Oct 23, 2023

RKSimon added llvm:SelectionDAG SelectionDAGISel as well and removed backend:X86 labels Oct 24, 2023

RKSimon added a commit that referenced this issue Oct 24, 2023

[DAG] Add test case for Issue #69965

f2eef3f

RKSimon mentioned this issue Oct 24, 2023

[DAG] SimplifyDemandedBits - ensure we demand the high bits for shl nsw/nuw ops #70041

Closed

RKSimon added a commit that referenced this issue Oct 26, 2023

[DAG] Update test case for Issue #69965

12dfcc0

The previous reduced test case just showed a minor codegen regression, this test now shows the actual miscompilation

RKSimon closed this as completed in 547dc46 Oct 26, 2023

zahiraam pushed a commit to zahiraam/llvm-project that referenced this issue Oct 26, 2023

[DAG] Update test case for Issue llvm#69965

19827d3

The previous reduced test case just showed a minor codegen regression, this test now shows the actual miscompilation

zahiraam pushed a commit to zahiraam/llvm-project that referenced this issue Oct 26, 2023

[DAG] SimplifyDemandedBits - ensure we drop NSW/NUW flags when we sim…

dc8742a

…plify a SHL node's input We already do this for variable shifts, but we missed it for constant shifts Fixes llvm#69965

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Miscompile in x86 backend, incorrectly removed bit mask #69965

Miscompile in x86 backend, incorrectly removed bit mask #69965

meheff commented Oct 23, 2023 •

edited

hongted commented Oct 23, 2023

llvmbot commented Oct 23, 2023

RKSimon commented Oct 23, 2023

hongted commented Oct 23, 2023 •

edited

RKSimon commented Oct 24, 2023

meheff commented Oct 24, 2023

RKSimon commented Oct 24, 2023

meheff commented Oct 24, 2023

RKSimon commented Oct 27, 2023

meheff commented Oct 27, 2023

Miscompile in x86 backend, incorrectly removed bit mask #69965

Miscompile in x86 backend, incorrectly removed bit mask #69965

Comments

meheff commented Oct 23, 2023 • edited

hongted commented Oct 23, 2023

llvmbot commented Oct 23, 2023

RKSimon commented Oct 23, 2023

hongted commented Oct 23, 2023 • edited

RKSimon commented Oct 24, 2023

meheff commented Oct 24, 2023

RKSimon commented Oct 24, 2023

meheff commented Oct 24, 2023

RKSimon commented Oct 27, 2023

meheff commented Oct 27, 2023

meheff commented Oct 23, 2023 •

edited

hongted commented Oct 23, 2023 •

edited