Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] bad combine for load inst caused by tbaa if two arguments ptr is same #68523

Closed
hstk30-hw opened this issue Oct 8, 2023 · 11 comments
Labels
invalid Resolved as invalid, i.e. not a bug undefined behaviour

Comments

@hstk30-hw
Copy link
Contributor

hstk30-hw commented Oct 8, 2023

inline void* operator new(__SIZE_TYPE__, void* __p) noexcept { return __p; }

long foo(char *c1, char *c2)
{
  long *p1 = new (c1) long;
  *p1 = 100;
  long long *p2 = new (c2) long long;
  *p2 = 200;
  long *p3 = new (c2) long;
  *p3 = 200;
  return *p1;
}
int main()
{
  union {
      char c;
      long l;
      long long ll;
  } c;
  // return 100 in arm64be ilp32
  if (foo(&c.c, &c.c) != 200)
    __builtin_abort();
}

The code run fail in arm64be ilp32 in O2 level,

https://godbolt.org/z/3rddjszhc

godbolt not support ilp32, I simpilify it use aarch64_be in IR (hope yours can get the point :-) ).

debug info like:

INSTCOMBINE ITERATION #1 on _Z3fooPcS_
 ADD:   ret i32 %0
 ADD:   %0 = load i32, ptr %c1, align 4, !tbaa !5
 ADD:   store i32 200, ptr %c2, align 4, !tbaa !5
 ADD:   store i64 200, ptr %c2, align 8, !tbaa !9
 ADD:   store i32 100, ptr %c1, align 4, !tbaa !5
 IC: Visiting:   store i32 100, ptr %c1, align 4, !tbaa !5
 IC: Visiting:   store i64 200, ptr %c2, align 8, !tbaa !9
 IC: Visiting:   store i32 200, ptr %c2, align 4, !tbaa !5
 IC: Visiting:   %0 = load i32, ptr %c1, align 4, !tbaa !5
 IC: Visiting:   ret i32 %0
 Jump threading on function '_Z3fooPcS_'
 LVI Getting block end value ptr %c1 at 'entry'
 PUSH: ptr %c1 in entry
 POP ptr %c1 in entry = overdefined
   Result = overdefined
 LVI Getting block end value ptr %c2 at 'entry'
 PUSH: ptr %c2 in entry
 POP ptr %c2 in entry = overdefined
   Result = overdefined
 LVI Getting block end value ptr %c2 at 'entry'
   Result = overdefined
 LVI Getting block end value ptr %c1 at 'entry'
   Result = overdefined
 LVI Getting block end value   %0 = load i32, ptr %c1, align 4, !tbaa !5 at 'entry'
 PUSH:   %0 = load i32, ptr %c1, align 4, !tbaa !5 in entry
  compute BB 'entry' - unknown inst def found.
 POP   %0 = load i32, ptr %c1, align 4, !tbaa !5 in entry = overdefined
   Result = overdefined
     Looking for trivial roots
 Found a new trivial root: %entry
 Last visited node: %entry
     Looking for non-trivial roots
 Total: 1, Num: 2
 Discovered CFG nodes:
 0: nullptr
 1: nullptr
 2: %entry
 Found roots: %entry
 mark live:   store i32 100, ptr %c1, align 4, !tbaa !5
 mark block live: entry
 mark live:   store i64 200, ptr %c2, align 8, !tbaa !9
 mark live:   store i32 200, ptr %c2, align 4, !tbaa !5
 mark live:   ret i32 %0
 post-dom root child is a return: entry
 work live:   ret i32 %0
 mark live:   %0 = load i32, ptr %c1, align 4, !tbaa !5
 work live:   %0 = load i32, ptr %c1, align 4, !tbaa !5
 work live:   store i32 200, ptr %c2, align 4, !tbaa !5
 work live:   store i64 200, ptr %c2, align 8, !tbaa !9
 work live:   store i32 100, ptr %c1, align 4, !tbaa !5
 final dead terminator blocks:
 Trying to eliminate MemoryDefs killed by 1 = MemoryDef(liveOnEntry) (  store i32 100, ptr %c1, align 4, !tbaa !5)
   trying to get dominating access
    visiting 0 = MemoryDef(liveOnEntry)
    ...  found LiveOnEntryDef
   finished walk
 Trying to eliminate MemoryDefs killed by 2 = MemoryDef(1) (  store i64 200, ptr %c2, align 8, !tbaa !9)
   trying to get dominating access
    visiting 1 = MemoryDef(liveOnEntry) (  store i32 100, ptr %c1, align 4, !tbaa !5)
    visiting 0 = MemoryDef(liveOnEntry)
    ...  found LiveOnEntryDef
   finished walk
 Trying to eliminate MemoryDefs killed by 3 = MemoryDef(2) (  store i32 200, ptr %c2, align 4, !tbaa !5)
   trying to get dominating access
    visiting 2 = MemoryDef(1)->liveOnEntry MayAlias (  store i64 200, ptr %c2, align 8, !tbaa !9)
   Checking for reads of 2 = MemoryDef(1)->liveOnEntry MayAlias (  store i64 200, ptr %c2, align 8, !tbaa !9)
    3 = MemoryDef(2) (  store i32 200, ptr %c2, align 4, !tbaa !5)
     ... skipping killing def/dom access
  Checking if we can kill 2 = MemoryDef(1)->liveOnEntry MayAlias (  store i64 200, ptr %c2, align 8, !tbaa !9)
 DSE: Partial overwrite: DeadLoc [0, 8) KillingLoc [0, 4)
 DSE: Partial overwrite a dead load [0, 8) by a killing store [0, 4)
 DSE: Merge Stores:
   Dead:   store i64 200, ptr %c2, align 8, !tbaa !9
   Killing:   store i32 200, ptr %c2, align 4, !tbaa !5
   Merged Value: 858993459400
 Trying to eliminate MemoryDefs that write the already existing value
 Trying to eliminate MemoryDefs at the end of the function


 INSTCOMBINE ITERATION #1 on _Z3fooPcS_
 ADD:   ret i32 %0
 ADD:   %0 = load i32, ptr %c1, align 4, !tbaa !5
 ADD:   store i64 858993459400, ptr %c2, align 8, !tbaa !9
 ADD:   store i32 100, ptr %c1, align 4, !tbaa !5
 IC: Visiting:   store i32 100, ptr %c1, align 4, !tbaa !5
 IC: Visiting:   store i64 858993459400, ptr %c2, align 8, !tbaa !9
 IC: Visiting:   %0 = load i32, ptr %c1, align 4, !tbaa !5
 IC: Replacing   %0 = load i32, ptr %c1, align 4, !tbaa !5
     with i32 100
 IC: Mod =   %0 = load i32, ptr %c1, align 4, !tbaa !5
     New =   %0 = load i32, ptr %c1, align 4, !tbaa !5
 IC: ERASE   %0 = load i32, ptr %c1, align 4, !tbaa !5
 IC: Visiting:   ret i32 100

DSE merge

store i32 200, ptr %c2, align 4, !tbaa !5
store i64 200, ptr %c2, align 8, !tbaa !9

to

store i64 858993459400, ptr %c2, align 8, !tbaa !9

Then, alias think store i64 858993459400, ptr %c2, align 8, !tbaa !9 is NoAlias to %c1 Loc (cause the size 4!=8 ???), so it combine

%0 = load i32, ptr %c1, align 4, !tbaa !5
ret i32 %0

to

ret i32 100

But, in fact, %c1 and %c2 is the same ptr.

@hstk30-hw
Copy link
Contributor Author

A more simple example https://godbolt.org/z/dWdcK3Ydh

define dso_local noundef i32 @_Z3fooPcS_(ptr noundef %c1, ptr noundef %c2) #0 {
 entry:
   store i32 100, ptr %c1, align 4, !tbaa !5
   store i64 200, ptr %c2, align 8, !tbaa !9
   store i32 200, ptr %c2, align 4, !tbaa !5
   %0 = load i32, ptr %c1, align 4, !tbaa !5
   ret i32 %0
 }

After DSE

define dso_local noundef i32 @foo(char*, char*)(ptr nocapture noundef %c1, ptr nocapture noundef writeonly %c2) local_unnamed_addr {
entry:
  store i32 100, ptr %c1, align 4, !tbaa !0
  store i64 200, ptr %c2, align 8, !tbaa !4
  %0 = load i32, ptr %c1, align 4, !tbaa !0
  ret i32 %0
}

Still, %c1 %c2 is MayBeAlias, but after DSE merge stores , %c1 and %c2 type is diff, so tbaa fail? Then lead to bad IC?

@nikic
Copy link
Contributor

nikic commented Oct 14, 2023

How is this not a strict aliasing violation?

@hstk30-hw
Copy link
Contributor Author

I'm not sure. I just assume based on the unexpected result.
Can you give any idea about this issue?

@nikic
Copy link
Contributor

nikic commented Oct 16, 2023

Tagging @AaronBallman to confirm whether this is a strict aliasing violation or not.

@hstk30-hw
Copy link
Contributor Author

 INSTCOMBINE ITERATION #1 on _Z3fooPcS_
 ADD:   ret i32 %0
 ADD:   %0 = load i32, ptr %c1, align 4, !tbaa !5
 ADD:   store i64 858993459400, ptr %c2, align 8, !tbaa !9
 ADD:   store i32 100, ptr %c1, align 4, !tbaa !5
 IC: Visiting:   store i32 100, ptr %c1, align 4, !tbaa !5
 IC: Visiting:   store i64 858993459400, ptr %c2, align 8, !tbaa !9
 IC: Visiting:   %0 = load i32, ptr %c1, align 4, !tbaa !5
 Start ptr %c2 @ LocationSize::precise(8), ptr %c1 @ LocationSize::precise(4)
 End ptr %c2 @ LocationSize::precise(8), ptr %c1 @ LocationSize::precise(4) = NoAlias
 IC: Replacing   %0 = load i32, ptr %c1, align 4, !tbaa !5
     with i32 100
 IC: Mod =   %0 = load i32, ptr %c1, align 4, !tbaa !5
     New =   %0 = load i32, ptr %c1, align 4, !tbaa !5
 IC: ERASE   %0 = load i32, ptr %c1, align 4, !tbaa !5
 IC: Visiting:   ret i32 100

This is add -aa-trace option output

@AaronBallman
Copy link
Collaborator

Tagging @AaronBallman to confirm whether this is a strict aliasing violation or not.

Hrm, I think something is definitely wrong here. For example, with:

  long long *p2 = new (c2) long long;
  *p2 = 200;
  long *p3 = new (c2) long;

nothing ends the lifetime of the long long at c2 before trying to start the lifetime of a long at the same location.

I think it's also a strict aliasing violation because:

  long *p1 = new (c1) long;
  *p1 = 100;
  long long *p2 = new (c2) long long;

tries to start the lifetime of two different types at the same memory location without an intervening end lifetime.

@hstk30-hw
Copy link
Contributor Author

IMHO, check alias Value pair if they all isa<Argument> and all type is ptr, then they are MaybeAlias ?
Do you think it will works?

@nikic
Copy link
Contributor

nikic commented Oct 17, 2023

As explained above, your code exhibits undefined behavior. There is nothing to fix. If you want to opt-out of strict aliasing requirements, use the -fno-strict-aliasing option.

@nikic nikic closed this as not planned Won't fix, can't repro, duplicate, stale Oct 17, 2023
@nikic nikic added invalid Resolved as invalid, i.e. not a bug and removed llvm:instcombine labels Oct 17, 2023
@hstk30-hw
Copy link
Contributor Author

This code used C++ placement new https://en.cppreference.com/w/cpp/language/new . It allows apply diff type to same memory location. I don't know what's your point of undefined behavior. The undefined behavior is language rel or llvm compiler rel?

@AaronBallman
Copy link
Collaborator

The UB is in the language realm. Placement new starts the lifetime of an object at a particular memory location. You're starting the lifetime of multiple objects of different types at the same memory location.

What happens if you try using std::launder() like this: https://godbolt.org/z/bjvfGdbj4

@hstk30-hw
Copy link
Contributor Author

Thx your patient. Now I know what is strict aliasing violation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid Resolved as invalid, i.e. not a bug undefined behaviour
Projects
None yet
Development

No branches or pull requests

4 participants