Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misoptimization involving memset and store #35477

Closed
JohanEngelen opened this issue Jan 28, 2018 · 14 comments
Closed

Misoptimization involving memset and store #35477

JohanEngelen opened this issue Jan 28, 2018 · 14 comments
Labels
bugzilla Issues migrated from bugzilla

Comments

@JohanEngelen
Copy link
Member

Bugzilla Link 36129
Resolution FIXED
Resolved on Feb 02, 2018 08:51
Version 6.0
OS All
Blocks #35152
Attachments O3 output LLVM5.0, llvm6.ll output of LLVM6.0's opt on with llvm5.ll as input
CC @DMG862,@zmodem,@hfinkel,@RKSimon,@JonPsson,@rotateright

Extended Description

The llvm5.ll is the optimized (-O3) output of LLVM 5.0 opt of a simple testcase compiled by LDC. The testcase is setting %dt and %dt2 to the same value by different means, and asserts that they are equal indeed.
llvm6.ll is the output of opt 6.0.0 when optimizing llvm5.ll:
reproducer= opt -O3 llvm5.ll -S -o llvm6.ll.

LLVM6.0 does a misoptimization and the assert fails.
Zooming in:

%datum.DateTime = type { %datum.Date, %datum.TimeOfDay, [1 x i8] }
%datum.Date = type { i16, i8, [1 x i8] }
%datum.TimeOfDay = type { i8, i8, i8 }

define i32 @​main....
  %dt2 = alloca i64, align 8
  %tmpcast = bitcast i64* %dt2 to %datum.DateTime*
;...
  %5 = bitcast i64* %dt2 to i8*
  store i64 1, i64* %dt2, align 8
  %6 = getelementptr inbounds %datum.DateTime, %datum.DateTime* %tmpcast, i64 0, i32 1, i32 0
  call void @​llvm.memset.p0i8.i64(i8* %6, i8 0, i64 3, i32 4, i1 false) #​5
  %7 = getelementptr inbounds %datum.DateTime, %datum.DateTime* %tmpcast, i64 0, i32 1, i32 1
  store i8 30, i8* %7, align 1

is optimized to

define i32 @​main....
  %dt2 = alloca i64, align 8
  %tmpcast = bitcast i64* %dt2 to %datum.DateTime*
;...
  %5 = bitcast i64* %dt2 to i8*
  ; The following store is correct I think
  store i64 32985348833281, i64* %dt2, align 8
  %6 = getelementptr inbounds %datum.DateTime, %datum.DateTime* %tmpcast, i64 0, i32 1, i32 0
  ; This memset overwrites partly the correct value.
  ; I think if this memset would not be there, all would be fine.
  call void @​llvm.memset.p0i8.i64(i8* nonnull %6, i8 0, i64 3, i32 4, i1 false) #​4

[Remark: I can see that perhaps endianness is tricky here. But actually, the LLVM optimizer itself converted the type of %dt2 to i64* (instead of %datum.DateTime*).]

@DMG862
Copy link
Mannequin

DMG862 mannequin commented Jan 29, 2018

I believe this is a case of:
define void @​test38(i32* %P, i32* %Q) {
store i32 1, i32* %P
%P2 = bitcast i32* %P to i8*
store i32 2, i32* %Q
store i8 3, i8* %P2
ret void
}

store 3 cannot be merged into store 1, because the store 2 to Q mayalias P. The PartialStoreMerging we have doesn't take this into account I don't think.

@rotateright
Copy link
Contributor

Thanks for cc'ing me, Dave.

For reference, partial store merging was:
https://reviews.llvm.org/D30703
https://reviews.llvm.org/rL314206

I commandeered that patch, but I still don't know much about this pass. Let me see if I can take alias analysis into account, so we avoid this.

@DMG862
Copy link
Mannequin

DMG862 mannequin commented Jan 29, 2018

Maybe a check of memoryIsNotModifiedBetween is all we need? Same as noop stores. It may not be the most efficient, but I think checks what we want.

@rotateright
Copy link
Contributor

Maybe a check of memoryIsNotModifiedBetween is all we need? Same as noop
stores. It may not be the most efficient, but I think checks what we want.

Sounds good to me...patch coming up. Thanks!

@rotateright
Copy link
Contributor

@RKSimon
Copy link
Collaborator

RKSimon commented Jan 29, 2018

Should this be a 6.00 blocker?

@rotateright
Copy link
Contributor

Should this be a 6.00 blocker?

Yes, I think so - it's a recently introduced miscompile.

@JohanEngelen
Copy link
Member Author

Should this be a 6.00 blocker?

Yes, I think so - it's a recently introduced miscompile.

Thanks for elevating to a 6.0.0 blocker. LDC generates code that encounters this bug. The bug results in LDC failing its testsuite, so would be a blocker for releasing LDC linked to LLVM 6.0.0. (I haven't tried to make a testcase that would reproduce with Clang)

@rotateright
Copy link
Contributor

Fixed in trunk:
https://reviews.llvm.org/rL323759

Leaving the bug open to allow baking time there and - assuming success - merging in the branch.

@JonPsson
Copy link
Contributor

*** Bug llvm/llvm-bugzilla-archive#35923 has been marked as a duplicate of this bug. ***

@JohanEngelen
Copy link
Member Author

I can confirm that the original D testcase now passes with LDC+LLVMtrunk. Thanks for the fix!

(Sidenote: In the code, I read a TODO comment "Deal with [...] some mem intrinsics (if needed)". As you can see from the testcase IR, we generate a call to the memset intrinsic to set the padding bytes of structs to zero (padding bytes are put in arrays of i8). So for us, it'd be beneficial if the store merge code learns how to deal with that.)

@rotateright
Copy link
Contributor

I can confirm that the original D testcase now passes with LDC+LLVMtrunk.
Thanks for the fix!

(Sidenote: In the code, I read a TODO comment "Deal with [...] some mem
intrinsics (if needed)". As you can see from the testcase IR, we generate a
call to the memset intrinsic to set the padding bytes of structs to zero
(padding bytes are put in arrays of i8). So for us, it'd be beneficial if
the store merge code learns how to deal with that.)

Thanks for verifying, Johan.

Can you open a new bug report for the missed optimization? We don't want to lose track of that request when this report is closed.

@zmodem
Copy link
Collaborator

zmodem commented Feb 2, 2018

Fixed in trunk:
https://reviews.llvm.org/rL323759

Leaving the bug open to allow baking time there and - assuming success -
merging in the branch.

Merged in r324086.

I can confirm that the original D testcase now passes with LDC+LLVMtrunk.
Thanks for the fix!

(Sidenote: In the code, I read a TODO comment "Deal with [...] some mem
intrinsics (if needed)". As you can see from the testcase IR, we generate a
call to the memset intrinsic to set the padding bytes of structs to zero
(padding bytes are put in arrays of i8). So for us, it'd be beneficial if
the store merge code learns how to deal with that.)

Thanks for verifying, Johan.

Can you open a new bug report for the missed optimization? We don't want to
lose track of that request when this report is closed.

Was a new bug filed? I'd like to close this.

@rotateright
Copy link
Contributor

Was a new bug filed? I'd like to close this.

Closing - I've taken a shot at what I think the problem is with bug 36211.

@llvmbot llvmbot transferred this issue from llvm/llvm-bugzilla-archive Dec 10, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugzilla Issues migrated from bugzilla
Projects
None yet
Development

No branches or pull requests

5 participants