Skip to content

Commit

Permalink
[AArch64] Don't expand memcmp in strict align mode.
Browse files Browse the repository at this point in the history
7aecf23 fixed the bug where we would miscompile, but we still generate
a crazy amount of code. Turn off the expansion until someone implements
an appropriate heuristic.

Differential Revision: https://reviews.llvm.org/D77599
  • Loading branch information
efriedma-quic committed Apr 7, 2020
1 parent f596ab4 commit e9ac757
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 12 deletions.
7 changes: 6 additions & 1 deletion llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
Expand Up @@ -629,7 +629,12 @@ int AArch64TTIImpl::getCmpSelInstrCost(unsigned Opcode, Type *ValTy,
AArch64TTIImpl::TTI::MemCmpExpansionOptions
AArch64TTIImpl::enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const {
TTI::MemCmpExpansionOptions Options;
Options.AllowOverlappingLoads = !ST->requiresStrictAlign();
if (ST->requiresStrictAlign()) {
// TODO: Add cost modeling for strict align. Misaligned loads expand to
// a bunch of instructions when strict align is enabled.
return Options;
}
Options.AllowOverlappingLoads = true;
Options.MaxNumLoads = TLI->getMaxExpandSizeMemcmp(OptSize);
Options.NumLoadsPerBlock = Options.MaxNumLoads;
// TODO: Though vector loads usually perform well on AArch64, in some targets
Expand Down
16 changes: 5 additions & 11 deletions llvm/test/CodeGen/AArch64/bcmp-inline-small.ll
Expand Up @@ -11,12 +11,12 @@ entry:
ret i1 %ret

; CHECK-LABEL: test_b2:
; CHECK-NOT: bl bcmp
; CHECKN-NOT: bl bcmp
; CHECKN: ldr x
; CHECKN-NEXT: ldr x
; CHECKN-NEXT: ldur x
; CHECKN-NEXT: ldur x
; CHECKS-COUNT-30: ldrb w
; CHECKS: bl bcmp
}

define i1 @test_b2_align8(i8* align 8 %s1, i8* align 8 %s2) {
Expand All @@ -26,19 +26,13 @@ entry:
ret i1 %ret

; CHECK-LABEL: test_b2_align8:
; CHECK-NOT: bl bcmp
; CHECKN-NOT: bl bcmp
; CHECKN: ldr x
; CHECKN-NEXT: ldr x
; CHECKN-NEXT: ldur x
; CHECKN-NEXT: ldur x
; CHECKS: ldr x
; CHECKS-NEXT: ldr x
; CHECKS-NEXT: ldr w
; CHECKS-NEXT: ldr w
; CHECKS-NEXT: ldrh w
; CHECKS-NEXT: ldrh w
; CHECKS-NEXT: ldrb w
; CHECKS-NEXT: ldrb w
; TODO: Four loads should be within the limit, but the heuristic isn't implemented.
; CHECKS: bl bcmp
}

define i1 @test_bs(i8* %s1, i8* %s2) optsize {
Expand Down

0 comments on commit e9ac757

Please sign in to comment.