Skip to content

Conversation

@ojhunt
Copy link
Contributor

@ojhunt ojhunt commented Oct 28, 2025

When -fstrict-vtable-pointers is set we can devirtualise calls to
virtual functions when called indirectly through a separate function
that does not locally know the exact type it is operating on.

This only permits the optimization for regular methods, not any kind
of constructor or destructor.

…hods

When -fstrict-vtable-pointers is set we can devirtualise calls to
virtual functions when called indirectly through a separate function
that does not locally know the exact type it is operating on.

This only permits the optimization for regular methods, not any kind
of constructor or destructor.
@ojhunt ojhunt requested a review from ChuanqiXu9 October 28, 2025 00:41
@ojhunt ojhunt self-assigned this Oct 28, 2025
@ojhunt
Copy link
Contributor Author

ojhunt commented Oct 28, 2025

@ChuanqiXu9 I'm sure this will change codegen for existing tests so am running all of them atm, but I thought you should have a look at this.

Copy link
Member

@ChuanqiXu9 ChuanqiXu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM generally, thanks.

// if the current method is not virtual, it may be calling another method
// that calls a virtual function.
if (IsPolymorphicObject && !IsStructor && IsFinal)
EmitVTableAssumptionLoads(ThisRecordDecl, LoadCXXThisAddress());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use EmitVTableAssumptionLoads, we should add test for cases it has multiple virtual tables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, will do, and virtual inheritance

Comment on lines 1 to 2
// RUN: %clang_cc1 -std=c++26 %s -emit-llvm -O3 -o - | FileCheck %s
// RUN: %clang_cc1 -std=c++26 %s -emit-llvm -O3 -fstrict-vtable-pointers -o - | FileCheck %s --check-prefix=STRICT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we don't need -std=c++26?

@ChuanqiXu9
Copy link
Member

BTW, have you tested it with bootstrapped build?

@ojhunt
Copy link
Contributor Author

ojhunt commented Oct 28, 2025

BTW, have you tested it with bootstrapped build?

no, but honestly my success rate at simply getting a trivial bootstrap build to work has not been great :-/

@ChuanqiXu9
Copy link
Member

BTW, have you tested it with bootstrapped build?

no, but honestly my success rate at simply getting a trivial bootstrap build to work has not been great :-/

I can try to make it.

New tests for multiple and virtual inheritance to verify that
we still perform this adjustments even when  devirtualising the
calls
@ojhunt
Copy link
Contributor Author

ojhunt commented Oct 28, 2025

Ok, I've added multiple and virtual inheritance tests, and included tests that force static and dynamic this adjustments.

We currently do not appear to devirtualise dynamic this adjustments, which means the tests currently use the vtable to perform dynamic this adjustments, and then immediately inline the virtual functions \o/

@ojhunt
Copy link
Contributor Author

ojhunt commented Oct 28, 2025

BTW, have you tested it with bootstrapped build?

no, but honestly my success rate at simply getting a trivial bootstrap build to work has not been great :-/

I can try to make it.

got a stage2 build actually working, and am running tests atm, assuming this passes I'll redo the stage2 with -fstrict-vtable-pointers

@ojhunt
Copy link
Contributor Author

ojhunt commented Oct 28, 2025

a normal stage2 worked fine, time for strict vtable pointers

@ojhunt
Copy link
Contributor Author

ojhunt commented Oct 28, 2025

@ChuanqiXu9 I'm seeing bootstrap failure on main :-/

@ChuanqiXu9
Copy link
Member

@ChuanqiXu9 I'm seeing bootstrap failure on main :-/

Will it fail with unmodified fstrict-vtable-pointers? Or if it will still fail if we give up the complex cases for multiple vtables? You can look at my patch for it (limits the number of vtables)

Comment on lines +333 to +347
// STRICT-LABEL: i64 @_ZN8Derived110directCallEmm(
// STRICT-NOT: call
// STRICT: ret i64

// STRICT-LABEL: i64 @_ZNK8Derived19getSumLenEmm(
// STRICT-NOT: call
// STRICT: ret i64

// STRICT-LABEL: i64 @_ZN8Derived114directBaseCallEmm(
// STRICT-NOT: call
// STRICT: ret i64

// STRICT-LABEL: i64 @_ZNK8Derived18getBatchEmmPl
// STRICT-NOT: call
// STRICT: ret i64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird. In my environment, these checks failed with "tail call @llvm.assume"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the -O3? What host are you on? (wondering if I need a more precise triple)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I am using // RUN: %clang_cc1 -triple=x86_64 -std=c++26 %s -emit-llvm -O3 -fstrict-vtable-pointers -o - | FileCheck %s --check-prefix=STRICT

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I succeed with // STRICT-NOT: call {{[^(]*}}%

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What targets do you have configured? (I want to see if I can repro locally, and then make the tests robust against it)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using X86. I am happy to have // STRICT-NOT: call {{[^(]*}}% it is more verbose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace // STRICT-NOT: call with // STRICT-NOT: call {{[^(]*}}% I feel it makes it more clear that we're checking the dynamic calls are erased.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll see if I can make them a bit clearer as well

@ChuanqiXu9
Copy link
Member

And also, when bootstrapping, have you met errors like: undefined symbol: clang::ImplicitConceptSpecializationDecl::~ImplicitConceptSpecializationDecl()? There are other similar cases.

The common point of these decl is, they have a virtual destructor but they don't have any other virtual function. This is why my previous patch wrote: ChuanqiXu9@40ceabe#diff-9f23818ed51d0b117b5692129d0801721283d0f128a01cbc562353da0266d7adR1375-R1382

@ojhunt
Copy link
Contributor Author

ojhunt commented Oct 29, 2025

And also, when bootstrapping, have you met errors like: undefined symbol: clang::ImplicitConceptSpecializationDecl::~ImplicitConceptSpecializationDecl()? There are other similar cases.

The common point of these decl is, they have a virtual destructor but they don't have any other virtual function. This is why my previous patch wrote: ChuanqiXu9@40ceabe#diff-9f23818ed51d0b117b5692129d0801721283d0f128a01cbc562353da0266d7adR1375-R1382

I haven't but I may try an assertion-free build because I see assertions in the baseline

@ojhunt
Copy link
Contributor Author

ojhunt commented Oct 29, 2025

Ok, bootstrap - assertions passes all tests for me, what is your bootstrap config?

@ChuanqiXu9
Copy link
Member

After I add the following, the bootstrap build and test succeeded for me:

    if (CGM.getCodeGenOpts().StrictVTablePointers && CXXThisValue) {
      const CXXRecordDecl *ThisRecordDecl = MD->getParent();
      bool IsPolymorphicObject = ThisRecordDecl->isPolymorphic();
      bool IsStructor = isa<CXXDestructorDecl, CXXConstructorDecl>(MD);
      bool IsFinal = ThisRecordDecl->isEffectivelyFinal();

      // A workaround for cases the base class defined virtual dtor while
      // the derived class don't offer it. Then the current implementation may
      // introduce undefined symbols as it introduce all symbols in the
      // vtable.
      bool BaseHasDtor = false;
      ThisRecordDecl->forallBases([&](const CXXRecordDecl *Base) -> bool {
        if (Base->getDestructor() && !Base->getDestructor()->isImplicit())
            BaseHasDtor = true;
  
        return true;
      });
      bool HasDtor = ThisRecordDecl->getDestructor() &&
          !ThisRecordDecl->getDestructor()->isImplicit();

      // We do not care about whether this is a virtual method, because even
      // if the current method is not virtual, it may be calling another method
      // that calls a virtual function.
      if (IsPolymorphicObject && !IsStructor && IsFinal && (!BaseHasDtor || HasDtor))
        EmitVTableAssumptionLoads(ThisRecordDecl, LoadCXXThisAddress());
    }

@ChuanqiXu9
Copy link
Member

Ok, bootstrap - assertions passes all tests for me, what is your bootstrap config?

LDFLAGS="-fuse-ld=lld -Wl,-q -Wl,-znow -Wl,-build-id=sha1" cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug -DCMAKE_C_COMPILER=... -DCMAKE_CXX_COMPILER=... -DCMAKE_C_FLAG="-fstack-clash-protection -fcf-protection -Wno-backend-plugin" -DCMAKE_CXX_FLAGS="-fstack-clash-protection -fcf-protection -Wno-backend-plugin -fstrict-vtable-pointers" -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra;lld;openmp" -DLLVM_ENABLE_RUNTIMES="compiler-rt" -DLLVM_LIBDIR_SUFFIX=64 -DCLANG_LIBDIR_SUFFIX=64 -DOPENMP_LIBDIR_SUFFIX=64 -DLLVM_TARGETS_TO_BUILD="X86;AArch64;BPF;WebAssembly" -DLLVM_ENABLE_LIBCXX=ON -DLLVM_STATIC_LINK_CXX_STDLIB=ON -DCOMPILER_RT_BUILD_LIBFUZZER=OFF -DCOMPILER_RT_BUILD_SANITIZERS=OFF -DCOMPILER_RT_BUILD_XRAY=OFF -DCOMPILER_RT_BUILD_ORC=OFF -DLLVM_USE_LINKER=lld ../llvm

@ChuanqiXu9
Copy link
Member

LGTM generally, so what's the problem for now?

Comment on lines +401 to +403
// STRICT: [[TRUE_THIS:%.*]] = getelementptr inbounds nuw i8, ptr %this, i64 16
// STRICT: [[INVARIANT_THIS:%.*]] = tail call ptr @llvm.strip.invariant.group.p0(ptr nonnull [[TRUE_THIS]])
// STRICT: [[VALUE_PTR:%.*]] = getelementptr inbounds nuw i8, ptr [[INVARIANT_THIS]], i64 8
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am slightly confused about the intention for these checks. Why do we check there is a GEP?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks are ensuring the we are still performing the correct this adjustment after adding the type assumption - i.e make sure that while we have identified the concrete type we don't want to accidentally skip the adjustment, and the value load is to make sure that after the this adjustment we use the correct field offset.

@ojhunt
Copy link
Contributor Author

ojhunt commented Oct 30, 2025

LGTM generally, so what's the problem for now?

Review :D

@ojhunt ojhunt marked this pull request as ready for review October 30, 2025 07:17
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Oct 30, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 30, 2025

@llvm/pr-subscribers-clang

Author: Oliver Hunt (ojhunt)

Changes

When -fstrict-vtable-pointers is set we can devirtualise calls to
virtual functions when called indirectly through a separate function
that does not locally know the exact type it is operating on.

This only permits the optimization for regular methods, not any kind
of constructor or destructor.


Full diff: https://github.com/llvm/llvm-project/pull/165341.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenFunction.cpp (+11-1)
  • (added) clang/test/CodeGenCXX/indirect-final-vcall-through-base.cpp (+414)
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index 88628530cf66b..73ce40739d581 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1316,7 +1316,17 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
       // fast register allocator would be happier...
       CXXThisValue = CXXABIThisValue;
     }
-
+    if (CGM.getCodeGenOpts().StrictVTablePointers) {
+      const CXXRecordDecl *ThisRecordDecl = MD->getParent();
+      bool IsPolymorphicObject = ThisRecordDecl->isPolymorphic();
+      bool IsStructor = isa<CXXDestructorDecl, CXXConstructorDecl>(MD);
+      bool IsFinal = ThisRecordDecl->isEffectivelyFinal();
+      // We do not care about whether this is a virtual method, because even
+      // if the current method is not virtual, it may be calling another method
+      // that calls a virtual function.
+      if (IsPolymorphicObject && !IsStructor && IsFinal)
+        EmitVTableAssumptionLoads(ThisRecordDecl, LoadCXXThisAddress());
+    }
     // Check the 'this' pointer once per function, if it's available.
     if (CXXABIThisValue) {
       SanitizerSet SkippedChecks;
diff --git a/clang/test/CodeGenCXX/indirect-final-vcall-through-base.cpp b/clang/test/CodeGenCXX/indirect-final-vcall-through-base.cpp
new file mode 100644
index 0000000000000..4203c869acaea
--- /dev/null
+++ b/clang/test/CodeGenCXX/indirect-final-vcall-through-base.cpp
@@ -0,0 +1,414 @@
+// Actual triple does not matter, just ensuring that the ABI being used for
+// mangling and similar is consistent. Choosing x86_64 as that seems to be a
+// configured target for most build configurations
+// RUN: %clang_cc1 -triple=x86_64 -std=c++26 %s -emit-llvm -O3                          -o - | FileCheck %s
+// RUN: %clang_cc1 -triple=x86_64 -std=c++26 %s -emit-llvm -O3 -fstrict-vtable-pointers -o - | FileCheck %s --check-prefix=STRICT
+
+using size_t = unsigned long;
+using int64_t = long;
+
+struct Base {
+    virtual int64_t sharedGet(size_t i) const = 0;
+    virtual int64_t get(size_t i) const = 0;
+    virtual int64_t getBatch(size_t offset, size_t len, int64_t arr[]) const {
+      int64_t result = 0;
+      for (size_t i = 0; i < len; ++i) {
+        result += get(offset + i);
+        arr[i] = get(offset + i);
+      }
+      return result;
+    }
+    virtual int64_t getSumLen(size_t offset, size_t len) const {
+      int64_t result = 0;
+      for (size_t i = 0; i < len; ++i) {
+        result += get(offset + i);
+      }
+      return result;
+    }
+    virtual int64_t useSharedGet(size_t i) {
+      return sharedGet(i);
+    }
+};
+  
+struct Derived1 final : public Base {
+public:
+    virtual int64_t sharedGet(size_t i) const override { return 17; }
+    int64_t get(size_t i) const override {
+        return i;
+    }
+    
+    int64_t getBatch(size_t offset, size_t len, int64_t arr[]) const override;
+    virtual int64_t getSumLen(size_t offset, size_t len) const override;
+    int64_t directCall(size_t offset, size_t len);
+    int64_t directBaseCall(size_t offset, size_t len);
+    virtual int64_t useSharedGet(size_t i) override;
+};
+
+struct Base2 {
+    unsigned value = 0;
+    virtual int64_t sharedGet(size_t i) const = 0;
+    virtual int64_t get2(size_t i) const = 0;
+    virtual int64_t getBatch2(size_t offset, size_t len, int64_t arr[]) const {
+      int64_t result = 0;
+      for (size_t i = 0; i < len; ++i) {
+        result += get2(offset + i);
+        arr[i] = get2(offset + i);
+      }
+      return result;
+    }
+    virtual int64_t getValue() = 0;
+    virtual int64_t callGetValue() {
+      return getValue();
+    }
+    virtual int64_t useBase(Base *b) {
+      return b->get(0);
+    }
+};
+
+struct Derived2 final : Base, Base2 {
+    virtual int64_t sharedGet(size_t i) const override { return 19; };
+    virtual int64_t get(size_t i) const override {
+      return 7;
+    };
+    virtual int64_t get2(size_t i) const override {
+      return 13;
+    };
+    int64_t getBatch(size_t offset, size_t len, int64_t arr[]) const override;
+    virtual int64_t getSumLen(size_t offset, size_t len) const override;
+    int64_t getBatch2(size_t offset, size_t len, int64_t arr[]) const override;
+    virtual int64_t useSharedGet(size_t i) override;
+    virtual int64_t useBase(Base *b) override;
+    virtual int64_t getValue() override { return value; }
+    virtual int64_t callGetValue() override;
+};
+
+struct IntermediateA: virtual Base {
+
+};
+struct IntermediateB: virtual Base2 {
+
+};
+
+struct Derived3Part1: IntermediateA {
+
+};
+
+struct Derived3Part2: IntermediateB {
+
+};
+
+struct Derived3 final: Derived3Part1, Derived3Part2 {
+    virtual int64_t sharedGet(size_t i) const override { return 23; }
+    virtual int64_t get(size_t i) const override { return 27; }
+    virtual int64_t getBatch(size_t offset, size_t len, int64_t arr[]) const override;
+    virtual int64_t get2(size_t i) const override { return 29; }
+    virtual int64_t getBatch2(size_t offset, size_t len, int64_t arr[]) const override;
+    virtual int64_t useSharedGet(size_t i) override;
+    virtual int64_t useBase(Base *b) override;
+    virtual int64_t getValue() override { return value; }
+    virtual int64_t callGetValue() override;
+};
+
+int64_t Derived1::directCall(size_t offset, size_t len) {
+  return getSumLen(offset, len);
+}
+
+int64_t Derived1::directBaseCall(size_t offset, size_t len) {
+  return Base::getSumLen(offset, len);
+}
+
+int64_t Derived1::getBatch(size_t offset, size_t len, int64_t arr[]) const {
+    return Base::getBatch(offset, len, arr);
+}
+
+int64_t Derived1::getSumLen(size_t offset, size_t len) const {
+  return Base::getSumLen(offset, len);
+}
+
+int64_t Derived1::useSharedGet(size_t i) {
+  return Base::useSharedGet(i);
+}
+
+int64_t Derived2::getBatch(size_t offset, size_t len, int64_t arr[]) const {
+    return Base::getBatch(offset, len, arr);
+}
+
+int64_t Derived2::getBatch2(size_t offset, size_t len, int64_t arr[]) const {
+    return Base2::getBatch2(offset, len, arr);
+}
+
+int64_t Derived2::getSumLen(size_t offset, size_t len) const {
+  return Base::getSumLen(offset, len);
+}
+
+int64_t Derived2::useSharedGet(size_t i) {
+  return Base::useSharedGet(i);
+}
+
+int64_t Derived2::useBase(Base *b) {
+  return Base2::useBase(this);
+}
+
+int64_t Derived2::callGetValue() {
+  return Base2::callGetValue();
+}
+
+int64_t Derived3::getBatch(size_t offset, size_t len, int64_t arr[]) const {
+  return Base::getBatch(offset, len, arr);
+}
+int64_t Derived3::getBatch2(size_t offset, size_t len, int64_t arr[]) const {
+  return Base2::getBatch2(offset, len, arr);
+}
+
+int64_t Derived3::useSharedGet(size_t i) {
+  return Base::useSharedGet(i);
+}
+int64_t Derived3::useBase(Base *b) {
+  return Base2::useBase(this);
+}
+
+int64_t Derived3::callGetValue() {
+  return Base2::callGetValue();
+}
+
+// CHECK-LABEL: i64 @_ZN8Derived110directCallEmm(
+// CHECK: for.body
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE]]
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VFN_SLOT]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived19getSumLenEmm(
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE]]
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VFN_SLOT]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZN8Derived114directBaseCallEmm(
+// CHECK: for.body
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE]]
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VFN_SLOT]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived18getBatchEmmPl(
+// CHECK: for.
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: tail call noundef i64 [[VFN1]](
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived28getBatchEmmPl(
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: tail call noundef i64 [[VFN1]](
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived29getBatch2EmmPl(
+// CHECK: [[OFFSETBASE:%.*]] = getelementptr inbounds nuw i8, ptr %this
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr [[OFFSETBASE]]
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: tail call noundef i64 [[VFN1]](
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr [[OFFSETBASE]]
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZThn8_NK8Derived29getBatch2EmmPl(
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: tail call noundef i64 [[VFN1]](
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived29getSumLenEmm(
+// CHECK: for.body
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE]]
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VFN_SLOT]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZN8Derived212useSharedGetEm(
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VTABLE]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived38getBatchEmmPl
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: tail call noundef i64 [[VFN1]](
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZTv0_n40_NK8Derived38getBatchEmmPl
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[THISOFFSET_VSLOT:%.*]] = getelementptr inbounds i8, ptr [[VTABLE]], i64 -40
+// CHECK: [[THIS_OFFSET:%.*]] = load i64, ptr [[THISOFFSET_VSLOT]]
+// CHECK: [[THIS:%.*]] = getelementptr inbounds i8, ptr %this, i64 [[THIS_OFFSET]]
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr [[THIS]]
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr [[THIS]]
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived39getBatch2EmmPl
+// CHECK: [[OFFSETBASE:%.*]] = getelementptr inbounds nuw i8, ptr %this
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr [[OFFSETBASE]]
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: tail call noundef i64 [[VFN1]](
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr [[OFFSETBASE]]
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZTv0_n40_NK8Derived39getBatch2EmmPl
+// CHECK: entry:
+  // %vtable = load ptr, ptr %this, align 8, !tbaa !6
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+  // %0 = getelementptr inbounds i8, ptr %vtable, i64 -40
+// CHECK: [[THISOFFSET_VSLOT:%.*]] = getelementptr inbounds i8, ptr [[VTABLE]], i64 -40
+  // %1 = load i64, ptr %0, align 8
+// CHECK: [[THIS_OFFSET:%.*]] = load i64, ptr [[THISOFFSET_VSLOT]]
+  // %2 = getelementptr inbounds i8, ptr %this, i64 %1
+// CHECK: [[BASE:%.*]] = getelementptr inbounds i8, ptr %this, i64 [[THIS_OFFSET]]
+         // %add.ptr.i = getelementptr inbounds nuw i8, ptr %2, i64 16
+// CHECK: [[THIS:%.*]] = getelementptr inbounds nuw i8, ptr [[BASE]], i64 16
+// CHECK: {{for.body.*:}}
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr [[THIS]]
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr [[THIS]]
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZN8Derived312useSharedGetEm
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VTABLE]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZTv0_n56_N8Derived312useSharedGetEm
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[THISOFFSET_VSLOT:%.*]] = getelementptr inbounds i8, ptr [[VTABLE]], i64 -56
+// CHECK: [[THIS_OFFSET:%.*]] = load i64, ptr [[THISOFFSET_VSLOT]]
+// CHECK: [[THIS:%.*]] = getelementptr inbounds i8, ptr %this, i64 [[THIS_OFFSET]]
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr [[THIS]]
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VTABLE]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+
+// STRICT-LABEL: i64 @_ZN8Derived110directCallEmm(
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZNK8Derived19getSumLenEmm(
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived114directBaseCallEmm(
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZNK8Derived18getBatchEmmPl
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZNK8Derived29getSumLenEmm(
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived212useSharedGetEm(
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived27useBaseEP4Base
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived212callGetValueEv(
+// STRICT: [[OFFSET_THIS:%.*]] = getelementptr inbounds nuw i8, ptr %this, i64 8
+// STRICT: [[INVARIANT_THIS:%.*]] = tail call ptr @llvm.strip.invariant.group.p0(ptr nonnull [[OFFSET_THIS]])
+// STRICT: [[VALUE_PTR:%.*]] = getelementptr inbounds nuw i8, ptr [[INVARIANT_THIS]], i64 8
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZThn8_N8Derived212callGetValueEv
+// STRICT: [[INVARIANT_THIS:%.*]] = tail call ptr @llvm.strip.invariant.group.p0(ptr nonnull readonly %this)
+// STRICT: [[VALUE_PTR:%.*]] = getelementptr inbounds nuw i8, ptr [[INVARIANT_THIS]], i64 8
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZNK8Derived38getBatchEmmPl
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZTv0_n40_NK8Derived38getBatchEmmPl
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZNK8Derived39getBatch2EmmPl
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZTv0_n40_NK8Derived39getBatch2EmmPl
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived312useSharedGetEm
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZTv0_n56_N8Derived312useSharedGetEm
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived37useBaseEP4Base
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived312callGetValueEv(
+// STRICT: [[TRUE_THIS:%.*]] = getelementptr inbounds nuw i8, ptr %this, i64 16
+// STRICT: [[INVARIANT_THIS:%.*]] = tail call ptr @llvm.strip.invariant.group.p0(ptr nonnull [[TRUE_THIS]])
+// STRICT: [[VALUE_PTR:%.*]] = getelementptr inbounds nuw i8, ptr [[INVARIANT_THIS]], i64 8
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZTv0_n56_N8Derived312callGetValueEv
+// STRICT: [[VTABLE:%.*]] = load ptr, ptr %this
+// STRICT: [[THISOFFSET_VSLOT:%.*]] = getelementptr inbounds i8, ptr [[VTABLE]], i64 -48
+// STRICT: [[THIS_OFFSET:%.*]] = load i64, ptr [[THISOFFSET_VSLOT]]
+// STRICT: [[VIRTUAL_BASE:%.*]] = getelementptr inbounds i8, ptr %this, i64 [[THIS_OFFSET]]
+// STRICT: [[TRUE_THIS:%.*]] = getelementptr inbounds nuw i8, ptr [[VIRTUAL_BASE]], i64 16
+// STRICT: [[INVARIANT_THIS:%.*]] = tail call ptr @llvm.strip.invariant.group.p0(ptr nonnull [[TRUE_THIS]])
+// STRICT: [[VALUE_PTR:%.*]] = getelementptr inbounds nuw i8, ptr [[INVARIANT_THIS]], i64 8
+// STRICT: ret i64

@llvmbot
Copy link
Member

llvmbot commented Oct 30, 2025

@llvm/pr-subscribers-clang-codegen

Author: Oliver Hunt (ojhunt)

Changes

When -fstrict-vtable-pointers is set we can devirtualise calls to
virtual functions when called indirectly through a separate function
that does not locally know the exact type it is operating on.

This only permits the optimization for regular methods, not any kind
of constructor or destructor.


Full diff: https://github.com/llvm/llvm-project/pull/165341.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenFunction.cpp (+11-1)
  • (added) clang/test/CodeGenCXX/indirect-final-vcall-through-base.cpp (+414)
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index 88628530cf66b..73ce40739d581 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -1316,7 +1316,17 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
       // fast register allocator would be happier...
       CXXThisValue = CXXABIThisValue;
     }
-
+    if (CGM.getCodeGenOpts().StrictVTablePointers) {
+      const CXXRecordDecl *ThisRecordDecl = MD->getParent();
+      bool IsPolymorphicObject = ThisRecordDecl->isPolymorphic();
+      bool IsStructor = isa<CXXDestructorDecl, CXXConstructorDecl>(MD);
+      bool IsFinal = ThisRecordDecl->isEffectivelyFinal();
+      // We do not care about whether this is a virtual method, because even
+      // if the current method is not virtual, it may be calling another method
+      // that calls a virtual function.
+      if (IsPolymorphicObject && !IsStructor && IsFinal)
+        EmitVTableAssumptionLoads(ThisRecordDecl, LoadCXXThisAddress());
+    }
     // Check the 'this' pointer once per function, if it's available.
     if (CXXABIThisValue) {
       SanitizerSet SkippedChecks;
diff --git a/clang/test/CodeGenCXX/indirect-final-vcall-through-base.cpp b/clang/test/CodeGenCXX/indirect-final-vcall-through-base.cpp
new file mode 100644
index 0000000000000..4203c869acaea
--- /dev/null
+++ b/clang/test/CodeGenCXX/indirect-final-vcall-through-base.cpp
@@ -0,0 +1,414 @@
+// Actual triple does not matter, just ensuring that the ABI being used for
+// mangling and similar is consistent. Choosing x86_64 as that seems to be a
+// configured target for most build configurations
+// RUN: %clang_cc1 -triple=x86_64 -std=c++26 %s -emit-llvm -O3                          -o - | FileCheck %s
+// RUN: %clang_cc1 -triple=x86_64 -std=c++26 %s -emit-llvm -O3 -fstrict-vtable-pointers -o - | FileCheck %s --check-prefix=STRICT
+
+using size_t = unsigned long;
+using int64_t = long;
+
+struct Base {
+    virtual int64_t sharedGet(size_t i) const = 0;
+    virtual int64_t get(size_t i) const = 0;
+    virtual int64_t getBatch(size_t offset, size_t len, int64_t arr[]) const {
+      int64_t result = 0;
+      for (size_t i = 0; i < len; ++i) {
+        result += get(offset + i);
+        arr[i] = get(offset + i);
+      }
+      return result;
+    }
+    virtual int64_t getSumLen(size_t offset, size_t len) const {
+      int64_t result = 0;
+      for (size_t i = 0; i < len; ++i) {
+        result += get(offset + i);
+      }
+      return result;
+    }
+    virtual int64_t useSharedGet(size_t i) {
+      return sharedGet(i);
+    }
+};
+  
+struct Derived1 final : public Base {
+public:
+    virtual int64_t sharedGet(size_t i) const override { return 17; }
+    int64_t get(size_t i) const override {
+        return i;
+    }
+    
+    int64_t getBatch(size_t offset, size_t len, int64_t arr[]) const override;
+    virtual int64_t getSumLen(size_t offset, size_t len) const override;
+    int64_t directCall(size_t offset, size_t len);
+    int64_t directBaseCall(size_t offset, size_t len);
+    virtual int64_t useSharedGet(size_t i) override;
+};
+
+struct Base2 {
+    unsigned value = 0;
+    virtual int64_t sharedGet(size_t i) const = 0;
+    virtual int64_t get2(size_t i) const = 0;
+    virtual int64_t getBatch2(size_t offset, size_t len, int64_t arr[]) const {
+      int64_t result = 0;
+      for (size_t i = 0; i < len; ++i) {
+        result += get2(offset + i);
+        arr[i] = get2(offset + i);
+      }
+      return result;
+    }
+    virtual int64_t getValue() = 0;
+    virtual int64_t callGetValue() {
+      return getValue();
+    }
+    virtual int64_t useBase(Base *b) {
+      return b->get(0);
+    }
+};
+
+struct Derived2 final : Base, Base2 {
+    virtual int64_t sharedGet(size_t i) const override { return 19; };
+    virtual int64_t get(size_t i) const override {
+      return 7;
+    };
+    virtual int64_t get2(size_t i) const override {
+      return 13;
+    };
+    int64_t getBatch(size_t offset, size_t len, int64_t arr[]) const override;
+    virtual int64_t getSumLen(size_t offset, size_t len) const override;
+    int64_t getBatch2(size_t offset, size_t len, int64_t arr[]) const override;
+    virtual int64_t useSharedGet(size_t i) override;
+    virtual int64_t useBase(Base *b) override;
+    virtual int64_t getValue() override { return value; }
+    virtual int64_t callGetValue() override;
+};
+
+struct IntermediateA: virtual Base {
+
+};
+struct IntermediateB: virtual Base2 {
+
+};
+
+struct Derived3Part1: IntermediateA {
+
+};
+
+struct Derived3Part2: IntermediateB {
+
+};
+
+struct Derived3 final: Derived3Part1, Derived3Part2 {
+    virtual int64_t sharedGet(size_t i) const override { return 23; }
+    virtual int64_t get(size_t i) const override { return 27; }
+    virtual int64_t getBatch(size_t offset, size_t len, int64_t arr[]) const override;
+    virtual int64_t get2(size_t i) const override { return 29; }
+    virtual int64_t getBatch2(size_t offset, size_t len, int64_t arr[]) const override;
+    virtual int64_t useSharedGet(size_t i) override;
+    virtual int64_t useBase(Base *b) override;
+    virtual int64_t getValue() override { return value; }
+    virtual int64_t callGetValue() override;
+};
+
+int64_t Derived1::directCall(size_t offset, size_t len) {
+  return getSumLen(offset, len);
+}
+
+int64_t Derived1::directBaseCall(size_t offset, size_t len) {
+  return Base::getSumLen(offset, len);
+}
+
+int64_t Derived1::getBatch(size_t offset, size_t len, int64_t arr[]) const {
+    return Base::getBatch(offset, len, arr);
+}
+
+int64_t Derived1::getSumLen(size_t offset, size_t len) const {
+  return Base::getSumLen(offset, len);
+}
+
+int64_t Derived1::useSharedGet(size_t i) {
+  return Base::useSharedGet(i);
+}
+
+int64_t Derived2::getBatch(size_t offset, size_t len, int64_t arr[]) const {
+    return Base::getBatch(offset, len, arr);
+}
+
+int64_t Derived2::getBatch2(size_t offset, size_t len, int64_t arr[]) const {
+    return Base2::getBatch2(offset, len, arr);
+}
+
+int64_t Derived2::getSumLen(size_t offset, size_t len) const {
+  return Base::getSumLen(offset, len);
+}
+
+int64_t Derived2::useSharedGet(size_t i) {
+  return Base::useSharedGet(i);
+}
+
+int64_t Derived2::useBase(Base *b) {
+  return Base2::useBase(this);
+}
+
+int64_t Derived2::callGetValue() {
+  return Base2::callGetValue();
+}
+
+int64_t Derived3::getBatch(size_t offset, size_t len, int64_t arr[]) const {
+  return Base::getBatch(offset, len, arr);
+}
+int64_t Derived3::getBatch2(size_t offset, size_t len, int64_t arr[]) const {
+  return Base2::getBatch2(offset, len, arr);
+}
+
+int64_t Derived3::useSharedGet(size_t i) {
+  return Base::useSharedGet(i);
+}
+int64_t Derived3::useBase(Base *b) {
+  return Base2::useBase(this);
+}
+
+int64_t Derived3::callGetValue() {
+  return Base2::callGetValue();
+}
+
+// CHECK-LABEL: i64 @_ZN8Derived110directCallEmm(
+// CHECK: for.body
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE]]
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VFN_SLOT]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived19getSumLenEmm(
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE]]
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VFN_SLOT]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZN8Derived114directBaseCallEmm(
+// CHECK: for.body
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE]]
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VFN_SLOT]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived18getBatchEmmPl(
+// CHECK: for.
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: tail call noundef i64 [[VFN1]](
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived28getBatchEmmPl(
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: tail call noundef i64 [[VFN1]](
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived29getBatch2EmmPl(
+// CHECK: [[OFFSETBASE:%.*]] = getelementptr inbounds nuw i8, ptr %this
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr [[OFFSETBASE]]
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: tail call noundef i64 [[VFN1]](
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr [[OFFSETBASE]]
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZThn8_NK8Derived29getBatch2EmmPl(
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: tail call noundef i64 [[VFN1]](
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived29getSumLenEmm(
+// CHECK: for.body
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE]]
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VFN_SLOT]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZN8Derived212useSharedGetEm(
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VTABLE]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived38getBatchEmmPl
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: tail call noundef i64 [[VFN1]](
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZTv0_n40_NK8Derived38getBatchEmmPl
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[THISOFFSET_VSLOT:%.*]] = getelementptr inbounds i8, ptr [[VTABLE]], i64 -40
+// CHECK: [[THIS_OFFSET:%.*]] = load i64, ptr [[THISOFFSET_VSLOT]]
+// CHECK: [[THIS:%.*]] = getelementptr inbounds i8, ptr %this, i64 [[THIS_OFFSET]]
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr [[THIS]]
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr [[THIS]]
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZNK8Derived39getBatch2EmmPl
+// CHECK: [[OFFSETBASE:%.*]] = getelementptr inbounds nuw i8, ptr %this
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr [[OFFSETBASE]]
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: tail call noundef i64 [[VFN1]](
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr [[OFFSETBASE]]
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZTv0_n40_NK8Derived39getBatch2EmmPl
+// CHECK: entry:
+  // %vtable = load ptr, ptr %this, align 8, !tbaa !6
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+  // %0 = getelementptr inbounds i8, ptr %vtable, i64 -40
+// CHECK: [[THISOFFSET_VSLOT:%.*]] = getelementptr inbounds i8, ptr [[VTABLE]], i64 -40
+  // %1 = load i64, ptr %0, align 8
+// CHECK: [[THIS_OFFSET:%.*]] = load i64, ptr [[THISOFFSET_VSLOT]]
+  // %2 = getelementptr inbounds i8, ptr %this, i64 %1
+// CHECK: [[BASE:%.*]] = getelementptr inbounds i8, ptr %this, i64 [[THIS_OFFSET]]
+         // %add.ptr.i = getelementptr inbounds nuw i8, ptr %2, i64 16
+// CHECK: [[THIS:%.*]] = getelementptr inbounds nuw i8, ptr [[BASE]], i64 16
+// CHECK: {{for.body.*:}}
+// CHECK: [[VTABLE1:%.*]] = load ptr, ptr [[THIS]]
+// CHECK: [[VFN_SLOT1:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE1]]
+// CHECK: [[VFN1:%.*]] = load ptr, ptr [[VFN_SLOT1]]
+// CHECK: [[VTABLE2:%.*]] = load ptr, ptr [[THIS]]
+// CHECK: [[VFN_SLOT2:%.*]] = getelementptr inbounds nuw i8, ptr [[VTABLE2]]
+// CHECK: [[VFN2:%.*]] = load ptr, ptr [[VFN_SLOT2]]
+// CHECK: tail call noundef i64 [[VFN2]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZN8Derived312useSharedGetEm
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VTABLE]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+// CHECK-LABEL: i64 @_ZTv0_n56_N8Derived312useSharedGetEm
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr %this
+// CHECK: [[THISOFFSET_VSLOT:%.*]] = getelementptr inbounds i8, ptr [[VTABLE]], i64 -56
+// CHECK: [[THIS_OFFSET:%.*]] = load i64, ptr [[THISOFFSET_VSLOT]]
+// CHECK: [[THIS:%.*]] = getelementptr inbounds i8, ptr %this, i64 [[THIS_OFFSET]]
+// CHECK: [[VTABLE:%.*]] = load ptr, ptr [[THIS]]
+// CHECK: [[VFN:%.*]] = load ptr, ptr [[VTABLE]]
+// CHECK: tail call noundef i64 [[VFN]](
+// CHECK: ret i64
+
+
+// STRICT-LABEL: i64 @_ZN8Derived110directCallEmm(
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZNK8Derived19getSumLenEmm(
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived114directBaseCallEmm(
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZNK8Derived18getBatchEmmPl
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZNK8Derived29getSumLenEmm(
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived212useSharedGetEm(
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived27useBaseEP4Base
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived212callGetValueEv(
+// STRICT: [[OFFSET_THIS:%.*]] = getelementptr inbounds nuw i8, ptr %this, i64 8
+// STRICT: [[INVARIANT_THIS:%.*]] = tail call ptr @llvm.strip.invariant.group.p0(ptr nonnull [[OFFSET_THIS]])
+// STRICT: [[VALUE_PTR:%.*]] = getelementptr inbounds nuw i8, ptr [[INVARIANT_THIS]], i64 8
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZThn8_N8Derived212callGetValueEv
+// STRICT: [[INVARIANT_THIS:%.*]] = tail call ptr @llvm.strip.invariant.group.p0(ptr nonnull readonly %this)
+// STRICT: [[VALUE_PTR:%.*]] = getelementptr inbounds nuw i8, ptr [[INVARIANT_THIS]], i64 8
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZNK8Derived38getBatchEmmPl
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZTv0_n40_NK8Derived38getBatchEmmPl
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZNK8Derived39getBatch2EmmPl
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZTv0_n40_NK8Derived39getBatch2EmmPl
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived312useSharedGetEm
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZTv0_n56_N8Derived312useSharedGetEm
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived37useBaseEP4Base
+// STRICT-NOT: call
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZN8Derived312callGetValueEv(
+// STRICT: [[TRUE_THIS:%.*]] = getelementptr inbounds nuw i8, ptr %this, i64 16
+// STRICT: [[INVARIANT_THIS:%.*]] = tail call ptr @llvm.strip.invariant.group.p0(ptr nonnull [[TRUE_THIS]])
+// STRICT: [[VALUE_PTR:%.*]] = getelementptr inbounds nuw i8, ptr [[INVARIANT_THIS]], i64 8
+// STRICT: ret i64
+
+// STRICT-LABEL: i64 @_ZTv0_n56_N8Derived312callGetValueEv
+// STRICT: [[VTABLE:%.*]] = load ptr, ptr %this
+// STRICT: [[THISOFFSET_VSLOT:%.*]] = getelementptr inbounds i8, ptr [[VTABLE]], i64 -48
+// STRICT: [[THIS_OFFSET:%.*]] = load i64, ptr [[THISOFFSET_VSLOT]]
+// STRICT: [[VIRTUAL_BASE:%.*]] = getelementptr inbounds i8, ptr %this, i64 [[THIS_OFFSET]]
+// STRICT: [[TRUE_THIS:%.*]] = getelementptr inbounds nuw i8, ptr [[VIRTUAL_BASE]], i64 16
+// STRICT: [[INVARIANT_THIS:%.*]] = tail call ptr @llvm.strip.invariant.group.p0(ptr nonnull [[TRUE_THIS]])
+// STRICT: [[VALUE_PTR:%.*]] = getelementptr inbounds nuw i8, ptr [[INVARIANT_THIS]], i64 8
+// STRICT: ret i64

Comment on lines +333 to +347
// STRICT-LABEL: i64 @_ZN8Derived110directCallEmm(
// STRICT-NOT: call
// STRICT: ret i64

// STRICT-LABEL: i64 @_ZNK8Derived19getSumLenEmm(
// STRICT-NOT: call
// STRICT: ret i64

// STRICT-LABEL: i64 @_ZN8Derived114directBaseCallEmm(
// STRICT-NOT: call
// STRICT: ret i64

// STRICT-LABEL: i64 @_ZNK8Derived18getBatchEmmPl
// STRICT-NOT: call
// STRICT: ret i64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace // STRICT-NOT: call with // STRICT-NOT: call {{[^(]*}}% I feel it makes it more clear that we're checking the dynamic calls are erased.

Comment on lines +4 to +5
// RUN: %clang_cc1 -triple=x86_64 -std=c++26 %s -emit-llvm -O3 -o - | FileCheck %s
// RUN: %clang_cc1 -triple=x86_64 -std=c++26 %s -emit-llvm -O3 -fstrict-vtable-pointers -o - | FileCheck %s --check-prefix=STRICT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I still feel better to remove -std=c++26 options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +4 to +5
// RUN: %clang_cc1 -triple=x86_64 -std=c++26 %s -emit-llvm -O3 -o - | FileCheck %s
// RUN: %clang_cc1 -triple=x86_64 -std=c++26 %s -emit-llvm -O3 -fstrict-vtable-pointers -o - | FileCheck %s --check-prefix=STRICT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit2: for CodeGen changes, it is better to add another check with -disable-llvm-passes to check the generated codes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, so we're verifying that the correct IR is being generated initially?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.

nitpicking: not "correct" but "expected". (Although this doesn't matter). The idea is, actually we only changes/controls the CodeGen part. But we need the middle end part to optimize it. Then the new added test is helpful if someday the test fails, we can get the problem quickly to see if it is due to the CodeGen or due to the middle end part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants