Skip to content

[Clang] Add __builtin_invoke and use it in libc++ #116709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jun 29, 2025

Conversation

philnik777
Copy link
Contributor

@philnik777 philnik777 commented Nov 18, 2024

std::invoke is currently quite heavy compared to a function call, since it involves quite heavy SFINAE. This can be done significantly more efficient by the compiler, since most calls to std::invoke are simple function calls and 6 out of the seven overloads for std::invoke exist only to support member pointers. Even these boil down to a few relatively simple checks.

Some real-world testing with this patch revealed some significant results. For example, instantiating std::format("Banane") (and its callees) went down from ~125ms on my system to ~104ms.

Copy link

github-actions bot commented Nov 18, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@philnik777
Copy link
Contributor Author

philnik777 commented Nov 22, 2024

@ldionne I think this is the more appropriate place to discuss this.

(From #116637 (comment))

Chris had a patch on Clang at some point where we discussed this and I voiced concerns. Here it is: https://reviews.llvm.org/D130867

Here are some of my concerns:

  1. Implementing more stuff in the compiler adds complexity to the compiler, and does not decrease the complexity of the library because libc++ still needs to implement invoke on other compilers.

  2. It duplicates the tests, since we need to test the builtin and the standard library interface to the same level.

  3. Duplicating functionality in libc++ and in the compiler is confusing. For instance, it sets up the stage for people not knowing where a bug in std::invoke should be fixed. Is it in the compiler? In the library? In both? And the set of folks who can fix a bug in the compiler is a lot smaller than the set of folks who can do that in the library.

  4. It creates another level of coupling between the library and the compiler version-wise. For example, if we have a LWG issue fixing std::invoke, we must now fix it in Clang and whether the issue is fixed is not something we can determine from the library, it becomes something that depends on the compiler. This is technically a pre-existing issue with the builtin type traits, but in practice the type traits have a simple API and they don't change much, unlike std::invoke which has a complex simplification specification and could be the target of changes.

  1. I'm not convinced the library part is true. The reality is that we support Clang and GCC, and if they both support the builtins (or provide different ones for the same feature) we remove our fallback implementations and thus reducing the complexity for libc++. GCC 15 already adds __is_invocable, so we would be able to at least simplify our traits. Whether the complexity is too high for the compiler is something the compiler maintainers should judge IMO. FWIW IMO The compiler implementations is actually much easier to understand than the mess we have in the library.

  2. True. I'm not sure this is a huge burden though.

  3. IMO that's not the case. We call a builtin. If that builtin is wrong we file a bug against the compiler. I don't really see the confusion here. It's exactly the same as with type traits.

  4. I don't think this is just "technically pre-existing". There are bugs in the compiler builtins. In some cases we've waited until these bugs are fixed to use a builtin trait, but when the bug was pre-existing for a significant amount of time we didn't bother trying to work around it in the library. I'm not sure what you mean by "complex simplification".

@ldionne
Copy link
Member

ldionne commented Nov 22, 2024

  1. I'm not convinced the library part is true. The reality is that we support Clang and GCC, and if they both support the builtins (or provide different ones for the same feature) we remove our fallback implementations and thus reducing the complexity for libc++. GCC 15 already adds __is_invocable, so we would be able to at least simplify our traits. Whether the complexity is too high for the compiler is something the compiler maintainers should judge IMO. FWIW IMO The compiler implementations is actually much easier to understand than the mess we have in the library.

Does GCC implement (or have plans to implement) __builtin_invoke?

  1. True. I'm not sure this is a huge burden though.

Well, our tests for std::invoke are not simple, and we'd have to do the same in Clang if we want to have the same amount of coverage.

  1. IMO that's not the case. We call a builtin. If that builtin is wrong we file a bug against the compiler. I don't really see the confusion here. It's exactly the same as with type traits.

Your PR description says that the compiler recognizes std::invoke itself as a builtin, is that right? It makes a difference to me whether std::invoke is a builtin directly or std::invoke calls __builtin_invoke.

  1. I don't think this is just "technically pre-existing". There are bugs in the compiler builtins. In some cases we've waited until these bugs are fixed to use a builtin trait, but when the bug was pre-existing for a significant amount of time we didn't bother trying to work around it in the library. I'm not sure what you mean by "complex simplification".

Sorry, I meant complex specification. I mean that invoke is not like is_constructible -- it may be updated in the future. The updates would be to the library wording, and I'd find it really weird to have to go change the compiler builtin to account for what's effectively a library change.

@philnik777
Copy link
Contributor Author

  1. I'm not convinced the library part is true. The reality is that we support Clang and GCC, and if they both support the builtins (or provide different ones for the same feature) we remove our fallback implementations and thus reducing the complexity for libc++. GCC 15 already adds __is_invocable, so we would be able to at least simplify our traits. Whether the complexity is too high for the compiler is something the compiler maintainers should judge IMO. FWIW IMO The compiler implementations is actually much easier to understand than the mess we have in the library.

Does GCC implement (or have plans to implement) __builtin_invoke?

Not that I'm aware of.

  1. True. I'm not sure this is a huge burden though.

Well, our tests for std::invoke are not simple, and we'd have to do the same in Clang if we want to have the same amount of coverage.

AFAICT we have one test for invoke plus a constexpr test which should really just be rolled into the generic invoke test. That test is ~400 LoC, which is definitely not nothing, but also not unmanageable IMO. We could probably also simplify that by not explicitly listing every cv combination but instead using types::for_each.

  1. IMO that's not the case. We call a builtin. If that builtin is wrong we file a bug against the compiler. I don't really see the confusion here. It's exactly the same as with type traits.

Your PR description says that the compiler recognizes std::invoke itself as a builtin, is that right? It makes a difference to me whether std::invoke is a builtin directly or std::invoke calls __builtin_invoke.

Yes, it does. If that is the main point of contention we can move that into a separate patch. I don't expect the compile times to differ significantly between recognizing std::invoke as a builtin itself and having std::invoke use __builtin_invoke as the implementation. It would still be nice for improved debug performance and reduced debug info though.

What exactly is the difference you perceive between the two? I mean, sure, there is one, but if std::invoke just always calls __builtin_invoke there really isn't a meaningful one.

  1. I don't think this is just "technically pre-existing". There are bugs in the compiler builtins. In some cases we've waited until these bugs are fixed to use a builtin trait, but when the bug was pre-existing for a significant amount of time we didn't bother trying to work around it in the library. I'm not sure what you mean by "complex simplification".

Sorry, I meant complex specification. I mean that invoke is not like is_constructible -- it may be updated in the future. The updates would be to the library wording, and I'd find it really weird to have to go change the compiler builtin to account for what's effectively a library change.

Is your main concern here that we'd be limited in the library what we can change without the changing the compiler? If that is the case, would your concerns be alleviated if we could opt out of the builtin recognition for specific overloads, e.g. through an attribute?

@ldionne
Copy link
Member

ldionne commented Nov 26, 2024

AFAICT we have one test for invoke plus a constexpr test which should really just be rolled into the generic invoke test. That test is ~400 LoC, which is definitely not nothing, but also not unmanageable IMO. We could probably also simplify that by not explicitly listing every cv combination but instead using types::for_each.

As long as libc++ keeps testing its public API fully, I'm OK with this. What I don't want is to start shifting test burden towards Clang because the compiler actually implements most of the functionality. These builtins should be implementation details, so we should still be testing the full public API in libc++ itself.

The testing question from the Clang side is interesting and I would expect Clang folks to have an opinion. From the Clang side, I'd expect some concerns about adding a builtin with such a complex and user-visible "API". But those concerns are not really for me to raise, if Clang folks are fine with it, that's not an issue for me.

Yes, it does. If that is the main point of contention we can move that into a separate patch. I don't expect the compile times to differ significantly between recognizing std::invoke as a builtin itself and having std::invoke use __builtin_invoke as the implementation. It would still be nice for improved debug performance and reduced debug info though.

What exactly is the difference you perceive between the two? I mean, sure, there is one, but if std::invoke just always calls __builtin_invoke there really isn't a meaningful one.

[...]

Is your main concern here that we'd be limited in the library what we can change without the changing the compiler? If that is the case, would your concerns be alleviated if we could opt out of the builtin recognition for specific overloads, e.g. through an attribute?

As discussed just now in the libc++ monthly:

  • Silently hijacking a libc++ function by the compiler is surprising and certainly unexpected unless you already know what's going on. I can imagine someone trying to fix a bug in std::invoke and spending hours scratching their heads not understanding why std::invoke is still not doing the right thing after their changes, only to find out that the compiler was really implementing the function under the hood. These things should be explicit.
  • Silently hijacking also means that we lose control over what's happening from the library side, so a bug or an improvement requires changes to the compiler.

My view is that compiler builtins should be tools to implement an API more simply and more efficiently. They shouldn't be the API itself.

Personally, I think the best way to move this forward would be to remove the hijacking from this patch to make it less contentious, and then to potentially pursue an attribute to formalize how hijacking works. This could (and should) be used beyond std::invoke, for example with std::forward and std::move. That would actually be an improvement to the status-quo in relation to my concerns with "discoverability".

@philnik777 philnik777 changed the title [Clang] Add __builtin_invoke and detect std::invoke as a builtin [Clang] Add __builtin_invoke Dec 14, 2024
@philnik777 philnik777 changed the title [Clang] Add __builtin_invoke [Clang] Add __builtin_invoke and use it in libc++ Dec 14, 2024
@philnik777 philnik777 force-pushed the builtin_invoke branch 2 times, most recently from ddcc918 to c52c8f3 Compare March 30, 2025 09:40
@philnik777 philnik777 marked this pull request as ready for review May 20, 2025 12:16
@philnik777 philnik777 requested a review from a team as a code owner May 20, 2025 12:16
@llvmbot llvmbot added clang Clang issues not falling into any other category libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. clang:frontend Language frontend issues, e.g. anything involving "Sema" labels May 20, 2025
@llvmbot
Copy link
Member

llvmbot commented May 20, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-libcxx

Author: Nikolas Klauser (philnik777)

Changes

std::invoke is currently quite heavy compared to a function call, since it involves quite heavy SFINAE. This can be done significantly more efficient by the compiler, since most calls to std::invoke are simple function calls and 6 out of the seven overloads for std::invoke exist only to support member pointers. Even these boil down to a few relatively simple checks.

Some real-world testing with this patch revealed some significant results. For example, instantiating std::format("Banane") (and its callees) went down from ~125ms on my system to ~104ms.


Patch is 29.32 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/116709.diff

9 Files Affected:

  • (modified) clang/include/clang/Basic/Builtins.td (+6)
  • (modified) clang/include/clang/Sema/Sema.h (+9)
  • (modified) clang/lib/Parse/ParseDeclCXX.cpp (+1-23)
  • (modified) clang/lib/Sema/SemaChecking.cpp (+97)
  • (modified) clang/lib/Sema/SemaExprCXX.cpp (+54-51)
  • (added) clang/test/CodeGenCXX/builtin-invoke.cpp (+61)
  • (added) clang/test/SemaCXX/builtin-invoke.cpp (+133)
  • (modified) libcxx/include/__type_traits/invoke.h (+127-28)
  • (modified) libcxx/include/__type_traits/is_core_convertible.h (+11)
diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index 187d3b5ed24a7..58cc35088c40a 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -4272,6 +4272,12 @@ def MoveIfNsoexcept : CxxLibBuiltin<"utility"> {
   let Namespace = "std";
 }
 
+def Invoke : Builtin {
+  let Spellings = ["__builtin_invoke"];
+  let Attributes = [CustomTypeChecking, Constexpr];
+  let Prototype = "void(...)";
+}
+
 def Annotation : Builtin {
   let Spellings = ["__builtin_annotation"];
   let Attributes = [NoThrow, CustomTypeChecking];
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 5ec67087aeea4..22d66e8688906 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -2594,6 +2594,8 @@ class Sema final : public SemaBase {
                                SourceLocation BuiltinLoc,
                                SourceLocation RParenLoc);
 
+  ExprResult BuiltinInvoke(CallExpr *TheCall);
+
   static StringRef GetFormatStringTypeName(FormatStringType FST);
   static FormatStringType GetFormatStringType(StringRef FormatFlavor);
   static FormatStringType GetFormatStringType(const FormatAttr *Format);
@@ -15220,11 +15222,18 @@ class Sema final : public SemaBase {
                                SourceLocation Loc);
   QualType BuiltinRemoveReference(QualType BaseType, UTTKind UKind,
                                   SourceLocation Loc);
+
+  QualType BuiltinRemoveCVRef(QualType BaseType, SourceLocation Loc) {
+    return BuiltinRemoveReference(BaseType, UTTKind::RemoveCVRef, Loc);
+  }
+
   QualType BuiltinChangeCVRQualifiers(QualType BaseType, UTTKind UKind,
                                       SourceLocation Loc);
   QualType BuiltinChangeSignedness(QualType BaseType, UTTKind UKind,
                                    SourceLocation Loc);
 
+  bool BuiltinIsBaseOf(SourceLocation RhsTLoc, QualType LhsT, QualType RhsT);
+
   /// Ensure that the type T is a literal type.
   ///
   /// This routine checks whether the type @p T is a literal type. If @p T is an
diff --git a/clang/lib/Parse/ParseDeclCXX.cpp b/clang/lib/Parse/ParseDeclCXX.cpp
index 316bc30edf1f0..aeb1112bad8b4 100644
--- a/clang/lib/Parse/ParseDeclCXX.cpp
+++ b/clang/lib/Parse/ParseDeclCXX.cpp
@@ -1611,29 +1611,7 @@ void Parser::ParseClassSpecifier(tok::TokenKind TagTokKind,
       Tok.isOneOf(
 #define TRANSFORM_TYPE_TRAIT_DEF(_, Trait) tok::kw___##Trait,
 #include "clang/Basic/TransformTypeTraits.def"
-          tok::kw___is_abstract,
-          tok::kw___is_aggregate,
-          tok::kw___is_arithmetic,
-          tok::kw___is_array,
-          tok::kw___is_assignable,
-          tok::kw___is_base_of,
-          tok::kw___is_bounded_array,
-          tok::kw___is_class,
-          tok::kw___is_complete_type,
-          tok::kw___is_compound,
-          tok::kw___is_const,
-          tok::kw___is_constructible,
-          tok::kw___is_convertible,
-          tok::kw___is_convertible_to,
-          tok::kw___is_destructible,
-          tok::kw___is_empty,
-          tok::kw___is_enum,
-          tok::kw___is_floating_point,
-          tok::kw___is_final,
-          tok::kw___is_function,
-          tok::kw___is_fundamental,
-          tok::kw___is_integral,
-          tok::kw___is_interface_class,
+          tok::kw___is_convertible, // Last use in libc++ was removed in 925a11a
           tok::kw___is_literal,
           tok::kw___is_lvalue_expr,
           tok::kw___is_lvalue_reference,
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index a960b9931ddfd..26579de25bdf0 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -2368,6 +2368,8 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
     return BuiltinShuffleVector(TheCall);
     // TheCall will be freed by the smart pointer here, but that's fine, since
     // BuiltinShuffleVector guts it, but then doesn't release it.
+  case Builtin::BI__builtin_invoke:
+    return BuiltinInvoke(TheCall);
   case Builtin::BI__builtin_prefetch:
     if (BuiltinPrefetch(TheCall))
       return ExprError();
@@ -5406,6 +5408,101 @@ ExprResult Sema::ConvertVectorExpr(Expr *E, TypeSourceInfo *TInfo,
                                    RParenLoc, CurFPFeatureOverrides());
 }
 
+ExprResult Sema::BuiltinInvoke(CallExpr *TheCall) {
+  auto Loc = TheCall->getBeginLoc();
+  auto Args = MutableArrayRef(TheCall->getArgs(), TheCall->getNumArgs());
+  assert(llvm::none_of(Args,
+                       [](Expr *Arg) { return Arg->isTypeDependent(); }));
+
+  if (Args.size() == 0) {
+    Diag(TheCall->getBeginLoc(), diag::err_typecheck_call_too_few_args_at_least)
+        << 0 << 1 << 0 << 0 << TheCall->getSourceRange();
+    return ExprError();
+  }
+
+  auto FuncT = Args[0]->getType();
+
+  if (auto *MPT = FuncT->getAs<MemberPointerType>()) {
+    if (Args.size() < 2) {
+      Diag(TheCall->getBeginLoc(),
+            diag::err_typecheck_call_too_few_args_at_least)
+          << 0 << 2 << 1 << 0 << TheCall->getSourceRange();
+      return ExprError();
+    }
+
+    auto *MemPtrClass = MPT->getQualifier()->getAsType();
+    auto ObjectT = Args[1]->getType();
+
+
+    if (MPT->isMemberDataPointer() && Args.size() != 2) {
+      Diag(TheCall->getBeginLoc(), diag::err_typecheck_call_too_many_args)
+          << 0 << 2 << Args.size() << 0 << TheCall->getSourceRange();
+      return ExprError();
+    }
+
+    ExprResult ObjectArg = [&]() -> ExprResult {
+      // (1.1): (t1.*f)(t2, …, tN) when f is a pointer to a member function of a
+      // class T and is_same_v<T, remove_cvref_t<decltype(t1)>> ||
+      // is_base_of_v<T, remove_cvref_t<decltype(t1)>> is true;
+      // (1.4): t1.*f when N=1 and f is a pointer to data member of a class T
+      // and is_same_v<T, remove_cvref_t<decltype(t1)>> ||
+      // is_base_of_v<T, remove_cvref_t<decltype(t1)>> is true;
+      if (Context.hasSameType(QualType(MemPtrClass, 0),
+                              BuiltinRemoveCVRef(ObjectT, Loc)) ||
+          BuiltinIsBaseOf(Args[1]->getBeginLoc(), QualType(MemPtrClass, 0),
+                          BuiltinRemoveCVRef(ObjectT, Loc))) {
+        return Args[1];
+      }
+
+      // (t1.get().*f)(t2, …, tN) when f is a pointer to a member function of
+      // a class T and remove_cvref_t<decltype(t1)> is a specialization of
+      // reference_wrapper;
+      if (auto *RD = ObjectT->getAsCXXRecordDecl()) {
+        if (RD->isInStdNamespace() &&
+            RD->getDeclName().getAsString() == "reference_wrapper") {
+          CXXScopeSpec SS;
+          IdentifierInfo *GetName = &Context.Idents.get("get");
+          UnqualifiedId GetID;
+          GetID.setIdentifier(GetName, Loc);
+
+          auto MemExpr = ActOnMemberAccessExpr(
+              getCurScope(), Args[1], Loc, tok::period, SS,
+              /*TemplateKWLoc=*/SourceLocation(), GetID, nullptr);
+
+          if (MemExpr.isInvalid())
+            return ExprError();
+
+          return ActOnCallExpr(getCurScope(), MemExpr.get(), Loc, {}, Loc);
+        }
+      }
+
+      // ((*t1).*f)(t2, …, tN) when f is a pointer to a member function of a
+      // class T and t1 does not satisfy the previous two items;
+
+      return ActOnUnaryOp(getCurScope(), Loc, tok::star, Args[1]);
+    }();
+
+    if (ObjectArg.isInvalid())
+      return ExprError();
+
+    auto BinOp = ActOnBinOp(getCurScope(), TheCall->getBeginLoc(),
+                            tok::periodstar, ObjectArg.get(), Args[0]);
+    if (BinOp.isInvalid())
+      return ExprError();
+
+    if (MPT->isMemberDataPointer())
+      return BinOp;
+
+    auto *MemCall = new (Context)
+        ParenExpr(SourceLocation(), SourceLocation(), BinOp.get());
+
+    return ActOnCallExpr(getCurScope(), MemCall, TheCall->getBeginLoc(),
+                         Args.drop_front(2), TheCall->getRParenLoc());
+  }
+  return ActOnCallExpr(getCurScope(), Args.front(), TheCall->getBeginLoc(),
+                       Args.drop_front(), TheCall->getRParenLoc());
+}
+
 bool Sema::BuiltinPrefetch(CallExpr *TheCall) {
   unsigned NumArgs = TheCall->getNumArgs();
 
diff --git a/clang/lib/Sema/SemaExprCXX.cpp b/clang/lib/Sema/SemaExprCXX.cpp
index b071c98051bbe..e945b446953c7 100644
--- a/clang/lib/Sema/SemaExprCXX.cpp
+++ b/clang/lib/Sema/SemaExprCXX.cpp
@@ -6540,67 +6540,70 @@ ExprResult Sema::ActOnTypeTrait(TypeTrait Kind, SourceLocation KWLoc,
   return BuildTypeTrait(Kind, KWLoc, ConvertedArgs, RParenLoc);
 }
 
-static bool EvaluateBinaryTypeTrait(Sema &Self, TypeTrait BTT, const TypeSourceInfo *Lhs,
-                                    const TypeSourceInfo *Rhs, SourceLocation KeyLoc) {
-  QualType LhsT = Lhs->getType();
-  QualType RhsT = Rhs->getType();
+bool Sema::BuiltinIsBaseOf(SourceLocation RhsTLoc, QualType LhsT,
+                           QualType RhsT) {
+  // C++0x [meta.rel]p2
+  // Base is a base class of Derived without regard to cv-qualifiers or
+  // Base and Derived are not unions and name the same class type without
+  // regard to cv-qualifiers.
+
+  const RecordType *lhsRecord = LhsT->getAs<RecordType>();
+  const RecordType *rhsRecord = RhsT->getAs<RecordType>();
+  if (!rhsRecord || !lhsRecord) {
+    const ObjCObjectType *LHSObjTy = LhsT->getAs<ObjCObjectType>();
+    const ObjCObjectType *RHSObjTy = RhsT->getAs<ObjCObjectType>();
+    if (!LHSObjTy || !RHSObjTy)
+      return false;
 
-  assert(!LhsT->isDependentType() && !RhsT->isDependentType() &&
-         "Cannot evaluate traits of dependent types");
+    ObjCInterfaceDecl *BaseInterface = LHSObjTy->getInterface();
+    ObjCInterfaceDecl *DerivedInterface = RHSObjTy->getInterface();
+    if (!BaseInterface || !DerivedInterface)
+      return false;
 
-  switch(BTT) {
-  case BTT_IsBaseOf: {
-    // C++0x [meta.rel]p2
-    // Base is a base class of Derived without regard to cv-qualifiers or
-    // Base and Derived are not unions and name the same class type without
-    // regard to cv-qualifiers.
-
-    const RecordType *lhsRecord = LhsT->getAs<RecordType>();
-    const RecordType *rhsRecord = RhsT->getAs<RecordType>();
-    if (!rhsRecord || !lhsRecord) {
-      const ObjCObjectType *LHSObjTy = LhsT->getAs<ObjCObjectType>();
-      const ObjCObjectType *RHSObjTy = RhsT->getAs<ObjCObjectType>();
-      if (!LHSObjTy || !RHSObjTy)
-        return false;
+    if (RequireCompleteType(RhsTLoc, RhsT,
+                            diag::err_incomplete_type_used_in_type_trait_expr))
+      return false;
 
-      ObjCInterfaceDecl *BaseInterface = LHSObjTy->getInterface();
-      ObjCInterfaceDecl *DerivedInterface = RHSObjTy->getInterface();
-      if (!BaseInterface || !DerivedInterface)
-        return false;
+    return BaseInterface->isSuperClassOf(DerivedInterface);
+  }
 
-      if (Self.RequireCompleteType(
-              Rhs->getTypeLoc().getBeginLoc(), RhsT,
-              diag::err_incomplete_type_used_in_type_trait_expr))
-        return false;
+  assert(Context.hasSameUnqualifiedType(LhsT, RhsT) ==
+         (lhsRecord == rhsRecord));
 
-      return BaseInterface->isSuperClassOf(DerivedInterface);
-    }
+  // Unions are never base classes, and never have base classes.
+  // It doesn't matter if they are complete or not. See PR#41843
+  if (lhsRecord && lhsRecord->getDecl()->isUnion())
+    return false;
+  if (rhsRecord && rhsRecord->getDecl()->isUnion())
+    return false;
+
+  if (lhsRecord == rhsRecord)
+    return true;
 
-    assert(Self.Context.hasSameUnqualifiedType(LhsT, RhsT)
-             == (lhsRecord == rhsRecord));
+  // C++0x [meta.rel]p2:
+  //   If Base and Derived are class types and are different types
+  //   (ignoring possible cv-qualifiers) then Derived shall be a
+  //   complete type.
+  if (RequireCompleteType(RhsTLoc, RhsT,
+                          diag::err_incomplete_type_used_in_type_trait_expr))
+    return false;
 
-    // Unions are never base classes, and never have base classes.
-    // It doesn't matter if they are complete or not. See PR#41843
-    if (lhsRecord && lhsRecord->getDecl()->isUnion())
-      return false;
-    if (rhsRecord && rhsRecord->getDecl()->isUnion())
-      return false;
+  return cast<CXXRecordDecl>(rhsRecord->getDecl())
+    ->isDerivedFrom(cast<CXXRecordDecl>(lhsRecord->getDecl()));
+}
 
-    if (lhsRecord == rhsRecord)
-      return true;
+static bool EvaluateBinaryTypeTrait(Sema &Self, TypeTrait BTT, const TypeSourceInfo *Lhs,
+                                    const TypeSourceInfo *Rhs, SourceLocation KeyLoc) {
+  QualType LhsT = Lhs->getType();
+  QualType RhsT = Rhs->getType();
 
-    // C++0x [meta.rel]p2:
-    //   If Base and Derived are class types and are different types
-    //   (ignoring possible cv-qualifiers) then Derived shall be a
-    //   complete type.
-    if (Self.RequireCompleteType(
-            Rhs->getTypeLoc().getBeginLoc(), RhsT,
-            diag::err_incomplete_type_used_in_type_trait_expr))
-      return false;
+  assert(!LhsT->isDependentType() && !RhsT->isDependentType() &&
+         "Cannot evaluate traits of dependent types");
+
+  switch(BTT) {
+  case BTT_IsBaseOf:
+    return Self.BuiltinIsBaseOf(Rhs->getTypeLoc().getBeginLoc(), LhsT, RhsT);
 
-    return cast<CXXRecordDecl>(rhsRecord->getDecl())
-      ->isDerivedFrom(cast<CXXRecordDecl>(lhsRecord->getDecl()));
-  }
   case BTT_IsVirtualBaseOf: {
     const RecordType *BaseRecord = LhsT->getAs<RecordType>();
     const RecordType *DerivedRecord = RhsT->getAs<RecordType>();
diff --git a/clang/test/CodeGenCXX/builtin-invoke.cpp b/clang/test/CodeGenCXX/builtin-invoke.cpp
new file mode 100644
index 0000000000000..af66dfd4dae30
--- /dev/null
+++ b/clang/test/CodeGenCXX/builtin-invoke.cpp
@@ -0,0 +1,61 @@
+// RUN: %clang_cc1 -triple=x86_64-linux-gnu -emit-llvm -o - %s | FileCheck %s
+
+extern "C" void* memcpy(void*, const void*, decltype(sizeof(int)));
+void func();
+
+namespace std {
+  template <class T>
+  class reference_wrapper {
+    T* ptr;
+
+  public:
+    T& get() { return *ptr; }
+  };
+} // namespace std
+
+struct Callable {
+  void operator()() {}
+
+  void func();
+};
+
+extern "C" void call1() {
+  __builtin_invoke(func);
+  __builtin_invoke(Callable{});
+  __builtin_invoke(memcpy, nullptr, nullptr, 0);
+
+  // CHECK:      define dso_local void @call1
+  // CHECK-NEXT: entry:
+  // CHECK-NEXT:   %ref.tmp = alloca %struct.Callable, align 1
+  // CHECK-NEXT:   call void @_Z4funcv()
+  // CHECK-NEXT:   call void @_ZN8CallableclEv(ptr noundef nonnull align 1 dereferenceable(1) %ref.tmp)
+  // CHECK-NEXT:   call void @llvm.memcpy.p0.p0.i64(ptr align 1 null, ptr align 1 null, i64 0, i1 false)
+  // CHECK-NEXT:   ret void
+}
+
+extern "C" void call_memptr(std::reference_wrapper<Callable> wrapper) {
+  __builtin_invoke(&Callable::func, wrapper);
+
+  // CHECK:      define dso_local void @call_memptr
+  // CHECK-NEXT: entry:
+  // CHECK-NEXT:   %wrapper = alloca %"class.std::reference_wrapper", align 8
+  // CHECK-NEXT:   %coerce.dive = getelementptr inbounds nuw %"class.std::reference_wrapper", ptr %wrapper, i32 0, i32 0
+  // CHECK-NEXT:   store ptr %wrapper.coerce, ptr %coerce.dive, align 8
+  // CHECK-NEXT:   %call = call noundef nonnull align 1 dereferenceable(1) ptr @_ZNSt17reference_wrapperI8CallableE3getEv(ptr noundef nonnull align 8 dereferenceable(8) %wrapper)
+  // CHECK-NEXT:   %0 = getelementptr inbounds i8, ptr %call, i64 0
+  // CHECK-NEXT:   br i1 false, label %memptr.virtual, label %memptr.nonvirtual
+  // CHECK-EMPTY:
+  // CHECK-NEXT: memptr.virtual:
+  // CHECK-NEXT:   %vtable = load ptr, ptr %0, align 8
+  // CHECK-NEXT:   %1 = getelementptr i8, ptr %vtable, i64 sub (i64 ptrtoint (ptr @_ZN8Callable4funcEv to i64), i64 1), !nosanitize !2
+  // CHECK-NEXT:   %memptr.virtualfn = load ptr, ptr %1, align 8, !nosanitize !2
+  // CHECK-NEXT:   br label %memptr.end
+  // CHECK-EMPTY:
+  // CHECK-NEXT: memptr.nonvirtual:
+  // CHECK-NEXT:   br label %memptr.end
+  // CHECK-EMPTY:
+  // CHECK-NEXT: memptr.end:
+  // CHECK-NEXT:   %2 = phi ptr [ %memptr.virtualfn, %memptr.virtual ], [ @_ZN8Callable4funcEv, %memptr.nonvirtual ]
+  // CHECK-NEXT:   call void %2(ptr noundef nonnull align 1 dereferenceable(1) %0)
+  // CHECK-NEXT:   ret void
+}
diff --git a/clang/test/SemaCXX/builtin-invoke.cpp b/clang/test/SemaCXX/builtin-invoke.cpp
new file mode 100644
index 0000000000000..5b156b5ff75c4
--- /dev/null
+++ b/clang/test/SemaCXX/builtin-invoke.cpp
@@ -0,0 +1,133 @@
+// RUN: %clang_cc1 -verify -fsyntax-only %s
+
+void func() { // expected-note {{'func' declared here}}
+  __builtin_invoke(); // expected-error {{too few arguments to function call, expected at least 1, have 0}}
+}
+
+void nfunc() noexcept {}
+
+struct S {};
+void argfunc(int, S) {} // expected-note {{'argfunc' declared here}}
+
+struct Callable {
+  void operator()() {}
+
+  void func() {}
+
+  int var;
+};
+
+void* malloc(decltype(sizeof(int)));
+
+template <class T>
+struct pointer_wrapper {
+  T* v;
+
+  T& operator*() {
+    return *v;
+  }
+};
+
+namespace std {
+  template <class T>
+  class reference_wrapper {
+    T* ptr;
+
+  public:
+    reference_wrapper(T& ref) : ptr(&ref) {}
+
+    T& get() { return *ptr; }
+  };
+
+  template <class T>
+  reference_wrapper<T> ref(T& v) {
+    return reference_wrapper<T>(v);
+  }
+} // namespace std
+
+struct InvalidSpecialization1 {
+  void func() {}
+
+  int var;
+};
+
+template <>
+class std::reference_wrapper<InvalidSpecialization1> {
+public:
+  reference_wrapper(InvalidSpecialization1&) {}
+};
+
+struct InvalidSpecialization2 {
+  void func() {}
+
+  int var;
+};
+
+template <>
+class std::reference_wrapper<InvalidSpecialization2> {
+public:
+  reference_wrapper(InvalidSpecialization2&) {}
+
+private:
+  InvalidSpecialization2& get(); // expected-note 2 {{declared private here}}
+};
+
+void call() {
+  __builtin_invoke(func);
+  __builtin_invoke(nfunc);
+  static_assert(!noexcept(__builtin_invoke(func)));
+  static_assert(noexcept(__builtin_invoke(nfunc)));
+  __builtin_invoke(func, 1); // expected-error {{too many arguments to function call, expected 0, have 1}}
+  __builtin_invoke(argfunc, 1); // expected-error {{too few arguments to function call, expected 2, have 1}}
+  __builtin_invoke(Callable{});
+  __builtin_invoke(malloc, 0);
+  __builtin_invoke(__builtin_malloc, 0); // expected-error {{builtin functions must be directly called}}
+
+  // Member functiom pointer
+  __builtin_invoke(&Callable::func); // expected-error {{too few arguments to function call, expected at least 2, have 1}}
+  __builtin_invoke(&Callable::func, 1); // expected-error {{indirection requires pointer operand ('int' invalid)}}
+  __builtin_invoke(&Callable::func, Callable{});
+  __builtin_invoke(&Callable::func, Callable{}, 1); // expected-error {{too many arguments to function call, expected 0, have 1}}
+
+  Callable c;
+  __builtin_invoke(&Callable::func, &c);
+  __builtin_invoke(&Callable::func, std::ref(c));
+  __builtin_invoke(&Callable::func, &c);
+  __builtin_invoke(&Callable::func, &c, 2); // expected-error {{too many arguments to function call, expected 0, have 1}}
+  __builtin_invoke(&Callable::func, pointer_wrapper<Callable>{&c});
+  __builtin_invoke(&Callable::func, pointer_wrapper<Callable>{&c}, 2); // expected-error {{too many arguments to function call, expected 0, have 1}}
+
+  InvalidSpecialization1 is1;
+  InvalidSpecialization2 is2;
+  __builtin_invoke(&InvalidSpecialization1::func, std::ref(is1)); // expected-error {{no member named 'get' in 'std::reference_wrapper<InvalidSpecialization1>'}}
+  __builtin_invoke(&InvalidSpecialization2::func, std::ref(is2)); // expected-error {{'get' is a private member of 'std::reference_wrapper<InvalidSpecialization2>'}}
+
+  // Member data pointer
+  __builtin_invoke(&Callable::var); // expected-error {{too few arguments to function call, expected at least 2, have 1}}
+  __builtin_invoke(&Callable::var, 1); // expected-error {{indirection requires pointer operand ('int' invalid)}}
+  (void)__builtin_invoke(&Callable::var, Callable{});
+  __builtin_invoke(&Callable::var, Callable{}, 1); // expected-error {{too many arguments to function call, expected 2, have 3}}
+
+  (void)__builtin_invoke(&Callable::var, &c);
+  (void)__builtin_invoke(&Callable::var, std::ref(c));
+  (void)__builtin_invoke(&Callable::var, &c);
+  __builtin_invoke(&Callable::var, &c, 2); // expected-error {{too many arguments to function call, expected 2,...
[truncated]

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of nits, else I'm reasonably ok with this. Interested to hear what @AaronBallman has to say, else I think this is beneficial.

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy as-is. Please give Aaron a chance to poke if he wants to.

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Precommit CI failures look related to this patch:

/var/lib/buildkite-agent/builds/linux-56-59b8f5d88-g6cdk-1/llvm-project/github-pull-requests/build-runtimes/include/c++/v1/__type_traits/invoke.h:123:29: error: variable has incomplete type 'struct _LIBCPP_TEMPLATE_VIS'
  123 | struct _LIBCPP_TEMPLATE_VIS is_invocable : bool_constant<__is_invocable_v<_Fn, _Args...> > {};
      |                             ^

@philnik777
Copy link
Contributor Author

Thanks for working on that. Do you have any benchmarks?

I haven't benchmarked extensively, but as stated in the commit message, The instantiation of std::format is ~20ms faster. IMO that's a significant enough speedup on its own that it grants adding a builtin.

I've just tested with ranges::adjacent_find, and there the instantiation goes down from ~14.9ms to ~13.5ms. Not as impressive, but still quite nice. (And I've also noticed that common_reference takes an absurd amount of time to instantiate on ranges::adjacent_find)

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the libc++ part of this patch. I like this new approach a lot more. I think it gives us most of the benefits without raising difficult questions.

@philnik777
Copy link
Contributor Author

Gentle ping @AaronBallman @cor3ntin

@philnik777
Copy link
Contributor Author

@AaronBallman @cor3ntin any more thoughts on this?

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything wrong with the changes, but I think we're missing test coverage.

Comment on lines +2271 to +2272
if (RD->isInStdNamespace() &&
RD->getDeclName().getAsString() == "reference_wrapper") {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not opposed, but I'm also not delighted for more special knowledge about STL interfaces leaking into the compiler implementation.

if (!BaseInterface || !DerivedInterface)
return false;

if (RequireCompleteType(RhsTLoc, RhsT,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this diagnostic is tested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test for the case below. If anybody can inform me how to create an ObjCObjectType I'm happy to add a test for this as well (though I'm not sure testing this is that useful).

@philnik777
Copy link
Contributor Author

Ping @AaronBallman

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM aside from testing nits.

@AaronBallman
Copy link
Collaborator

Precommit CI failures look related

@philnik777 philnik777 merged commit 7138397 into llvm:main Jun 29, 2025
111 of 118 checks passed
@philnik777 philnik777 deleted the builtin_invoke branch June 29, 2025 15:52
@EricWF
Copy link
Member

EricWF commented Jun 29, 2025

@philnik777 I'm encountering a number of test failures related to libc++'s clang-tidy module when running the libc++ test suite with the new compiler:

# .---command stdout------------
# | /home/eric/llvm-project/build/libcxx/libcxx/test-suite-install/include/c++/v1/__type_traits/invoke.h:68:1: error: Internal aliases should always be marked _LIBCPP_NODEBUG [libcpp-nodebug-on-aliases,-warnings-as-errors]
# |    68 | using __invoke_result_t = decltype(__builtin_invoke(std::declval<_Args>()...));
# |       | ^
# | /home/eric/llvm-project/build/libcxx/libcxx/test-suite-install/include/c++/v1/__type_traits/invoke.h:79:1: error: Internal aliases should always be marked _LIBCPP_NODEBUG [libcpp-nodebug-on-aliases,-warnings-as-errors]
# |    79 | using __invoke_result = __invoke_result_impl<void, _Args...>;
# |       | ^

Could you please take a look at these?

template <class... Args>
concept invocable = requires(Args... args) { __builtin_invoke(args...); };

static_assert(!invocable<std::reference_wrapper<InvalidSpecialization1>>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a positive test for invocable?


namespace std {
template <class... Args>
auto invoke(Args&&... args) -> decltype(__builtin_invoke(args...));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere in the tests?

@philnik777
Copy link
Contributor Author

@philnik777 I'm encountering a number of test failures related to libc++'s clang-tidy module when running the libc++ test suite with the new compiler:

# .---command stdout------------
# | /home/eric/llvm-project/build/libcxx/libcxx/test-suite-install/include/c++/v1/__type_traits/invoke.h:68:1: error: Internal aliases should always be marked _LIBCPP_NODEBUG [libcpp-nodebug-on-aliases,-warnings-as-errors]
# |    68 | using __invoke_result_t = decltype(__builtin_invoke(std::declval<_Args>()...));
# |       | ^
# | /home/eric/llvm-project/build/libcxx/libcxx/test-suite-install/include/c++/v1/__type_traits/invoke.h:79:1: error: Internal aliases should always be marked _LIBCPP_NODEBUG [libcpp-nodebug-on-aliases,-warnings-as-errors]
# |    79 | using __invoke_result = __invoke_result_impl<void, _Args...>;
# |       | ^

Could you please take a look at these?

Urgh, looks like clang-tidy doesn't run in the bootstrapping build. I'll update my local clang and fix any issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants