Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Clang][HTO] Add clang attribute for propagating llvm-level information #83059

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

wsmoses
Copy link
Member

@wsmoses wsmoses commented Feb 26, 2024

This PR adds functionality for specifying an LLVM function attribute within clang. This is necessary for transfering information from C++ into LLVM which doens't have a C++-level attribute. This functionality has been demonstrated to be helpful in a variety of ways, such as LLVM HTO.

However, this attribute is only intended for advanced users who need to specify specific information only available in LLVM attributes. Use of attributes which are not tied to a specific version of Clang (e.g. pure) should be preferred when available.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen labels Feb 26, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 26, 2024

@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: William Moses (wsmoses)

Changes

This PR adds functionality for specifying an LLVM function attribute within clang. This is necessary for transfering information from C++ into LLVM which doens't have a C++-level attribute. This functionality has been demonstrated to be helpful in a variety of ways, such as LLVM HTO.

However, this attribute is only intended for advanced users who need to specify specific information only available in LLVM attributes. Use of attributes which are not tied to a specific version of Clang (e.g. pure) should be preferred when available.


Full diff: https://github.com/llvm/llvm-project/pull/83059.diff

5 Files Affected:

  • (modified) clang/include/clang/Basic/Attr.td (+7)
  • (modified) clang/include/clang/Basic/AttrDocs.td (+13)
  • (modified) clang/lib/CodeGen/CGCall.cpp (+8)
  • (modified) clang/lib/CodeGen/CodeGenFunction.cpp (+10)
  • (added) clang/test/CodeGen/attr-llvmfn.c (+14)
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index fa191c7378dba4..044f4fada3590f 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -2211,6 +2211,13 @@ def AllocAlign : InheritableAttr {
   let Documentation = [AllocAlignDocs];
 }
 
+def LLVMFuncAttr : InheritableAttr {
+  let Spellings = [Clang<"llvm_fn_attr">];
+  let Args = [StringArgument<"LLVMAttrName">, StringArgument<"LLVMAttrValue">];
+  let Documentation = [LLVMFuncAttrDocs];
+  let InheritEvenIfAlreadyPresent = 1;
+}
+
 def NoReturn : InheritableAttr {
   let Spellings = [GCC<"noreturn">, Declspec<"noreturn">];
   // FIXME: Does GCC allow this on the function instead?
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index b96fbddd51154c..3f93bb6d6fc648 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -781,6 +781,19 @@ pointer is not sufficiently aligned.
   }];
 }
 
+def LLVMFuncAttrDocs : Documentation {
+  let Category = DocCatFunction;
+  let Content = [{
+Use ``__attribute__((llvm_fn_attr(attr_name, attr_value)))`` on a function to specify an LLVM function attribute that will be added to this function. 
+
+Note that uses of this attribute are highly compiler specific as the meaning and availability of LLVM attributes may change between compiler versions.
+
+This attribute is only intended for advanced users who need to specify specific information only available in LLVM attributes. Use of attributes which are not tied to a specific version of Clang (e.g. pure) should be preferred when available.
+
+The first arugment is a string representing the name of the LLVM attribute to be applied and the second arugment is a string representing its value.
+  }];
+}
+
 def EnableIfDocs : Documentation {
   let Category = DocCatFunction;
   let Content = [{
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index d05cf1c6e1814e..cb960bf7a3af6f 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -2465,6 +2465,14 @@ void CodeGenModule::ConstructAttributeList(StringRef Name,
 
     if (TargetDecl->hasAttr<ArmLocallyStreamingAttr>())
       FuncAttrs.addAttribute("aarch64_pstate_sm_body");
+
+    for(auto Attr: TargetDecl->specific_attrs<LLVMFuncAttr>()) {
+      auto name = Attr->getLLVMAttrName();
+      auto value = Attr->getLLVMAttrValue();
+
+      auto Attr = llvm::Attribute::parseAttr(getLLVMContext(), AttributeAndValue.first, AttributeAndValue.second);
+      FuncAttrs.addAttribute(Attr);
+    }
   }
 
   // Attach "no-builtins" attributes to:
diff --git a/clang/lib/CodeGen/CodeGenFunction.cpp b/clang/lib/CodeGen/CodeGenFunction.cpp
index 1ad905078d349c..1d8031c1a27900 100644
--- a/clang/lib/CodeGen/CodeGenFunction.cpp
+++ b/clang/lib/CodeGen/CodeGenFunction.cpp
@@ -831,6 +831,16 @@ void CodeGenFunction::StartFunction(GlobalDecl GD, QualType RetTy,
         FD->getBody()->getStmtClass() == Stmt::CoroutineBodyStmtClass)
       SanOpts.Mask &= ~SanitizerKind::Null;
 
+  if (D) {
+    for(auto Attr: TargetDecl->specific_attrs<LLVMFuncAttr>()) {
+      auto name = Attr->getLLVMAttrName();
+      auto value = Attr->getLLVMAttrValue();
+
+      auto Attr = llvm::Attribute::parseAttr(getLLVMContext(), AttributeAndValue.first, AttributeAndValue.second);
+      Fn->addFnAttr(Attr);
+    }
+  }
+
   // Apply xray attributes to the function (as a string, for now)
   bool AlwaysXRayAttr = false;
   if (const auto *XRayAttr = D ? D->getAttr<XRayInstrumentAttr>() : nullptr) {
diff --git a/clang/test/CodeGen/attr-llvmfn.c b/clang/test/CodeGen/attr-llvmfn.c
new file mode 100644
index 00000000000000..72f14a726f795e
--- /dev/null
+++ b/clang/test/CodeGen/attr-llvmfn.c
@@ -0,0 +1,14 @@
+// RUN: %clang_cc1 -debug-info-kind=limited -emit-llvm -o - | FileCheck %s
+
+void t1() __attribute__((llvm_fn_attr("custom_attr", "custom_value"), llvm_fn_attr("second_attr", "second_value")));
+
+void t1()
+{
+}
+
+void t2();
+
+void t3() {
+	t2() ____attribute__((llvm_fn_attr("custom_attr", "custom_value"), llvm_fn_attr("second_attr", "second_value")));
+}
+

Copy link

github-actions bot commented Feb 26, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@wsmoses wsmoses force-pushed the htoattr branch 2 times, most recently from 6efba29 to cb87c49 Compare February 27, 2024 00:56
Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this, but I'm generally not in favor of this patch. I do not think we should expose all LLVM IR (function) attributes directly in Clang for a whole host of reasons:

  • User experience is very poor. Some LLVM IR attributes have specific requirements and those won't be checked by the compiler frontend. For example, what does it mean to put the alignstack LLVM IR attribute on a function that's marked with the naked attribute (so no prolog or epilog is generated)? What does it mean to put an alloc-family LLVM IR attribute on a function that returns void? Etc.
  • LLVM IR attributes are implementation details, Clang attributes are part of our contract with users. LLVM IR attributes can change or be removed specifically because they're not user facing, the same is not true of attributes exposed in Clang.
  • This isn't feature-testable (__has_c|cpp_attribute will return true for llvm_fn_attr but there's no way to test for specific LLVM IR attributes.
  • The feature is not ergonomic with attribute argument handling. For example, the LLVM IR allocsize attribute accepts multiple values, but we don't support that (we could, but that has other issues like accepting more values than the IR attribute allows), the memory attribute uses an enumeration of values, but we don't support that, etc.
  • This duplicates functionality already exposed via other Clang attributes, such as convergent, hot, cold, etc but there may be subtle differences in behavior (the frontend will look at ReturnsTwiceAttr for more than just codegen, but it won't for llvm_fn_attr("returns_twice")).

In general, I think we want to expose LLVM IR attributes manually and only when they have a known use case that warrants their support as an extension in Clang.

Copy link
Collaborator

@erichkeane erichkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few things in how the handle function works that are incorrect/not necessary, but like Aaron, I'm pretty solidly against this patch in concept. Aaron had some good reasoning, and mine opinion reflects all of it.

@wsmoses
Copy link
Member Author

wsmoses commented Feb 27, 2024

@AaronBallman @erichkeane, do you have any suggestions for paths forward, for use cases where it is guaranteed that the attribute is valid and the user (or perhaps more specifically, another Clang-tool) needs to provide information to LLVM through Clang AST/source.

For example, I have a clang plugin that should apply specific LLVM (string) attributes to functions. Unfortunately, without a way to modify codegen in the clang plugin, this prevents this workflow from working without significant hacks. Presently those "hacks" are essentially making a new global variable that takes the function and an LLVM pass that parses those globals into attributes -- which is quite poor, and has issues for templates, etc.

In the case of the original research study that inspired this (HTO), we built an experimental LTO replacement tool that emitted headers that contained the derived LLVM attributes that were missing for the source and found significant speedup as a result. This has even more of the implementation detail requirement that you mention, but again has the guarantee that it is emitted by the same clang.

Would this be permissible as a hidden attribute, or perhaps it takes an additional argument for the LLVM version, and errs if the version doesn't match the present compiler?

@wsmoses
Copy link
Member Author

wsmoses commented Feb 27, 2024

As another option, what if this always emits string attributes prefixed with "clang_". And any other tools that semantic assurances for what they do with it. Thus the attribute won't be tied to any LLVM attributes.

@erichkeane
Copy link
Collaborator

@AaronBallman @erichkeane, do you have any suggestions for paths forward, for use cases where it is guaranteed that the attribute is valid and the user (or perhaps more specifically, another Clang-tool) needs to provide information to LLVM through Clang AST/source.

For example, I have a clang plugin that should apply specific LLVM (string) attributes to functions. Unfortunately, without a way to modify codegen in the clang plugin, this prevents this workflow from working without significant hacks. Presently those "hacks" are essentially making a new global variable that takes the function and an LLVM pass that parses those globals into attributes -- which is quite poor, and has issues for templates, etc.

In the case of the original research study that inspired this (HTO), we built an experimental LTO replacement tool that emitted headers that contained the derived LLVM attributes that were missing for the source and found significant speedup as a result. This has even more of the implementation detail requirement that you mention, but again has the guarantee that it is emitted by the same clang.

Would this be permissible as a hidden attribute, or perhaps it takes an additional argument for the LLVM version, and errs if the version doesn't match the present compiler?

The one option we HAVE for this sort of thing is the annotate and annotate_type attributes, which can provide (IIRC) arbitrary string attributes into LLVM-IR.

@wsmoses
Copy link
Member Author

wsmoses commented Feb 27, 2024

So my understanding was that the annotate attribute didn't modify codegen (and thus LLVM string attributes), but perhaps I didn't use it properly.

Is there any reference for that?

@erichkeane
Copy link
Collaborator

So my understanding was that the annotate attribute didn't modify codegen (and thus LLVM string attributes), but perhaps I didn't use it properly.

Is there any reference for that?

I'm not sure what you mean by that? the 'annotate' attribute just ends up in an LLVM-string-attribute being placed on the decl/type/etc. Unfortunately it isn't documented (as it is the 'hidden' attribute you suggested!), but you can see some examples in clang/test/CodeGen (just grep for annotate).

@wsmoses
Copy link
Member Author

wsmoses commented Feb 28, 2024

Okay, the AnnotateAttr doesn't create LLVM string attributes, it creates LLVM metadata, but that should sufice, closing.

@wsmoses wsmoses closed this Feb 28, 2024
@wsmoses
Copy link
Member Author

wsmoses commented Feb 28, 2024

Hm actually reopening, the metadata isn't emitted if the defnition isn't available [e.g. for extern int X; when given an annotation

@wsmoses wsmoses reopened this Feb 28, 2024
@AaronBallman
Copy link
Collaborator

Hm actually reopening, the metadata isn't emitted if the defnition isn't available [e.g. for extern int X; when given an annotation

That seems like a bug (so long as the declaration is actually emitted to LLVM IR at all).

@AaronBallman @erichkeane, do you have any suggestions for paths forward, for use cases where it is guaranteed that the attribute is valid and the user (or perhaps more specifically, another Clang-tool) needs to provide information to LLVM through Clang AST/source.

If it's a Clang-based tool, that might open up other options. I think it could be reasonable to have an internal-use-only Clang attribute (one with no Spelling so users have no way to access it) that wraps LLVM IR attributes. You could use MySpecialLLVMAttr::CreateImplicit() to create the attributes as-needed from within the tool, rather than modifying source code and feeding it in to the compiler. Would that perhaps be workable for you? (No idea what @erichkeane thinks of this idea, but if we went this route, I would want a nice comment on the attribute definition explaining why it intentionally has no spelling so nobody comes along later and adds a spelling for it without realizing the concerns.)

@erichkeane
Copy link
Collaborator

Hm actually reopening, the metadata isn't emitted if the defnition isn't available [e.g. for extern int X; when given an annotation

That seems like a bug (so long as the declaration is actually emitted to LLVM IR at all).

@AaronBallman @erichkeane, do you have any suggestions for paths forward, for use cases where it is guaranteed that the attribute is valid and the user (or perhaps more specifically, another Clang-tool) needs to provide information to LLVM through Clang AST/source.

If it's a Clang-based tool, that might open up other options. I think it could be reasonable to have an internal-use-only Clang attribute (one with no Spelling so users have no way to access it) that wraps LLVM IR attributes. You could use MySpecialLLVMAttr::CreateImplicit() to create the attributes as-needed from within the tool, rather than modifying source code and feeding it in to the compiler. Would that perhaps be workable for you? (No idea what @erichkeane thinks of this idea, but if we went this route, I would want a nice comment on the attribute definition explaining why it intentionally has no spelling so nobody comes along later and adds a spelling for it without realizing the concerns.)

I think I'd rather fix the extern declaration issues first, I'm guessing we just didn't add it to somewhere.

As far as the 'spelling-less' attribute, I'm not really sure how usable that would be, but I've done that trick in the past for a downstream. Its probably OK, though definitely something we need to make sure no one touches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants