Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[clang][Sema] Add noinline check for __builtin_frame_address and __builtin_return_address #82966

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

anematode
Copy link
Contributor

@anematode anematode commented Feb 26, 2024

Will resolve #66059 . GCC's behavior in the case of inlining (https://gcc.gnu.org/onlinedocs/gcc/Return-Address.html) is that a caller's return/frame address may be returned, if the function in question gets inlined. Their docs encourage the function to be marked noinline if this behavior is not desired. In my opinion, it's confusing for the behavior to change depending on optimization levels, so it would be courteous to produce a warning if the function containing the call to __builtin_frame_address and __builtin_return_address is not marked noinline.

Copy link

github-actions bot commented Feb 26, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@anematode anematode force-pushed the frame-addr-diagnostic branch 2 times, most recently from a22cb7e to cd68484 Compare February 26, 2024 20:22
…ddress

Resolves llvm#66059. GCC's behavior in the case of inlining (https://gcc.gnu.org/onlinedocs/gcc/Return-Address.html) is that a caller's return/frame address may be returned, if the function in question gets inlined. Their docs encourage the function to be marked noinline to prevent this behavior. Therefore, produce a warning if the function containing the call to __builtin_frame_address and __builtin_return_address is not marked noinline.
@anematode anematode marked this pull request as ready for review February 26, 2024 20:51
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Feb 26, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 26, 2024

@llvm/pr-subscribers-clang

Author: Timothy Herchen (anematode)

Changes

Will resolve #66059 . GCC's behavior in the case of inlining (https://gcc.gnu.org/onlinedocs/gcc/Return-Address.html) is that a caller's return/frame address may be returned, if the function in question gets inlined. Their docs encourage the function to be marked noinline. Therefore, it would be courteous to produce a warning if the function containing the call to __builtin_frame_address and __builtin_return_address is not marked noinline.


Full diff: https://github.com/llvm/llvm-project/pull/82966.diff

3 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+4)
  • (modified) clang/lib/Sema/SemaChecking.cpp (+25-9)
  • (modified) clang/test/Sema/builtin-returnaddress.c (+38-6)
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 57784a4ba2e388..21cfd69dc69fc4 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -2029,6 +2029,10 @@ def warn_frame_address : Warning<
   "calling '%0' with a nonzero argument is unsafe">,
   InGroup<FrameAddress>, DefaultIgnore;
 
+def warn_frame_address_missing_noinline: Warning<
+  "calling '%0' in function not marked __attribute__((noinline)) may return a caller's %1 address">,
+  InGroup<FrameAddress>, DefaultIgnore;
+
 def warn_cxx98_compat_nontrivial_union_or_anon_struct_member : Warning<
   "%select{anonymous struct|union}0 member %1 with a non-trivial "
   "%sub{select_special_member_kind}2 is incompatible with C++98">,
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 984088e345c806..3c0a5a8a402ae9 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -2730,17 +2730,33 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
     if (SemaBuiltinConstantArgRange(TheCall, 0, 0, 0xFFFF))
       return ExprError();
 
-    // -Wframe-address warning if non-zero passed to builtin
-    // return/frame address.
     Expr::EvalResult Result;
     if (!TheCall->getArg(0)->isValueDependent() &&
-        TheCall->getArg(0)->EvaluateAsInt(Result, getASTContext()) &&
-        Result.Val.getInt() != 0)
-      Diag(TheCall->getBeginLoc(), diag::warn_frame_address)
-          << ((BuiltinID == Builtin::BI__builtin_return_address)
-                  ? "__builtin_return_address"
-                  : "__builtin_frame_address")
-          << TheCall->getSourceRange();
+        TheCall->getArg(0)->EvaluateAsInt(Result, getASTContext())) {
+      const char *BuiltinName =
+          (BuiltinID == Builtin::BI__builtin_return_address)
+              ? "__builtin_return_address"
+              : "__builtin_frame_address";
+
+      // -Wframe-address warning if non-zero passed to builtin
+      // return/frame address.
+      if (Result.Val.getInt() != 0) {
+        Diag(TheCall->getBeginLoc(), diag::warn_frame_address)
+            << BuiltinName << TheCall->getSourceRange();
+      }
+
+      // -Wframe-address warning if enclosing function is not marked noinline.
+      if (const FunctionDecl *FD = dyn_cast<FunctionDecl>(CurContext)) {
+        if (!FD->hasAttr<NoInlineAttr>() && !FD->isMain()) {
+          const char *ShortName =
+              (BuiltinID == Builtin::BI__builtin_return_address) ? "return"
+                                                                 : "frame";
+          Diag(TheCall->getBeginLoc(),
+               diag::warn_frame_address_missing_noinline)
+              << BuiltinName << ShortName << TheCall->getSourceRange();
+        }
+      }
+    }
     break;
   }
 
diff --git a/clang/test/Sema/builtin-returnaddress.c b/clang/test/Sema/builtin-returnaddress.c
index 16d2a517ac12f2..0b6733f9381c9f 100644
--- a/clang/test/Sema/builtin-returnaddress.c
+++ b/clang/test/Sema/builtin-returnaddress.c
@@ -2,24 +2,32 @@
 // RUN: %clang_cc1 -fsyntax-only -Wmost -verify %s
 // RUN: %clang_cc1 -x c++ -fsyntax-only -Wframe-address -verify %s
 
-void* a(unsigned x) {
+__attribute__((noinline)) void* a(unsigned x) {
 return __builtin_return_address(0);
 }
 
-void* b(unsigned x) {
+__attribute__((noinline)) void* b(unsigned x) {
 return __builtin_return_address(1); // expected-warning{{calling '__builtin_return_address' with a nonzero argument is unsafe}}
 }
 
-void* c(unsigned x) {
+__attribute__((noinline)) void* c(unsigned x) {
 return __builtin_frame_address(0);
 }
 
-void* d(unsigned x) {
+__attribute__((noinline)) void* d(unsigned x) {
 return __builtin_frame_address(1); // expected-warning{{calling '__builtin_frame_address' with a nonzero argument is unsafe}}
 }
 
+void* e(unsigned x) {
+  return __builtin_frame_address(0); // expected-warning{{calling '__builtin_frame_address' in function not marked __attribute__((noinline)) may return a caller's frame address}}
+}
+
+void* f(unsigned x) {
+  return __builtin_return_address(0); // expected-warning{{calling '__builtin_return_address' in function not marked __attribute__((noinline)) may return a caller's return address}}
+}
+
 #ifdef __cplusplus
-template<int N> void *RA()
+template<int N> __attribute__((noinline)) void *RA()
 {
   return __builtin_return_address(N); // expected-warning{{calling '__builtin_return_address' with a nonzero argument is unsafe}}
 }
@@ -28,4 +36,28 @@ void *foo()
 {
  return RA<2>(); // expected-note{{in instantiation of function template specialization 'RA<2>' requested here}}
 }
-#endif
+
+void* f() {
+  return ([&] () {
+    return __builtin_frame_address(0); // expected-warning{{calling '__builtin_frame_address' in function not marked __attribute__((noinline)) may return a caller's frame address}}
+  })();
+}
+
+void* g() {
+  return ([&] () __attribute__((noinline)) {
+    return __builtin_frame_address(0);
+  })();
+}
+
+void* h() {
+  return ([&] () {
+    return __builtin_return_address(0); // expected-warning{{calling '__builtin_return_address' in function not marked __attribute__((noinline)) may return a caller's return address}}
+  })();
+}
+
+void* i() {
+  return ([&] () __attribute__((noinline)) {
+    return __builtin_return_address(0);
+  })();
+}
+#endif
\ No newline at end of file

@@ -2029,6 +2029,10 @@ def warn_frame_address : Warning<
"calling '%0' with a nonzero argument is unsafe">,
InGroup<FrameAddress>, DefaultIgnore;

def warn_frame_address_missing_noinline: Warning<
"calling '%0' in function not marked __attribute__((noinline)) may return a caller's %1 address">,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a minor thing, but you should probably use %select for %1 here instead of passing in e.g. "frame" as a string. The %0 is fine imo because it’s the name of a builtin function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"calling '%0' in function not marked __attribute__((noinline)) may return a caller's %1 address">,
"calling '%0' in function not marked '__attribute__((noinline))' may return a caller's %1 address">,

You should also add single quotes here because it's using a syntactic construct.

@Sirraide
Copy link
Contributor

Sirraide commented Mar 2, 2024

In my opinion, it's confusing for the behavior to change depending on optimization levels

I will say, if you’re calling __builtin_frame_address when optimisations are enabled then I’m not quite sure what you’re expecting, candidly; I’m not convinced we should warn on this, but I’m also not that familiar w/ how people typically use this builtin.

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -2029,6 +2029,10 @@ def warn_frame_address : Warning<
"calling '%0' with a nonzero argument is unsafe">,
InGroup<FrameAddress>, DefaultIgnore;

def warn_frame_address_missing_noinline: Warning<
"calling '%0' in function not marked __attribute__((noinline)) may return a caller's %1 address">,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"calling '%0' in function not marked __attribute__((noinline)) may return a caller's %1 address">,
"calling '%0' in function not marked '__attribute__((noinline))' may return a caller's %1 address">,

You should also add single quotes here because it's using a syntactic construct.

Comment on lines +2736 to +2739
const char *BuiltinName =
(BuiltinID == Builtin::BI__builtin_return_address)
? "__builtin_return_address"
: "__builtin_frame_address";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this, you should be able to use ASTContext::BuiltinInfo to call getName(BuiltinID).

Comment on lines +2749 to +2750
if (const FunctionDecl *FD = dyn_cast<FunctionDecl>(CurContext)) {
if (!FD->hasAttr<NoInlineAttr>() && !FD->isMain()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (const FunctionDecl *FD = dyn_cast<FunctionDecl>(CurContext)) {
if (!FD->hasAttr<NoInlineAttr>() && !FD->isMain()) {
if (const auto *FD = dyn_cast<FunctionDecl>(CurContext);
FD && !FD->hasAttr<NoInlineAttr>() && !FD->isMain()) {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be looking at FunctionDecl::isMSVCRTEntryPoint() in addition to isMain()?

return __builtin_return_address(0);
})();
}
#endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the newline back to the end of the file.

@nickdesaulniers
Copy link
Member

I'm seeing evidence that this might be a chatty diagnostic in practice:

https://sourcegraph.com/github.com/torvalds/linux@90d35da658da8cff0d4ecbb5113f5fac9d00eb72/-/blob/kernel/fork.c?L311 https://sourcegraph.com/github.com/torvalds/linux@90d35da658da8cff0d4ecbb5113f5fac9d00eb72/-/blob/mm/util.c?L644 https://sourcegraph.com/github.com/torvalds/linux@90d35da658da8cff0d4ecbb5113f5fac9d00eb72/-/blob/arch/arm/mm/nommu.c?L224 https://sourcegraph.com/github.com/torvalds/linux@90d35da658da8cff0d4ecbb5113f5fac9d00eb72/-/blob/kernel/scs.c?L48 (and quite a few others).

CC @nathanchance @nickdesaulniers @rjmccall for opinions

If we continue to move forward with the patch, you should add a release note to clang/docs/ReleaseNotes.rst so users know about the new diagnostic.

There's definitely cases where __builtin_frame_address is used in callers attributes __attribute__((always_inline)) within the Linux kernel sources. Perhaps we should diagnose when the caller is neither noinline or always_inline, since then inline substitution will depend on optimizations?

@rjmccall
Copy link
Contributor

rjmccall commented Mar 4, 2024

I'm seeing evidence that this might be a chatty diagnostic in practice:

https://sourcegraph.com/github.com/torvalds/linux@90d35da658da8cff0d4ecbb5113f5fac9d00eb72/-/blob/kernel/fork.c?L311 https://sourcegraph.com/github.com/torvalds/linux@90d35da658da8cff0d4ecbb5113f5fac9d00eb72/-/blob/mm/util.c?L644 https://sourcegraph.com/github.com/torvalds/linux@90d35da658da8cff0d4ecbb5113f5fac9d00eb72/-/blob/arch/arm/mm/nommu.c?L224 https://sourcegraph.com/github.com/torvalds/linux@90d35da658da8cff0d4ecbb5113f5fac9d00eb72/-/blob/kernel/scs.c?L48 (and quite a few others).

CC @nathanchance @nickdesaulniers @rjmccall for opinions

Does anyone understand why Linux uses __builtin_return_address there? On first glance, I'd think this is exactly the sort of thing that the warning ought to be warning about — a static function that could easily be inlined into its callers. But I assume it doesn't actually matter because Linux's use doesn't really depend on the exact pointer value.

Have we consider the alternative of just disabling inlining when a function uses __builtin_return_address? The middle-end already knows that there are things it can't inline.

Although I seem to remember having seen code that uses always_inline in order to force __builtin_return_address to actually apply to its caller.

@anematode
Copy link
Contributor Author

Although I seem to remember having seen code that uses always_inline in order to force __builtin_return_address to actually apply to its caller.

Right; that's why I thought just disabling inlining would be a suboptimal choice, although I guess you could have always_inline override it.

If the diagnostic is so noisy then maybe I don't see the point of it. Is there any case in Linux where it's a correctness issue to use a caller's address? (Presumably in this case it would just give a confusing call site when you're debugging.)

@nickdesaulniers
Copy link
Member

Does anyone understand why Linux uses __builtin_return_address there?

Grep turns up 257 callers of __builtin_return_address in the linux kernel sources. Hard to say definitely for all cases. It's possible that some are incorrect, or should have their callers explicitly marked always_inline or noinline.

Have we consider the alternative of just disabling inlining when a function uses __builtin_return_address?

Is that what GCC does (Looks like no)? Perhaps useful to match behavior of this compiler builtin, if so. Or document ways in which we intentionally diverge.

Is there any case in Linux where it's a correctness issue to use a caller's address?

257 call sites. hard to say

@anematode
Copy link
Contributor Author

It's used 100+ more times through the macro _RET_IP_. https://elixir.bootlin.com/linux/latest/source/include/linux/instruction_pointer.h#L7

https://elixir.bootlin.com/linux/latest/source/include/linux/kasan.h#L164 has some example uses where always_inline is important for correctness.

https://elixir.bootlin.com/linux/latest/source/kernel/kcsan/core.c#L912 might be interesting? (This is way out of my depth.) Function isn't marked noinline and the return pointer is actually important. Also, many other functions in that file seem to be marked noinline, e.g. https://elixir.bootlin.com/linux/latest/source/kernel/kcsan/core.c#L1327, but not that one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Inliner] Should we inline callee containing llvm.frameaddress?
6 participants