New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang] Set correct FPOptions if attribute 'optnone' presents #85605
Conversation
@llvm/pr-subscribers-clang Author: Serge Pavlov (spavloff) ChangesAttribute Full diff: https://github.com/llvm/llvm-project/pull/85605.diff 5 Files Affected:
diff --git a/clang/include/clang/Basic/LangOptions.h b/clang/include/clang/Basic/LangOptions.h
index 08fc706e3cbf74..19c60a8cb5e946 100644
--- a/clang/include/clang/Basic/LangOptions.h
+++ b/clang/include/clang/Basic/LangOptions.h
@@ -842,6 +842,8 @@ class FPOptions {
/// Return difference with the given option set.
FPOptionsOverride getChangesFrom(const FPOptions &Base) const;
+ void applyChanges(FPOptionsOverride FPO);
+
// We can define most of the accessors automatically:
#define OPTION(NAME, TYPE, WIDTH, PREVIOUS) \
TYPE get##NAME() const { \
@@ -923,6 +925,11 @@ class FPOptionsOverride {
setAllowFPContractAcrossStatement();
}
+ void setDisallowOptimizations() {
+ setFPPreciseEnabled(true);
+ setDisallowFPContract();
+ }
+
storage_type getAsOpaqueInt() const {
return (static_cast<storage_type>(Options.getAsOpaqueInt())
<< FPOptions::StorageBitSize) |
@@ -979,6 +986,10 @@ inline FPOptionsOverride FPOptions::getChangesFrom(const FPOptions &Base) const
return getChangesSlow(Base);
}
+inline void FPOptions::applyChanges(FPOptionsOverride FPO) {
+ *this = FPO.applyOverrides(*this);
+}
+
/// Describes the kind of translation unit being processed.
enum TranslationUnitKind {
/// The translation unit is a complete translation unit.
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 95ea5ebc7f1ac1..ccc2ded67589ec 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -3222,6 +3222,7 @@ class Sema final {
Decl *ActOnStartOfFunctionDef(Scope *S, Decl *D,
SkipBodyInfo *SkipBody = nullptr,
FnBodyKind BodyKind = FnBodyKind::Other);
+ void applyFunctionAttributesBeforeParsingBody(Decl *FD);
/// Determine whether we can delay parsing the body of a function or
/// function template until it is used, assuming we don't care about emitting
diff --git a/clang/lib/Parse/ParseStmt.cpp b/clang/lib/Parse/ParseStmt.cpp
index 76a3fa8f2627de..489ae9f167b95d 100644
--- a/clang/lib/Parse/ParseStmt.cpp
+++ b/clang/lib/Parse/ParseStmt.cpp
@@ -2508,6 +2508,10 @@ Decl *Parser::ParseFunctionStatementBody(Decl *Decl, ParseScope &BodyScope) {
Sema::PragmaStackSentinelRAII
PragmaStackSentinel(Actions, "InternalPragmaState", IsCXXMethod);
+ // Some function attributes (like OptimizeNoneAttr) affect FP options.
+ Sema::FPFeaturesStateRAII SaveFPFeatures(Actions);
+ Actions.applyFunctionAttributesBeforeParsingBody(Decl);
+
// Do not enter a scope for the brace, as the arguments are in the same scope
// (the function body) as the body itself. Instead, just read the statement
// list and put it into a CompoundStmt for safe keeping.
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 5850cd0ab6b9aa..59c327515dba70 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -15919,6 +15919,16 @@ Decl *Sema::ActOnStartOfFunctionDef(Scope *FnBodyScope, Decl *D,
return D;
}
+void Sema::applyFunctionAttributesBeforeParsingBody(Decl *FD) {
+ if (FD->hasAttr<OptimizeNoneAttr>()) {
+ FPOptionsOverride FPO;
+ FPO.setDisallowOptimizations();
+ CurFPFeatures.applyChanges(FPO);
+ FpPragmaStack.CurrentValue =
+ CurFPFeatures.getChangesFrom(FPOptions(LangOpts));
+ }
+}
+
/// Given the set of return statements within a function body,
/// compute the variables that are subject to the named return value
/// optimization.
diff --git a/clang/test/AST/ast-dump-fpfeatures.cpp b/clang/test/AST/ast-dump-fpfeatures.cpp
index da0011602a728e..f6e9d3637417e6 100644
--- a/clang/test/AST/ast-dump-fpfeatures.cpp
+++ b/clang/test/AST/ast-dump-fpfeatures.cpp
@@ -187,3 +187,14 @@ float func_18(float x, float y) {
// CHECK: CompoundStmt {{.*}} ConstRoundingMode=downward
// CHECK: ReturnStmt
// CHECK: BinaryOperator {{.*}} ConstRoundingMode=downward
+
+#pragma float_control(precise, off)
+__attribute__((optnone))
+float func_19(float x, float y) {
+ return x + y;
+}
+
+// CHECK-LABEL: FunctionDecl {{.*}} func_19 'float (float, float)'
+// CHECK: CompoundStmt {{.*}} MathErrno=1
+// CHECK: ReturnStmt
+// CHECK: BinaryOperator {{.*}} 'float' '+' MathErrno=1
|
Hmm. Is there some sort of optimization in IRGen that we need to suppress here, or is it something in LLVM code gen? Presumably normal LLVM optimization passes all just skip Mostly I'm wondering how far we're expected to go with |
The issue #62098 demonstrates such case. Anyway, It is not good to have bogus attributes in AST.
It impedes implementation of pragma FENV_ROUND. The pragma requires setting FP options inside CompoundStmt and it interacts with the fictious attributes. This is the reason of this patch. |
Okay. So if I understand correctly, this substitution of libcalls is a frontend optimization, and the bug is arguing that it ought to be suppressed by
I think it's a fair question whether this is a bogus attribute. From the bug, it sounds more like the user wants some way to disable fast-math for a specific function, and apparently we don't have one. That seems like a reasonable feature request. Because we don't have that feature, they're flailing around trying essentially every pragma and attribute that we do have and arguing that at least one of them ought to have that impact, but it's not at all clear to me that they should. And presumably they don't actually want to disable all optimization of their function anyway; they just want to turn off fast math.
That doesn't make sense as a reason for this patch; |
I have uploaded your patch and compared the attributes generated from your patch and the little test case from #62098. The attributes generated are different. Therefore, the expected optimizations from this patch are going to be different than what we currently have. I think that's a problem. Attributes with this patch (only the ones that differ): attributes #0 = { "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "unsafe-fp-math"="false" } and without the patch: |
They can use
Exactly. User specifies attribute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright. I don't really have a problem with doing this, and this seems like basically the right approach. You're applying the attributes in the wrong place, though.
clang/lib/Parse/ParseStmt.cpp
Outdated
@@ -2508,6 +2508,10 @@ Decl *Parser::ParseFunctionStatementBody(Decl *Decl, ParseScope &BodyScope) { | |||
Sema::PragmaStackSentinelRAII | |||
PragmaStackSentinel(Actions, "InternalPragmaState", IsCXXMethod); | |||
|
|||
// Some function attributes (like OptimizeNoneAttr) affect FP options. | |||
Sema::FPFeaturesStateRAII SaveFPFeatures(Actions); | |||
Actions.applyFunctionAttributesBeforeParsingBody(Decl); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The attributes should cover the entire function body, so you're also going to need to do this before parsing function-try-blocks and constructor initializers. That probably just means pushing it a level up across the board.
clang/lib/Parse/Parser.cpp
Outdated
@@ -1495,6 +1495,10 @@ Decl *Parser::ParseFunctionDefinition(ParsingDeclarator &D, | |||
return Actions.ActOnFinishFunctionBody(Res, nullptr, false); | |||
} | |||
|
|||
// Some function attributes (like OptimizeNoneAttr) affect FP options. | |||
Sema::FPFeaturesStateRAII SaveFPFeatures(Actions); | |||
Actions.applyFunctionAttributesBeforeParsingBody(Res); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. There are actually a lot of paths that call into ParseFunctionStatementBody
like this. Rather than changing all of them, can we just do this in ActOnStartOfFunctionDef
/ ActOnFinishFunctionBody
? We do a lot of similar context push/pop operations there already.
Please test an inline C++ method (which are delayed and use a different path) and ObjC method parsing (which also use a different path).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sema.h
changes look good to me.
✅ With the latest revision this PR passed the C/C++ code formatter. |
✅ With the latest revision this PR passed the Python code formatter. |
Attribute `optnone` must turn off all optimizations including fast-math ones. Actually AST nodes in the 'optnone' function still had fast-math flags. This change implements fixing FP options before function body is parsed.
Ping. |
A little about the significance of this fix. To implement pragma FENV_ROUND, we need to apply the FP options stored in the instance of CompoundStmt to the builder object, so that it uses the specified rounding mode. It can be done using the class CGFPOptionsRAII. However in this case a couple of issues rise, both related to fast-math. This patch is intended to fix one of them an is a prerequisite for pragma FENV_ROUND implementation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems fine. Is there any existing bookkeeping we no longer need to do if we're going to have this RAII object in scope during function parsing?
Thanks!
It seems handling fast-math is the only case that prevented using this RAII object. |
Hi, we have an internal test that checks that compiling with and without optnone at O0 does not change the code generated by the compiler, and one of the tests we have did change which I bisected back to your change. The test consists of the following code: #include <math.h>
extern int printf(const char *format, ...);
template <typename T> OPTNONE T &pi(int num_terms) {
static T approx = 0;
for (int i = 0; i <= num_terms; ++i) {
approx += 4 * (pow(-1, i) / ((2 * i) + 1));
}
return approx;
}
OPTNONE
int main() {
int const num_terms = 5;
float f1000 = pi<float>(num_terms);
printf("%f\n", f1000);
return 0;
} When compiled with vcvtsi2sd xmm1, xmm1, eax
vdivsd xmm0, xmm0, xmm1
vmovss xmm2, dword ptr [rip + pi<float>(int)::approx] # xmm2 = mem[0],zero,zero,zero
vcvtss2sd xmm1, xmm1, xmm2
vmovsd xmm2, qword ptr [rip + .LCPI1_0] # xmm2 = [4.0E+0,0.0E+0]
vmulsd xmm0, xmm0, xmm2
vaddsd xmm1, xmm0, xmm1
vcvtsd2ss xmm0, xmm0, xmm1
vmovss dword ptr [rip + pi<float>(int)::approx], xmm0 When compiled with vcvtsi2sd xmm1, xmm1, eax
vdivsd xmm1, xmm0, xmm1
vmovsd xmm0, qword ptr [rip + .LCPI1_0] # xmm0 = [4.0E+0,0.0E+0]
vmulsd xmm1, xmm0, xmm1
vmovss xmm2, dword ptr [rip + pi<float>(int)::approx] # xmm2 = mem[0],zero,zero,zero
vcvtss2sd xmm0, xmm0, xmm2
vaddsd xmm1, xmm0, xmm1
vcvtsd2ss xmm0, xmm0, xmm1
vmovss dword ptr [rip + pi<float>(int)::approx], xmm0 You can see this change on godbolt here: https://godbolt.org/z/WTWPoqY51 Is this change intended? |
Hi @dyung. The observed difference is due to the FP contraction turned off if optnone is specified. In O0 this optimization is still applied. As a result, the function with optnone contains separate fadd and fmul, while without this attribute the function contains combined operatin fmuladd. This optimization is turned off in |
Thanks @spavloff. I can confirm that if I remove the call you mentioned the assembly generated is identical. I am slightly confused why we are running FP contraction at O0. I would have thought no optimization at all be done at O0? |
In an ideal world, an When we devised With that goal in mind, having |
Hi @spavloff. Regarding:
There's no need for them to behave differently here. And in fact, we want them to behave the same. There's a subtle point about FP contraction, in that doing an FP contraction (within a statement in C/C++) isn't considered an "optimization". It's a bit counterintuitive, but there's a good reason. A Fused Multiply Add (FMA) will, in general, produce a slightly different bit-wise floating point result than a multiply/add sequence. For architectures that have FMA instructions, if we enabled FMA at (for example) The legal values and semantics in C/C++ for
See: https://clang.llvm.org/docs/UsersManual.html#cmdoption-ffp-contract (There is also one other legal setting: So setting To put it another way, for a target that has FMA instructions, a valid program will produce identical results for As you said earlier:
In short,
|
Hi @spavloff. Given my post above was fairly long, I want to make a short, clarifying comment. There was a sense in the earlier discussion that there is a problem in that the spec of
For
but if
That will, in general, produce a slightly different result. Both results are correct/legal according to the IEEE Standard. But regardless, we don't want applying Here is a godbolt link illustrating: https://godbolt.org/z/abfz41f9W |
Previously treatment of the attribute `optnone` was modified in llvm#85605 ([clang] Set correct FPOptions if attribute 'optnone' presents). As a side effect FPContract was disabled for optnone. It created unneeded divergence with the behavior of -O0, which enables this optimization. In the discussion llvm#85605 (comment) it was pointed out that FP contraction should be enabled even if all optimizations are turned off, otherwise results of calculations would be different. This change enables FPContract at optnone.
Previously treatment of the attribute `optnone` was modified in #85605 ([clang] Set correct FPOptions if attribute 'optnone' presents). As a side effect FPContract was disabled for optnone. It created unneeded divergence with the behavior of -O0, which enables this optimization. In the discussion #85605 (comment) it was pointed out that FP contraction should be enabled even if all optimizations are turned off, otherwise results of calculations would be different. This change enables FPContract at optnone.
It looks like this is causing crashes when including precompiled headers that contain optnone functions; this can be reproduced with a very short header file and any source file, even a completely empty one. Reproducer follows:
|
Hi @SLTozer, Thank you for the report. There is an issue somewhere around the code that reads/wtites records FP_PRAGMA_OPTIONS and/or FLOAT_CONTROL_PRAGMA_OPTIONS, which sometimes manifest itseld in broken synchronization of FP options. I will prepare fix soon. |
After llvm#85605 ([clang] Set correct FPOptions if attribute 'optnone' presents) the current FP options in Sema are saved during parsing function because Sema can modify them if optnone is present. However they were saved too late, it caused fails in some cases when precompiled headers are used. This patch moves the storing earlier.
PR #92146 does not solve the problem of serialization mentioned above, but it is simple and quick solution for the reported crash. |
After #85605 ([clang] Set correct FPOptions if attribute 'optnone' presents) the current FP options in Sema are saved during parsing function because Sema can modify them if optnone is present. However they were saved too late, it caused fails in some cases when precompiled headers are used. This patch moves the storing earlier.
After llvm#85605 ([clang] Set correct FPOptions if attribute 'optnone' presents) the current FP options in Sema are saved during parsing function because Sema can modify them if optnone is present. However they were saved too late, it caused fails in some cases when precompiled headers are used. This patch moves the storing earlier.
Attribute
optnone
must turn off all optimizations including fast-math ones. Actually AST nodes in the 'optnone' function still had fast-math flags. This change implements fixing FP options before function body is parsed.