-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a bug in implementation of Smith's algorithm used in complex div. #78330
Conversation
in command line rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's bad that we don't have a IR test that breaks when you change the IR output like this. Please add one.
clang/lib/CodeGen/CGExprComplex.cpp
Outdated
// '-ffast-math' is used in the command line but followed by an | ||
// '-fno-cx-limited-range'. | ||
(CGF.getLangOpts().FastMath && | ||
Op.FPFeatures.getComplexRange() == LangOptions::CX_Full)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FastMath
is tautologically true here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
std::string ComplexRangeStr = RenderComplexRangeOption("limited"); | ||
if (!ComplexRangeStr.empty()) | ||
CmdArgs.push_back(Args.MakeArgString(ComplexRangeStr)); | ||
ComplexRangeStr = RenderComplexRangeOption("limited"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to move the rendering of this option to the bottom of this function (around line 3124) to avoid inserting multiple, conflicting options if more than one -f[no-]cx-* option is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang Author: Zahira Ammarguellat (zahiraam) ChangesThis patch fixes a bug in Smith's algorithm (thanks to @andykaylor who detected it) and makes sure that last option in command line rules. Full diff: https://github.com/llvm/llvm-project/pull/78330.diff 4 Files Affected:
diff --git a/clang/lib/CodeGen/CGExprComplex.cpp b/clang/lib/CodeGen/CGExprComplex.cpp
index e532794b71bdb4a..6fbd8f19eeb50a4 100644
--- a/clang/lib/CodeGen/CGExprComplex.cpp
+++ b/clang/lib/CodeGen/CGExprComplex.cpp
@@ -936,7 +936,7 @@ ComplexPairTy ComplexExprEmitter::EmitRangeReductionDiv(llvm::Value *LHSr,
llvm::Value *RC = Builder.CreateFMul(CdD, RHSr); // rc
llvm::Value *DpRC = Builder.CreateFAdd(RHSi, RC); // tmp=d+rc
- llvm::Value *T7 = Builder.CreateFMul(LHSr, RC); // ar
+ llvm::Value *T7 = Builder.CreateFMul(LHSr, CdD); // ar
llvm::Value *T8 = Builder.CreateFAdd(T7, LHSi); // ar+b
llvm::Value *DSTFr = Builder.CreateFDiv(T8, DpRC); // (ar+b)/tmp
@@ -978,7 +978,11 @@ ComplexPairTy ComplexExprEmitter::EmitBinDiv(const BinOpInfo &Op) {
return EmitRangeReductionDiv(LHSr, LHSi, RHSr, RHSi);
else if (Op.FPFeatures.getComplexRange() == LangOptions::CX_Limited)
return EmitAlgebraicDiv(LHSr, LHSi, RHSr, RHSi);
- else if (!CGF.getLangOpts().FastMath) {
+ else if (!CGF.getLangOpts().FastMath ||
+ // '-ffast-math' is used in the command line but followed by an
+ // '-fno-cx-limited-range'.
+ (CGF.getLangOpts().FastMath &&
+ Op.FPFeatures.getComplexRange() == LangOptions::CX_Full)) {
LHSi = OrigLHSi;
// If we have a complex operand on the RHS and FastMath is not allowed, we
// delegate to a libcall to handle all of the complexities and minimize
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 997ec2d491d02c8..7e25347677e23e8 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -2765,6 +2765,7 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
StringRef Float16ExcessPrecision = "";
StringRef BFloat16ExcessPrecision = "";
LangOptions::ComplexRangeKind Range = LangOptions::ComplexRangeKind::CX_Full;
+ std::string ComplexRangeStr = "";
if (const Arg *A = Args.getLastArg(options::OPT_flimited_precision_EQ)) {
CmdArgs.push_back("-mlimit-float-precision");
@@ -2780,24 +2781,24 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
case options::OPT_fcx_limited_range: {
EmitComplexRangeDiag(D, Range, LangOptions::ComplexRangeKind::CX_Limited);
Range = LangOptions::ComplexRangeKind::CX_Limited;
- std::string ComplexRangeStr = RenderComplexRangeOption("limited");
- if (!ComplexRangeStr.empty())
- CmdArgs.push_back(Args.MakeArgString(ComplexRangeStr));
+ ComplexRangeStr = RenderComplexRangeOption("limited");
break;
}
case options::OPT_fno_cx_limited_range:
+ EmitComplexRangeDiag(D, Range, LangOptions::ComplexRangeKind::CX_Full);
Range = LangOptions::ComplexRangeKind::CX_Full;
+ ComplexRangeStr = RenderComplexRangeOption("full");
break;
case options::OPT_fcx_fortran_rules: {
EmitComplexRangeDiag(D, Range, LangOptions::ComplexRangeKind::CX_Fortran);
Range = LangOptions::ComplexRangeKind::CX_Fortran;
- std::string ComplexRangeStr = RenderComplexRangeOption("fortran");
- if (!ComplexRangeStr.empty())
- CmdArgs.push_back(Args.MakeArgString(ComplexRangeStr));
+ ComplexRangeStr = RenderComplexRangeOption("fortran");
break;
}
case options::OPT_fno_cx_fortran_rules:
+ EmitComplexRangeDiag(D, Range, LangOptions::ComplexRangeKind::CX_Full);
Range = LangOptions::ComplexRangeKind::CX_Full;
+ ComplexRangeStr = RenderComplexRangeOption("full");
break;
case options::OPT_ffp_model_EQ: {
// If -ffp-model= is seen, reset to fno-fast-math
@@ -3068,9 +3069,7 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
SeenUnsafeMathModeOption = true;
// ffast-math enables fortran rules for complex multiplication and
// division.
- std::string ComplexRangeStr = RenderComplexRangeOption("limited");
- if (!ComplexRangeStr.empty())
- CmdArgs.push_back(Args.MakeArgString(ComplexRangeStr));
+ ComplexRangeStr = RenderComplexRangeOption("limited");
break;
}
case options::OPT_fno_fast_math:
@@ -3227,6 +3226,8 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D,
options::OPT_fstrict_float_cast_overflow, false))
CmdArgs.push_back("-fno-strict-float-cast-overflow");
+ if (!ComplexRangeStr.empty())
+ CmdArgs.push_back(Args.MakeArgString(ComplexRangeStr));
if (Args.hasArg(options::OPT_fcx_limited_range))
CmdArgs.push_back("-fcx-limited-range");
if (Args.hasArg(options::OPT_fcx_fortran_rules))
diff --git a/clang/test/CodeGen/cx-complex-range.c b/clang/test/CodeGen/cx-complex-range.c
index 8368fa611335cca..2d8507c710f2021 100644
--- a/clang/test/CodeGen/cx-complex-range.c
+++ b/clang/test/CodeGen/cx-complex-range.c
@@ -15,9 +15,25 @@
// RUN: -ffast-math -complex-range=limited -emit-llvm -o - %s \
// RUN: | FileCheck %s --check-prefix=LMTD-FAST
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu \
+// RUN: -ffast-math -complex-range=full -emit-llvm -o - %s \
+// RUN: | FileCheck %s --check-prefix=FULL
+
// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \
// RUN: -fno-cx-fortran-rules -o - | FileCheck %s --check-prefix=FULL
+// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \
+// RUN: -fcx-limited-range -fno-cx-limited-range -o - \
+// RUN: | FileCheck %s --check-prefix=FULL
+
+// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \
+// RUN: -fno-cx-limited-range -fcx-limited-range -o - \
+// RUN: | FileCheck %s --check-prefix=FULL
+
+// RUN: %clang_cc1 %s -O0 -emit-llvm -triple x86_64-unknown-unknown \
+// RUN: -fno-cx-fortran-rules -fcx-fortran-rules -o - \
+// RUN: | FileCheck %s --check-prefix=FULL
+
_Complex float div(_Complex float a, _Complex float b) {
// LABEL: define {{.*}} @div(
// FULL: call {{.*}} @__divsc3
diff --git a/clang/test/Driver/range.c b/clang/test/Driver/range.c
index 8d456a997d6967e..045d9c7d3d802ae 100644
--- a/clang/test/Driver/range.c
+++ b/clang/test/Driver/range.c
@@ -6,6 +6,9 @@
// RUN: %clang -### -target x86_64 -fno-cx-limited-range -c %s 2>&1 \
// RUN: | FileCheck %s
+// RUN: %clang -### -target x86_64 -fcx-limited-range -fno-cx-limited-range \
+// RUN: -c %s 2>&1 | FileCheck --check-prefix=FULL %s
+
// RUN: %clang -### -target x86_64 -fcx-fortran-rules -c %s 2>&1 \
// RUN: | FileCheck --check-prefix=FRTRN %s
@@ -29,7 +32,11 @@
// RUN: %clang -### -target x86_64 -fcx-limited-range -ffast-math -c %s 2>&1 \
// RUN: | FileCheck --check-prefix=LMTD %s
+// RUN: %clang -### -target x86_64 -ffast-math -fno-cx-limited-range -c %s 2>&1 \
+// RUN: | FileCheck --check-prefix=FULL %s
+
// LMTD: -complex-range=limited
+// FULL: -complex-range=full
// LMTD-NOT: -complex-range=fortran
// CHECK-NOT: -complex-range=limited
// FRTRN: -complex-range=fortran
|
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as far as my requests; please wait for approval from the other reviewers.
This is unrelated, but I wonder if we should proactively outline this sequence; it's quite long.
Not sure what you mean. |
I mean that we could lazily emit a helper function like To be clear, I'm not suggesting you need to do this, and certainly not in this patch. |
That's a reasonable suggestion. We may also want to add the ability to emit a helper function to handle the full range instead of calling the compiler-rt function when -ffreestanding has been used. Are you aware of the PR that @jcranmer-intel posted to add intrinsics for complex multiplication and division (#68742)? That would allow us to move this problem out of the front end, which seems good. |
If LLVM wants to provide intrinsics that cover patterns that the frontend wants to emit, that'd be great. The frontend will still have to make decisions about e.g. whether to request Smith's algorithm, of course. |
@rjmccall thanks for the review and clarification. @andykaylor do you have any other comments about this? |
@AaronBallman any comments about this? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current changes are reasonable to land for Clang 18 so we get the fix out to users. I don't have strong opinions on pushing this down into LLVM or outlining the function like John suggested; either approach is fine by me. I think outlining the function might make sense but I don't think it's a requirement either (perhaps add a FIXME comment suggesting it?).
ComplexRangeStr += "fortran"; | ||
break; | ||
case LangOptions::ComplexRangeKind::CX_None: | ||
ComplexRangeStr = ""; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it valid to pass -complex-range=
without anything after the equals sign? Based on how this is called, I think this case should be an assertion instead, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh! you are right. We should actually never land here because I am calling RenderComplexRangeOption with this condition:
if (Range != LangOptions::ComplexRangeKind::CX_None)
ComplexRangeStr = RenderComplexRangeOption(Range);
I guess an llvm-unreachable" should be added here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
This patch fixes a bug in Smith's algorithm (thanks to @andykaylor who detected it) and makes sure that last option in command line rules.