Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix susy gg_tt builds and add it to the repo #625

Merged
merged 98 commits into from
Mar 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
98 commits
Select commit Hold shift + click to select a range
838d0b1
[susy2] add susy_gg_tt.sa to the repository to allow code fixes
valassi Apr 3, 2023
0fd793b
[susy2] in susyggtt.sa add constexpr to cxsmmpl::conj function
valassi Apr 3, 2023
bd5db8f
[susy2] in susyggtt.sa Parameters.h, fix constexpr fixes for Majorana…
valassi Apr 3, 2023
36dfe05
[susy2] in susyggtt.sa Parameters.h, fix mdl_G__exp__2 as in SM ggtt.…
valassi Apr 3, 2023
586195d
[susy2] in susyggtt.sa Parameters.h, move non-constexper Majorana fix…
valassi Apr 3, 2023
01f5bbc
[susy2] in CODEGEN, add SUSY process susy_gq_ttllq
valassi Apr 3, 2023
182ae67
[susy2] in CODEGEN, remove SUSY process susy_gq_ttllq
valassi Apr 3, 2023
373607f
Merge remote-tracking branch 'upstream/master' into susy2
valassi Feb 13, 2024
e9c7b07
[susy2] regenerate susy_gg_tt.sa with the latest master - will then a…
valassi Feb 13, 2024
55bfb54
[susy2] add many missing files to the newly generated susy_gg_tt.sa
valassi Feb 14, 2024
bc66488
[susy2] in susyggtt.sa add constexpr to cxsmmpl::conj function (now v…
valassi Feb 14, 2024
098fc73
[susy2] in susyggtt.sa Parameters.h, fix constexpr fixes for Majorana…
valassi Feb 14, 2024
415e27b
[susy2] in susyggtt.sa Parameters.h, fix mdl_G__exp__2 as in SM ggtt.…
valassi Feb 14, 2024
d5ded87
[susy2] add reference log file in susy_gg_tt.sa
valassi Feb 14, 2024
717a5de
[susy2] in susy_gg_tt.sa Parameters_MSSM_SLHA2.h, temporarely(?) comm…
valassi Feb 14, 2024
df8b87a
[susy2] in susyggtt.sa Parameters_MSSM_SLHA2.h, make Parameters_MSSM_…
valassi Feb 14, 2024
4f3b83a
[susy2] in susyggtt.sa Parameters_MSSM_SLHA2.cc, declare "indices" th…
valassi Feb 14, 2024
3295142
[susy2] in susyggtt.sa, fix HRDCOD=0 cuda tests by including mdl_I51x…
valassi Feb 14, 2024
94408b8
[susy2] in susyggtt.sa, test an alternative approach to fix HRDCOD=0 …
valassi Feb 14, 2024
e4bf343
[susy2] in susyggtt.sa, revert the previous commit with an alternativ…
valassi Feb 14, 2024
ac1843c
[susy2] in susyggtt.sa Parameters.h, change mdl_G__exp__2 from conste…
valassi Feb 14, 2024
94f61a6
[susy2] in susyggtt.sa Parameters.h, use fptype instead of fptype_sv …
valassi Feb 14, 2024
c4f0465
[susy2] in CODEGEN (backport from susyggtt.sa) add constexpr to cxsmm…
valassi Feb 27, 2024
a7bf616
[susy2] in CODEGEN, backport susy_gg_tt.sa Parameters_MSSM_SLHA2.h: c…
valassi Feb 27, 2024
a28025e
[susy2] in susyggtt.sa Parameters.h, use cxtype[_sv] for mdl_G__exp__…
valassi Feb 27, 2024
42141b3
[susy2] in CODEGEN, add reference log file for susy_gg_tt
valassi Feb 27, 2024
32cd140
[susy2] in susy_gg_tt.sa/src/Parameters_MSSM_SLHA2.cc, fix clang form…
valassi Feb 27, 2024
812b9a6
[susy2] in CODEGEN for Parameters.cc, declare "indices" that was prev…
valassi Feb 27, 2024
3a6dd0a
[susy2] in susyggtt.sa Parameters.h, go back to fptype[_sv] for mdl_G…
valassi Feb 27, 2024
464e896
[susy2] in susyggtt.sa, various improvements to resync with latest CO…
valassi Feb 27, 2024
b07d597
[susy2] in CODEGEN, almost complete the backport from susy_gg_tt.sa (…
valassi Feb 27, 2024
e3fcb60
[susy2] progress in CODEGEN: implement a mechanism to identify and pr…
valassi Feb 27, 2024
ddbe240
[susy2] in susyggtt.sa, first implementation of constexpr trig functi…
valassi Feb 28, 2024
5aca390
[susy2] in susyggtt.sa, complete(?) the implementation of constexpr t…
valassi Feb 28, 2024
d374ede
[susy2] in susyggtt.sa, fix the last pending issue for HRDCOD=1 cuda …
valassi Feb 28, 2024
ddedfb7
[susy2] in susyggtt.sa testmisc.cc, add the basic test I had used for…
valassi Feb 28, 2024
0728117
[susy2] in susyggtt.sa constexpr_math.h, many fixes in sin/cos/tan af…
valassi Feb 29, 2024
9fd060b
[susy2] in susyggtt.sa, add many additional tests for constexpr_math …
valassi Feb 29, 2024
052b073
[susy2] in susyggtt.sa testmisc.cc, fix tolerances to bypass problema…
valassi Feb 29, 2024
5147422
[susy2] in susyggtt.sa constexpr_math.h, fix atan after more intense …
valassi Feb 29, 2024
6c55ff8
[susy2] in susyggtt.sa, complete testmisc.cc with tests for atan
valassi Feb 29, 2024
e1e62eb
[susy2] in CODEGEN, add constexpr_math.h from susyggtt.sa, with the i…
valassi Feb 29, 2024
8cdd587
[susy2] in CODEGEN, fix clang formatting in constexpr_math.h
valassi Feb 29, 2024
4354adf
[susy2] in CODEGEN, add to testmisc.cc the tests for constexpr_math.h…
valassi Feb 29, 2024
edc5005
[susy2] in CODEGEN cudacpp.mk, add optional quadmath dependency for c…
valassi Feb 29, 2024
26295a6
[susy2] in CODEGEN, fix clang formatting in testmisc.cc
valassi Feb 29, 2024
8671c27
[susy2] in susy_gg_tt.sa, fix clang formatting and minor details as i…
valassi Feb 29, 2024
de9075d
[susy2] in CODEGEN cpp_model_parameters_h.inc, remove constexpr sqrt …
valassi Feb 29, 2024
93cf463
[susy2] in CODEGEN model.py, ensure that constexpr math functions are…
valassi Feb 29, 2024
ccd719c
[susy2] progress in CODEGEN: mark mdl_I51x11 as device parameter (thi…
valassi Feb 29, 2024
2ffa7e5
[susy2] in CODEGEN, add constexpr_math.h to the Parameters.h template
valassi Feb 29, 2024
59c80b6
[susy2] in CODEGEN model.py, fix the 'tan(' replacement with constexpr
valassi Feb 29, 2024
5fc2898
[susy2] in susy_gg_tt.sa, restructure the handling of BSM parameters …
valassi Feb 29, 2024
8c1f38c
[susy2] in susy_gg_tt.sa, failed attempt at further restructuring the…
valassi Feb 29, 2024
322b423
[susy2] in susy_gg_tt.sa, final(?) restructuring of the handling of B…
valassi Feb 29, 2024
8521dd8
[susy2] in susy_gg_tt.sa, further change to ease code generation: add…
valassi Feb 29, 2024
a4cac2b
[susy2] in susy_gg_tt.sa, further changes to ease code generation: ad…
valassi Feb 29, 2024
fbeee37
[susy2] in CODEGEN, finally complete the backport from susy_gg_tt.sa!
valassi Feb 29, 2024
fdd44a8
[susy2] regenerate susy_gg_tt.sa, all ok at last!
valassi Feb 29, 2024
21cec55
[susy2] fixes in CODEGEN for ee_mumu.sa (after completing the backpor…
valassi Feb 29, 2024
3149315
[susy2] in CODEGEN model_handling.py, improve a variable name to indi…
valassi Feb 29, 2024
df071f0
[susy2] regenerate all processes (and add constexpr_math.h) - all ok,…
valassi Feb 29, 2024
8c40ac4
[susy2] in ee_mumu.sa CPPProcess.cc, fix undefined bsmIndepParam by s…
valassi Feb 29, 2024
2f398a9
[susy2] in CODEGEN, fix the build of ee_mumu.sa by adding the undefin…
valassi Feb 29, 2024
319b13f
[susy2] regenerate ee_mumu.sa and susy_gg_tt.sa, they build and test …
valassi Feb 29, 2024
f778ae0
[susy2] regenerate all processes
valassi Feb 29, 2024
022e2f6
[susy2] in susy_gg_tt.sa, fix FPTYPE=f builds
valassi Feb 29, 2024
dcb55e4
[susy2] in CODEGEN, backport from susy_gg_tt.sa the fix for FPTYPE=f …
valassi Feb 29, 2024
81476cf
[susy2] regenerate susy_gg_tt.sa, all ok
valassi Feb 29, 2024
d5a2936
[susy2] regenerate all processes after fixing FPTYPE=f builds
valassi Feb 29, 2024
7e7c99a
[susy2] in susy_gg_tt.sa, use literals for pi constants as M_PIl is n…
valassi Feb 29, 2024
4fb276c
[susy2] in CODEGEN backporting from susy_gg_tt.sa, use literals for p…
valassi Feb 29, 2024
a3719ce
[susy2] regenerate susy_gg_tt.sa, all ok
valassi Feb 29, 2024
1c1d9a9
[susy2] manually copy the new constexpr_math.h with the fix for Mac t…
valassi Feb 29, 2024
49d4003
[susy2] in susy_gg_tt.sa, use literals for pi constants also in testm…
valassi Feb 29, 2024
3ef6e56
[susy2] in CODEGEN backporting susy_gg_tt.sa, use literals for pi con…
valassi Feb 29, 2024
4b48614
[susy2] regenerate susy_gg_tt.sa, all ok
valassi Feb 29, 2024
aee94c3
[susy2] manually fix testmisc.cc in all processes
valassi Feb 29, 2024
26ee2fe
[susy2] in susy_gg_tt.sa testmisc.cc, increase tolerances for Mac
valassi Feb 29, 2024
9c1623e
[susy2] in CODEGEN backporting susy_gg_tt.sa testmisc.cc, increase to…
valassi Feb 29, 2024
ff7c9f6
[susy2] regenerate susy_gg_tt.sa, all ok
valassi Feb 29, 2024
e7f0969
[susy2] manually fix testmisc.cc in all processes again
valassi Feb 29, 2024
9601eea
[susy2] first, go back to rocrand logs on itscrd90 for easier compari…
valassi Mar 1, 2024
cc421be
[susy2] rerun 78 tput tests on itscrd90, all ok
valassi Mar 1, 2024
f4d951c
[susy2] rerun 18 tmad tests on itscrd90, all ok
valassi Mar 1, 2024
ca46e82
[susy2] bug fix in tput/teeThroughputX.sh: rename -rorhst as -hirhst …
valassi Mar 3, 2024
a143159
[susy2] bug fix in tput/throughputX.sh: comment out -hirhst (which is…
valassi Mar 3, 2024
01096bb
[susy2] rerun 72 tput tests on LUMI - all ok (with known errors on gq…
valassi Mar 3, 2024
a77fb34
[susy2] rerun 18 tmad tests on LUMI, all ok (execpt for gqttq as usual)
valassi Mar 3, 2024
ed81e86
[susy2] go back to itscrd90 test logs
valassi Mar 3, 2024
b0863f0
[susy2] rerun 78 tput tests on itgold91 for the first time, all ok - …
valassi Mar 3, 2024
ca849bd
[susy2] rerun 18 tmad tests on itgold91 for the first time, all ok - …
valassi Mar 3, 2024
4b09db3
[susy2] go back to itscrd90 test logs
valassi Mar 3, 2024
0fc6567
[susy2] in CODEGEN, add reference test file for susy_gg_tttt
valassi Mar 3, 2024
22c4ed0
[susy2] in CODEGEN, add reference test file for susy_gg_gogo
valassi Mar 3, 2024
7643469
[susy2] ** COMPLETE SUSY2 ** in CODEGEN, add reference test file for …
valassi Mar 3, 2024
5afacab
Merge remote-tracking branch 'upstream/master' (with PR #822 fixing #…
valassi Mar 14, 2024
6b66787
[susy2] in CODEGEN model_handling.py, add comments in write_hardcoded…
valassi Mar 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,9 @@ Parameters_%(model_name)s::getInstance()
void
Parameters_%(model_name)s::setIndependentParameters( SLHAReader& slha )
{
zero = 0; // define "zero"
ZERO = 0; // define "zero"
//std::vector<int> indices(2, 0); // prepare a vector for indices
zero = 0; // define "zero"
ZERO = 0; // define "zero"
std::vector<int> indices( 2, 0 ); // prepare a vector for indices
%(set_independent_parameters)s
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,16 @@
#include "mgOnGpuCxtypes.h"
#include "mgOnGpuVectors.h"

#include "constexpr_math.h"

//==========================================================================

#ifndef MGONGPU_HARDCODE_PARAM // this is only supported in SM processes (e.g. not in EFT models) for the moment (#439)%(efterror)s
// AV Jan 2024 (PR #625): this ugly #define was the only way I found to avoid creating arrays[nBsm] in CPPProcess.cc if nBsm is 0
// The problem is that nBsm is determined when generating Parameters.h, which happens after CPPProcess.cc has already been generated
// For simplicity, keep this code hardcoded also for SM processes (a nullptr is needed as in the case nBsm == 0)
%(bsmdefine)s

#ifndef MGONGPU_HARDCODE_PARAM%(eftwarn0)s

#include "read_slha.h"

Expand Down Expand Up @@ -70,7 +77,7 @@ namespace mg5amcCpu
//void printDependentParameters(); // now computed event-by-event (running alphas #373)

// Print couplings that are changed event by event
//void printDependentCouplings(); // now computed event-by-event (running alphas #373)
//void printDependentCouplings(); // now computed event-by-event (running alphas #373)%(bsmip0)s

private:

Expand All @@ -79,7 +86,7 @@ namespace mg5amcCpu

} // end namespace mg5amcGpu/mg5amcCpu

#else
#else%(eftwarn1)s

#include <cassert>
#include <limits>
Expand All @@ -94,37 +101,6 @@ namespace mg5amcCpu
// Hardcoded constexpr physics parameters
namespace Parameters_%(model_name)s // keep the same name rather than HardcodedParameters_%(model_name)s for simplicity
{
// Constexpr implementation of sqrt (see https://stackoverflow.com/a/34134071)
double constexpr sqrtNewtonRaphson( double x, double curr, double prev )
{
return curr == prev ? curr : sqrtNewtonRaphson( x, 0.5 * ( curr + x / curr ), curr );
}
double constexpr constexpr_sqrt( double x )
{
return x >= 0 // && x < std::numeric_limits<double>::infinity() // avoid -Wtautological-constant-compare warning in fast math
? sqrtNewtonRaphson( x, x, 0 )
: std::numeric_limits<double>::quiet_NaN();
}

// Constexpr implementation of floor (see https://stackoverflow.com/a/66146159)
constexpr int constexpr_floor( double d )
{
const int i = static_cast<int>( d );
return d < i ? i - 1 : i;
}

// Constexpr implementation of pow
constexpr double constexpr_pow( double base, double exp )
{
// NB(1): this implementation of constexpr_pow requires exponent >= 0
assert( exp >= 0 ); // NB would fail at compile time with "error: call to non-‘constexpr’ function ‘void __assert_fail'"
// NB(2): this implementation of constexpr_pow requires an integer exponent
const int iexp = constexpr_floor( exp );
assert( static_cast<double>( iexp ) == exp ); // NB would fail at compile time with "error: call to non-‘constexpr’ function ‘void __assert_fail'"
// Iterative implementation of pow if exp is a non negative integer
return iexp == 0 ? 1 : base * constexpr_pow( base, iexp - 1 );
}

// Model parameters independent of aS
constexpr double zero = 0;
constexpr double ZERO = 0;
Expand All @@ -145,7 +121,7 @@ namespace mg5amcCpu
//void printDependentParameters(); // now computed event-by-event (running alphas #373)

// Print couplings that are changed event by event
//void printDependentCouplings(); // now computed event-by-event (running alphas #373)
//void printDependentCouplings(); // now computed event-by-event (running alphas #373)%(bsmip1)s
}

} // end namespace mg5amcGpu/mg5amcCpu
Expand All @@ -170,16 +146,19 @@ namespace mg5amcCpu
%(dcoupdecl)s
};
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunused-variable" // e.g. <<warning: unused variable ‘mdl_G__exp__2’ [-Wunused-variable]>>
#pragma GCC diagnostic ignored "-Wunused-parameter" // e.g. <<warning: unused parameter ‘G’ [-Wunused-parameter]>>
#pragma GCC diagnostic ignored "-Wunused-parameter" // e.g. <<warning: unused parameter ‘G’ [-Wunused-parameter]>>
#pragma GCC diagnostic ignored "-Wunused-variable" // e.g. <<warning: unused variable ‘mdl_G__exp__2’ [-Wunused-variable]>>
#pragma GCC diagnostic ignored "-Wunused-but-set-variable" // e.g. <<warning: variable ‘mdl_G__exp__2’ set but not used [-Wunused-but-set-variable]>>
#ifdef MGONGPUCPP_GPUIMPL
#pragma nv_diagnostic push
#pragma nv_diag_suppress 177 // e.g. <<warning #177-D: variable "mdl_G__exp__2" was declared but never referenced>>
#endif
__host__ __device__ inline const DependentCouplings_sv computeDependentCouplings_fromG( const fptype_sv& G_sv )
__host__ __device__ inline const DependentCouplings_sv computeDependentCouplings_fromG( const fptype_sv& G_sv, const double* bsmIndepParamPtr )
{
#ifdef MGONGPU_HARDCODE_PARAM
using namespace Parameters_%(model_name)s;
#else
%(eftspecial0)s
#endif
// NB: hardcode cxtype cI(0,1) instead of cxtype (or hardcoded cxsmpl) mdl_complexi (which exists in Parameters_%(model_name)s) because:
// (1) mdl_complexi is always (0,1); (2) mdl_complexi is undefined in device code; (3) need cxsmpl conversion to cxtype in code below
Expand Down Expand Up @@ -220,12 +199,13 @@ namespace mg5amcCpu
template<class G_ACCESS, class C_ACCESS>
__device__ inline void
G2COUP( const fptype gs[],
fptype couplings[] )
fptype couplings[],
const double* bsmIndepParamPtr )
{
mgDebug( 0, __FUNCTION__ );
using namespace Parameters_%(model_name)s_dependentCouplings;
const fptype_sv& gs_sv = G_ACCESS::kernelAccessConst( gs );
DependentCouplings_sv couplings_sv = computeDependentCouplings_fromG( gs_sv );
DependentCouplings_sv couplings_sv = computeDependentCouplings_fromG( gs_sv, bsmIndepParamPtr );
%(dcoupaccessbuffer)s%(dcoupkernelaccess)s%(dcoupcompute)s
mgDebug( 1, __FUNCTION__ );
return;
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,232 @@
// Copyright (C) 2020-2024 CERN and UCLouvain.
// Licensed under the GNU Lesser General Public License (version 3 or later).
// Created by: A. Valassi (Feb 2024) for the MG5aMC CUDACPP plugin.
// Further modified by: A. Valassi (2024) for the MG5aMC CUDACPP plugin.

#ifndef constexpr_math_h
#define constexpr_math_h 1

#include "mgOnGpuConfig.h"

#include <cassert>
#include <cmath>
#include <limits>

// FOR DEBUGGING!
#undef CONSTEXPR_MATH_DEBUG // no-debug
//#define CONSTEXPR_MATH_DEBUG 1 // debug
#ifdef CONSTEXPR_MATH_DEBUG
#define constexpr const
#endif

// NB: namespaces mg5amcGpu and mg5amcCpu includes types which are defined in different ways for CPU and GPU builds (see #318 and #725)
#ifdef MGONGPUCPP_GPUIMPL
namespace mg5amcGpu
#else
namespace mg5amcCpu
#endif
{
// Constexpr implementation of sqrt (see https://stackoverflow.com/a/34134071)
constexpr long double sqrtNewtonRaphson( const long double xx, const long double curr, const long double prev )
{
return curr == prev ? curr : sqrtNewtonRaphson( xx, 0.5 * ( curr + xx / curr ), curr );
}
constexpr long double constexpr_sqrt( const long double xx )
{
return xx >= 0 // && x < std::numeric_limits<double>::infinity() // avoid -Wtautological-constant-compare warning in fast math
? sqrtNewtonRaphson( xx, xx, 0 )
: std::numeric_limits<long double>::quiet_NaN();
}

// Constexpr implementation of floor (see https://stackoverflow.com/a/66146159)
constexpr int constexpr_floor( const long double xx )
{
const int i = static_cast<int>( xx );
return xx < i ? i - 1 : i;
}

// Constexpr implementation of pow
constexpr long double constexpr_pow( const long double base, const long double exp )
{
// NB(1): this implementation of constexpr_pow requires exponent >= 0
assert( exp >= 0 ); // NB would fail at compile time with "error: call to non-‘constexpr’ function ‘void __assert_fail'"
// NB(2): this implementation of constexpr_pow requires an integer exponent
const int iexp = constexpr_floor( exp );
assert( static_cast<long double>( iexp ) == exp ); // NB would fail at compile time with "error: call to non-‘constexpr’ function ‘void __assert_fail'"
// Iterative implementation of pow if exp is a non negative integer
return iexp == 0 ? 1 : base * constexpr_pow( base, iexp - 1 );
}

// PI constants
// NB1: M_PIl from from cmath is not defined on Mac
// NB2: std::numbers::pi needs c++20 but we are still using c++17
// NB3: I could use my constexpr_atan(1)*4... but a literal is better?
//constexpr long double constexpr_pi = M_PIl; // pi
//constexpr long double constexpr_pi_by_2 = M_PI_2l; // pi/2
//constexpr long double constexpr_pi_by_4 = M_PI_4l; // pi/4
constexpr long double constexpr_pi = 3.141592653589793238462643383279502884L; // same as M_PIl in gcc
constexpr long double constexpr_pi_by_2 = 1.570796326794896619231321691639751442L; // same as M_PI_2l in gcc
constexpr long double constexpr_pi_by_4 = 0.785398163397448309615660845819875721L; // same as M_PI_4l in gcc
static_assert( constexpr_pi_by_4 * 4 == constexpr_pi );
static_assert( constexpr_pi_by_4 * 2 == constexpr_pi_by_2 );
static_assert( constexpr_pi_by_2 * 2 == constexpr_pi );

// Constexpr implementation of sin for 0<x<pi/4 (long double signature)
// Taylor expansion : x - x**3/3! + x**5/5!
constexpr long double sinTaylor( const long double xx )
{
assert( xx >= 0 && "The argument of sinTaylor is assumed to be in [0,pi/4)" );
assert( xx < constexpr_pi_by_4 && "The argument of sinTaylor is assumed to be in [0,pi/4)" );
long double sinx = 0;
int ipow = 1;
long double delta = xx;
while( true )
{
long double sinxlast = sinx;
sinx += delta;
#ifdef CONSTEXPR_MATH_DEBUG
std::cout << "ipow=" << ipow << ", delta=" << delta << ", sinx=" << sinx << std::endl; // for debugging (not constexpr)
#endif
if( sinx == sinxlast ) break;
// Next iteration
ipow += 2;
delta *= -xx * xx / ( ipow - 1 ) / ipow;
}
return sinx;
}

// Mapping to [0,2*pi) range (long double signature)
constexpr long double mapIn0to2Pi( const long double xx )
{
return xx - constexpr_floor( xx / 2 / constexpr_pi ) * 2 * constexpr_pi;
}

// Constexpr implementation of cos (long double signature)
constexpr long double constexpr_cos_quad( const long double xx, const bool assume0to2Pi = false )
{
if( assume0to2Pi )
{
assert( xx >= 0 && "The argument of constexpr_cos_quad is assumed to be in [0,2*pi)" );
assert( xx < 2 * constexpr_pi && "The argument of constexpr_cos_quad is assumed to be in [0,2*pi)" );
}
if( xx < 0 )
return constexpr_cos_quad( mapIn0to2Pi( xx ), true );
else if( xx < constexpr_pi_by_4 ) // [0/4*pi, 1/4*pi)
return constexpr_sqrt( 1 - constexpr_pow( sinTaylor( xx ), 2 ) );
else if( xx < constexpr_pi_by_2 ) // [1/4*pi, 2/4*pi)
return sinTaylor( constexpr_pi_by_2 - xx );
else if( xx < 3 * constexpr_pi_by_4 ) // [2/4*pi, 3/4*pi)
return -sinTaylor( xx - constexpr_pi_by_2 );
else if( xx < constexpr_pi ) // [3/4*pi, 4/4*pi)
return -constexpr_sqrt( 1 - constexpr_pow( sinTaylor( constexpr_pi - xx ), 2 ) );
else if( xx < 2 * constexpr_pi ) // [4/4*pi, 8/4*pi)
return constexpr_cos_quad( 2 * constexpr_pi - xx, true );
else // [8/4*pi, +inf)
return constexpr_cos_quad( mapIn0to2Pi( xx ), true );
}

// Constexpr implementation of cos (double signature, internally implemented as long double)
constexpr double constexpr_cos( const double x )
{
return constexpr_cos_quad( x );
}

// Constexpr implementation of sin (long double signature)
constexpr long double constexpr_sin_quad( const long double xx, const bool assume0to2Pi = false )
{
if( assume0to2Pi )
{
assert( xx >= 0 && "The argument of constexpr_sin_quad is assumed to be in [0,2*pi)" );
assert( xx < 2 * constexpr_pi && "The argument of constexpr_sin_quad is assumed to be in [0,2*pi)" );
}
if( xx < 0 )
return constexpr_sin_quad( mapIn0to2Pi( xx ), true );
else if( xx < constexpr_pi_by_4 ) // [0/4*pi, 1/4*pi)
return sinTaylor( xx );
else if( xx < constexpr_pi_by_2 ) // [1/4*pi, 2/4*pi)
return constexpr_sqrt( 1 - constexpr_pow( sinTaylor( constexpr_pi_by_2 - xx ), 2 ) );
else if( xx < 3 * constexpr_pi_by_4 ) // [2/4*pi, 3/4*pi)
return constexpr_sqrt( 1 - constexpr_pow( sinTaylor( xx - constexpr_pi_by_2 ), 2 ) );
else if( xx < constexpr_pi ) // [3/4*pi, 4/4*pi)
return sinTaylor( constexpr_pi - xx );
else if( xx < 2 * constexpr_pi ) // [4/4*pi, 8/4*pi)
return -constexpr_sin_quad( 2 * constexpr_pi - xx, true );
else // [8/4*pi, +inf)
return constexpr_sin_quad( mapIn0to2Pi( xx ), true );
}

// Constexpr implementation of sin (double signature, internally implemented as long double)
constexpr double constexpr_sin( const double x )
{
return constexpr_sin_quad( x );
}

// Constexpr implementation of tan (long double signature)
constexpr long double constexpr_tan_quad( const long double xx, const bool assume0to2Pi = false )
{
if( assume0to2Pi )
{
assert( xx >= 0 && "The argument of constexpr_sin_quad is assumed to be in [0,2*pi)" );
assert( xx < 2 * constexpr_pi && "The argument of constexpr_sin_quad is assumed to be in [0,2*pi)" );
}
if( xx < 0 )
return constexpr_tan_quad( mapIn0to2Pi( xx ), true );
else if( xx < 2 * constexpr_pi ) // [0, 2*pi)
return constexpr_sin_quad( xx, assume0to2Pi ) / constexpr_cos_quad( xx, assume0to2Pi );
else // [8/4*pi, +inf)
return constexpr_tan_quad( mapIn0to2Pi( xx ), true );
}

// Constexpr implementation of tan (double signature, internally implemented as long double)
constexpr double constexpr_tan( const double x )
{
return constexpr_tan_quad( x );
}

// Constexpr implementation of atan for -1<x<1 (long double signature)
// Taylor expansion : x - x**3/3 + x**5/5...
constexpr long double atanTaylor( const long double xx )
{
assert( xx >= -1 && "The argument of atanTaylor is assumed to be in (-1,+1)" );
assert( xx < 1 && "The argument of atanTaylor is assumed to be in (-1,+1)" );
long double atanx = 0;
int ipow = 1;
long double xpow = xx;
while( true )
{
long double atanxlast = atanx;
atanx += xpow / ipow;
#ifdef CONSTEXPR_MATH_DEBUG
std::cout << "ipow=" << ipow << ", xpow=" << xpow << ", atanx=" << atanx << std::endl; // for debugging (not constexpr)
#endif
if( atanx == atanxlast ) break;
// Next iteration
ipow += 2;
xpow *= -xx * xx;
}
return atanx;
}

// Constexpr implementation of atan (long double signature)
constexpr long double constexpr_atan_quad( const long double xx )
{
if( xx > 1 )
return constexpr_pi_by_2 - atanTaylor( 1 / xx );
else if( xx == 1 )
return constexpr_pi_by_4;
else if( xx > -1 )
return atanTaylor( xx );
else if( xx == -1 )
return -constexpr_pi_by_4;
else // if( xx < -1 )
return -constexpr_pi_by_2 - atanTaylor( 1 / xx );
}

// Constexpr implementation of atan (double signature, internally implemented as long double)
constexpr double constexpr_atan( const double x )
{
return constexpr_atan_quad( x );
}
}

#endif // constexpr_math_h
Original file line number Diff line number Diff line change
Expand Up @@ -847,6 +847,9 @@ $(testmain): LIBFLAGS += -lgomp
endif
endif

# Test quadmath in testmisc.cc tests for constexpr_math #627
###$(testmain): LIBFLAGS += -lquadmath

# Bypass std::filesystem completely to ease portability on LUMI #803
#ifneq ($(findstring hipcc,$(GPUCC)),)
#$(testmain): LIBFLAGS += -lstdc++fs
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,8 @@ namespace mgOnGpu /* clang-format off */
};

template<typename FP>
inline __host__ __device__ cxsmpl<FP> // (NB: cannot be constexpr as a constexpr function cannot have a nonliteral return type "mgOnGpu::cxsmpl")
constexpr // (NB: now valid code? in the past this failed as "a constexpr function cannot have a nonliteral return type mgOnGpu::cxsmpl")
inline __host__ __device__ cxsmpl<FP>
conj( const cxsmpl<FP>& c )
{
return cxsmpl<FP>( c.real(), -c.imag() );
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
#include <algorithm>
#include <array>
#include <cstring>
#include <iomanip>
#include <iostream>
#include <memory>

Expand Down
Loading
Loading