Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SUSY HRDCOD=1 builds fail because sin/cos/atan are not constexpr (only CUDA fails, while C++ succeeds) #627

Closed
valassi opened this issue Apr 3, 2023 · 3 comments · Fixed by #625
Assignees

Comments

@valassi
Copy link
Member

valassi commented Apr 3, 2023

SUSY HRDCOD=1 builds fail because sin/cos/atan are not constexpr

See for instance susy gg to tt in PR #625, and in particular 586195d

The log says

The HRDCOD=1 build now fails with

ccache /usr/local/cuda-12.0/bin/nvcc  -O3  -lineinfo -I. -I../../src -I../../../../../tools -I/usr/local/cuda-12.0/include/ -DUSE_NVTX -gencode arch=compute_70,code=compute_70 -gencode arch=compute_70,code=sm_70 -use_fast_math -std=c++17  -ccbin /usr/lib64/ccache/g++ -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -DMGONGPU_HARDCODE_PARAM -Xcompiler -fPIC -c gcheck_sa.cu -o gcheck_sa.o
../../src/Parameters_MSSM_SLHA2.h(310): error: expression must have a constant value
../../src/Parameters_MSSM_SLHA2.h(310): note #2703-D: cannot call non-constexpr function "atan(double) noexcept(true)"
/usr/local/cuda-12.0/include/crt/math_functions.h(4137): here
../../src/Parameters_MSSM_SLHA2.h(725): error: expression must have a constant value
../../src/Parameters_MSSM_SLHA2.h(725): note #2703-D: cannot call non-constexpr function "cos(double) noexcept(true)"
/usr/local/cuda-12.0/include/crt/math_functions.h(553): here
../../src/Parameters_MSSM_SLHA2.h(726): error: expression must have a constant value
../../src/Parameters_MSSM_SLHA2.h(726): note #2703-D: cannot call non-constexpr function "sin(double) noexcept(true)"
/usr/local/cuda-12.0/include/crt/math_functions.h(520): here
3 errors detected in the compilation of "gcheck_sa.cu".
@valassi valassi changed the title SUSY HRDCOD=1 builds fail because sin/cos/atan are not constexpr SUSY HRDCOD=1 builds fail because sin/cos/atan are not constexpr (only CUDA fails, while C++ succeeds) Feb 14, 2024
@valassi
Copy link
Member Author

valassi commented Feb 14, 2024

Update Feb 2024. Actually, it is only nvcc that complains that sin/cos/atan are not constexpr. The C++ only build (CUDA_HOME=none using the old setup) succeeds, and the runtime tests are ok.

Maybe we can try to hardcode the gcc implementation of constexpr sin/cos/atan into our nvcc code?

See #625 (comment)

valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 14, 2024
…for mdl_G__exp__2 in FPTYPE=f

'make HRDCOD=0 FPTYPE=f -j' builds now succeed and the tests succeed too.

'make HRDCOD=0 FPTYPE=m -j' builds now also succeed and the tests succeed too.

'CUDA_HOME=none make HRDCOD=1' builds and tests also succeed with FPTYPE=d,f,m

The only pending problem are CUDA HRDCOD=1 builds because of missing constexpr sin/cos/atan (madgraph5#627)

This is probably ready to be backported to CODEGEN and eventually merged
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 28, 2024
…ons for HRDCOD=1 cuda builds (madgraph5#627)

Builds still fail because assert is not constexpr (I need to extend the function to cover any value and not just 0 to pi/2)
@valassi valassi self-assigned this Feb 28, 2024
@valassi
Copy link
Member Author

valassi commented Feb 28, 2024

I have made a first prototype in PR #625 here 46694af

Still need some fixes (extend the range beyond 0 to pi/2, add arctan).

The implementation is very simply based on a Taylor series expansion, which is easily done in constexpr.

@valassi valassi linked a pull request Feb 28, 2024 that will close this issue
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 28, 2024
…ons for HRDCOD=1 cuda builds (madgraph5#627)

Builds still fail because assert is not constexpr (I need to extend the function to cover any value and not just 0 to pi/2).
Also missing is atan.
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 28, 2024
…rig functions for HRDCOD=1 cuda builds (madgraph5#627)

Extend coverage of sin/cos/tan to any value, and add an atan function.

Now the build fails with the following:
HRDCOD=1 make
ccache /usr/local/cuda-12.0/bin/nvcc   -Xcompiler -O3 -lineinfo -I. -I../../src -I/usr/local/cuda-12.0/include/ -DUSE_NVTX -gencode arch=compute_70,code=compute_70 -gencode arch=compute_70,code=sm_70 -use_fast_math -std=c++17  -ccbin /usr/lib64/ccache/g++ -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -DMGONGPU_HARDCODE_PARAM -Xcompiler -fPIC -c -x cu CPPProcess.cc -o CPPProcess_cu.o
../../src/Parameters_MSSM_SLHA2.h(818): error: identifier "mg5amcGpu::Parameters_MSSM_SLHA2::mdl_I51x11" is undefined in device code
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 28, 2024
…builds (madgraph5#627): it was enought to define mdl_I51x11 as a "__device__" constexpr.

NB: Now the build fully succeeds and runTest.exe also succeeds for HRDCOD=1!
@valassi
Copy link
Member Author

valassi commented Feb 28, 2024

The fix for HRDCOD=1 builds in SUSY is now complete

(SUSY itself is almost completed, only some minor pieces missing to CODEGEN)

cc @oliviermattelaer @roiser

@valassi valassi closed this as completed Feb 28, 2024
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 29, 2024
…mplementation of constexpr trig functions for HRDCOD=1 cuda builds (madgraph5#627)

Also add it to output.py in CODEGEN
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 29, 2024
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 29, 2024
valassi added a commit to valassi/madgraph4gpu that referenced this issue Feb 29, 2024
… used madgraph5#627

Also add a comment about std::complex (replace is done twice)

The differences in generated code are now the following:

______________________________________________________________________________
git diff /data/avalassi/GPU2023/madgraph4gpuBis/epochX/cudacpp/susy_gg_tt.sa/SubProcesses/P1_Sigma_MSSM_SLHA2_gg_ttx/CPPProcess.cc
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
epochX/cudacpp/susy_gg_tt.sa/SubProcesses/P1_Sigma_MSSM_SLHA2_gg_ttx/CPPProcess.cc /tmp/git-blob-dFUXLJ/CPPProcess.cc 0651b576632e9d50c3ed2bf9461bd41309c90821 100644 epochX/cudacpp/susy_gg_tt.sa/SubProcesses/P1_Sigma_MSSM_SLHA2_gg_ttx/CPPProcess.cc 0000000000000000000000000000000000000000 100644
79c79
<   __device__ const fptype cIPD[3] = { (fptype)Parameters_MSSM_SLHA2::mdl_MT, (fptype)Parameters_MSSM_SLHA2::mdl_WT, (fptype)Parameters_MSSM_SLHA2::mdl_I51x11 };
---
>   __device__ const fptype cIPD[2] = { (fptype)Parameters_MSSM_SLHA2::mdl_MT, (fptype)Parameters_MSSM_SLHA2::mdl_WT };
83c83
<   __device__ __constant__ fptype cIPD[3];
---
>   __device__ __constant__ fptype cIPD[2];
86c86
<   static fptype cIPD[3];
---
>   static fptype cIPD[2];
505c505
<     const fptype tIPD[3] = { (fptype)m_pars->mdl_MT, (fptype)m_pars->mdl_WT, (fptype)m_pars->mdl_I51x11 };
---
>     const fptype tIPD[2] = { (fptype)m_pars->mdl_MT, (fptype)m_pars->mdl_WT };
508c508
<     gpuMemcpyToSymbol( cIPD, tIPD, 3 * sizeof( fptype ) );
---
>     gpuMemcpyToSymbol( cIPD, tIPD, 2 * sizeof( fptype ) );
511c511
<     memcpy( cIPD, tIPD, 3 * sizeof( fptype ) );
---
>     memcpy( cIPD, tIPD, 2 * sizeof( fptype ) );
514c514
<     //for ( i=0; i<3; i++ ) std::cout << std::setprecision(17) << "tIPD[i] = " << tIPD[i] << std::endl;
---
>     //for ( i=0; i<2; i++ ) std::cout << std::setprecision(17) << "tIPD[i] = " << tIPD[i] << std::endl;

______________________________________________________________________________
git diff /data/avalassi/GPU2023/madgraph4gpuBis/epochX/cudacpp/susy_gg_tt.sa/src/Parameters_MSSM_SLHA2.h
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
epochX/cudacpp/susy_gg_tt.sa/src/Parameters_MSSM_SLHA2.h /tmp/git-blob-fVhC4J/Parameters_MSSM_SLHA2.h 2805ef7d81bd8340b2a77b2ec16df7639165e30c 100644 epochX/cudacpp/susy_gg_tt.sa/src/Parameters_MSSM_SLHA2.h 0000000000000000000000000000000000000000 100644
23,24d22
< #include "constexpr_math.h"
<
295c293
<     __device__ constexpr double mdl_I51x11 = 1.;
---
>     constexpr double mdl_I51x11 = 1.;
370c368
<     constexpr cxsmpl<double> mdl_bb = ( ( -mdl_mHd2 + mdl_mHu2 - mdl_MZ__exp__2 * constexpr_cos( 2. * mdl_beta ) ) * constexpr_tan( 2. * mdl_beta ) ) / 2.;
---
>     constexpr cxsmpl<double> mdl_bb = ( ( -mdl_mHd2 + mdl_mHu2 - mdl_MZ__exp__2 * constexpr_cos( 2. * mdl_beta ) ) * tan( 2. * mdl_beta ) ) / 2.;
801,802c799
<       //const double mdl_I51x11 = Parameters_MSSM_SLHA2::getInstance()->mdl_I51x11; // fix HRDCOD=0 susy builds
<       const fptype mdl_I51x11 = cIPD[2]; // fix HRDCOD=0 susy builds
---
>       const fptype mdl_I51x11;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant