Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to Olivier's latest gpucpp branch upstream ("split_nonidentical_grouping") #619

Merged
merged 11 commits into from
Apr 5, 2023

Conversation

valassi
Copy link
Member

@valassi valassi commented Mar 31, 2023

Upgrade to Olivier's latest gpucpp branch upstream

The only change is that in eemumu.mad the P1_ll_ll subdirectory is now P1_epem_mupmum

Thisis related to #272, see #272 (comment) . Th elast change by @oliviermattelaer now prevents directories with nprocesses>1. See the discussion at the meeting earlier this week https://indico.cern.ch/event/1263518/

@valassi valassi marked this pull request as draft April 3, 2023 14:52
@valassi valassi changed the title Upgrade to Olivier's latest gpucpp branch upstream WIP: Upgrade to Olivier's latest gpucpp branch upstream Apr 3, 2023
@valassi valassi changed the title WIP: Upgrade to Olivier's latest gpucpp branch upstream WIP: Upgrade to Olivier's latest gpucpp branch upstream ("split_nonidentical_grouping") Apr 3, 2023
@valassi
Copy link
Member Author

valassi commented Apr 3, 2023

This PR in itself is complete and could be merged, but I would like to do a few things before moving there. I keep this in draft for the moment.

Olivier's changes I refer to are those that introduce the "split_nonidentical_grouping" mode: mg5amcnlo/mg5amcnlo@cd272f8 and mg5amcnlo/mg5amcnlo@aa37739

valassi added a commit to valassi/madgraph4gpu that referenced this pull request Apr 3, 2023
…l ok with no change

This completes the first "susy" patch: now susy_gg_tt can be generated correctly (but it does not build).
In practice, the main (only?) issue it addresses is madgraph5#622

Further patches (susy2 and possibly more) will attempt to fix these builds.

NB: At this stage, CODEGEN is still using the upstream mg5amcnlo without "split_nonidentical_grouping" (PR madgraph5#619 and madgraph5#272)
…(revert to previous 11 codegen logs for easier rebasing)

Revert "[susy] ** COMPLETE SUSY (PART 1) ** regenerate five processes mad, all ok with no change"
This reverts commit acbe689.

Revert "[susy] regenerate 6 processes SA, all ok with no change"
This reverts commit 806e7d7.
…w gpucpp usptream

This will make it easier to check if there are any other differences...
git mv ee_mumu.mad/SubProcesses/P1_ll_ll/ ee_mumu.mad/SubProcesses/P1_epem_mupmum/
STARTED  AT Fri Mar 31 10:10:01 AM CEST 2023
./tput/teeThroughputX.sh -mix -hrd -makej -eemumu -ggtt -ggttg -ggttgg -ggttggg -makeclean
ENDED(1) AT Fri Mar 31 10:26:06 AM CEST 2023 [Status=0]
./tput/teeThroughputX.sh -flt -hrd -makej -eemumu -ggtt -ggttgg -inlonly -makeclean
ENDED(2) AT Fri Mar 31 10:30:57 AM CEST 2023 [Status=0]
./tput/teeThroughputX.sh -makej -eemumu -ggtt -ggttg -ggttgg -ggttggg -flt -bridge -makeclean
ENDED(3) AT Fri Mar 31 10:32:03 AM CEST 2023 [Status=0]
./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -rmbhst
ENDED(4) AT Fri Mar 31 10:32:52 AM CEST 2023 [Status=0]
./tput/teeThroughputX.sh -eemumu -ggtt -ggttgg -flt -curhst
ENDED(5) AT Fri Mar 31 10:33:40 AM CEST 2023 [Status=0]

(Later rerun the tests as the GPU was not configured correctly)
…nly for eemumu

Revert "[gpucpp] TEMPORARY hack to only run eemumu testst in allTees from tput and tmad"
This reverts commit 6634d0f.
@valassi
Copy link
Member Author

valassi commented Apr 3, 2023

I have rebased to after the first susy patch #622

valassi added a commit to valassi/madgraph4gpu that referenced this pull request Apr 3, 2023
NB this has a single P1 subdirectory gq_ttxq! (but nprocesses=1 in each see madgraph5#272)

The Fortran code has DSIG1 and also DSIG2.
Clearly the DSIG2 code is not correctly interfaced to our cudacpp

The code fails to build:
...
ccache g++  -O3  -std=c++17 -I. -I../../src -I../../../../../tools -DUSE_NVTX -Wall -Wshadow -Wextra -ffast-math  -fopenmp -march=skylake-avx512 -mprefer-vector-width=256  -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -I/usr/local/cuda-12.0/include/ -fPIC -c CPPProcess.cc -o CPPProcess.o
In file included from CPPProcess.cc:25:
coloramps.h:18:3: error: too many initializers for ‘const bool [5][4]’
   18 |   };
      |   ^
...
ccache /usr/local/cuda-12.0/bin/nvcc  -O3  -lineinfo -I. -I../../src -I../../../../../tools -I/usr/local/cuda-12.0/include/ -DUSE_NVTX -gencode arch=compute_70,code=compute_70 -gencode arch=compute_70,code=sm_70 -use_fast_math -std=c++17  -ccbin /usr/lib64/ccache/g++ -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -Xcompiler -fPIC -c gCPPProcess.cu -o gCPPProcess.o
coloramps.h(13): error: too many initializer values

NB this is still using the old upstream mg5amcnlo before Olivier's split_nonidentical_grouping (see madgraph5#619)
I have previewed that after merging that, there will be two separate P1 subdirectories also in .mad (as in .sa)
@valassi valassi self-assigned this Apr 5, 2023
@valassi valassi marked this pull request as ready for review April 5, 2023 05:40
@valassi
Copy link
Member Author

valassi commented Apr 5, 2023

I have completed several tests for the effect of this MR on a process with DSIG2 (gq to qqt, in #626 which will include this as a merge half-way). This can now be merged.

Self-merging.

cc: @oliviermattelaer @roiser @zeniheisser

@valassi valassi merged commit 5f646a2 into madgraph5:master Apr 5, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Apr 5, 2023
NB this has a single P1 subdirectory gq_ttxq! (but nprocesses=1 in each see madgraph5#272)

The Fortran code has DSIG1 and also DSIG2.
Clearly the DSIG2 code is not correctly interfaced to our cudacpp

The code fails to build:
...
ccache g++  -O3  -std=c++17 -I. -I../../src -I../../../../../tools -DUSE_NVTX -Wall -Wshadow -Wextra -ffast-math  -fopenmp -march=skylake-avx512 -mprefer-vector-width=256  -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -I/usr/local/cuda-12.0/include/ -fPIC -c CPPProcess.cc -o CPPProcess.o
In file included from CPPProcess.cc:25:
coloramps.h:18:3: error: too many initializers for ‘const bool [5][4]’
   18 |   };
      |   ^
...
ccache /usr/local/cuda-12.0/bin/nvcc  -O3  -lineinfo -I. -I../../src -I../../../../../tools -I/usr/local/cuda-12.0/include/ -DUSE_NVTX -gencode arch=compute_70,code=compute_70 -gencode arch=compute_70,code=sm_70 -use_fast_math -std=c++17  -ccbin /usr/lib64/ccache/g++ -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -Xcompiler -fPIC -c gCPPProcess.cu -o gCPPProcess.o
coloramps.h(13): error: too many initializer values

NB this is still using the old upstream mg5amcnlo before Olivier's split_nonidentical_grouping (see madgraph5#619)
I have previewed that after merging that, there will be two separate P1 subdirectories also in .mad (as in .sa)
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Apr 5, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Apr 5, 2023
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Apr 5, 2023
… (prepare to regenerate with Olivier's madgraph5#619 patch that splits in in two)

git mv gq_ttq.mad/SubProcesses/P1_gq_ttxq gq_ttq.mad/SubProcesses/P1_gu_ttxu
cp -dpr gq_ttq.mad/SubProcesses/P1_gu_ttxu gq_ttq.mad/SubProcesses/P1_gux_ttxux/
git add gq_ttq.mad/SubProcesses/P1_gux_ttxux/
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Apr 5, 2023
This merges the contents of MR madgraph5#619 "split_nonidentical_grouping"
I will now regenerate gq_ttq with the new upstream code by Olivier, which removes DSIG2 and fixes the issues
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Apr 5, 2023
…tical_grouping madgraph5#619

Two (different) P1 subdirectories are now generated, each with only one DSIG1 (i.e. with MAXSPROC=1).
Previously a single P1 was generated, with DSIG1 and DSIG2 (i.e. with MAXSPROC=2).

Note also that a single LOGICAL ICOLAMP(4,5,2) is now replaced by two separate LOGICAL ICOLAMP(4,5,1)

Note however that nprocesses is always 1 in the cudacpp code in all P1 before and after.

The code builds and check.exe runs successfully in the two P1 directories (previously the build was failing)
valassi added a commit to valassi/madgraph4gpu that referenced this pull request Apr 5, 2023
…1_gux_ttxux to P1_gu_ttxu

The gqttq tests fail anyway and will need to be fixed (madgraph5#630).
However, this completes the addition of gq_ttq as a new process to the repo.
In particular it includes proof that Olivier's "split_nonidentical_grouping" madgraph5#619 fixes the gqttq builds.
It also includes a lot of cleanup for "nprocesses" (madgraph5#272 and madgraph5#343)

Revert "[gqttq] retry the tmad gqttq test with the P1_gu_ttxu directory - the test continues to fail (madgraph5#630)"
This reverts commit 2dea1f7.

Revert "[gqttq] temporarely use P1_gu_ttxu instead of P1_gux_ttxux for gqttq tmad tests"
This reverts commit ea23a9a.
@valassi valassi changed the title WIP: Upgrade to Olivier's latest gpucpp branch upstream ("split_nonidentical_grouping") Upgrade to Olivier's latest gpucpp branch upstream ("split_nonidentical_grouping") Apr 6, 2023
valassi added a commit to mg5amcnlo/mg5amcnlo_cudacpp that referenced this pull request Aug 16, 2023
valassi added a commit to mg5amcnlo/mg5amcnlo_cudacpp that referenced this pull request Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant