Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wrong type argument to unary minus (FFV functions should not use "-COUPs" as argument) #628

Closed
valassi opened this issue Apr 3, 2023 · 17 comments · Fixed by #764
Closed
Assignees

Comments

@valassi
Copy link
Member

valassi commented Apr 3, 2023

wrong type argument to unary minus (FFV functions should not use "-COUPs" as argument)

I have seen this issue in two places

For SUSY for instance
586195d

The CUDA_HOME=none HRDCOD=0 build also fails with
ccache g++  -O3  -std=c++17 -I. -I../../src -I../../../../../tools  -Wall -Wshadow -Wextra -ffast-math  -fopenmp -march=skylake-avx512 -mprefer-vector-width=256  -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -DMGONGPU_HARDCODE_PARAM -DMGONGPU_HAS_NO_CURAND  -fPIC -c CPPProcess.cc -o CPPProcess.o
CPPProcess.cc: In function ‘void mg5amcCpu::calculate_wavefunctions(int, const fptype*, const fptype*, mgOnGpu::fptype*, fptype_sv*, int)’:
CPPProcess.cc:241:81: error: wrong type argument to unary minus
  241 |       FFV1_0<W_ACCESS, A_ACCESS, CD_ACCESS>( w_fp[3], w_fp[2], w_fp[4], -COUPs[1], &amp_fp[0] );
      |                                                                          ~~~~~~~^
CPPProcess.cc:251:62: error: wrong type argument to unary minus
  251 |       FFV1_1<W_ACCESS, CD_ACCESS>( w_fp[2], w_fp[0], -COUPs[1], cIPD[0], cIPD[1], w_fp[4] );
      |                                                       ~~~~~~~^
CPPProcess.cc:254:81: error: wrong type argument to unary minus
  254 |       FFV1_0<W_ACCESS, A_ACCESS, CD_ACCESS>( w_fp[3], w_fp[4], w_fp[1], -COUPs[1], &amp_fp[0] );
      |                                                                          ~~~~~~~^
CPPProcess.cc:263:62: error: wrong type argument to unary minus
  263 |       FFV1_2<W_ACCESS, CD_ACCESS>( w_fp[3], w_fp[0], -COUPs[1], cIPD[0], cIPD[1], w_fp[4] );
      |                                                       ~~~~~~~^
CPPProcess.cc:266:81: error: wrong type argument to unary minus
  266 |       FFV1_0<W_ACCESS, A_ACCESS, CD_ACCESS>( w_fp[4], w_fp[2], w_fp[1], -COUPs[1], &amp_fp[0] );
      |                                                                          ~~~~~~~^

This is potentially tough to solve. I see two ways

  • either we keep the APIs the same, but then the value inside COUPs[1] must be replaced by their opposite... this whoever only works if ALL FFV functions use the same sign for the coupling
  • alternatively (ugly, cumbersome - and worse performance), must pass an extra parameter to the FFV functions to say that the couplings must be taken with the opposite sign

Copying @zeniheisser as I understand this might be a showstopper for SUSY studies. Sopying also @oliviermattelaer

@zeniheisser
Copy link
Contributor

Took a quick look at other processes where I thought this might occur, and found a minimal example in the electroweak sector of the SM:

The compilation fails with the following output message:
    makefile:77: CUDA_HOME was not set: using "/usr/local/cuda-11.8/"
    OMPFLAGS=-fopenmp
    AVX=512y
    FPTYPE=d
    HELINL=0
    HRDCOD=0
    RNDGEN=hasCurand
    Building in BUILDDIR=. for tag=512y_d_inl0_hrd0_hasCurand (USEBUILDDIR is not set)
    make -C ../../src  -f cudacpp_src.mk
    make[1]: Entering directory '/data/zwetters/modded5amc/PROC_PLUGIN_sm_0/src'
    AVX=512y
    mkdir -p ../lib
    g++  -O3  -std=c++17 -I.  -fPIC -Wall -Wshadow -Wextra -ffast-math  -fopenmp -march=skylake-avx512 -mprefer-vector-width=256  -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -c Parameters_sm.cc -o Parameters_sm.o
    g++  -O3  -std=c++17 -I.  -fPIC -Wall -Wshadow -Wextra -ffast-math  -fopenmp -march=skylake-avx512 -mprefer-vector-width=256  -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -c read_slha.cc -o read_slha.o
    g++ -shared -o../lib/libmg5amc_common.so ./Parameters_sm.o ./read_slha.o
    make[1]: Leaving directory '/data/zwetters/modded5amc/PROC_PLUGIN_sm_0/src'
    /usr/local/cuda-11.8//bin/nvcc  -O3  -lineinfo -I. -I../../src -I../../../../../tools -I/usr/local/cuda-11.8//include/ -DUSE_NVTX -gencode arch=compute_70,code=compute_70 -gencode arch=compute_70,code=sm_70 -use_fast_math -std=c++17  -ccbin /opt/rh/gcc-toolset-11/root/usr/bin/g++ -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -Xcompiler -fPIC -c gcheck_sa.cu -o gcheck_sa.o
    /usr/local/cuda-11.8//bin/nvcc  -O3  -lineinfo -I. -I../../src -I../../../../../tools -I/usr/local/cuda-11.8//include/ -DUSE_NVTX -gencode arch=compute_70,code=compute_70 -gencode arch=compute_70,code=sm_70 -use_fast_math -std=c++17  -ccbin /opt/rh/gcc-toolset-11/root/usr/bin/g++ -DMGONGPU_FPTYPE_DOUBLE -DMGONGPU_FPTYPE2_DOUBLE -Xcompiler -fPIC -c gCPPProcess.cu -o gCPPProcess.o
    gCPPProcess.cu(269): error: expression must have arithmetic or unscoped enum type
    
    1 error detected in the compilation of "gCPPProcess.cu".
    make: *** [makefile:422: gCPPProcess.o] Error 1

Same error appears when trying to build from the P1_ directory.

@valassi
Copy link
Member Author

valassi commented Apr 4, 2023

Took a quick look at other processes where I thought this might occur, and found a minimal example in the electroweak sector of the SM:

Thanks Zenny! Which process is this? As mentioned above, I saw this in gq to ttllq SM (and in a SUSY process)

This is clearly high priority - but complex potentially. Need to analyse the Fortran/Python code and discuss with @oliviermattelaer

@zeniheisser
Copy link
Contributor

Thanks Zenny! Which process is this? As mentioned above, I saw this in gq to ttllq SM (and in a SUSY process)

Ah, forgot to mention that! It's c s~ > w+ z.

Just to make sure, I also checked that the charge-conjugated process c~ s > w- z has the same issue, and it does.

@roiser
Copy link
Member

roiser commented May 19, 2023

Note I stumbled also over this in the "CMS process" from Sapta (Drell Yan). I worked around it with turning the signs inside the array.

@oliviermattelaer
Copy link
Member

Yes even in pure fortran, the compiler complains about unary operation, but this is just a warning.

So should I write this then?
FFV1_2<W_ACCESS, CD_ACCESS>( w_fp[3], w_fp[0], -1d0*COUPs[1], cIPD[0], cIPD[1], w_fp[4] );

Cheers,

Olivier

@valassi
Copy link
Member Author

valassi commented May 19, 2023

Hi Olivier, thanks for looking at this.

No it is more complex than that (and not very obvious from the code manybe, sorry). The point is that "COUPs[1]" is a pointer, not a value. The value would be in "COUPs[1][i]", where i is an event id if I remember correctly. It must be a pointer because we have the running G and so all couplings are event-dependent, they are not constants.

What would be needed is to make sure that the minus sign goes INSIDE the FFV implementation, rather than in the function calling the FFV. But I am not sure this is possible.

Various alternatives I considered (just brainstorming) could be

  • we add one input value to the FFV function signature, to mean whether internally COUP[][] or -COUP[][] should be used... but this is ugly
  • otherwise I even thought about having two arrays for couplings and -couplings, but this is even uglier

For a given coupling in a given FFV function, would all callers of that function always use one of COUP or -COUP, or are there multiple callers and some may have COUP and some -COUP?

Thanks
Andrea

@roiser
Copy link
Member

roiser commented May 19, 2023

There may be a third option, the problem why negating the coupling constant doesn't work right now is because its an array which represents a complex number. IIRC each function parameter represents one complex number (an array of 2). The proposal would be to create the complex number outside the function in a global and then let the complex unary minus operator do the job if needed when the global is passed as a function parameter.

@valassi
Copy link
Member Author

valassi commented May 22, 2023

There may be a third option, the problem why negating the coupling constant doesn't work right now is because its an array which represents a complex number.

Hi, thanks for the feedback. This is useful but not correct. The problem, in general, is not that these arrays represent complex numbers. The problem is that all FFV functions now take ONLY arguments which are arrays. This is by design, to make sure that all FFVs have uniform signatures and can be handled using similar decoding mechanisms.

Now, in one example I checked (the gq_ttllq), it turns out that the "-COUPs[3]" refers to an alphas-independent coupling. So you are kind of right that in this case it is a single complex number, because the same value is used for all events. Note however that the design of this thing is presently meant to use a pointer even if we had a single real number, not even a complex. So, again, the problem is that FFVs take pointers by design, not values. And you cannot do a unary minus of a pointer.

Let me rephrase: in designing this I was assuming (assumption which was correct for all of our initial processes) that the FFVs would be called with a coupling pointer, and that then it would be the FFV itself that would do "minus the coupling" if needed. This is now shown to be a wrong assumption.

Now there were three options, and the first one is wrong, but there are still two possibilities

  1. I was assuming FFVs would be called with coupling pointer for all couplings, dependent and independent. This is wrong.
  2. Maybe, maybe, FFVs are sometimes called with "-coupling" only for alphas-independent couplings? This is what I see in gq_ttllq (I have not even checked susy). If this was the case, then maybe the best option is to treat independent couplings differently from dependent couplings, and use values for independent couplings, pointers for dependent couplings. This would be quite ugly, but would work. But is this assumption correct?
  3. If, conversely, also alphas-dependent couplings are sometimes called as "-COUPs" when invoking FFVs, then we need a more general solution. Investing time for some quick patch along option 2 would just be a waste of time.

@oliviermattelaer can you try to clarify please?

Concerning a "more general" solution, I saw that gq_ttllq actually has some FFV calls with "COUPs[3]" an donly some other FFV calls (always the same ones, eg FFV2_5_0) with "-COUPs[3]". So one option I was considering, namely having only "-coupling" in the COUPs[3] pointer, would not work, as we need both positive and negative couplings.

Also about a more genearl solution, I checked that presently the template argiuments for FFVs include only one CI_ACCESS accessor. The same is used for instance for COUPs[3] and COUPs[5] in FFV2_5_0<W_ACCESS, A_ACCESS, CI_ACCESS>( w_fp[3], w_fp[14], w_fp[11], -COUPs[3], COUPs[5], &amp_fp[0] );. This is a problem because this call mixes one negative "-COUPs[3] and one positive 'COUPs[5]'. So one cannot use the template argument CI_ACCESS to give one - and one +, as it is the same CI_ACCESS for both. It really seems that we need one extra argument to the FFV calls, either in the template list or in the function arguments.

Again, the easiest would be if we were sure that FFV_2_5_0 calls always come with "-COUPs[3]" or similar (the first coupling) with a minus sign, then the minus sign could be moved inside the implementation. But this is very unlikely.

I think we most likely need an extra argument. I can "easily" do that in the cudacpp code generation, but its tedious and error prone, I would need around one week I think. Essentially:

FFV2_5_0<W_ACCESS, A_ACCESS, CI_ACCESS>( w_fp[3], w_fp[14], w_fp[11], -COUPs[3], COUPs[5], &amp_fp[0] );

should become either this, with one extra template argument, for instance differemt accessors if the coupling must get an extra minus sign or not,

FFV2_5_0<W_ACCESS, A_ACCESS, CI_ACCESSm, CI_ACCESSp>( w_fp[3], w_fp[14], w_fp[11], COUPs[3], COUPs[5], &amp_fp[0] );

or something like

FFV2_5_0<W_ACCESS, A_ACCESS, CI_ACCESS>( w_fp[3], w_fp[14], w_fp[11], -1, COUPs[3], +1, COUPs[5], &amp_fp[0] );

Any other thoughts? Thanks
Andrea

@oliviermattelaer
Copy link
Member

oliviermattelaer commented May 22, 2023 via email

@valassi
Copy link
Member Author

valassi commented May 22, 2023

Hi Olivier, thanks a lot. This clarifies.

The presence of the "-" is technically an optimization trick. In the past, we never had such minus sign but had two coupling for example: GC_11 = aS GC_12 = -aS and in that way FFV was called with either GC_11 or GC_12. In modern computation GC_12 does not exist anymore and we pass "-GC_11" instead. This can happen both for coupling dependent and/or independent of alpha_s.

Ok then we definitely cannot assume that this only happens for independent couplings. So my option 2 would not work, we need some moregeneral solution as in my option 3.

That's one solution, the other one is to "undo" the optimization and to restore those "GC_12" within the model... (This should be simple but will have a cost in term of memory)

Yes I agree. I think this is probably the best solution. It is essentially equivalent to one of the ideas I threw on the table (keep both +coupling and -coupling in two separate arrays), but the advantage is that you can do that with the physics in mind and maybe this still has some optimizations?

I mean, if you know that in a process you need only GC11=+aS and you do not need GC12=-aS, I guess you would only use GC11 and would not introduce GC12?

Is there a quick way I can 'disable' the optimization at code generation time?

Thanks!
Andrea

@oliviermattelaer
Copy link
Member

oliviermattelaer commented May 22, 2023 via email

@valassi
Copy link
Member Author

valassi commented May 22, 2023

Here is a dirty patch (that will impact everything... so we clearly want something cleaner but as a quick work around it should be ok)

Thanks Olivier, I will give it a try. Just speculating, in order to make a "cleaner" patch can we pass somehow an option to the code in import_ufo? I see there is this for instance

if 'allow_qed_cms' in options 

Are these options something you specify when you import a model? It would force users to add something to the import model, but it would already be better I think.

Even better, it would be nice to do this only inside the cudacpp (and not in the fortran only part), but I imagine this is really difficult. I guess we import the model once and that's it, so for instance when we create both fortran and cudacpp MEs then the sam emodel must be used for both...?

@valassi
Copy link
Member Author

valassi commented Jun 2, 2023

Hi @oliviermattelaer I am keeping this for later, but while working on something else I noticed this in ggtt logs for instance

�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_100', 1), ('GC_104', 1), ('GC_108', 1), ('GC_40', 1), ('GC_41', 1), ('GC_45', 1), ('GC_49', 1)  �[0m
�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_21', 1), ('GC_27', -1)  �[0m
�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_15', 1), ('GC_30', -1)  �[0m
�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_38', 1), ('GC_39', -1)  �[0m
�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_3', 1), ('GC_4', -1)  �[0m
�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_50', 1), ('GC_51', -1)  �[0m
�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_54', 1), ('GC_56', -1)  �[0m
�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_66', 1), ('GC_67', -1)  �[0m
�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_70', 1), ('GC_73', -1)  �[0m
�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_74', 1), ('GC_75', -1)  �[0m
�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_77', 1), ('GC_78', -1)  �[0m
�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_76', 1), ('GC_79', -1)  �[0m
�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_7', 1), ('GC_9', -1)  �[0m
�[1;32mDEBUG:  Fuse the Following coupling (they have the same value): ('GC_96', 1), ('GC_97', -1)  �[0m

Is this 'fuse the coupling' the feature you were describing above, or is it something else? (By the way is this 'fusing' something that only applies to independent couplings or also to alphas-dependent couplings?)
Thanks

@oliviermattelaer
Copy link
Member

Yes it is that feature.
And yes this can apply to all type of coupling

@valassi
Copy link
Member Author

valassi commented Jun 5, 2023

Great, thanks a lot

valassi added a commit to valassi/madgraph4gpu that referenced this issue Jun 12, 2023
…las" and "cms" process examples added by Stefan but move them to the end of the list

Note: the net effect of this MR is to add atlas and cms to the generateAndCompare.sh script, without fixing their builds
- the cms process is affected by the "unary minus" issue (madgraph5#628)
- the atlas process must be built with HRDCOD=1 because it has a modified SM with zero b mass (similar to madgraph5#616)
valassi added a commit to valassi/madgraph4gpu that referenced this issue Jun 12, 2023
…las" and "cms" process examples added by Stefan but move them to the end of the list

Note: the net effect of this MR is to add atlas and cms to the generateAndCompare.sh script, without fixing their builds
- the cms process is affected by the "unary minus" issue (madgraph5#628)
- the atlas process must be built with HRDCOD=1 because it has a modified SM with zero b mass (similar to madgraph5#616)
@valassi
Copy link
Member Author

valassi commented Aug 11, 2023

Discussion today:

  1. For the moment, a quick way to recover functionality (while losing performance) is to disable the coupling fusing optimization. @oliviermattelaer points out that this must be knwon at the time of importing the model. So it must either as an option to pass to the 'import' command (possible already??) or as a set command to execute before import (to do Olivier)
  2. Later on, we can think if we can port the couling fusing optimization also to GPU and SIMD... but nt for the release.

valassi added a commit to valassi/madgraph4gpu that referenced this issue Oct 28, 2023
…Olivier's patch for unary minus madgraph5#628

All FFV functions now have an extra argument Ccoeff in their signature.

I checked with a quick test that tput and tmad tests look ok for gg_tt (logs not saved)
  ./tput/teeThroughputX.sh -ggtt -makeclean -makej
  ./tmad/teeMadX.sh -ggtt +10x
There appears to be a ~10% improvement in performance in ggtt tests? To be tested on the other processes...

NB: I also quickly tried to see if the unary minus isse itself has disappeared.
This was failing with 'unary minus' issues before Olivier' patch, it now fails elsewhere after the patch.
It seems that the 'unary minus' issue madgraph5#628 itself has been fixed.
  ./CODEGEN/generateAndCompare.sh susy_gg_tt
  cd susy_gg_tt.sa/SubProcesses/P1_Sigma_MSSM_SLHA2_gg_ttx/
  make -j HRDCOD=1 |& grep unary
valassi added a commit to valassi/madgraph4gpu that referenced this issue Oct 28, 2023
@valassi valassi linked a pull request Nov 24, 2023 that will close this issue
@valassi
Copy link
Member Author

valassi commented Nov 24, 2023

This was fixed in Olivier's patch a062c0f as part of PR #764. It can be closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants