Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

master_june24: wrong memory access for channelIds in C++ sigmakin (missing ieventAccessRecordConst) #911

Closed
valassi opened this issue Jul 15, 2024 · 4 comments · Fixed by #882
Assignees
Milestone

Comments

@valassi
Copy link
Member

valassi commented Jul 15, 2024

Another issue introduced in #830 and being reviewed in #882.

This is a bug in sigmaKin. It is the same as #899 in calculate_wavefunctions. It needs a separate fix at a later time, Ikeep this separate.
https://github.com/valassi/madgraph4gpu/blob/8e312bc02d9d072615fcb1052b5db54754498517/epochX/cudacpp/gg_tt.mad/SubProcesses/P1_gg_ttx/CPPProcess.cc#L1035


#ifdef MGONGPU_SUPPORTS_MULTICHANNEL
      calculate_wavefunctions( ihel, allmomenta, allcouplings, allMEs, channelIds, allNumerators, allDenominators, jamp2_sv );
#else
      calculate_wavefunctions( ihel, allmomenta, allcouplings, allMEs, jamp2_sv );
#endif
      MEs_ighel[ighel] = allMEs[ievt];
    }
    // Event-by-event random choice of helicity #403
    //printf( "sigmaKin: ievt=%4d rndhel=%f\n", ievt, allrndhel[ievt] );
    for( int ighel = 0; ighel < cNGoodHel; ighel++ )
    {
      if( allrndhel[ievt] < ( MEs_ighel[ighel] / MEs_ighel[cNGoodHel - 1] ) )
      {
        const int ihelF = cGoodHel[ighel] + 1; // NB Fortran [1,ncomb], cudacpp [0,ncomb-1]
        allselhel[ievt] = ihelF;
        //printf( "sigmaKin: ievt=%4d ihel=%4d\n", ievt, ihelF );
        break;
      }
    }
#ifdef MGONGPU_SUPPORTS_MULTICHANNEL
    // Event-by-event random choice of color #402
    if( channelIds[0] != 0 ) // no event-by-event choice of color if channelId == 0 (fix FPE #783)
    {
      const unsigned int iconfigC = mgOnGpu::channelId_to_iconfigC[CID_ACCESS::kernelAccessConst( channelIds )]; // coloramps.h uses a channel ordering not the diagram id
@oliviermattelaer
Copy link
Member

Ok I guess this is the same issue that @roiser assumed nb_warp=1 for SIMD case (which is in principle fine, but better not rellying on that, especially for the "m" mode)

Thanks for pointing that issue,

Olivier

valassi added a commit to valassi/madgraph4gpu that referenced this issue Jul 17, 2024
…n channelIds and allChannelIds also in sigmaKin (fix madgraph5#899; wip on madgraph5#911)
valassi added a commit to valassi/madgraph4gpu that referenced this issue Jul 17, 2024
…the fact madgraph5#898 that channelids are the same inside each warp

This also fixes all pending CID_ACCESS::ieventAccessRecordConst calls madgraph5#899 and madgraph5#911

Note: CPPProcess.cc is now again very close to upstream/master (most of master_june24 changes from madgraph5#830 have been undone)
@oliviermattelaer oliviermattelaer added this to the warp milestone Jul 18, 2024
@valassi
Copy link
Member Author

valassi commented Jul 19, 2024

Thanks for pointing that issue,

Olivier

Thanks to you Olivier.

Ok I guess this is the same issue that @roiser assumed nb_warp=1 for SIMD case (which is in principle fine, but better not rellying on that, especially for the "m" mode)

Well I am not sure that it can be described like this, I think it is much more complex.

Technically, the code from 830 was totally missing a distinction between 'allChannels' and 'channels' and the call to ieventAccessRecordConst (which could have been copied from the other accessors). My first implementation, of which I was not sure myself (ievt00 or something else?), was essentially here ea2146b

Also, I think that assuming nb_warp=1 is a very bad idea. And at the very least, any assumption in the code must be accompanied by a sanity check of that assumption. Otherwise the code just becomes a mess, and may "randomly" (i.e. by sheer luck) seem to pass some tests.

Specifically, you are completely right that the mixed case is more complex. I added several tests in #924 #925 and realised that my first implementation with ievt00 was incomplete. In the mixed case the code (at least, in the version I implemented which is as-is that from upstream/master, there are other choices but unnecessarily complex) assumes that TWO SIMD vectors have the same channelId, not just one. There were two extra things needed, done in #924, essentially here valassi@9e77b2b

  • added a second ieventAccessRecordConst accessor, at ievt00+neppV and not at ievt00
  • added a SANITY CHECK (see my comment above about 'assumptions') that two vectors of channels use the same channelid
  • and I also used my newly developed tests to see that the fix transforms a test FAILED (i.e. a silent physics corruption in user usage) into an assertion (i.e. a warning that the input parameters must be changed by users)

This can be closed. (Note also that this is related to but different from #899, where another ieventAccessRecordConst call was missing, but this time in calculate_wavefunctions)

@valassi valassi closed this as completed Jul 19, 2024
@oliviermattelaer
Copy link
Member

assumes that TWO SIMD vectors have the same channelId, not just one.

Would be better to not have such assumptions (I do not see a real need for it so I'm curious of why you are doing it)
(sorry to react here, not sure where to write it)

@valassi
Copy link
Member Author

valassi commented Jul 19, 2024

Olivier, it takes WORK to do this. Lets move this after master_june24 can we? I opened #926.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants