-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
master_june24: missing sanity checks that all channelids are the same in a "warp" (or at least inside each SIMD vector) #898
Comments
PS On second thought, what I will do is add a sanity check that channelids are all the same inside a SIMD vector. For GPUs, this is not strictly necessary: the implementation can be made to be functional anyway, but it will be slower. Example. Take hardware SIMD vectors of 4 elements and hardware GPU warps of 32 threads. If every channelid is different for those elements, then:
|
Small complication/question: for mixed fptype the check must ensure that TWO double vectors have the same channelId? Or probably not, I must check. |
As discussed in #892 (comment), I am moving to an implementation where calculate_wavefunction and sigmaKin have two differenbt interfaces, in line with what they do
With respect to FPTYPE=m, the point to notice is the following, eg take avx2 with 8 floats and 4 doubles
|
NB: the complication here is that uint_v has 4 ints, while there is the need to check 8 ints, so probably this needs to be done with two separate vectors. I will check... |
Concerning:
I do not see any problem if the two vector will have different channelID in the "m" type. On the contrary it is better to have different one. The point is that the color computation (only part done in float) does not depend of the channelId. Obviously in that case, you can not have "nb_warp" set to 1 but this is the only limitation. |
…use channels 1,2,3,1,2,3... for different events (previously it was 1 for all events) NB1: the cuda test now fails, the reference file must be recreated NB2: I expect the SIMD tests to fail using the CUDA reference, due to the different bugs in the current channelId implementation NB3: eventually madgraph5#898 the implementation should enforce that all events in a warp use the same channelid
…channel test madgraph5#896 to use channels 1,2,3,1,2,3... for different events (previously it was 1 for all events) NB1: the cuda test now fails, the reference file must be recreated NB2: I expect the SIMD tests to fail using the CUDA reference, due to the different bugs in the current channelId implementation NB3: eventually madgraph5#898 the implementation should enforce that all events in a warp use the same channelid
…ph5#894 is SIMD handling of numertaors and denominators Now the tests madgraph5#898 succeed when using channels123123123... for bck in none sse4 avx2 512y 512z cuda; do echo BACKEND=$bck; ./build.${bck}_d_inl0_hrd0/runTest_*.exe; done | egrep '(BACKEND| ME |r.ME> BACKEND=none BACKEND=sse4 BACKEND=avx2 BACKEND=512y BACKEND=512z BACKEND=cuda Note: the debug printout gives DEBUG: MEKB processed 512 events across 3 channels { 1 : 172, 2 : 170, 3 : 170 }
…cks madgraph5#898 that channelids are the same inside each warp As expected, the tests madgraph5#898 using 123123123 etc fail the assertions for bck in none sse4 avx2 512y 512z cuda; do echo BACKEND=$bck; ./build.${bck}_d_inl0_hrd0/runTest_*.exe; done |& egrep '(BACKEND|Assert)' BACKEND=none BACKEND=sse4 runTest_cpp.exe: CPPProcess.cc:335: void mg5amcCpu::calculate_wavefunctions(int, const fptype*, const fptype*, mgOnGpu::fptype*, const unsigned int*, mgOnGpu::fptype*, mgOnGpu::fptype*, mg5amcCpu::fptype_sv*, int): Assertion `channelId == channelIds_sv[i]' failed. BACKEND=avx2 runTest_cpp.exe: CPPProcess.cc:335: void mg5amcCpu::calculate_wavefunctions(int, const fptype*, const fptype*, mgOnGpu::fptype*, const unsigned int*, mgOnGpu::fptype*, mgOnGpu::fptype*, mg5amcCpu::fptype_sv*, int): Assertion `channelId == channelIds_sv[i]' failed. BACKEND=512y runTest_cpp.exe: CPPProcess.cc:335: void mg5amcCpu::calculate_wavefunctions(int, const fptype*, const fptype*, mgOnGpu::fptype*, const unsigned int*, mgOnGpu::fptype*, mgOnGpu::fptype*, mg5amcCpu::fptype_sv*, int): Assertion `channelId == channelIds_sv[i]' failed. BACKEND=512z runTest_cpp.exe: CPPProcess.cc:335: void mg5amcCpu::calculate_wavefunctions(int, const fptype*, const fptype*, mgOnGpu::fptype*, const unsigned int*, mgOnGpu::fptype*, mgOnGpu::fptype*, mg5amcCpu::fptype_sv*, int): Assertion `channelId == channelIds_sv[i]' failed. BACKEND=cuda
…the fact madgraph5#898 that channelids are the same inside each warp This also fixes all pending CID_ACCESS::ieventAccessRecordConst calls madgraph5#899 and madgraph5#911 Note: CPPProcess.cc is now again very close to upstream/master (most of master_june24 changes from madgraph5#830 have been undone)
…h5#898 for all processes (using the CODEGEN/recreateRefs.sh script) Note: some processes fail (some of the fixes in upstream/master may be needed...) [ RUN ] SIGMA_SM_GG_TTXGG_GPU_MULTICHANNEL.compareMomAndME DEBUG: MatrixElementKernelDevice::computeMatrixElements 0x1420650 T 256 runTest_cuda.exe: GpuRuntime.h:26: void assertGpu(cudaError_t, const char*, int, bool): Assertion `code == gpuSuccess' failed. [ RUN ] SIGMA_SM_GG_TTXGGG_GPU_MULTICHANNEL.compareMomAndME DEBUG: MatrixElementKernelDevice::computeMatrixElements 0x3a1fe00 T 256 runTest_cuda.exe: GpuRuntime.h:26: void assertGpu(cudaError_t, const char*, int, bool): Assertion `code == gpuSuccess' failed. [----------] 1 test from SIGMA_SM_GG_TTXGG_GPU_MULTICHANNEL [ RUN ] SIGMA_SM_GG_TTXGG_GPU_MULTICHANNEL.compareMomAndME DEBUG: MatrixElementKernelDevice::computeMatrixElements 0x15e77f0 T 256 runTest_cuda.exe: GpuRuntime.h:26: void assertGpu(cudaError_t, const char*, int, bool): Assertion `code == gpuSuccess' failed.
The sanity check on th emain/first channelId SIMD vector is now implemented, essentially here (with many other things) ea2146b Note: in #924 I added the sanity check that TWO SIMD vectors have the same channelId in mixed mode, essentially here valassi@9e77b2b
Thanks Olivier. See the extensive discussion I added in #911 (comment) I was going to make many comments here about the mixed fptype, but in the end I made them here #926 (comment) The sanity checks are done (in PR #882), both for one vector and (in mixed mode) for two vectors. This can be closed. |
Another issue introduced in#830 and being reviewed in #882
This is related to (almost a sub-component of) the SIMD issue #894 but I prefer to address it separately.
As clarified during the meeting today and #894 (comment) (thanks @roiser), the cudacpp implementation should assume that all channelids are the same in a warp. To avoid all unexpected behaviour due to hidden assumption, a sanity check (an asser essentially is needed). I think we had discussed this in the past, maybe I will dig that out. Anyway, this is a prerequisite to simplifying the SIMD implementation.
I file this as a separate issue also because it may be appropriate to expose these assumptions in the fortan-cudacpp bridge, so that they have the same assumptions. (Example: warp_size can be modified in the runcard, there must be a sanity check on allowed values).
The text was updated successfully, but these errors were encountered: