Add T5 DNN by jkguiang · Pull Request #291 · SegmentLinking/TrackLooper

jkguiang · 2023-05-23T19:50:24Z

Summary

Added hard-coded T5 DNN to T5-building algorithm in place of the chi-squared cuts (see these slides for more info). The DNN gives a significant reduction in the T5 fake rate (particularly in the barrel) with only a slight change to the efficiency. This can be seen by comparing the efficiency plots in slide 8 (original master) and slide 9 (this PR) in the aforementioned slides. Finally, the current implementation of the DNN does significantly affect the overall runtime of the T5 building step (see the "Timing" section below).

Timing

This PR

Total Timing Summary
   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Event      Short           Rate
   avg      1.6      2.0      1.4      3.3      6.1      1.4      1.3      1.0      2.1      20.3      17.3+/-  3.9      21.9   explicit_cache[s=1]
   avg      4.7      3.7      2.0      5.4      8.1      1.5      3.0      1.9      3.6      33.8      27.7+/-  6.4      19.5   explicit_cache[s=2]
   avg      5.5      5.4      3.3     10.5     12.8      1.7      7.0      3.3      6.3      55.8      48.5+/- 10.2      15.1   explicit_cache[s=4]
   avg     11.6      7.3      4.7     14.9     17.8      2.2     11.0      5.4      9.6      84.5      70.7+/- 15.8      15.1   explicit_cache[s=6]
   avg     17.1      8.7      5.9     20.0     24.0      2.8     15.1      7.5     12.9     114.1      94.2+/- 18.8      15.1   explicit_cache[s=8]

Original master

Total Timing Summary
   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Event      Short           Rate
   avg      1.6      2.0      1.5      3.4      3.4      1.4      1.4      1.0      2.2      17.9      14.9+/-  3.5      19.1   explicit_cache[s=1]
   avg      3.4      3.1      1.8      4.9      4.6      1.5      2.7      1.7      3.5      27.2      22.3+/-  5.9      16.2   explicit_cache[s=2]
   avg      5.8      5.1      3.0      8.9      8.3      1.8      5.8      3.0      6.3      47.9      40.4+/-  7.8      12.9   explicit_cache[s=4]
   avg     10.8      6.3      3.9     12.2     11.6      2.1      8.7      4.7      9.1      69.5      56.5+/- 12.2      12.6   explicit_cache[s=6]
   avg     15.2      8.3      5.3     16.6     16.6      2.3     12.7      6.1     12.1      95.1      77.6+/- 11.1      12.3   explicit_cache[s=8]

VourMa · 2023-05-23T20:13:10Z

Quick comment by just scrolling through the changes: could we add the hardcoded arrays in SDL/Constants.cu? I feel like it is much more appropriate to include them there rather than injecting them in the kernel.

slava77 · 2023-05-23T20:18:36Z

+
+    // (0): Linear(in_features=38, out_features=32, bias=True) => x = x*W_T + b
+    float bias_0[32] = {
+        -0.453155100345611572265625000000,-3.414395570755004882812500000000, 0.461477220058441162109375000000,-0.154097318649291992187500000000,-0.470577239990234375000000000000, 2.228851079940795898437500000000,-0.163815096020698547363281250000,-2.296737670898437500000000000000,-0.098085217177867889404296875000, 2.589857339859008789062500000000, 2.105398178100585937500000000000,-0.525056004524230957031250000000, 1.790417790412902832031250000000,-2.162935256958007812500000000000, 0.012643844820559024810791015625,-1.595005750656127929687500000000, 0.222240477800369262695312500000,-3.097779273986816406250000000000,-0.748212277889251708984375000000, 1.027869462966918945312500000000,-0.504925668239593505859375000000, 0.079979300498962402343750000000,-0.680445253849029541015625000000, 1.331984400749206542968750000000, 2.027832508087158203125000000000,-0.063643321394920349121093750000,-0.304876327514648437500000000000,-0.020376743748784065246582031250, 2.847964286804199218750000000000,-1.884062528610229492187500000000, 2.168398141860961914062500000000,-2.673130989074707031250000000000


for longer term, we should see how to load these from a file.
For this PR, I'd suggest to format the values in float precision to have a more compact code.

For the hackathon, I'm curious how much of this can be sped up by using half-precision and/or tensor operations

GNiendorf · 2023-05-23T21:27:33Z

Quick comment by just scrolling through the changes: could we add the hardcoded arrays in SDL/Constants.cu? I feel like it is much more appropriate to include them there rather than injecting them in the kernel.

Could we keep them in their own separate (new) file maybe? Would make my life a lot easier when rebasing, and would probably be cleaner <3

…e sample

jkguiang · 2023-05-23T21:28:40Z

Just saw Gavin's comment, I will add another .cu file to contain the hard-coded weights and biases.

VourMa · 2023-05-24T08:38:22Z

+        mdsInGPU.anchorZ[fifthMDIndex],                               // outer (hit 5) t3_4_z
+        sqrtf(x5*x5 + y5*y5),                                         // outer (hit 5) t3_4_r
+        float(modulesInGPU.layers[lowerModuleIndex5] + 6*is_endcap5), // outer (hit 5) t3_4_layer
+        log10((innerRadius+outerRadius)*3.8f*1.602f/(2*100*5.39f)),   // t5_pt


What are those numbers? If they are constants, could they be added in the proper file and included here? If they are some temporary, magic numbers, could they be added in this file with a descriptive name (and possibly a descriptive comment), so that one can follow the calculations?

…r constants from Constants.cuh

VourMa

Thanks for the updates! PR looks good to me, so I am approving.

I saw that the validation is included in the slides but am I ask to add a summary comment about the performance in PR description, so that one doesn't have to have through the full set of slides to understand the changes?

Same goes for the timing, which could also be improved in the slides by removing the "explicit" only results (which we don't usually look at), mentioning which timing corresponds to this PR and which to the master and explaining which numbers are used the ΔΤ5 column. Thanks!

GNiendorf · 2023-05-26T16:02:38Z

Is the SDL::passChiSquaredConstraint function no longer used now that this PR is merged in? If so, should we remove it from the code? Along those lines, is there any other code that can be removed from this (maybe some variables that no longer need to be calculated that were used for the previous code)?

slava77 · 2023-05-26T17:38:39Z

Is the SDL::passChiSquaredConstraint function no longer used now that this PR is merged in? If so, should we remove it from the code? Along those lines, is there any other code that can be removed from this (maybe some variables that no longer need to be calculated that were used for the previous code)?

there was also a point to get rid of all regression computation, since it's not really used (or not supposed to).

VourMa · 2023-05-29T19:15:05Z

Timing on lnx7188 (V100) - DNN timing is not really worse here...

Before DNN:

Total Timing Summary
   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Event      Short           Rate
   avg      4.4      3.2      4.1      3.2      3.3      1.2      1.9      1.4      3.6      26.1      20.5+/-  3.3      29.5   explicit_cache[s=1]
   avg      7.5      5.3      4.7      6.0      5.1      1.5      3.7      2.4      7.1      43.3      34.3+/-  6.4      25.8   explicit_cache[s=2]
   avg     16.3      8.6      5.9     12.8     10.0      2.1      8.3      4.9     13.5      82.4      63.9+/-  9.9      22.0   explicit_cache[s=4]
   avg     24.7     11.5      7.9     17.8     14.5      2.7     12.4      7.2     18.2     116.9      89.6+/- 17.7      22.0   explicit_cache[s=6]
   avg     37.0     15.2     10.2     22.0     20.3      3.5     18.1     10.3     24.8     161.3     120.8+/- 21.8      22.2   explicit_cache[s=8]

After DNN:

Total Timing Summary
   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Event      Short           Rate
   avg      4.4      3.1      3.4      3.0      3.5      1.2      1.8      1.4      3.5      25.4      19.9+/-  3.5      29.2   explicit_cache[s=1]
   avg      7.0      4.8      4.0      5.6      4.9      1.4      3.4      2.1      6.1      39.3      30.9+/-  6.8      26.1   explicit_cache[s=2]
   avg     15.8      9.1      7.8     14.0     11.8      2.1      8.5      4.9     13.0      87.0      69.0+/- 11.0      23.5   explicit_cache[s=4]
   avg     24.7     11.8      8.0     17.5     14.1      2.7     13.5      7.6     17.5     117.5      90.1+/- 19.4      22.4   explicit_cache[s=6]
   avg     37.0     14.9      9.7     23.2     20.5      3.7     18.0     10.6     23.9     161.6     120.8+/- 22.0      21.9   explicit_cache[s=8]

VourMa · 2023-05-29T19:20:46Z

DNN performance within CMSSW:
LST w/o DNN -> Blue
LST w/ DNN -> Red

PU200 sample: https://uaf-10.t2.ucsd.edu/~evourlio/SDL/DNNv1InCMSSW_LSTvsLSTDNNOnly/
Cube5 sample: https://uaf-10.t2.ucsd.edu/~evourlio/SDL/DNNv1InCMSSW_cube5_LSTvsLSTDNNOnly/

slava77 · 2023-05-30T00:23:53Z

Timing on lnx7188 (V100) - DNN timing is not really worse here...

could it be that the code updates during the review are also responsible?

jkguiang · 2023-05-30T16:54:31Z

I re-ran sdl_timing PU200 today on cgpu-1 with a fresh clone of the current master branch:

Total Timing Summary
   Evt    Hits       MD       LS      T3       T5       pLS       pT5      pT3      TC       Event      Short           Rate
   avg      1.7      2.0      1.5      3.4      3.7      1.4      1.4      1.0      2.1      18.1      15.1+/-  3.5      19.7   explicit_cache[s=1]
   avg      2.6      3.5      1.8      4.9      5.0      1.5      2.8      1.6      3.5      27.2      23.1+/-  4.9      14.3   explicit_cache[s=2]
   avg      4.9      5.2      3.1      9.3      8.9      1.7      6.4      3.2      6.1      48.8      42.1+/-  7.6      12.7   explicit_cache[s=4]
   avg      9.7      6.7      4.3     12.8     12.8      2.2      9.6      4.8      9.2      72.1      60.1+/- 12.0      12.7   explicit_cache[s=6]
   avg     16.9      9.1      6.3     18.5     18.2      2.6     13.3      6.7     12.1     103.6      84.1+/- 16.6      13.7   explicit_cache[s=8]

These seem more consistent with what Manos sees on the Cornell machines. I'm not sure what is different from when I ran the timing last time, but I'm not complaining 😄

VourMa · 2023-05-30T17:30:10Z

I don't see any smoking gun in the updates for the decrease in the timing. @jkguiang could it be that you measured the timing for the master and the DNN-version at very different times?

added T5 DNN

a1ff80c

slava77 reviewed May 23, 2023

View reviewed changes

moved hard-coded weights and biases to Constants.cu

7a20f2a

commented out unnecessary print statement that segfaults for e.g. cub…

755c9c1

…e sample

moved hard-coded DNN weights and biases to a separate file

c8b36fd

VourMa reviewed May 24, 2023

View reviewed changes

jkguiang added 4 commits May 24, 2023 14:19

made comments more descriptive, switched some hard-coded constants fo…

f67d8fb

…r constants from Constants.cuh

deleted commented-out print statements

065668a

merged NeuralNetwork.cu and .cuh files

3ddae37

added some more comments

3b63fa2

jkguiang commented May 25, 2023

View reviewed changes

Comment thread SDL/Quintuplet.cu

VourMa approved these changes May 26, 2023

View reviewed changes

VourMa merged commit 3c0ba9a into SegmentLinking:master May 26, 2023

jkguiang mentioned this pull request Jul 11, 2023

Updated/refactored T5 DNN #304

Merged

Conversation

jkguiang commented May 23, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Timing

This PR

Original master

Uh oh!

VourMa commented May 23, 2023

Uh oh!

slava77 May 23, 2023

Choose a reason for hiding this comment

Uh oh!

GNiendorf commented May 23, 2023

Uh oh!

jkguiang commented May 23, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

VourMa May 24, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

VourMa left a comment

Choose a reason for hiding this comment

Uh oh!

GNiendorf commented May 26, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

slava77 commented May 26, 2023

Uh oh!

VourMa commented May 29, 2023

Uh oh!

VourMa commented May 29, 2023

Uh oh!

slava77 commented May 30, 2023

Uh oh!

jkguiang commented May 30, 2023

Uh oh!

VourMa commented May 30, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jkguiang commented May 23, 2023 •

edited

Loading

GNiendorf commented May 26, 2023 •

edited

Loading