New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ParticleNetAK4 jet tagger #31570
ParticleNetAK4 jet tagger #31570
Conversation
The code-checks are being triggered in jenkins. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-31570/18600
|
A new Pull Request was created by @hqucms (Huilin Qu) for master. It involves the following packages: PhysicsTools/PatAlgos @perrotta, @jpata, @cmsbuild, @santocch, @slava77 can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
please test |
The tests are being triggered in jenkins.
|
@cmsbuild please abort (we need to include the data PR, we were just discussing this PR in the reco chat) |
Jenkins tests are aborted. |
test parameters:
|
@cmsbuild please test |
The tests are being triggered in jenkins.
|
Comparison is ready Comparison Summary:
|
Reco outputs are only in jet id variables pairDiscriVector 19-24, as expected. Timing on reMINIAOD phase2 ttbar workflow 1325.518 (1000 local events) goes up by ~4% 0.453879 s/ev -> 0.471795 s/ev. This is fine, but it is not negligible, so should be noted. TimeReports are attached here: |
Looking at the report, the particleNetAK4 adds up to about 28 ms, which would be about 6%. (it would be nice to see it in the summary quoted above). |
I don't see it as a blocker for this PR today, but it would be good to see effort for all DNNs entering production to improve the network by trimming the network size, as well doing inference with reduced precision (as Slava suggests). |
@slava77 @jpata For now, the easiest way to speed up the inference is to enable the dynamic architecture feature of ONNXRuntime, then we can get a ~1.5-2x speed-up for free on all models using ONNXRuntime whenever AVX/AVX2/AVX512 is available (and still be able to run on older machine w/ only SSE). The price to pay is numerical precision level difference in the results due to the use of different instructions. For future developments one can try e.g., applying some preselection on the jet constituent particles, or doing a systematic network architecture search to reduce the inference time, but all these take substantial amount of time/resources and go beyond the scope of this PR. |
+reconstruction
|
@hqucms can you clarify if a backport of this is planned? |
@jpata Yes, we plan to backport this to 106X for UL. |
urgent |
@silviodonato |
@andrzejnovak I think rounding to 1e-4 should cover the difference in most cases, but there can still be some exceptions. |
merge |
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged. |
PR description:
This PR adds the ParticleNet tagger for AK4 jets.
ParticleNetAK4
is a multi-class tagger forThe current version is trained on standard AK4 CHS jets using UL18 MC. The tagger is based on the ParticleNet graph neural network architecture, which is also used in CMSSW for boosted AK8 jet tagging.
The new
ParticleNetAK4
tagger shows significant performance improvements:More details and comparisons can be found in the presentations in the BTV [1, 2] and the JME [1, 2) groups.
Requires:
cms-data/RecoBTag-Combined#35
PR validation:
Implementation of this PR has been verified with the training framework and shows consistent results.
[Timing]
Evaluated by running 1k ttbar events using
RecoBTag/ONNXRuntime/test/test_particle_net_ak4_cfg.py
.For comparison, below is for DeepJet:
[Timing for 1325.518]: #31570 (comment)
Timing for UL reMINIAOD workflow 1325.518 increases by ~6% (0.453879 s/ev -> 0.471795 s/ev).
[Memory]
Model init (1.8MB) + execution (3.5MB):
FYI @camclean @alefisico