Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New ParticleNet training for UL #31036

Merged
merged 1 commit into from Aug 6, 2020
Merged

New ParticleNet training for UL #31036

merged 1 commit into from Aug 6, 2020

Conversation

hqucms
Copy link
Contributor

@hqucms hqucms commented Aug 3, 2020

PR description:

This PR updates the ParticleNet tagger to the new training [V01] developed for the UL re-MiniAOD. The training is derived on UL17+UL18 samples and using Puppi tune V14. The new training improves the performance for UL samples and the new Puppi tune. More information can be found in the JME talks [1, 2] the BTV talk.

Requires: cms-data/RecoBTag-Combined#34

The new ParticleNet models use the "dynamic axis" feature of ONNX to avoid zero padding the particle/SV sequence, thus reduce the inference time by more than a factor of two compared to the V00 models.

[V00]

TimeReport   0.071098     0.071098     0.071098  pfMassDecorrelatedParticleNetJetTags
TimeReport   0.068718     0.068718     0.068718  pfParticleNetJetTags

[V01]

TimeReport   0.029899     0.029899     0.029899  pfMassDecorrelatedParticleNetJetTags
TimeReport   0.029965     0.029965     0.029965  pfParticleNetJetTags

(Measured on a ZprimeToTT_M1000_W10_TuneCP2_13TeV-madgraphMLM-pythia8 sample.)

PR validation:

The CMSSW implementation is compared to the training framework and consistent results are obtained.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 3, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 3, 2020

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-31036/17530

  • This PR adds an extra 12KB to repository

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 3, 2020

A new Pull Request was created by @hqucms (Huilin Qu) for master.

It involves the following packages:

RecoBTag/ONNXRuntime

@perrotta, @jpata, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks.
@emilbols, @smoortga, @JyothsnaKomaragiri, @mverzett, @ferencek, @andrzejnovak this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@silviodonato
Copy link
Contributor

please test with cms-data/RecoBTag-Combined#34

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 4, 2020

The tests are being triggered in jenkins.
Tested with other pull request(s) cms-data/RecoBTag-Combined#34

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 4, 2020

+1
Tested at: 33099a5
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4c1bb6/8554/summary.html
CMSSW: CMSSW_11_2_X_2020-08-03-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 4, 2020

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 4, 2020

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4c1bb6/8554/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 1114 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2525448
  • DQMHistoTests: Total failures: 5
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 2525395
  • DQMHistoTests: Total skipped: 47
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 33 files compared)
  • DQMHistoSizes: changed ( 10224.0 ): 0.004 KiB MessageLogger/Warnings
  • Checked 144 log files, 17 edm output root files, 34 DQM output files

@slava77
Copy link
Contributor

slava77 commented Aug 6, 2020

The new ParticleNet models use the "dynamic axis" feature of ONNX to avoid zero padding the particle/SV sequence, thus reduce the inference time by more than a factor of two compared to the V00 models.

what about memory?

I noticed that the .onnx file size is larger now
2.26 MB ParticleNetAK8/General/V01/particle-net.onnx
.. vs 1.47 MB ParticleNetAK8/General/V00/ParticleNet.onnx

@slava77
Copy link
Contributor

slava77 commented Aug 6, 2020

+1

for #31036 33099a5

  • code changes are in line with the PR description
  • jenkins tests pass and comparisons with the baseline show differences only in miniAOD AK8 jets discriminants. Manual check in wf 136.88811 (100 events JetHT 2018 wf) shows differences only in particleNet tags (as expected) with some of the larger differences in jet Hbb and Hcc discriminators, in line with the slides provided in the PR description
    e.g. pfParticleNetJetTags:probHcc
    all_OldVSNew_RunJetHT2018DreMINIAODULwf136p88811c_min2,max-2,patJets_slimmedJetsAK8__PAT_obj___pairDiscriVector__57__second260

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 6, 2020

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @silviodonato, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@silviodonato
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 244938c into cms-sw:master Aug 6, 2020
@hqucms
Copy link
Contributor Author

hqucms commented Aug 7, 2020

The new ParticleNet models use the "dynamic axis" feature of ONNX to avoid zero padding the particle/SV sequence, thus reduce the inference time by more than a factor of two compared to the V00 models.

what about memory?

I noticed that the .onnx file size is larger now
2.26 MB ParticleNetAK8/General/V01/particle-net.onnx
.. vs 1.47 MB ParticleNetAK8/General/V00/ParticleNet.onnx

@slava77
The memory is indeed a bit higher -- V01 is 6.7M (init) + 33M (runtime) now [1] compared to V00's 6M+15M [2]. But since the runtime memory does not scale w/ the number of threads for ONNXRuntime [3], I think the increase should be fine.

[1] http://hqu.web.cern.ch/hqu/dev/cgi-bin/igprof-navigator/ParticleNet-V01-TTM1000-CMSSW_11_2_X_2020-08-03-1100-PR31036-MEM_LIVE_99/367
http://hqu.web.cern.ch/hqu/dev/cgi-bin/igprof-navigator/ParticleNet-V01-TTM1000-CMSSW_11_2_X_2020-08-03-1100-PR31036-MEM_LIVE_99/100

[2] #30599 (comment)

[3] tested w/ RecoBTag/ONNXRuntime/test/test_particle_net_cfg.py

  • peak RSS w/ 1 thread: 656MB
  • peak RSS w/ 8 threads: ~720MB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants