Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating SONIC ParticleNet Producer and Config Files #43138

Merged

Conversation

wpmccormack
Copy link
Contributor

PR description:

Latest release of CMSSW (CMSSW_13) has new versions of particleNet and a new higgs interaction network, which can all be accelerated by current SONIC infrastructure. This existing infrastructure was introduced in previously merged PR: #37964. The new models have different input features, requiring a more robust approach to determining the shapes of tensors to be sent to the model hosting servers. I introduce a scheme to read the preprocessing json files for the models to determine how many of the features are particle flow candidates (pf), secondary vertices (sv), or the newly introduced lost tracks (lt).

Tagging @kpedro88 @yongbinfeng @violatingcp

PR validation:

Outputs of models have been checked locally to match with non-SONIC version of models. I also followed the pre-PR instructions here: https://cms-sw.github.io/PRWorkflow.html. Some tests failed due to xrootd issues, which are not related to this PR as far as I can tell.

There will be dependency on a PR in the RecoBTag-Combined repository. Once I make that PR, I will comment on this one

…ersions of particlenet and higgs interaction network
@wpmccormack
Copy link
Contributor Author

As promised, the accompanying model file PR is here: cms-data/RecoBTag-Combined#53

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43138/37417

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @wpmccormack (Patrick McCormack) for master.

It involves the following packages:

  • RecoBTag/ONNXRuntime (reconstruction)

@jfernan2, @cmsbuild, @mandrenguyen can you please review it and eventually sign? Thanks.
@missirol, @andrzejnovak, @demuller, @AnnikaStein, @Ming-Yan, @AlexDeMoor, @emilbols, @Senphy, @JyothsnaKomaragiri this is something you requested to watch as well.
@rappoccio, @sextonkennedy, @antoniovilela you are the release manager for this.

cms-bot commands are listed here

@jfernan2
Copy link
Contributor

enable profiling

@jfernan2
Copy link
Contributor

please test

@@ -86,6 +93,8 @@ ParticleNetSonicJetTagsProducer::ParticleNetSonicJetTagsProducer(const edm::Para
for (const auto &flav_name : flav_names_) {
produces<JetTagCollection>(flav_name);
}

emptyJets_.clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should not be necessary; vectors are initialized to empty by default

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted


emptyJets_.clear();

if (!countedInputs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason this can't be done in the constructor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Kevin, do you mean moving this to the ParticleNetConstructor function here

void ParticleNetConstructor(const edm::ParameterSet &Config_,
?
I need the emptyJets_ vector to be cleared for every event, which is why I added it here, instead of just the call at line 97, which you said was not needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was about initializing the constants that happens in this if block. (Indeed the clear() call should happen in acquire().)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh I see what you mean. Yes, moved that block into the constructor

@@ -46,6 +46,7 @@
)
)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete unnecessary change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ae5ab4/35496/summary.html
COMMIT: 6855d27
CMSSW: CMSSW_13_3_X_2023-10-29-2300/el8_amd64_gcc12
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/43138/35496/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 28 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3362691
  • DQMHistoTests: Total failures: 1070
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3361599
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 214 log files, 167 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@kpedro88
Copy link
Contributor

test parameters:
pull_requests = cms-data/RecoBTag-Combined#53
workflows = 11824.9001,24834.9001

@kpedro88
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ae5ab4/35504/summary.html
COMMIT: 6855d27
CMSSW: CMSSW_13_3_X_2023-10-30-1100/el8_amd64_gcc12
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/43138/35504/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals

ValueError: Undefined workflows: 11824.9001, 24834.9001

----- Begin Fatal Exception 30-Oct-2023 15:21:44 CET-----------------------
An exception of category 'ConfigFileReadError' occurred while
   [0] Processing the python configuration file named step2_PAT_DQM.py
Exception Message:
 unknown python problem occurred.
RuntimeError: An exception of category 'FileInPathError' occurred.
Exception Message:
Path /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/43138/35504/CMSSW_13_3_X_2023-10-30-1100/external/el8_amd64_gcc12/data/RecoBTag/Combined/data/ParticleNetFromMiniAODAK4/CHS/Central/particle-net.onnx is a symbolic link, not a file


At:
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Types.py(881): insertInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Mixins.py(381): insertContentsInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Mixins.py(516): insertInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Modules.py(161): insertInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Config.py(1216): _insertManyInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Config.py(1490): fillProcessDesc
  <string>(2): <module>

----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 30-Oct-2023 15:21:44 CET-----------------------
An exception of category 'ConfigFileReadError' occurred while
   [0] Processing the python configuration file named step2_PAT_DQM.py
Exception Message:
 unknown python problem occurred.
RuntimeError: An exception of category 'FileInPathError' occurred.
Exception Message:
Path /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/43138/35504/CMSSW_13_3_X_2023-10-30-1100/external/el8_amd64_gcc12/data/RecoBTag/Combined/data/ParticleNetFromMiniAODAK4/CHS/Central/particle-net.onnx is a symbolic link, not a file


At:
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Types.py(881): insertInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Mixins.py(381): insertContentsInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Mixins.py(516): insertInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Modules.py(161): insertInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Config.py(1216): _insertManyInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Config.py(1490): fillProcessDesc
  <string>(2): <module>

----- End Fatal Exception -------------------------------------------------
----- Begin Fatal Exception 30-Oct-2023 15:21:44 CET-----------------------
An exception of category 'ConfigFileReadError' occurred while
   [0] Processing the python configuration file named step2_PAT_DQM.py
Exception Message:
 unknown python problem occurred.
RuntimeError: An exception of category 'FileInPathError' occurred.
Exception Message:
Path /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/43138/35504/CMSSW_13_3_X_2023-10-30-1100/external/el8_amd64_gcc12/data/RecoBTag/Combined/data/ParticleNetFromMiniAODAK4/CHS/Central/particle-net.onnx is a symbolic link, not a file


At:
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Types.py(881): insertInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Mixins.py(381): insertContentsInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Mixins.py(516): insertInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Modules.py(161): insertInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Config.py(1216): _insertManyInto
  /cvmfs/cms-ib.cern.ch/sw/x86_64/week1/el8_amd64_gcc12/cms/cmssw-patch/CMSSW_13_3_X_2023-10-30-1100/src/FWCore/ParameterSet/python/Config.py(1490): fillProcessDesc
  <string>(2): <module>

----- End Fatal Exception -------------------------------------------------
Expand to see more relval errors ...

RelVals-INPUT

  • 4.64.6_MinimumBias2010A/step2_MinimumBias2010A.log
  • 136.72411136.72411_RunJetHT2016B_reminiaodUL/step2_RunJetHT2016B_reminiaodUL.log
  • 136.72412136.72412_RunJetHT2016B_reminiaodUL/step2_RunJetHT2016B_reminiaodUL.log
Expand to see more relval errors ...

@kpedro88
Copy link
Contributor

test parameters:
pull_requests = cms-data/RecoBTag-Combined#53
workflows = 11824.9001,24834.9001
relvals_opt = --what cleanedupgrade,standard,highstats,pileup,generator,extendedgen,production,identity,ged,machine,premix,nano,gpu,2017,2026

@kpedro88
Copy link
Contributor

@wpmccormack need to update the paths here

@wpmccormack
Copy link
Contributor Author

gahhh, I forgot to push the changes here that sync with recent changes to the structure of cms-data/RecoBTag-Combined#53. Updated now

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43138/37471

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

Pull request #43138 was updated. @mandrenguyen, @cmsbuild, @jfernan2 can you please check and sign again.

@kpedro88
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 1, 2023

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-ae5ab4/35540/summary.html
COMMIT: a9244c5
CMSSW: CMSSW_13_3_X_2023-10-31-1400/el8_amd64_gcc12
Additional Tests: PROFILING
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/43138/35540/install.sh to create a dev area with all the needed externals and cmssw changes.

  • DAS Queries: The DAS query tests failed, see the summary page for details.

Comparison Summary

Summary:

  • You potentially removed 3 lines from the logs
  • Reco comparison results: 21 differences found in the comparisons
  • DQMHistoTests: Total files compared: 52
  • DQMHistoTests: Total histograms compared: 3560437
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3560409
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 51 files compared)
  • Checked 223 log files, 176 edm output root files, 52 DQM output files
  • TriggerResults: no differences found

@jfernan2
Copy link
Contributor

jfernan2 commented Nov 2, 2023

+1
It would be nice to have checked timing in this PR but igprof test is broken: #43166

@cmsbuild
Copy link
Contributor

cmsbuild commented Nov 2, 2023

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @antoniovilela, @rappoccio, @sextonkennedy (and backports should be raised in the release meeting by the corresponding L2)

@kpedro88
Copy link
Contributor

kpedro88 commented Nov 2, 2023

@jfernan2 beyond igprof being broken, this PR only directly impacts the special workflows ending in .9001, which aren't picked up by the profiling option anyway.

@rappoccio
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 39b6c59 into cms-sw:master Nov 3, 2023
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants