Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ONNX-based Higgs to bb Interaction Network Tagger #30072

Merged
merged 8 commits into from Jul 7, 2020

Conversation

jmduarte
Copy link
Member

@jmduarte jmduarte commented Jun 2, 2020

PR description:

This PR implements the Higgs to bb Interaction Network tagger into CMSSW. This model was developed originally using open simulation (https://arxiv.org/abs/1909.12285). The version here corresponds to this original version. A re-training with mass-varying samples will be performed in the future. This model has been presented in several BTV meetings (e.g. https://indico.cern.ch/event/833434/contributions/3495704/) and JMAR meetings (e.g. https://indico.cern.ch/event/856106/contributions/3613884/).

The model was developed in PyTorch and converted to ONNX for use in CMSSW.

Execution time with 4 threads for 2017 JetHT events (and compared to other Deep taggers for reference) is:

TimeReport ---------- Module Summary ---[Real sec]----
TimeReport   0.000274     0.000274     0.000274  pfDeepDoubleXTagInfosSlimmedAK8DeepTags
TimeReport   0.000739     0.000739     0.000739  pfDeepBoostedJetTagInfosSlimmedAK8DeepTags
TimeReport   0.000738     0.000738     0.000738  pfParticleNetTagInfosSlimmedAK8DeepTags
TimeReport   0.000633     0.000633     0.000633  pfHiggsInteractionNetTagInfosSlimmedAK8DeepTags
TimeReport   0.000512     0.000512     0.000512  pfDeepDoubleBvLJetTagsSlimmedAK8DeepTags
TimeReport   0.004054     0.004054     0.004054  pfDeepBoostedJetTagsSlimmedAK8DeepTags
TimeReport   0.104109     0.104109     0.104109  pfParticleNetJetTagsSlimmedAK8DeepTags
TimeReport   0.007841     0.007841     0.007841  pfHiggsInteractionNetTagsSlimmedAK8DeepTags

Needs cms-data/RecoBTag-Combined#30

PR validation:

This PR was verified with the training framework and shows the same results.

probHbbprobQCD

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 2, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 2, 2020

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-30072/15799

  • This PR adds an extra 60KB to repository

  • Found files with invalid states:

    • RecoBTag/ONNXRuntime/plugins/HiggsINONNXJetTagsProducer.cc:
    • RecoBTag/ONNXRuntime/python/pfHiggsIN_cff.py:
    • RecoBTag/ONNXRuntime/python/Parameters/HiggsIN/V00/pfHiggsINPreprocessParams_cfi.py:
    • RecoBTag/FeatureTools/plugins/HiggsINTagInfoProducer.cc:
    • DataFormats/BTauReco/interface/HiggsINTagInfo.h:
    • DataFormats/BTauReco/interface/HiggsINFeatures.h:
  • There are other open Pull requests which might conflict with changes you have proposed:

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 2, 2020

A new Pull Request was created by @jmduarte (Javier Duarte) for master.

It involves the following packages:

DataFormats/BTauReco
PhysicsTools/PatAlgos
RecoBTag/Configuration
RecoBTag/FeatureTools
RecoBTag/ONNXRuntime

@perrotta, @cmsbuild, @santocch, @slava77 can you please review it and eventually sign? Thanks.
@rappoccio, @gouskos, @hatakeyamak, @emilbols, @peruzzim, @seemasharmafnal, @mmarionncern, @ahinzmann, @smoortga, @jdolen, @hqucms, @ferencek, @rovere, @jdamgov, @nhanvtran, @gkasieczka, @schoef, @mariadalfonso, @clelange, @riga, @JyothsnaKomaragiri, @mverzett, @gpetruc, @andrzejnovak this is something you requested to watch as well.
@silviodonato, @dpiparo you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 2, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 2, 2020

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-30072/15800

  • This PR adds an extra 56KB to repository

  • There are other open Pull requests which might conflict with changes you have proposed:

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 2, 2020

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jun 2, 2020

The tests are being triggered in jenkins.
Tested with other pull request(s) cms-data/RecoBTag-Combined#30
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/6745/console Started: 2020/06/02 18:29

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 4, 2020

-1

Tested at: 49d1b08

CMSSW: CMSSW_11_2_X_2020-07-03-2300
SCRAM_ARCH: slc7_amd64_gcc820
You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c596d4/7688/summary.html

I found follow errors while testing this PR

Failed tests: AddOn

  • AddOn:

I found errors in the following addon tests:

Preparing to run ['cmsDriver.py TTbar_Tauola_13TeV_TuneCUETP8M1_cfi -s GEN,SIM,DIGI,L1,DIGI2RAW --mc --scenario=pp -n 10 --conditions auto:run3_mc_PIon --relval 9000,50 --datatier "GEN-SIM-RAW" --eventcontent RAWSIM --customise=HLTrigger/Configuration/CustomConfigs.L1T --era Run3 --fileout file:RelVal_Raw_PIon_MC.root', 'cmsRun /cvmfs/cms-ib.cern.ch/week1/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_X_2020-07-03-2300/src/HLTrigger/Configuration/test/OnLine_HLT_PIon.py realData=False globalTag=@ inputFiles=@ ', 'cmsDriver.py RelVAddOnTest might have timed out: FAILED - secs

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 4, 2020

Comparison job queued.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 4, 2020

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c596d4/7688/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 3262 differences found in the comparisons
  • DQMHistoTests: Total files compared: 37
  • DQMHistoTests: Total histograms compared: 2784120
  • DQMHistoTests: Total failures: 5
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 2784064
  • DQMHistoTests: Total skipped: 50
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: -0.004 KiB( 36 files compared)
  • DQMHistoSizes: changed ( 10224.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 154 log files, 17 edm output root files, 37 DQM output files

@slava77
Copy link
Contributor

slava77 commented Jul 7, 2020

+1

for #30072 49d1b08

  • code changes are in line with the PR description and the follow up review
  • jenkins tests pass and comparisons with the baseline show differences only in the
    • the addOn failure is unrelated (" RelVAddOnTest might have timed out: FAILED - secs")
  • local test with 1K events in 136.88811 confirms that only the new two tags were added for pfHiggsInteractionNetTagsSlimmedAK8DeepTags probQCD and probHbb, while the existing tags for AK8 jets are unchanged
    • time per event is roughly consistent with the PR description, it adds up to about 0.8% per miniAOD workflow.
    • memory use increases by about 8 MiB (4.6 MiB constant per job, mostly in the model global cache, and 3.3 MiB per event in DeepBoostedJetONNXJetTagsProducer::produce ONNX runtime)

in the jetHt data 1K events wf 136.88811 the probHbb (a histogram in black; ignore the one in red) appears roughly as expected to show mostly background
jet_origVStest30072_RunJetHT2018DreMINIAODULwf136p88811c_min2,max-2,patJets_slimmedJetsAK8__PAT_obj___pairDiscriVector__83__second84

This PR should be merged together/after cms-data/RecoBTag-Combined#30

@silviodonato
Copy link
Contributor

merge

@Dr15Jones
Copy link
Contributor

This pull request seems to be responsible for the 700+ failures in the RelVal IBs with the error

An exception of category 'ConfigFileReadError' occurred while
   [0] Processing the python configuration file named step3_RAW2DIGI_L1Reco_RECO_RECOSIM_EI_PAT.py
Exception Message:
 unknown python problem occurred.
RuntimeError: An exception of category 'FileInPathError' occurred.
Exception Message:
edm::FileInPath unable to find file RecoBTag/Combined/data/HiggsInteractionNet/V00/IN.onnx anywhere in the search path.
The search path is defined by: CMSSW_SEARCH_PATH
${CMSSW_SEARCH_PATH} is: /data/cmsbld/jenkins/workspace/ib-run-relvals/CMSSW_11_2_X_2020-07-07-2300/poison:/data/cmsbld/jenkins/workspace/ib-run-relvals/CMSSW_11_2_X_2020-07-07-2300/src:/data/cmsbld/jenkins/workspace/ib-run-relvals/CMSSW_11_2_X_2020-07-07-2300/external/slc7_amd64_gcc820/data:/cvmfs/cms-ib.cern.ch/nweek-02636/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-07-2300/src:/cvmfs/cms-ib.cern.ch/nweek-02636/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-07-2300/external/slc7_amd64_gcc820/data
Current directory is: /data/cmsbld/jenkins/workspace/ib-run-relvals/CMSSW_11_2_X_2020-07-07-2300/pyRelval/1.0_ProdMinBias+ProdMinBias+DIGIPROD1+RECOPROD1


At:
  /cvmfs/cms-ib.cern.ch/nweek-02636/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-07-2300/python/FWCore/ParameterSet/Types.py(808): insertInto
  /cvmfs/cms-ib.cern.ch/nweek-02636/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-07-2300/python/FWCore/ParameterSet/Mixins.py(373): insertContentsInto
  /cvmfs/cms-ib.cern.ch/nweek-02636/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-07-2300/python/FWCore/ParameterSet/Mixins.py(502): insertInto
  /cvmfs/cms-ib.cern.ch/nweek-02636/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-07-2300/python/FWCore/ParameterSet/Modules.py(162): insertInto
  /cvmfs/cms-ib.cern.ch/nweek-02636/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-07-2300/python/FWCore/ParameterSet/Config.py(1100): _insertManyInto
  /cvmfs/cms-ib.cern.ch/nweek-02636/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_X_2020-07-07-2300/python/FWCore/ParameterSet/Config.py(1315): fillProcessDesc
  <string>(2): <module>


@slava77
Copy link
Contributor

slava77 commented Jul 8, 2020

This pull request seems to be responsible for the 700+ failures in the RelVal IBs with the error

it's not the PR, it's the incomplete merge of the external that's the issue

@santocch
Copy link

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (but tests are reportedly failing).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants