Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated MLPF producer with ONNX #36841

Merged
merged 11 commits into from Feb 11, 2022

Conversation

jpata
Copy link
Contributor

@jpata jpata commented Jan 31, 2022

PR description:

Following the presentation at the PPD general meeting, at ACAT2021, and as outlined in the PPD workshop, we are updating the MLPF integration in CMSSW to facilitate further development and scrutiny.

Note that MLPF is off by default and thus no changes are expected in any of the standard workflows.

This PR mainly updates the ML model and switches the inference to ONNX from tensorflow. MLPF-specific event content is removed, as now MLPF produces PFCandidates instead of (rather than in parallel to) PFAlgo, when enabled.

Here are the igprof results:

PR validation:

For physics validation, please see the slides linked above. The integration can be tested in the workflows 11843.13 and 11834.13.

@jpata
Copy link
Contributor Author

jpata commented Jan 31, 2022

test parameters:

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-36841/28032

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-36841/28033

@jpata
Copy link
Contributor Author

jpata commented Feb 4, 2022

adding @cms-sw/pf-l2 @laurenhay here, just to make sure everyone is informed.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 5, 2022

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-bafd5c/22230/summary.html
COMMIT: 539f744
CMSSW: CMSSW_12_3_X_2022-02-04-1100/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/36841/22230/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-bafd5c/22230/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-bafd5c/22230/git-merge-result

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-bafd5c/11834.13_TTbar_14TeV+2021PU_mlpf+TTbar_14TeV_TuneCP5_GenSim+DigiPU+RecoNanoPU+HARVESTNanoPU
  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-bafd5c/11843.13_QCD_FlatPt_15_3000HS_14+2021PU_mlpf+QCDForPF_14TeV_TuneCP5_GenSim+DigiPU+RecoNanoPU+HARVESTNanoPU

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 46
  • DQMHistoTests: Total histograms compared: 3766018
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3765994
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 45 files compared)
  • Checked 193 log files, 42 edm output root files, 46 DQM output files
  • TriggerResults: no differences found

@jfernan2
Copy link
Contributor

jfernan2 commented Feb 7, 2022

+1

@clacaputo
Copy link
Contributor

+reconstruction

  • no changes are expected since MLPF is off by default
  • WFs 11834.13 and 11843.13 run without problems

@srimanob
Copy link
Contributor

srimanob commented Feb 7, 2022

+Upgrade

From the code related to Upgrade, updating the MLPF workflow to allow QCD sample is fine.

@kskovpen
Copy link
Contributor

kskovpen commented Feb 7, 2022

+pdmv

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 7, 2022

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

tensorflow::Tensor input(tensorflow::DT_FLOAT, shape);
input.flat<float>().setZero();
#ifdef MLPF_DEBUG
std::cout << "tensor_size=" << tensor_size << std::endl;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed, given the previous assert's?
If really needed to debug, then maybe better to have this cout before L55

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, it would perhaps be more informative (and less lines) to print out before the assert while debugging. can we address this in a follow-up version, or would you prefer a resigning of this PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, do not request 4+1 signatures only to move one comment line: let postpone it to a follow-up PR

cand.setPdgId(pred_pid);
cand.setCharge(charge);
reco::PFCandidate::ParticleType particleType(reco::PFCandidate::X);
if (pred_pid == 211)
Copy link
Contributor

@perrotta perrotta Feb 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this pred_pid is unsigned (here and everywhere else in this code), i.e. it does not consider the charge: can you confirm, just for check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, by construction, the underlying model reconstructs the absolute value of the PID separately from the charge.

So far, we didn't present detailed results about the charge prediction (so //cand.setCharge(charge); downstream), but it should not be a major problem, as should be driven by the track information.

@perrotta
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants