New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce (Robust)ParTAK4 jet tagger, DeepJet model update for Run 3, remove DeepCSV from nano #41275
Conversation
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41275/35038 ERROR: Unable to merge PR. See log https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41275/35038/cms-checkout-topic.log |
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41275/35041 ERROR: Build errors found during clang-tidy run.
|
It seems some changes on the inputs producers in the last PRs led to the errors : #40803 We are investigating this and adjusting our implementation w.r.t. those new elements |
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41275/35049
Code check has found code style and quality issues which could be resolved by applying following patch(s)
|
@cmsbuild , please test |
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d29304/32316/summary.html The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:
You can see more details here: Comparison SummarySummary:
NANO Comparison SummarySummary:
Nano size comparison Summary:
|
+1 |
@AnnikaStein @AlexDeMoor thank you for getting this one done. Let's try to finilize the backport today |
+reconstruction
|
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
By the way the model's PR has not yet been merged : cms-data/RecoBTag-Combined#51 |
Thank you for noticing it @AlexDeMoor |
Backport to 13_0_X of #41275 Introduce (Robust)ParTAK4 jet tagger, DeepJet model update for Run 3, remove DeepCSV from nano
Backport to 13_1_X of #41275 Introduce (Robust)ParTAK4 jet tagger, DeepJet model update for Run 3, remove DeepCSV from nano
PR description:
This is to add (Robust)ParticleTransformerAK4, in particular for Mini and Nano, update the DeepJet (aka DeepFlavour) model and remove outdated DeepCSV (aka bTagDeep…) entries for Run 3. New models have been trained on 122X samples (PUPPI). This refers to multiple issues on GitLab, https://gitlab.cern.ch/cms-nanoAOD/xpog-coordination/-/issues/61 https://gitlab.cern.ch/cms-nanoAOD/xpog-coordination/-/issues/56
It’s a follow-up of PR #40706 (where DataFormats were defined, which are filled with meaning now).
We are using model files of cms-data/RecoBTag-Combined#51 and as discussed in https://indico.cern.ch/event/1263008/#10-ak4-jets-tagging-algorithms we were able to introduce a well performing, robust and yet quicker ParT model (for inference).
One of the most comprehensive overviews on the new tagger along with the different developments that go in was given lately: https://indico.cern.ch/event/1218506/#6-particle-transformer-run-3-t
The new concept we want to introduce in RecoBTag is adversarial training, e.g. explained in the jet tagging context here https://doi.org/10.1007/s41781-022-00087-1, with first full CMS application here https://cds.cern.ch/record/2839919. Implications / advantages have been studied already for earlier versions of ParTAK4 (e.g. https://indico.cern.ch/event/1218501/#4-adversarial-training-for-par, https://indico.cern.ch/event/1218499/#5-adversarial-training-for-par). More (written) documentation to follow.
Collaboration with @AlexDeMoor
PR validation:
The comparisons between CMSSW and PyTorch have been run lately, results presented here: https://indico.cern.ch/event/1272115/#1-news
A set of basic checks was run:
Custom tests (which we think shall not go in the official repo, but would be available if needed: AnnikaStein@07aac71) have been run to perform the AOD -> MiniAOD step, („step1_PAT.py“) as well as „step1_NANO.py“ to go from Mini to Nano, both without errors, adding the new branches as expected.
If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:
This is for master, but a backport to 130X is planned.