New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Electron MVA V2 and general code changes #23473
Electron MVA V2 and general code changes #23473
Conversation
The code-checks are being triggered in jenkins. |
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-23473/5020 Code check has found code style and quality issues which could be resolved by applying a patch in https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-23473/5020/git-diff.patch You can run |
The code-checks are being triggered in jenkins. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-23473/5023 |
A new Pull Request was created by @guitargeek (Jonas Rembser) for master. It involves the following packages: RecoEgamma/EgammaTools @perrotta, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
@cmsbuild please test with cms-sw/cmsdist#4056 |
The tests are being triggered in jenkins. |
-1 Tested at: fd07032 You can see the results of the tests here: I found follow errors while testing this PR Failed tests: UnitTests RelVals
I found errors in the following unit tests: ---> test runtestRecoEgammaElectronIdentification had ERRORS
When I ran the RelVals I found an error in the following worklfows: runTheMatrix-results/4.53_RunPhoton2012B+RunPhoton2012B+HLTD+RECODR1reHLT+HARVESTDR1reHLT/step3_RunPhoton2012B+RunPhoton2012B+HLTD+RECODR1reHLT+HARVESTDR1reHLT.log136.7611 step2 runTheMatrix-results/136.7611_RunJetHT2016E_reminiaod+RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM/step2_RunJetHT2016E_reminiaod+RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM.log136.731 step3 runTheMatrix-results/136.731_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2/step3_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2.log136.8311 step2 runTheMatrix-results/136.8311_RunJetHT2017F_reminiaod+RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017/step2_RunJetHT2017F_reminiaod+RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017.log135.4 step3 runTheMatrix-results/135.4_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS/step3_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS.log136.788 step3 runTheMatrix-results/136.788_RunSinglePh2017B+RunSinglePh2017B+HLTDR2_2017+RECODR2_2017reHLT_skimSinglePh_Prompt+HARVEST2017/step3_RunSinglePh2017B+RunSinglePh2017B+HLTDR2_2017+RECODR2_2017reHLT_skimSinglePh_Prompt+HARVEST2017.log9.0 step3 runTheMatrix-results/9.0_Higgs200ChargedTaus+Higgs200ChargedTaus+DIGI+RECO+HARVEST/step3_Higgs200ChargedTaus+Higgs200ChargedTaus+DIGI+RECO+HARVEST.log1325.7 step2 runTheMatrix-results/1325.7_TTbar_13_94Xv2NanoAODINPUT+TTbar_13_94Xv2NanoAODINPUT+NANOEDMMC2017_94XMiniAODv2+HARVESTNANOAODMC2017_94XMiniAODv2/step2_TTbar_13_94Xv2NanoAODINPUT+TTbar_13_94Xv2NanoAODINPUT+NANOEDMMC2017_94XMiniAODv2+HARVESTNANOAODMC2017_94XMiniAODv2.log25.0 step3 runTheMatrix-results/25.0_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT/step3_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT.log1306.0 step3 runTheMatrix-results/1306.0_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15/step3_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15.log1330.0 step3 runTheMatrix-results/1330.0_ZMM_13+ZMM_13+DIGIUP15+RECOUP15_L1TMuDQM+HARVESTUP15_L1TMuDQM/step3_ZMM_13+ZMM_13+DIGIUP15+RECOUP15_L1TMuDQM+HARVESTUP15_L1TMuDQM.log136.85 step3 runTheMatrix-results/136.85_RunEGamma2018A+RunEGamma2018A+HLTDR2_2018+RECODR2_2018reHLT_skimEGamma_Prompt_L1TEgDQM+HARVEST2018_L1TEgDQM/step3_RunEGamma2018A+RunEGamma2018A+HLTDR2_2018+RECODR2_2018reHLT_skimEGamma_Prompt_L1TEgDQM+HARVEST2018_L1TEgDQM.log10042.0 step3 runTheMatrix-results/10042.0_ZMM_13+ZMM_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017+ALCAFull_2017/step3_ZMM_13+ZMM_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017+ALCAFull_2017.log25202.0 step3 runTheMatrix-results/25202.0_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVESTUP15_PU25/step3_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVESTUP15_PU25.log10024.0 step3 runTheMatrix-results/10024.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017+ALCAFull_2017/step3_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017+ALCAFull_2017.log10224.0 step3 runTheMatrix-results/10224.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017PU_GenSimFull+DigiFullPU_2017PU+RecoFullPU_2017PU+HARVESTFullPU_2017PU/step3_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017PU_GenSimFull+DigiFullPU_2017PU+RecoFullPU_2017PU+HARVESTFullPU_2017PU.log10824.0 step3 runTheMatrix-results/10824.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2018_GenSimFull+DigiFull_2018+RecoFull_2018+HARVESTFull_2018+ALCAFull_2018/step3_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2018_GenSimFull+DigiFull_2018+RecoFull_2018+HARVESTFull_2018+ALCAFull_2018.log11624.0 step3 runTheMatrix-results/11624.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2019_GenSimFull+DigiFull_2019+RecoFull_2019+HARVESTFull_2019+ALCAFull_2019/step3_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2019_GenSimFull+DigiFull_2019+RecoFull_2019+HARVESTFull_2019+ALCAFull_2019.log20034.0 step3 runTheMatrix-results/20034.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D17_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D17+RecoFullGlobal_2023D17+HARVESTFullGlobal_2023D17/step3_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D17_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D17+RecoFullGlobal_2023D17+HARVESTFullGlobal_2023D17.log21234.0 step3 runTheMatrix-results/21234.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D21_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D21+RecoFullGlobal_2023D21+HARVESTFullGlobal_2023D21/step3_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D21_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D21+RecoFullGlobal_2023D21+HARVESTFullGlobal_2023D21.log20434.0 step3 runTheMatrix-results/20434.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D19_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D19+RecoFullGlobal_2023D19+HARVESTFullGlobal_2023D19/step3_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D19_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D19+RecoFullGlobal_2023D19+HARVESTFullGlobal_2023D19.log250202.181 step4 runTheMatrix-results/250202.181_TTbar_13UP18+TTbar_13UP18+PREMIXUP18_PU25+DIGIPRMXLOCALUP18_PU25+RECOPRMXUP18_PU25+HARVESTUP18_PU25/step4_TTbar_13UP18+TTbar_13UP18+PREMIXUP18_PU25+DIGIPRMXLOCALUP18_PU25+RECOPRMXUP18_PU25+HARVESTUP18_PU25.log |
Comparison not run due to runTheMatrix errors (RelVals and Igprof tests were also skipped) |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
I would like to understand the new eleID output as it got updated by this PR, e.g. https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_10_2_X_2018-06-22-2300+23473/27222/validateJR/all_mini_OldVSNew_RunSinglePh2017B136p788/
Overall: could you please check the output, confirm that it does correspond to what you expect, and possibly also explain here what do you actually expect? |
Hi Andrea, thanks for pointing out these plots. Yes, 7 new IDs is what I expected. The iso V2 has 4 working points because it's including the HZZ working point: What is unexpected for me is that some of the old IDs got changed, this should not be the case. At least the MVA values didn't change, otherwise we would have seen that too. I think it's just related to the way the IDs are ordered in MiniAOD: I'm still trying to understand what's going on exactly, but how are the IDs ordered in MiniAOD? Alphabetically or in the order as they are appear in miniAOD_tools.py? I'm investigating but I anticipate that I will come to the conclusion that the IDs got just reshuffled. |
On 6/25/18 3:30 PM, Jonas Rembser wrote:
Hi Andrea, thanks for pointing out these plots. Yes, 7 new IDs is what I
expected. The iso V2 has 4 working points because it's including the HZZ
working point:
https://github.com/guitargeek/cmssw/blob/ElectronID_MVA2017_V2_102X/RecoEgamma/ElectronIdentification/python/Identification/mvaElectronID_Fall17_iso_V2_cff.py#L80
I didn't include it in NanoAOD though because the HZZ analysis doesn't
really need it.
What is unexpected for me is that some of the old IDs got changed, this
should not be the case. At least the MVA values didn't change, otherwise
we would have seen that too. I think it's just related to the way the
IDs are ordered in MiniAOD:
https://github.com/cms-sw/cmssw/pull/23473/files#diff-76dede28e7751d7bb6c5c9542de4155cR282
I'm still trying to understand what's going on exactly, but how are the
IDs ordered in MiniAOD? Alphabetically or in the order as they are
appear in miniAOD_tools.py? I'm investigating but I anticipate that I
will come to the conclusion that the IDs got just reshuffled.
The order should be most likely the same as they are inserted in the
list of IDs.
So, if the new IDs were inserted in the middle, everything after these
new ones will look different in the fwlite-based plots
|
I didn't know that we do this kind of comparisons, otherwise I wouldn't have inserted them in the middle... Slava the reason why I'm alarmed it because the number of IDs before the insertion is 1+4+4+3+3 = 15. While the in plots, the first 16 IDs agree, which is one too much. So either there is one additional agreeing by chance (unlikely), or there is something going on which I don't understand. |
On 6/25/18 3:47 PM, Jonas Rembser wrote:
I didn't know that we do this kind of comparisons, otherwise I wouldn't
have inserted them in the middle...
Slava the reason why I'm alarmed it because the number of IDs before the
insertion is 1+4+4+3+3 = 15. While the in plots, the first 16 IDs agree,
which is one too much. So either there is one additional agreeing by
chance (unlikely), or there is something going on which I don't understand.
do you have an edmConfigDump for this module (which fills the IDs)
before and after?
|
I don't know which module I should run edmConfigDump on, since miniAOD_tools.py is not a cfg file...Can you please help me out and tell me more specifically what you'd do? Otherwise I'm just trying to figure it out myself. |
On 6/25/18 4:09 PM, Jonas Rembser wrote:
I don't know which module I should run edmConfigDump on, since
miniAOD_tools.py is not a cfg file...Can you help me out and tell me
more specifically what you'd do? Otherwise I'm just trying to figure it
out myself.
it's the easiest by rerunning the workflow discussed here with -j 0
(runTheMatrix.py -l xxx -j 0)
and then edmConfigDump on the corresponding config that runs PAT
|
Ok thank you but now I already dumped the pairs directly from there: Let's check this file here form the tests: So if I loop over the that vector<pair<string, float>> electronIDs and print it out I get:
So it's indeed alphabetically sorted and the new IDs were inserted exactly in the place that we anticipated from the plots. Here my code for reproducing:
For comparison, here the same thing for the equivalent file from a random PR where the electron ID was not changed:
|
Thank you @guitargeek and @slava77 : everything seems to be as it should, then! |
+1
|
merge |
The changes go together with cms-data/RecoEgamma-ElectronIdentification#9.
It is interesting for @Sam-Harper, @lsoffi, @michelif, @UAEDF-tomc.