New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bin lookup in the JetCorrectorParameters class #17812
Bin lookup in the JetCorrectorParameters class #17812
Conversation
…ds which holds the correct parameters was using a linear search through all of the possible bins. Not only that, but inside of that loop was an additional loop through all of the binned variables. This is a O(N^2) process. For one binned dimension this is fine and maybe even acceptable for two dimensions, but the slowdown was huge for three dimensions of binned variables. There is now a new way of searching for the right mRecords bin when there are three binned dimensions. This is a O(ln(n_d1)+ln(n_d2)+ln(n_d3)+2) process where n_di is the number of bins in each dimension. Dimensions d<3 must have overlapping bin boundaries (although they don't need the same exact number of bins). The last dimension (3) can have bin boundaries which aren't sharred among the (d1,d2) bins. In addition to this change there was a bug fix to how the code reads the upper and lower bounds of the binned dimensions.
…nd remove extraneous comments.
…function. Create a new helper class called JetCorrectorParametersHelper and put most of the new code in there. This version of the code will not compile due to a serialization error.
…d remove an extraneous std::endl.
…rrectorParameters class. Also untemplate the JetCorrectorParametersHelper class and set a maximum number of binned variables to 3.
A new Pull Request was created by @aperloff (Alexx Perloff) for master. It involves the following packages: CondFormats/JetMETObjects @ggovi, @cmsbuild, @monttj, @davidlange6 can you please review it and eventually sign? Thanks. cms-bot commands are listed here #13028 |
@cmsbuild please test |
The tests are being triggered in jenkins. |
-1 Tested at: c7427a4 You can see the results of the tests here: I found follow errors while testing this PR Failed tests: UnitTests RelVals AddOn
I found errors in the following unit tests: ---> test runtestPhysicsToolsPatAlgos had ERRORS
When I ran the RelVals I found an error in the following worklfows: runTheMatrix-results/5.1_TTbar+TTbarFS+HARVESTFS/step1_TTbar+TTbarFS+HARVESTFS.log140.53 step2 runTheMatrix-results/140.53_RunHI2011+RunHI2011+RECOHID11+HARVESTDHI/step2_RunMinBias2012A+RunMinBias2012A+RECODDQM+HARVESTDDQM.log1003.0 step2 runTheMatrix-results/1003.0_RunMinBias2012A+RunMinBias2012A+RECODDQM+HARVESTDDQM/step2_RunMinBias2012A+RunMinBias2012A+RECODDQM+HARVESTDDQM.log1001.0 step2 runTheMatrix-results/1001.0_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5/step2_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5.log1000.0 step2 runTheMatrix-results/1000.0_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT/step2_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT.log136.731 step2 runTheMatrix-results/136.731_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2/step2_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2.log4.53 step3 runTheMatrix-results/4.53_RunPhoton2012B+RunPhoton2012B+HLTD+RECODR1reHLT+HARVESTDR1reHLT/step3_RunPhoton2012B+RunPhoton2012B+HLTD+RECODR1reHLT+HARVESTDR1reHLT.log135.4 step1 runTheMatrix-results/135.4_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS/step1_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS.log1306.0 step2 runTheMatrix-results/1306.0_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15/step2_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15.log1330.0 step2 runTheMatrix-results/1330.0_ZMM_13+ZMM_13+DIGIUP15+RECOUP15+HARVESTUP15/step2_ZMM_13+ZMM_13+DIGIUP15+RECOUP15+HARVESTUP15.log10021.0 step2 runTheMatrix-results/10021.0_TenMuE_0_200+TenMuE_0_200_pythia8_2017_GenSimFull+DigiFull_2017+RecoFull_2017+ALCAFull_2017+HARVESTFull_2017/step2_TenMuE_0_200+TenMuE_0_200_pythia8_2017_GenSimFull+DigiFull_2017+RecoFull_2017+ALCAFull_2017+HARVESTFull_2017.log20034.0 step3 runTheMatrix-results/20034.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D7_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D7+RecoFullGlobal_2023D7+HARVESTFullGlobal_2023D7/step3_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D7_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D7+RecoFullGlobal_2023D7+HARVESTFullGlobal_2023D7.log25202.0 step2 runTheMatrix-results/25202.0_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVESTUP15_PU25/step2_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVESTUP15_PU25.log10024.0 step2 runTheMatrix-results/10024.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+ALCAFull_2017+HARVESTFull_2017/step2_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+ALCAFull_2017+HARVESTFull_2017.log10824.0 step3 runTheMatrix-results/10824.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2018_GenSimFull+DigiFull_2018+RecoFull_2018+ALCAFull_2018+HARVESTFull_2018/step3_Higgs200ChargedTaus+Higgs200ChargedTaus+DIGI+RECO+HARVEST.log9.0 step3 runTheMatrix-results/9.0_Higgs200ChargedTaus+Higgs200ChargedTaus+DIGI+RECO+HARVEST/step3_Higgs200ChargedTaus+Higgs200ChargedTaus+DIGI+RECO+HARVEST.log25.0 step3 runTheMatrix-results/25.0_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT/step3_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT.log50202.0 step3 runTheMatrix-results/50202.0_TTbar_13+TTbar_13+DIGIUP15_PU50+RECOUP15_PU50+HARVESTUP15_PU50/step3_TTbar_13+TTbar_13+DIGIUP15_PU50+RECOUP15_PU50+HARVESTUP15_PU50.log21234.0 step3 runTheMatrix-results/21234.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D4_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D4+RecoFullGlobal_2023D4+HARVESTFullGlobal_2023D4/step3_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D4_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D4+RecoFullGlobal_2023D4+HARVESTFullGlobal_2023D4.log23234.0 step3 runTheMatrix-results/23234.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D8_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D8+RecoFullGlobal_2023D8+HARVESTFullGlobal_2023D8/step3_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023D8_GenSimHLBeamSpotFull14+DigiFullTrigger_2023D8+RecoFullGlobal_2023D8+HARVESTFullGlobal_2023D8.log
I found errors in the following addon tests: cmsDriver.py TTbar_8TeV_TuneCUETP8M1_cfi --conditions auto:run1_mc --fast -n 100 --eventcontent AODSIM,DQM --relval 100000,1000 -s GEN,SIM,RECOBEFMIX,DIGI:pdigi_valid,L1,DIGI2RAW,L1Reco,RECO,EI,VALIDATION --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --datatier GEN-SIM-DIGI-RECO,DQMIO --beamspot Realistic8TeVCollision : FAILED - time: date Tue Mar 7 22:44:31 2017-date Tue Mar 7 22:41:50 2017 s - exit: 35584 |
Comparison not run due to runTheMatrix errors (RelVals and Igprof tests were also skipped) |
The tests are being triggered in jenkins. |
Comparison job queued. |
Comparison is ready There are some workflows for which there are errors in the baseline: |
@@ -38,7 +38,8 @@ float SimpleJetCorrector::correction(const std::vector<float>& fX,const std::vec | |||
float result = 1.; | |||
float tmp = 0.0; | |||
float cor = 0.0; | |||
int bin = mParameters.binIndex(fX); | |||
int bin = -1; | |||
bin = (fX.size()<=3 && fX.size()>0) ? mParameters.binIndexN(fX) : mParameters.binIndex(fX); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a quick question.
Isn't is necessary to have this conditional statement?
I don't see the difference.
Regards,
Taejeong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first case uses the new, efficient algorithm binIndexN
(note the N
at the end); the second case uses the old, brute-force algorithm binIndex
.
The new algorithm requires the maximum number of dimensions to be defined at compile time, so this logic is necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to second what @kpedro88 said. An additional reason is that we want to use the new algorithm for all the dimensions it has been tested for (1,2,3), but right now we are using the original, brute-force algorithm as a catch-all for untested numbers of dimensions. I would expect the new algorithm, binIndexN, to have much better performance than this original, so this limit can always be raised/removed if and when we find it necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I am sorry I missed "N" at the end.
Thank you for the details.
Taejeong
+1 |
@ggovi please sign |
+1 |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request requires discussion in the ORP meeting before it's merged. @Muzaffar, @davidlange6, @smuzaffar |
+1 |
if (value < min || value > max) | ||
{ | ||
std::stringstream sserr; | ||
sserr<<"Value for dimension "<<dim<<" is outside of the bin boundaries"<<std::endl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI @aperloff - looks like our testing logs are full of this message. Presumably its a sign of a problem missed in validation, so we'll revert this PR for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @davidlange6 - This is not a new error mode. The same issue (being outside the bin boundaries) happened in the original binIndex function. The only difference here is that in the new code an error message is printed. So while this is annoying for the log files, there is no change in the functionality of the binIndex(N) codes.
I've commented out this error message and committed the code to my cmssw repo/branch. How would you like me to proceed? Should I do a new pull request and reference this one and the revert? I'm not sure of the procedure now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @aperloff - just make a new pull request that includes this one + your fix. Thanks! (4MB/event of printout is a bit more than "annoying"...)
Summary:
Problem:
The more binned parameters you add to the condition, the slower the correction lookup is. As a concrete example, when you go from being binned in (eta) to being binned in (eta,pt) the condition goes from from 82 records in the lookup to almost 3000 records. If moving from being binned in (eta) to being binned in (eta,rho,pt), you now have 64000 records. Before the fix this slowed down the lookup table substantially.
Solution/Benchmarking:
With the changes there is a speedup for every jet correction lookup, even with one dimension of binning. For 1D corrections (L2L3Residual like) we see a speedup of ~2.3x (assuming binned with 82 eta bins), for 2D (L2L3 MC truth like) we see ~96.6x, and for 3D we see a speedup of ~12000x . All of these were measured using igprof. Also beneficial is the fact that there should be no noticeable change in behavior for the users or to any existing code. Everything is backwards compatible and requires no change to the database (i.e. all of the new code was made transient).
Documentation:
All of this work is documented in a report I gave to the JERC group (https://indico.cern.ch/event/617090/contributions/2490770/attachments/1420000/2175622/2017_02_23-Alexx-JetCorrectorParameters.pdf).
@kpedro88 @schoef