New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix module run order consistency check #15740
Fix module run order consistency check #15740
Conversation
Fixed a bug in the calculation of having a consistent module schedule when comparing ordering of modules on Paths with dependencies of modules based on consumes. The old code did not handle consumesMany and sometimes missed inconsistencies when multiple Paths were involved. The new code no longer uses the old module modulesDependentUpon method and instead uses the more accurate modulesWhoseProductsAreConsumed.
The SewerModule wants to drop all output by default while other OutputModules want to keep them.
We now properly enforce that one cannot ask for a data product which is produced later on the same Path. This includes not being able to ask for this process' TriggerResults from a module on a Path (EndPaths are fine). These tests and modules were violating that rule.
A new Pull Request was created by @Dr15Jones (Chris Jones) for CMSSW_8_1_X. It involves the following packages: FWCore/Framework @cmsbuild, @smuzaffar, @Dr15Jones, @davidlange6 can you please review it and eventually sign? Thanks. cms-bot commands are list here #13028 |
please test |
+1 |
The tests are being triggered in jenkins. |
This pull request is fully signed and it will be integrated in one of the next CMSSW_8_1_X IBs after it passes the integration tests. This pull request requires discussion in the ORP meeting before it's merged. @slava77, @davidlange6, @smuzaffar |
-1 Tested at: 4b9e0a2 The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: You can see the results of the tests here: I found follow errors while testing this PR Failed tests: UnitTests RelVals AddOn
I found errors in the following unit tests: ---> test TestFWCoreFrameworkGlobalStreamOne had ERRORS
When I ran the RelVals I found an error in the following worklfows: runTheMatrix-results/4.22_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC/step2_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC.log5.1 step1 runTheMatrix-results/5.1_TTbar+TTbarFS+HARVESTFS/step1_TTbar+TTbarFS+HARVESTFS.log140.53 step2 runTheMatrix-results/140.53_RunHI2011+RunHI2011+RECOHID11+HARVESTDHI/step2_RunHI2011+RunHI2011+RECOHID11+HARVESTDHI.log136.731 step2 runTheMatrix-results/136.731_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_HIPM+HARVESTDR2/step2_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_HIPM+HARVESTDR2.log8.0 step3 runTheMatrix-results/8.0_BeamHalo+BeamHalo+DIGICOS+RECOCOS+ALCABH+HARVESTCOS/step3_BeamHalo+BeamHalo+DIGICOS+RECOCOS+ALCABH+HARVESTCOS.log1306.0 step2 runTheMatrix-results/1306.0_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15/step2_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15.log1000.0 step2 runTheMatrix-results/1000.0_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT/step2_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT.log1001.0 step2 runTheMatrix-results/1001.0_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5/step2_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5.log1330.0 step2 runTheMatrix-results/1330.0_ZMM_13+ZMM_13+DIGIUP15+RECOUP15+HARVESTUP15/step2_ZMM_13+ZMM_13+DIGIUP15+RECOUP15+HARVESTUP15.log1003.0 step2 runTheMatrix-results/1003.0_RunMinBias2012A+RunMinBias2012A+RECODDQM+HARVESTDDQM/step2_RunMinBias2012A+RunMinBias2012A+RECODDQM+HARVESTDDQM.log135.4 step1 runTheMatrix-results/135.4_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS/step1_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS.log10021.0 step3 runTheMatrix-results/10021.0_TenMuE_0_200+TenMuE_0_200_pythia8_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017/step3_TenMuE_0_200+TenMuE_0_200_pythia8_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017.log9.0 step3 runTheMatrix-results/9.0_Higgs200ChargedTaus+Higgs200ChargedTaus+DIGI+RECO+HARVEST/step3_Higgs200ChargedTaus+Higgs200ChargedTaus+DIGI+RECO+HARVEST.log25.0 step3 runTheMatrix-results/25.0_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT/step3_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT.log25202.0 step2 runTheMatrix-results/25202.0_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVESTUP15_PU25/step2_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVESTUP15_PU25.log10424.0 step3 runTheMatrix-results/10424.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2023D1_GenSimFull+DigiFull_2023D1+RecoFullGlobal_2023D1+HARVESTFullGlobal_2023D1/step3_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2023D1_GenSimFull+DigiFull_2023D1+RecoFullGlobal_2023D1+HARVESTFullGlobal_2023D1.log50202.0 step3 runTheMatrix-results/50202.0_TTbar_13+TTbar_13+DIGIUP15_PU50+RECOUP15_PU50+HARVESTUP15_PU50/step3_TTbar_13+TTbar_13+DIGIUP15_PU50+RECOUP15_PU50+HARVESTUP15_PU50.log10024.0 step3 runTheMatrix-results/10024.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017/step3_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017.log11224.0 step3 runTheMatrix-results/11224.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2023D3_GenSimFull+DigiFull_2023D3+RecoFullGlobal_2023D3+HARVESTFullGlobal_2023D3/step3_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2023D3_GenSimFull+DigiFull_2023D3+RecoFullGlobal_2023D3+HARVESTFullGlobal_2023D3.log
I found errors in the following addon tests: cmsDriver.py TTbar_8TeV_TuneCUETP8M1_cfi --conditions auto:run1_mc --fast -n 100 --eventcontent AODSIM,DQM --relval 100000,1000 -s GEN,SIM,RECOBEFMIX,DIGI:pdigi_valid,L1,DIGI2RAW,L1Reco,RECO,EI,HLT:@Fake,VALIDATION --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --datatier GEN-SIM-DIGI-RECO,DQMIO --beamspot Realistic8TeVCollision : FAILED - time: date Tue Sep 6 07:43:30 2016-date Tue Sep 6 07:42:16 2016 s - exit: 20736 The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic: |
Moved the implementation of determining if the Path order and module data product dependencies are consistent to its own files. This allows better sharing as well as the opportunity for standalone tests.
Corrected problems in the algorithm used to find when module interdependencies clash with Path specifications. Added a unit test to allow quick checking of the algorithm.
The parameter splitLevel is not used by DQMRootOutputModule but configurations in the RelVal are failing because they were originally written to use PoolOutputModule which does use that parameter. By adding this parameter we provide backwards compatibility.
Pull request #15740 was updated. @smuzaffar, @Dr15Jones, @dmitrijus, @cmsbuild, @vanbesien, @davidlange6 can you please check and sign again. |
please test |
The tests are being triggered in jenkins. |
please test |
The tests are being triggered in jenkins. |
+1 |
@davidlange6 @dmitrijus @vanbesien the DQM changes do not change the behavior of the DQMRootOutputModule at all. The changes only tell the framework the proper information it needs to work correctly now. |
ping |
1 similar comment
ping |
The framework requires that a module earlier on a Path can not request data from a module later on the same Path. It also requires that no modules form a data dependency loop. Unfortunately, the code which was intended to catch these problems was incomplete. This change should fix those oversites.
These rules must be properly enforced in order to proceed with the next step of the threaded framework (which will allow multiple Paths to run simultaneously).