Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated root to tip of branch v6-22-00-patches #6314

Conversation

mrodozov
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Oct 14, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @mrodozov (Mircho Rodozov) for branch IB/CMSSW_11_2_X/rootnext.

@cmsbuild, @smuzaffar, @mrodozov can you please review it and eventually sign? Thanks.
cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

-1

Tested at: 540e8a6

CMSSW: CMSSW_11_2_ROOT622_X_2020-10-13-2300
SCRAM_ARCH: slc7_amd64_gcc820
You can see the results of the tests here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-997530/9940/summary.html

I found follow errors while testing this PR

Failed tests: UnitTests RelVals AddOn

  • Unit Tests:

I found errors in the following unit tests:

---> test testAlignmentOfflineValidation had ERRORS
---> test testCalibTrackerSiStripCommon had ERRORS
---> test testSSTGainPCL_fromRECO had ERRORS
---> test CalibCalorimetryEcalLaserSortingRunStreamer had ERRORS
---> test DQMServicesStreamerIORunStreamer had ERRORS
---> test TestIntegrationSeriesOfProcesses had ERRORS
---> test TestIntegrationThinningTests had ERRORS
---> test NewStreamerUNCOMPRESSED had ERRORS
---> test NewStreamerLZMA had ERRORS
---> test NewStreamerZLIB had ERRORS
---> test NewStreamerZSTD had ERRORS
---> test testJetMETCorrectionsType1MET had ERRORS
---> test runtestPhysicsToolsPatAlgos had ERRORS
---> test runtestUtilAlgos had ERRORS
---> test runtestRecoEgammaPhotonIdentification had ERRORS
---> test runtestRecoEgammaElectronIdentification had ERRORS
---> test runtestTqafExamples had ERRORS
---> test runtestTqafTopEventProducers had ERRORS
---> test runtestTqafTopHitFit had ERRORS
---> test runtestTqafTopEventSelection had ERRORS
---> test TestPoolInput had ERRORS
---> test runtestTqafTopKinFitter had ERRORS
---> test runtestTqafTopJetCombination had ERRORS
---> test runtestTqafTopSkimming had ERRORS
---> test runtestTqafTopTools had ERRORS
---> test testPhase2PixelNtuple had ERRORS
---> test checkMultiRunHarvestingOutput had ERRORS
---> test testPVPlotting had ERRORS
---> test testRecoMETMETProducers had ERRORS
---> test testTauEmbeddingProducers had ERRORS

  • RelVals:

When I ran the RelVals I found an error in the following workflows:
136.88811 step2

runTheMatrix-results/136.88811_RunJetHT2018D_reminiaodUL+RunJetHT2018D_reminiaodUL+REMINIAOD_data2018UL+HARVEST2018_REMINIAOD_data2018UL/step2_RunJetHT2018D_reminiaodUL+RunJetHT2018D_reminiaodUL+REMINIAOD_data2018UL+HARVEST2018_REMINIAOD_data2018UL.log

4.22 step2
runTheMatrix-results/4.22_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC/step2_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC.log

136.7611 step2
runTheMatrix-results/136.7611_RunJetHT2016E_reminiaod+RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM/step2_RunJetHT2016E_reminiaod+RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM.log

136.8311 step2
runTheMatrix-results/136.8311_RunJetHT2017F_reminiaod+RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017/step2_RunJetHT2017F_reminiaod+RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017.log

8.0 step3
runTheMatrix-results/8.0_BeamHalo+BeamHalo+DIGICOS+RECOCOS+ALCABH+HARVESTCOS/step3_BeamHalo+BeamHalo+DIGICOS+RECOCOS+ALCABH+HARVESTCOS.log

140.53 step2
runTheMatrix-results/140.53_RunHI2011+RunHI2011+RECOHID11+HARVESTDHI/step2_RunHI2011+RunHI2011+RECOHID11+HARVESTDHI.log

158.01 step2
runTheMatrix-results/158.01_HydjetQ_reminiaodPbPb2018_INPUT+HydjetQ_reminiaodPbPb2018_INPUT+REMINIAODHI2018PPRECO+HARVESTHI2018PPRECOMINIAOD/step2_HydjetQ_reminiaodPbPb2018_INPUT+HydjetQ_reminiaodPbPb2018_INPUT+REMINIAODHI2018PPRECO+HARVESTHI2018PPRECOMINIAOD.log

136.731 step3
runTheMatrix-results/136.731_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2/step3_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2.log

136.793 step3
runTheMatrix-results/136.793_RunDoubleEG2017C+RunDoubleEG2017C+HLTDR2_2017+RECODR2_2017reHLT_skimDoubleEG_Prompt+HARVEST2017/step3_RunDoubleEG2017C+RunDoubleEG2017C+HLTDR2_2017+RECODR2_2017reHLT_skimDoubleEG_Prompt+HARVEST2017.log

140.56 step2
runTheMatrix-results/140.56_RunHI2018+RunHI2018+RECOHID18+HARVESTDHI18/step2_RunHI2018+RunHI2018+RECOHID18+HARVESTDHI18.log

136.874 step3
runTheMatrix-results/136.874_RunEGamma2018C+RunEGamma2018C+HLTDR2_2018+RECODR2_2018reHLT_skimEGamma_Offline_L1TEgDQM+HARVEST2018_L1TEgDQM/step3_RunEGamma2018C+RunEGamma2018C+HLTDR2_2018+RECODR2_2018reHLT_skimEGamma_Offline_L1TEgDQM+HARVEST2018_L1TEgDQM.log

135.4 step3
runTheMatrix-results/135.4_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS/step3_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS.log

7.3 step3
runTheMatrix-results/7.3_CosmicsSPLoose_UP18+CosmicsSPLoose_UP18+DIGICOS_UP18+RECOCOS_UP18+ALCACOS_UP18+HARVESTCOS_UP18/step3_CosmicsSPLoose_UP18+CosmicsSPLoose_UP18+DIGICOS_UP18+RECOCOS_UP18+ALCACOS_UP18+HARVESTCOS_UP18.log

158.0 step2
runTheMatrix-results/158.0_HydjetQ_B12_5020GeV_2018_ppReco+HydjetQ_B12_5020GeV_2018_ppReco+DIGIHI2018PPRECO+RECOHI2018PPRECO+ALCARECOHI2018PPRECO+HARVESTHI2018PPRECO/step2_HydjetQ_B12_5020GeV_2018_ppReco+HydjetQ_B12_5020GeV_2018_ppReco+DIGIHI2018PPRECO+RECOHI2018PPRECO+ALCARECOHI2018PPRECO+HARVESTHI2018PPRECO.log

1001.0 step3
runTheMatrix-results/1001.0_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVDSIPIXELCALRUN1+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5/step3_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVDSIPIXELCALRUN1+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5.log

1000.0 step3
runTheMatrix-results/1000.0_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT/step3_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT.log

10042.0 step3
runTheMatrix-results/10042.0_ZMM_13+2017+ZMM_13TeV_TuneCUETP8M1_GenSim+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano/step3_ZMM_13+2017+ZMM_13TeV_TuneCUETP8M1_GenSim+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano.log

25.0 step5
runTheMatrix-results/25.0_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT/step5_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT.log

11634.0 step2
runTheMatrix-results/11634.0_TTbar_14TeV+2021+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA/step2_TTbar_14TeV+2021+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA.log

10024.0 step3
runTheMatrix-results/10024.0_TTbar_13+2017+TTbar_13TeV_TuneCUETP8M1_GenSim+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano/step3_TTbar_13+2017+TTbar_13TeV_TuneCUETP8M1_GenSim+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano.log

12434.0 step2
runTheMatrix-results/12434.0_TTbar_14TeV+2023+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA/step2_TTbar_14TeV+2023+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA.log

10824.0 step3
runTheMatrix-results/10824.0_TTbar_13+2018+TTbar_13TeV_TuneCUETP8M1_GenSim+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano/step3_TTbar_13+2018+TTbar_13TeV_TuneCUETP8M1_GenSim+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano.log

10224.0 step3
runTheMatrix-results/10224.0_TTbar_13+2017PU+TTbar_13TeV_TuneCUETP8M1_GenSim+DigiPU+RecoFakeHLTPU+HARVESTFakeHLTPU+Nano/step3_TTbar_13+2017PU+TTbar_13TeV_TuneCUETP8M1_GenSim+DigiPU+RecoFakeHLTPU+HARVESTFakeHLTPU+Nano.log

250202.181 step4
runTheMatrix-results/250202.181_TTbar_13UP18+TTbar_13UP18+PREMIXUP18_PU25+DIGIPRMXLOCALUP18_PU25+RECOPRMXUP18_PU25+HARVESTUP18_PU25/step4_TTbar_13UP18+TTbar_13UP18+PREMIXUP18_PU25+DIGIPRMXLOCALUP18_PU25+RECOPRMXUP18_PU25+HARVESTUP18_PU25.log

  • AddOn:

I found errors in the following addon tests:

cmsRun /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_2_ROOT622_X_2020-10-13-2300/src/PhysicsTools/PatAlgos/test/IntegrationTest_cfg.py : FAILED - time: date Wed Oct 14 15:22:26 2020-date Wed Oct 14 15:22:11 2020 s - exit: 22016
cmsRun /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_ROOT622_X_2020-10-13-2300/src/HLTrigger/Configuration/test/OnLine_HLT_Fake2.py realData=True globalTag=@ inputFiles=@ : FAILED - time: date Wed Oct 14 15:22:57 2020-date Wed Oct 14 15:22:13 2020 s - exit: 21504
cmsDriver.py RelVal -s HLT:Fake2,RAW2DIGI,L1Reco,RECO --data --scenario=pp -n 10 --conditions auto:run2_data_Fake2 --relval 9000,50 --datatier "RAW-HLT-RECO" --eventcontent FEVTDEBUGHLT --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --era Run2_2016 --processName=HLTRECO --filein file:RelVal_Raw_Fake2_DATA.root --fileout file:RelVal_Raw_Fake2_DATA_HLT_RECO.root : FAILED - time: date Wed Oct 14 15:22:57 2020-date Wed Oct 14 15:22:13 2020 s - exit: 21504
cmsRun /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_ROOT622_X_2020-10-13-2300/src/HLTrigger/Configuration/test/OnLine_HLT_PRef.py realData=True globalTag=@ inputFiles=@ : FAILED - time: date Wed Oct 14 15:23:11 2020-date Wed Oct 14 15:22:16 2020 s - exit: 21504
cmsDriver.py RelVal -s HLT:PRef,RAW2DIGI,L1Reco,RECO --data --scenario=pp -n 10 --conditions auto:run3_data_PRef --relval 9000,50 --datatier "RAW-HLT-RECO" --eventcontent FEVTDEBUGHLT --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --era Run3 --processName=HLTRECO --filein file:RelVal_Raw_PRef_DATA.root --fileout file:RelVal_Raw_PRef_DATA_HLT_RECO.root : FAILED - time: date Wed Oct 14 15:23:11 2020-date Wed Oct 14 15:22:16 2020 s - exit: 21504
cmsRun /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_ROOT622_X_2020-10-13-2300/src/HLTrigger/Configuration/test/OnLine_HLT_HIon.py realData=True globalTag=@ inputFiles=@ : FAILED - time: date Wed Oct 14 15:23:21 2020-date Wed Oct 14 15:22:21 2020 s - exit: 21504
cmsDriver.py RelVal -s HLT:HIon,RAW2DIGI,L1Reco,RECO --data --scenario=pp -n 10 --conditions auto:run3_data_HIon --relval 9000,50 --datatier "RAW-HLT-RECO" --eventcontent FEVTDEBUGHLT --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --era Run3_pp_on_PbPb --processName=HLTRECO --filein file:RelVal_Raw_HIon_DATA.root --fileout file:RelVal_Raw_HIon_DATA_HLT_RECO.root : FAILED - time: date Wed Oct 14 15:23:21 2020-date Wed Oct 14 15:22:21 2020 s - exit: 21504
cmsRun /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_ROOT622_X_2020-10-13-2300/src/HLTrigger/Configuration/test/OnLine_HLT_GRun.py realData=False globalTag=@ inputFiles=@ : FAILED - time: date Wed Oct 14 15:28:10 2020-date Wed Oct 14 15:22:25 2020 s - exit: 22016
cmsDriver.py RelVal -s HLT:GRun,RAW2DIGI,L1Reco,RECO --mc --scenario=pp -n 10 --conditions auto:run3_mc_GRun --relval 9000,50 --datatier "RAW-HLT-RECO" --eventcontent FEVTDEBUGHLT --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --era Run3 --processName=HLTRECO --filein file:RelVal_Raw_GRun_MC.root --fileout file:RelVal_Raw_GRun_MC_HLT_RECO.root : FAILED - time: date Wed Oct 14 15:28:10 2020-date Wed Oct 14 15:22:25 2020 s - exit: 22016
cmsRun /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_ROOT622_X_2020-10-13-2300/src/HLTrigger/Configuration/test/OnLine_HLT_PRef.py realData=False globalTag=@ inputFiles=@ : FAILED - time: date Wed Oct 14 15:27:36 2020-date Wed Oct 14 15:22:32 2020 s - exit: 22016
cmsDriver.py RelVal -s HLT:PRef,RAW2DIGI,L1Reco,RECO --mc --scenario=pp -n 10 --conditions auto:run3_mc_PRef --relval 9000,50 --datatier "RAW-HLT-RECO" --eventcontent FEVTDEBUGHLT --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --era Run3 --processName=HLTRECO --filein file:RelVal_Raw_PRef_MC.root --fileout file:RelVal_Raw_PRef_MC_HLT_RECO.root : FAILED - time: date Wed Oct 14 15:27:36 2020-date Wed Oct 14 15:22:32 2020 s - exit: 22016
cmsRun /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_ROOT622_X_2020-10-13-2300/src/HLTrigger/Configuration/test/OnLine_HLT_PIon.py realData=True globalTag=@ inputFiles=@ : FAILED - time: date Wed Oct 14 15:23:56 2020-date Wed Oct 14 15:22:38 2020 s - exit: 21504
cmsDriver.py RelVal -s HLT:PIon,RAW2DIGI,L1Reco,RECO --data --scenario=pp -n 10 --conditions auto:run3_data_PIon --relval 9000,50 --datatier "RAW-HLT-RECO" --eventcontent FEVTDEBUGHLT --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --era Run3 --processName=HLTRECO --filein file:RelVal_Raw_PIon_DATA.root --fileout file:RelVal_Raw_PIon_DATA_HLT_RECO.root : FAILED - time: date Wed Oct 14 15:23:56 2020-date Wed Oct 14 15:22:38 2020 s - exit: 21504
cmsRun /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_ROOT622_X_2020-10-13-2300/src/HLTrigger/Configuration/test/OnLine_HLT_HIon.py realData=False globalTag=@ inputFiles=@ : FAILED - time: date Wed Oct 14 15:27:48 2020-date Wed Oct 14 15:22:41 2020 s - exit: 22016
cmsDriver.py RelVal -s HLT:HIon,RAW2DIGI,L1Reco,RECO --mc --scenario=pp -n 10 --conditions auto:run3_mc_HIon --relval 9000,50 --datatier "RAW-HLT-RECO" --eventcontent FEVTDEBUGHLT --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --era Run3_pp_on_PbPb --processName=HLTRECO --filein file:RelVal_Raw_HIon_MC.root --fileout file:RelVal_Raw_HIon_MC_HLT_RECO.root : FAILED - time: date Wed Oct 14 15:27:48 2020-date Wed Oct 14 15:22:41 2020 s - exit: 22016
cmsRun /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_ROOT622_X_2020-10-13-2300/src/HLTrigger/Configuration/test/OnLine_HLT_GRun.py realData=True globalTag=@ inputFiles=@ : FAILED - time: date Wed Oct 14 15:27:33 2020-date Wed Oct 14 15:23:02 2020 s - exit: 21504
cmsDriver.py RelVal -s HLT:GRun,RAW2DIGI,L1Reco,RECO --data --scenario=pp -n 10 --conditions auto:run3_data_GRun --relval 9000,50 --datatier "RAW-HLT-RECO" --eventcontent FEVTDEBUGHLT --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --era Run3 --processName=HLTRECO --filein file:RelVal_Raw_GRun_DATA.root --fileout file:RelVal_Raw_GRun_DATA_HLT_RECO.root : FAILED - time: date Wed Oct 14 15:27:33 2020-date Wed Oct 14 15:23:02 2020 s - exit: 21504
cmsRun /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw-patch/CMSSW_11_2_ROOT622_X_2020-10-13-2300/src/HLTrigger/Configuration/test/OnLine_HLT_Fake1.py realData=True globalTag=@ inputFiles=@ : FAILED - time: date Wed Oct 14 15:25:41 2020-date Wed Oct 14 15:23:06 2020 s - exit: 21504
cmsDriver.py RelVal -s HLT:Fake1,RAW2DIGI,L1Reco,RECO --data --scenario=pp -n 10 --conditions auto:run2_data_Fake1 --relval 9000,50 --datatier "RAW-HLT-RECO" --eventcontent FEVTDEBUGHLT --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --era Run2_25ns --processName=HLTRECO --filein file:RelVal_Raw_Fake1_DATA.root --fileout file:RelVal_Raw_Fake1_DATA_HLT_RECO.root : FAILED - time: date Wed Oct 14 15:25:41 2020-date Wed Oct 14 15:23:06 2020 s - exit: 21504

@cmsbuild
Copy link
Contributor

Comparison not run due to runTheMatrix errors (RelVals and Igprof tests were also skipped)

@smuzaffar
Copy link
Contributor

@makortel , We are trying to update root here and noticed that there are many failures like [a]. Any idea what could cause such issue?

[a]

---- Begin Fatal Exception 14-Oct-2020 14:14:54 CEST-----------------------
An exception of category 'FatalRootError' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing input source of type PoolSource
   [2] While initializing meta data for branch: SiPixelRecHitedmNewDetSetVector_siPixelRecHits__RECO.
   Additional Info:
      [a] Fatal Root Error: @SUB=TProtoClass::FindDataMember
data member with index 0 is not found in class boost::any

----- End Fatal Exception -------------------------------------------------

----- Begin Fatal Exception 14-Oct-2020 14:15:00 CEST-----------------------
An exception of category 'FatalRootError' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing input source of type PoolSource
   Additional Info:
      [a] Fatal Root Error: @SUB=TProtoClass::FindDataMember
data member with index 0 is not found in class edm::TypeWithDict

----- End Fatal Exception ------------------------------------------------- 

@makortel
Copy link
Contributor

I have no idea. @Dr15Jones, @pcanal would you have any suggestions?

@Dr15Jones
Copy link

FYI @pcanal

@smuzaffar
Copy link
Contributor

by the way, we have seen the same errors while testing root master branch.

@smuzaffar
Copy link
Contributor

the changes we are testing are root-project/root@d6156de...e4cd9d3

@Dr15Jones
Copy link

@pcanal
Copy link

pcanal commented Oct 14, 2020

There was change in the way we iterate through the data members so it looks like we overlooked something. @smuzaffar can you tell me how to reproduce this with a debug version of ROOT? Thanks.

@Dr15Jones
Copy link

@pcanal I believe all ROOT specific builds are done with debug. Try CMSSW_11_2_ROOT6_X_2020-10-12-2300.

@smuzaffar
Copy link
Contributor

@Dr15Jones , @pcanal these changes are not in IBs. One needs to built it locally. @mrodozov can you pelase build (on cmsdevXX) this PR + build root in debug mode and provide instructions to @pcanal so that he can test it.

@mrodozov
Copy link
Contributor Author

yes of course.

@mrodozov
Copy link
Contributor Author

@pcanal

  • log on cmsdev20
scram p CMSSW_11_2_ROOT622_X_2020-10-13-2300
cp /build/mrodozov/root622/build_rootmaster/slc7_amd64_gcc820/lcg/root-toolfile/2.1-cms/etc/scram.d/* CMSSW_11_2_ROOT622_X_2020-10-13-2300/config/toolbox/slc7_amd64_gcc820/tools/selected/
cd CMSSW_11_2_ROOT622_X_2020-10-13-2300 ; cmsenv ; scram setup ; scram b

The this CMSSW release is this way setup with the root 622 with -DCMAKE_BUILD_TYPE=Debug
It's still building, maybe in 30 minutes

@pcanal
Copy link

pcanal commented Oct 15, 2020

I was able to log it and set up the build area. Could you give an example of command line leading to the error message? Thanks.

@makortel
Copy link
Contributor

E.g. runTheMatrix.py -l 11634.0, although it gives a little bit different error (in step2)

----- Begin Fatal Exception 14-Oct-2020 14:58:04 CEST-----------------------
An exception of category 'FatalRootError' occurred while
   [0] Constructing the EventProcessor
   [1] Constructing module: class=L2MuonProducer label='hltL2Muons'
   Additional Info:
      [a] Fatal Root Error: @SUB=TProtoClass::FindDataMember
data member with index 0 is not found in class tbb::internal::atomic_impl<unsigned long>

----- End Fatal Exception -------------------------------------------------

but should not require grid certificate to run.

@pcanal
Copy link

pcanal commented Oct 15, 2020

I must have missed an important setup step:

[pcanal@cmsdev20 src]$ pwd
/build/pcanal/cms-6314/CMSSW_11_2_ROOT622_X_2020-10-13-2300/src
[pcanal@cmsdev20 src]$ less 11634.0_TTbar_14TeV+2021+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA/step1_TTbar_14TeV+2021+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA.log
----- Begin Fatal Exception 15-Oct-2020 16:08:07 CEST-----------------------
An exception of category 'DictionaryNotFound' occurred while
   [0] Constructing the EventProcessor
   [1] Calling ProductRegistryHelper::addToRegistry, checking dictionaries for produced types
Exception Message:
No data dictionary found for the following classes:

  edm::Wrapper<edm::TriggerResults>

What should I do to fix that?

@pcanal
Copy link

pcanal commented Oct 29, 2020

I found and resolved the problem (mostly). See root-project/root#6728. This patch is sufficient to solve the problem if all enums that are stored as the key of an associative container are using the default size. If some are using the non-default size, we also need the (upcoming) fix for root-project/root#6725.

pcanal added a commit to pcanal/root that referenced this pull request Oct 30, 2020
This fix root-project#6726

As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master).

The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
mrodozov pushed a commit to cms-sw/root that referenced this pull request Oct 30, 2020
This fix root-project#6726

As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master).

The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
mrodozov pushed a commit to cms-sw/root that referenced this pull request Nov 2, 2020
This fix root-project#6726

As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master).

The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
pcanal added a commit to root-project/root that referenced this pull request Nov 2, 2020
This fix #6726

As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master).

The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
pcanal added a commit to pcanal/root that referenced this pull request Nov 2, 2020
This fix root-project#6726

As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master).

The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
pcanal added a commit to root-project/root that referenced this pull request Nov 2, 2020
This fix #6726

As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master).

The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
@pcanal
Copy link

pcanal commented Nov 2, 2020

Related PR merged into main branch and v6.22 patch branch.

@smuzaffar
Copy link
Contributor

thanks @pcanal , I am testing latest v6.22 here
CMSSW master: #6359
CMSSW ROOT622: #6358

@smuzaffar
Copy link
Contributor

@pcanal , looks like latest ROOT 6.22 updates broke our tests again [a] (e.g see https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-eac189/10459/runTheMatrix-results/23234.0_TTbar_14TeV+2026D49+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal/step3_TTbar_14TeV+2026D49+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal.log ). Things were in good shape with ROOT v622 commit root-project/root@db0b0f7 + your change for non-zero size enum. But with latest root v6.22 ( https://github.com/root-project/root/commits/v6-22-00-patches ) we have few crashes. I have integrated #6358 for our ROOT622 IBs and in few hours we will have full tests results.

[a]

#3  0x00002af504960a59 in sig_dostack_then_abort () from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_2_ROOT622_X_2020-11-02-2300/lib/slc7_amd64_gcc820/pluginFWCoreServicesPlugins.so
#4  <signal handler called>
#5  0x00002af4f3e84290 in TStreamerInfo::AddReadAction (this=0x2af506a31000, readSequence=0x2af536e87740, i=0, compinfo=0x2af59072d588) at /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-1718c3698c5b60720e5f35ab8ee4efc5/root-6.22.03/io/io/src/TStreamerInfoActions.cxx:3317
#6  0x00002af4f3e830c9 in TStreamerInfo::Compile (this=0x2af506a31000) at /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-1718c3698c5b60720e5f35ab8ee4efc5/root-6.22.03/io/io/src/TStreamerInfoActions.cxx:3191
#7  0x00002af4f3e6c1ae in TStreamerInfo::BuildOld (this=0x2af506a31000) at /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-1718c3698c5b60720e5f35ab8ee4efc5/root-6.22.03/io/io/src/TStreamerInfo.cxx:2547
#8  0x00002af4f45f22ad in TClass::GetStreamerInfo (this=0x2af536bcd800, version=6) at /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-1718c3698c5b60720e5f35ab8ee4efc5/root-6.22.03/core/meta/src/TClass.cxx:4619
#9  0x00002af4f3e6fd3e in TStreamerInfo::ForceWriteInfo (this=0x2af531d13300, file=0x2af4fb7ede00, force=false) at /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-1718c3698c5b60720e5f35ab8ee4efc5/root-6.22.03/io/io/src/TStreamerInfo.cxx:3156
#10 0x00002af4f35f6aa1 in TTreeCloner::CopyStreamerInfos (this=0x7ffc12abb970) at /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-1718c3698c5b60720e5f35ab8ee4efc5/root-6.22.03/tree/tree/src/TTreeCloner.cxx:465
#11 0x00002af4f35f5ad0 in TTreeCloner::Exec (this=0x7ffc12abb970) at /data/cmsbld/jenkins/workspace/ib-run-pr-tests/testBuildDir/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-1718c3698c5b60720e5f35ab8ee4efc5/root-6.22.03/tree/tree/src/TTreeCloner.cxx:202
#12 0x00002af551da8999 in edm::RootOutputTree::fastCloneTTree(TTree*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_2_ROOT622_X_2020-11-02-2300/lib/slc7_amd64_gcc820/libIOPoolOutput.so
#13 0x00002af551da8fa4 in edm::RootOutputTree::maybeFastCloneTree(bool, bool, TTree*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) () from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_2_ROOT622_X_2020-11-02-2300/lib/slc7_amd64_gcc820/libIOPoolOutput.so
#14 0x00002af551da2ec9 in edm::RootOutputFile::beginInputFile(edm::FileBlock const&, int) () from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_2_ROOT622_X_2020-11-02-2300/lib/slc7_amd64_gcc820/libIOPoolOutput.so
#15 0x00002af4f30ce9d8 in edm::Schedule::openOutputFiles(edm::FileBlock&) () from /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_2_ROOT622_X_2020-11-02-2300/lib/slc7_amd64_gcc820/libFWCoreFramework.so

@pcanal
Copy link

pcanal commented Nov 3, 2020

@smuzaffar let me know when the build is ready to debug. (and please remind me one of the non-certificate reproducer :) )

@makortel
Copy link
Contributor

makortel commented Nov 3, 2020

@pcanal For the workflow @smuzaffar pointer to, runTheMatrix.py -l 23234.0 should be sufficient (to run the GEN-SIM locally and not use any non-local input files).

@smuzaffar
Copy link
Contributor

@pcanal , new IB CMSSW_11_2_ROOT622_X_2020-11-03-1100 is available but only one workflow 4.17 failed ( https://cmssdt.cern.ch/SDT/cgi-bin/buildlogs/raw/slc7_amd64_gcc820/CMSSW_11_2_ROOT622_X_2020-11-03-1100/pyRelValMatrixLogs/run/4.17_RunMinBias2011A+RunMinBias2011A+HLTD+RECODR1reHLT+HARVESTDR1reHLT+SKIMDreHLT/step5_RunMinBias2011A+RunMinBias2011A+HLTD+RECODR1reHLT+HARVESTDR1reHLT+SKIMDreHLT.log ) . Strangely for next IB (CMSSW_11_2_ROOT622_X_2020-11-03-2300) this workflow 4.17 passed. May be it is some thread related issue. Tests for 23h00 IB are still running, I will update again once we have full results of 23h00 IB

@Dr15Jones
Copy link

Dr15Jones commented Nov 4, 2020

Strangely for next IB (CMSSW_11_2_ROOT622_X_2020-11-03-2300) this workflow 4.17 passed. May be it is some thread related issue.

The issue looks like a memory overwrite sort of problem to me. Such problems can 'move' around when using multiple threads since the order of new/delete is not consistent process to process.

The problem seems to be a corrupted virtual table called when a data product read from file is being deleted.

@pcanal
Copy link

pcanal commented Nov 9, 2020

@smuzaffar root-project/root#6768 solves the problems and has been merged into the master and v6.22 branches.

@smuzaffar
Copy link
Contributor

smuzaffar commented Nov 9, 2020

thanks @pcanal , I already have tested and intergeted it for today's 11h00 ROOT622 IB.
I confirm that the errors I was getting during PR tests are fixed. The IB is now running the tests and I already see that one workflow fails [a]. I will report back is there is a simple way to reproduce this

[a] workflow 1002.0 step3
https://cmssdt.cern.ch/SDT/cgi-bin/logreader/slc7_amd64_gcc820/CMSSW_11_2_ROOT622_X_2020-11-09-1100/pyRelValMatrixLogs/run/1002.0_RRD+RunMinBias2011A+RECODR1+COPYPASTE/step3_RRD+RunMinBias2011A+RECODR1+COPYPASTE.log#/274-274

#3  0x00002b53f04a0a59 in sig_dostack_then_abort () from /cvmfs/cms-ib.cern.ch/week0/slc7_amd64_gcc820/cms/cmssw/CMSSW_11_2_ROOT622_X_2020-11-09-1100/lib/slc7_amd64_gcc820/pluginFWCoreServicesPlugins.so
#4  <signal handler called>
#5  0x00002b53e7e48fb6 in std::type_info::name (this=0x384b180000000000) at /data/cmsbld/jenkins/workspace/build-any-ib/w/slc7_amd64_gcc820/external/gcc/8.2.0-bcolbf/include/c++/8.4.0/typeinfo:100
#6  0x00002b53e8e1c377 in TClass::GetClass (typeinfo=..., load=true) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-b5939d52a0322c380ff1dab7b0540fed/root-6.22.03/core/meta/src/TClass.cxx:3185
#7  0x00002b53e8e45078 in TIsAProxy::operator() (this=0x2b540ba13680, obj=0x2b5425981240) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-b5939d52a0322c380ff1dab7b0540fed/root-6.22.03/core/meta/src/TIsAProxy.cxx:117
#8  0x00002b53e8e1a607 in TClass::GetActualClass (this=0x2b5414f1f400, object=0x2b5425981240) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-b5939d52a0322c380ff1dab7b0540fed/root-6.22.03/core/meta/src/TClass.cxx:2573
#9  0x00002b53e85cdd84 in TBufferIO::WriteObjectAny (this=0x2b542ef9bc60, obj=0x2b5425981240, ptrClass=0x2b5414f1f400, cacheReuse=true) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-b5939d52a0322c380ff1dab7b0540fed/root-6.22.03/io/io/src/TBufferIO.cxx:504
#10 0x00002b53e8662500 in TGenCollectionStreamer::WriteObjects (this=0x2b5426c40800, nElements=14, b=...) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-b5939d52a0322c380ff1dab7b0540fed/root-6.22.03/io/io/src/TGenCollectionStreamer.cxx:984
#11 0x00002b53e866382e in TGenCollectionStreamer::Streamer (this=0x2b5426c40800, b=...) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-b5939d52a0322c380ff1dab7b0540fed/root-6.22.03/io/io/src/TGenCollectionStreamer.cxx:1449
#12 0x00002b53e8627cc8 in TCollectionStreamer::Streamer (this=0x2b540ba4d448, buff=..., pObj=0x2b543fa90760, onFileClass=0x0) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-b5939d52a0322c380ff1dab7b0540fed/root-6.22.03/io/io/src/TCollectionProxyFactory.cxx:166
#13 0x00002b53e85b7079 in TCollectionClassStreamer::Stream (this=0x2b540ba4d410, b=..., obj=0x2b543fa90760, onfileClass=0x0) at include/TCollectionProxyFactory.h:183
#14 0x00002b53e8e2681c in TClass::StreamerExternal (pThis=0x2b543780d800, object=0x2b543fa90760, b=..., onfile_class=0x0) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-b5939d52a0322c380ff1dab7b0540fed/root-6.22.03/core/meta/src/TClass.cxx:6613
#15 0x00002b53e7dc6cf1 in TClass::Streamer (this=0x2b543780d800, obj=0x2b543fa90760, b=..., onfile_class=0x0) at /data/cmsbld/jenkins/workspace/build-any-ib/w/BUILD/slc7_amd64_gcc820/lcg/root/6.22.03-b5939d52a0322c380ff1dab7b0540fed/root-6.22.03/core/meta/inc/TClass.h:568

@pcanal
Copy link

pcanal commented Nov 9, 2020

@mrodozov which version did you import? There was a fatal bug solved today (In the function TStreamerInfo::SetClass:


void TStreamerInfo::SetClass(TClass *newcl)
{
   if (newcl) {
      // This is mostly (but not only) for the artificial "This" streamerElement for an stl collection.
      Update(fClass, newcl);
   }
   fClass = newcl;
}

is the correct code.

@smuzaffar
Copy link
Contributor

I took your PR https://github.com/cms-sw/root/pull/146/commits on top of root's v6.22 branch. Looks like your PR was update after I merge.
Ok, let me rsync from root 6.22 branch then

@pcanal
Copy link

pcanal commented Nov 9, 2020

I can't verify 1002.0 due to a DAS error. I would need the input file and step3 script on cmsdev20.

@smuzaffar
Copy link
Contributor

@pcanal , let me re-build the IB using latest latest root v6.22 branch and then we will see how it goes.

@smuzaffar
Copy link
Contributor

@pcanal , wf 1002.0 step2 input and config files are available under /afs/cern.ch/user/m/muzaffar/public/root622/1002.0
cmsRun step3_NONE.py randomly produce thsi crash.

As I wrote, this IB includes https://github.com/cms-sw/root/pull/146/commits which might not have all of your changes, So feel free to test the above otherwise wait till tomorrow when we have new IB based on latest root v6.22

@pcanal
Copy link

pcanal commented Nov 11, 2020

For the record, it was reported elsewhere that this problems are solved but there are other issues. (cms-sw/cmssw#30359 (comment))

pcanal added a commit to pcanal/root that referenced this pull request Apr 16, 2021
This fix root-project#6726

As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master).

The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
pcanal added a commit to pcanal/root that referenced this pull request Apr 29, 2021
This fix root-project#6726

As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master).

The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
pcanal added a commit to pcanal/root that referenced this pull request Oct 4, 2022
This fix root-project#6726

As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master).

The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
pcanal added a commit to root-project/root that referenced this pull request Oct 5, 2022
This fix #6726

As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master).

The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants