New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert to CUDA 10.1 Update 1 (10.1.168), add support for Power #5418
Revert to CUDA 10.1 Update 1 (10.1.168), add support for Power #5418
Conversation
please test |
The tests are being triggered in jenkins. |
A new Pull Request was created by @fwyzard (Andrea Bocci) for branch IB/CMSSW_11_1_X/master. @cmsbuild, @smuzaffar, @mrodozov can you please review it and eventually sign? Thanks. |
5b43c86
to
0d7b06c
Compare
Pull request #5418 was updated. |
0d7b06c
to
c559c69
Compare
please test |
@mrodozov @smuzaffar how do I ask to run the tests on Power ? |
The tests are being triggered in jenkins. |
Pull request #5418 was updated. |
please test for slc7_ppc64le_gcc820 |
please test for slc7_aarch64_gcc820 |
The tests are being triggered in jenkins.
|
The CUDA part on Power seems to have completed fine ? |
-1 Tested at: c559c69 You can see the results of the tests here: I found follow errors while testing this PR Failed tests: UnitTests
I found errors in the following unit tests: ---> test test_pycurl had ERRORS |
Comparison job queued. |
-1 Tested at: c559c69 You can see the results of the tests here: I found follow errors while testing this PR Failed tests: UnitTests RelVals AddOn
I found errors in the following unit tests: ---> test test_calo_rechit had ERRORS
When I ran the RelVals I found an error in the following workflows: runTheMatrix-results/136.7611_RunJetHT2016E_reminiaod+RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM/step2_RunJetHT2016E_reminiaod+RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM.log136.8311 step2 runTheMatrix-results/136.8311_RunJetHT2017F_reminiaod+RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017/step2_RunJetHT2017F_reminiaod+RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017.log1325.7 step2 runTheMatrix-results/1325.7_TTbar_13_94Xv2NanoAODINPUT+TTbar_13_94Xv2NanoAODINPUT+NANOEDMMC2017_94XMiniAODv2+HARVESTNANOAODMC2017_94XMiniAODv2/step2_TTbar_13_94Xv2NanoAODINPUT+TTbar_13_94Xv2NanoAODINPUT+NANOEDMMC2017_94XMiniAODv2+HARVESTNANOAODMC2017_94XMiniAODv2.log5.1 step1 runTheMatrix-results/5.1_TTbar+TTbarFS+HARVESTFS/step1_TTbar+TTbarFS+HARVESTFS.log135.4 step1 runTheMatrix-results/135.4_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS/step1_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS.log1000.0 step2 runTheMatrix-results/1000.0_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT/step2_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT.log101.0 step1 runTheMatrix-results/101.0_SingleElectronE120EHCAL+SingleElectronE120EHCAL/step1_SingleElectronE120EHCAL+SingleElectronE120EHCAL.log1001.0 step2 runTheMatrix-results/1001.0_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVDSIPIXELCALRUN1+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5/step2_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVDSIPIXELCALRUN1+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5.log140.56 step2 runTheMatrix-results/140.56_RunHI2018+RunHI2018+RECOHID18+HARVESTDHI18/step2_RunHI2018+RunHI2018+RECOHID18+HARVESTDHI18.log4.53 step3 runTheMatrix-results/4.53_RunPhoton2012B+RunPhoton2012B+HLTD+RECODR1reHLT+HARVESTDR1reHLT/step3_RunPhoton2012B+RunPhoton2012B+HLTD+RECODR1reHLT+HARVESTDR1reHLT.log136.85 step3 runTheMatrix-results/136.85_RunEGamma2018A+RunEGamma2018A+HLTDR2_2018+RECODR2_2018reHLT_skimEGamma_Offline_L1TEgDQM+HARVEST2018_L1TEgDQM/step3_RunEGamma2018A+RunEGamma2018A+HLTDR2_2018+RECODR2_2018reHLT_skimEGamma_Offline_L1TEgDQM+HARVEST2018_L1TEgDQM.log136.731 step3 runTheMatrix-results/136.731_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2/step3_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2.log136.788 step3 runTheMatrix-results/136.788_RunSinglePh2017B+RunSinglePh2017B+HLTDR2_2017+RECODR2_2017reHLT_skimSinglePh_Prompt+HARVEST2017/step3_RunSinglePh2017B+RunSinglePh2017B+HLTDR2_2017+RECODR2_2017reHLT_skimSinglePh_Prompt+HARVEST2017.log1306.0 step3 runTheMatrix-results/1306.0_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15/step3_SingleMuPt1_UP15+SingleMuPt1_UP15+DIGIUP15+RECOUP15+HARVESTUP15.log1330.0 step3 runTheMatrix-results/1330.0_ZMM_13+ZMM_13+DIGIUP15+RECOUP15_L1TMuDQM+HARVESTUP15_L1TMuDQM+NANOUP15/step3_ZMM_13+ZMM_13+DIGIUP15+RECOUP15_L1TMuDQM+HARVESTUP15_L1TMuDQM+NANOUP15.log4.22 step5 runTheMatrix-results/4.22_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC/step5_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC.log8.0 step5 runTheMatrix-results/8.0_BeamHalo+BeamHalo+DIGICOS+RECOCOS+ALCABH+HARVESTCOS/step5_BeamHalo+BeamHalo+DIGICOS+RECOCOS+ALCABH+HARVESTCOS.log10042.0 step3 runTheMatrix-results/10042.0_ZMM_13+ZMM_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017+ALCAFull_2017+NanoFull_2017/step3_ZMM_13+ZMM_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017+ALCAFull_2017+NanoFull_2017.log7.3 step5 runTheMatrix-results/7.3_CosmicsSPLoose_UP18+CosmicsSPLoose_UP18+DIGICOS_UP18+RECOCOS_UP18+ALCACOS_UP18+HARVESTCOS_UP18/step5_CosmicsSPLoose_UP18+CosmicsSPLoose_UP18+DIGICOS_UP18+RECOCOS_UP18+ALCACOS_UP18+HARVESTCOS_UP18.log9.0 step3 runTheMatrix-results/9.0_Higgs200ChargedTaus+Higgs200ChargedTaus+DIGI+RECO+HARVEST/step3_Higgs200ChargedTaus+Higgs200ChargedTaus+DIGI+RECO+HARVEST.log25.0 step3 runTheMatrix-results/25.0_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT/step3_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT.log158.0 step3 runTheMatrix-results/158.0_HydjetQ_B12_5020GeV_2018_ppReco+HydjetQ_B12_5020GeV_2018_ppReco+DIGIHI2018PPRECO+RECOHI2018PPRECO+ALCARECOHI2018PPRECO+HARVESTHI2018PPRECO/step3_HydjetQ_B12_5020GeV_2018_ppReco+HydjetQ_B12_5020GeV_2018_ppReco+DIGIHI2018PPRECO+RECOHI2018PPRECO+ALCARECOHI2018PPRECO+HARVESTHI2018PPRECO.log140.53 step3 runTheMatrix-results/140.53_RunHI2011+RunHI2011+RECOHID11+HARVESTDHI/step3_RunHI2011+RunHI2011+RECOHID11+HARVESTDHI.log11634.0 step3 runTheMatrix-results/11634.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2021_GenSimFull+DigiFull_2021+RecoFull_2021+HARVESTFull_2021+ALCAFull_2021/step3_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2021_GenSimFull+DigiFull_2021+RecoFull_2021+HARVESTFull_2021+ALCAFull_2021.log10024.0 step3 runTheMatrix-results/10024.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017+ALCAFull_2017+NanoFull_2017/step3_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017_GenSimFull+DigiFull_2017+RecoFull_2017+HARVESTFull_2017+ALCAFull_2017+NanoFull_2017.log10824.0 step3 runTheMatrix-results/10824.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2018_GenSimFull+DigiFull_2018+RecoFull_2018+HARVESTFull_2018+ALCAFull_2018+NanoFull_2018/step3_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2018_GenSimFull+DigiFull_2018+RecoFull_2018+HARVESTFull_2018+ALCAFull_2018+NanoFull_2018.log12434.0 step3 runTheMatrix-results/12434.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023_GenSimFull+DigiFull_2023+RecoFull_2023+HARVESTFull_2023+ALCAFull_2023/step3_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2023_GenSimFull+DigiFull_2023+RecoFull_2023+HARVESTFull_2023+ALCAFull_2023.log25202.0 step3 runTheMatrix-results/25202.0_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVESTUP15_PU25+NANOUP15_PU25/step3_TTbar_13+TTbar_13+DIGIUP15_PU25+RECOUP15_PU25+HARVESTUP15_PU25+NANOUP15_PU25.log20034.0 step3 runTheMatrix-results/20034.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2026D35_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D35+RecoFullGlobal_2026D35+HARVESTFullGlobal_2026D35/step3_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2026D35_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D35+RecoFullGlobal_2026D35+HARVESTFullGlobal_2026D35.log10224.0 step3 runTheMatrix-results/10224.0_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017PU_GenSimFull+DigiFullPU_2017PU+RecoFullPU_2017PU+HARVESTFullPU_2017PU+NanoFull_2017PU/step3_TTbar_13+TTbar_13TeV_TuneCUETP8M1_2017PU_GenSimFull+DigiFullPU_2017PU+RecoFullPU_2017PU+HARVESTFullPU_2017PU+NanoFull_2017PU.log20434.0 step3 runTheMatrix-results/20434.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2026D41_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D41+RecoFullGlobal_2026D41+HARVESTFullGlobal_2026D41/step3_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2026D41_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D41+RecoFullGlobal_2026D41+HARVESTFullGlobal_2026D41.log23234.0 step3 runTheMatrix-results/23234.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2026D49_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D49+RecoFullGlobal_2026D49+HARVESTFullGlobal_2026D49/step3_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2026D49_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D49+RecoFullGlobal_2026D49+HARVESTFullGlobal_2026D49.log21234.0 step3 runTheMatrix-results/21234.0_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2026D44_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D44+RecoFullGlobal_2026D44+HARVESTFullGlobal_2026D44/step3_TTbar_14TeV+TTbar_14TeV_TuneCUETP8M1_2026D44_GenSimHLBeamSpotFull14+DigiFullTrigger_2026D44+RecoFullGlobal_2026D44+HARVESTFullGlobal_2026D44.log250202.181 step4 runTheMatrix-results/250202.181_TTbar_13UP18+TTbar_13UP18+PREMIXUP18_PU25+DIGIPRMXLOCALUP18_PU25+RECOPRMXUP18_PU25+HARVESTUP18_PU25/step4_TTbar_13UP18+TTbar_13UP18+PREMIXUP18_PU25+DIGIPRMXLOCALUP18_PU25+RECOPRMXUP18_PU25+HARVESTUP18_PU25.log
I found errors in the following addon tests: cmsDriver.py TTbar_8TeV_TuneCUETP8M1_cfi --conditions auto:run1_mc --fast -n 100 --eventcontent AODSIM,DQM --relval 100000,1000 -s GEN,SIM,RECOBEFMIX,DIGI:pdigi_valid,L1,DIGI2RAW,L1Reco,RECO,EI,VALIDATION --customise=HLTrigger/Configuration/CustomConfigs.L1THLT --datatier GEN-SIM-DIGI-RECO,DQMIO --beamspot Realistic8TeVCollision : FAILED - time: date Thu Dec 12 12:52:39 2019-date Thu Dec 12 12:52:30 2019 s - exit: 256 |
Comparison not run due to runTheMatrix errors (RelVals and Igprof tests were also skipped) |
Yes, cuda part of cmssw build fine for ppc64le |
-1 Tested at: c559c69 You can see the results of the tests here: I found follow errors while testing this PR Failed tests: UnitTests RelVals
I found errors in the following unit tests: ---> test import-tables had ERRORS
|
Comparison not run due to runTheMatrix errors (RelVals and Igprof tests were also skipped) |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
+externals |
This pull request is fully signed and it will be integrated in one of the next IB/CMSSW_11_1_X/master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2) |
I know... but there is no newer version officially available, yet :-( |
CUDA 10.1 Update 2 and later show a problem with running under MPS (NVIDIA Multi Process Server) or using CUDA Dynamic Parallelism within CMSSW.
While awaiting feedback from NVIDIA, the only solution seems to revert to the latest working version, which was 10.1 Update 1.
Drop the Nsight Compute and Nsigh System tools from the CUDA package, because they are released much more often as external packages.
Add support for CUDA on IBM Power architecture (ppc64le) on Linux.