Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix VectorHits workflows with PU #38468

Merged
merged 1 commit into from Jun 27, 2022
Merged

Conversation

mmusich
Copy link
Contributor

@mmusich mmusich commented Jun 22, 2022

PR description:

it was noticed by @JanFSchulte that running wf 39634.9 it was crashing on the harvesting step with a seg fault in Phase2ITRecHitHarvester:Phase2OTTrackingRechitHarvester_PS.
This happens because this protection:

from Configuration.ProcessModifiers.vectorHits_cff import vectorHits
vectorHits.toReplaceWith(trackerphase2ValidationHarvesting, trackerphase2ValidationHarvesting.copyAndExclude([Phase2OTTrackingRechitHarvester_PS,Phase2OTTrackingRechitHarvester_2S]))

was not picked up, as the modifier was not applied in the harvesting step.
This PR trivially fixes the problem.

PR validation:

Run the wf 39634.9 and I get:

39634.9 2026D88PU_vectorHits+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTriggerPU+RecoGlobalPU+HARVESTGlobalPU [1]: cmsDriver.py TTbar_14TeV_TuneCP5_cfi  -s GEN,SIM -n 10 --conditions auto:phase2_realistic_T21 --beamspot HLLHC14TeV --datatier GEN-SIM --eventcontent FEVTDEBUG --geometry Extended2026D88 --era Phase2C17I13M9 --relval 9000,100 
                                           [2]: cmsDriver.py step2  -s DIGI:pdigi_valid,L1TrackTrigger,L1,DIGI2RAW,HLT:@fake2 --conditions auto:phase2_realistic_T21 --datatier GEN-SIM-DIGI-RAW -n 10 --eventcontent FEVTDEBUGHLT --geometry Extended2026D88 --era Phase2C17I13M9 --pileup AVE_200_BX_25ns --pileup_input das:/RelValMinBias_14TeV/CMSSW_12_3_0_pre5-123X_mcRun4_realistic_v4_2026D88noPU-v1/GEN-SIM
                                           [3]: cmsDriver.py step3  -s RAW2DIGI,RECO,RECOSIM,PAT,VALIDATION:@phase2Validation+@miniAODValidation,DQM:@phase2+@miniAODDQM --conditions auto:phase2_realistic_T21 --datatier GEN-SIM-RECO,MINIAODSIM,DQMIO -n 10 --eventcontent FEVTDEBUGHLT,MINIAODSIM,DQM --geometry Extended2026D88 --era Phase2C17I13M9 --procModifiers vectorHits --pileup AVE_200_BX_25ns --pileup_input das:/RelValMinBias_14TeV/CMSSW_12_3_0_pre5-123X_mcRun4_realistic_v4_2026D88noPU-v1/GEN-SIM
                                           [4]: cmsDriver.py step4  -s HARVESTING:@phase2Validation+@phase2+@miniAODValidation+@miniAODDQM --conditions auto:phase2_realistic_T21 --mc  --geometry Extended2026D88 --scenario pp --filetype DQM --era Phase2C17I13M9 --procModifiers vectorHits -n 10 --pileup AVE_200_BX_25ns --pileup_input das:/RelValMinBias_14TeV/CMSSW_12_3_0_pre5-123X_mcRun4_realistic_v4_2026D88noPU-v1/GEN-SIM

as expected (notice the modifier gets added to the step4).

if this PR is a backport please specify the original PR and why you need to backport that PR:

Not a backport but will be backported.

@mmusich
Copy link
Contributor Author

mmusich commented Jun 22, 2022

type bug-fix

@mmusich
Copy link
Contributor Author

mmusich commented Jun 22, 2022

test parameters:

  • workflows = 39634.9
  • relvals_opt= -w standard,highstats,pileup,generator,extendedgen,production,upgrade,cleanedupgrade,ged

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-38468/30681

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @mmusich (Marco Musich) for master.

It involves the following packages:

  • Configuration/PyReleaseValidation (pdmv, upgrade)

@jordan-martins, @bbilin, @cmsbuild, @AdrianoDee, @srimanob, @kskovpen can you please review it and eventually sign? Thanks.
@makortel, @kpedro88, @fabiocos, @Martin-Grunewald, @missirol, @trtomei, @beaucero, @slomeo this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Contributor Author

mmusich commented Jun 22, 2022

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9b0760/25700/summary.html
COMMIT: ce7d818
CMSSW: CMSSW_12_5_X_2022-06-22-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/38468/25700/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 2 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3659099
  • DQMHistoTests: Total failures: 7
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3659069
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 49 files compared)
  • DQMHistoSizes: changed ( 312.0 ): 0.004 KiB MessageLogger/Warnings
  • Checked 208 log files, 45 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@mmusich
Copy link
Contributor Author

mmusich commented Jun 24, 2022

please test workflow 39634.9

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals-INPUT
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9b0760/25752/summary.html
COMMIT: ce7d818
CMSSW: CMSSW_12_5_X_2022-06-23-2300/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/38468/25752/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals-INPUT

The relvals timed out after 4 hours.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3659307
  • DQMHistoTests: Total failures: 14
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3659271
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 208 log files, 45 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@mmusich
Copy link
Contributor Author

mmusich commented Jun 24, 2022

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9b0760/25764/summary.html
COMMIT: ce7d818
CMSSW: CMSSW_12_5_X_2022-06-24-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/38468/25764/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3659307
  • DQMHistoTests: Total failures: 8
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3659277
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 208 log files, 45 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@mmusich
Copy link
Contributor Author

mmusich commented Jun 26, 2022

Hi @cms-sw/orp-l2, @smuzaffar,
I keep not seeing wf 39634.9 in https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9b0760/25764/runTheMatrix-results/
so it seems that #38468 (comment) is not the right command to use. Can you clarify what command should be used?
Thanks you!

@kpedro88
Copy link
Contributor

@mmusich the problem is that you are using both "upgrade" and "cleanedupgrade". The latter exists so that the former does not have to be used, because it would cause duplicate workflow conflicts with "standard". Instead, using both of them together causes different duplicate workflow conflicts, as the relvals console log indicates:

ValueError: Duplicated workflows: 39634.9

The correct option is: relvals_opt= -w standard,highstats,pileup,generator,extendedgen,production,cleanedupgrade,ged

@mmusich
Copy link
Contributor Author

mmusich commented Jun 26, 2022

test parameters:

  • workflow = 39634.9
  • relvals_opt= -w standard,highstats,pileup,generator,extendedgen,production,cleanedupgrade,ged

@mmusich
Copy link
Contributor Author

mmusich commented Jun 26, 2022

@cmsbuild, please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-9b0760/25809/summary.html
COMMIT: ce7d818
CMSSW: CMSSW_12_5_X_2022-06-26-0000/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/38468/25809/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

There are some workflows for which there are errors in the baseline:
39634.9 step 4
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-9b0760/39634.9_TTbar_14TeV+2026D88PU_vectorHits+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTriggerPU+RecoGlobalPU+HARVESTGlobalPU

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3659995
  • DQMHistoTests: Total failures: 8
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3659965
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 212 log files, 49 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@mmusich
Copy link
Contributor Author

mmusich commented Jun 27, 2022

@cms-sw/pdmv-l2 @cms-sw/upgrade-l2 now that the tests are finally proving that the Harvesting step for wf 39634.9 is running fine (see log), please consider signing this.
Thank you.

@kskovpen
Copy link
Contributor

+pdmv

@srimanob
Copy link
Contributor

+Upgrade

This PR just add HarvestGlobal to the vectorhits workflow.

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@qliphy
Copy link
Contributor

qliphy commented Jun 27, 2022

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants