Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime error from SiPixelQualityESProducer using cmsDriver.py -s .. HLT:[db]:[config] .. #42492

Open
missirol opened this issue Aug 7, 2023 · 14 comments

Comments

@missirol
Copy link
Contributor

missirol commented Aug 7, 2023

The script in [1] contains two tests ("test1" and "test2") which only differ in how the HLT step is loaded by cmsDriver.py. In CMSSW_13_0_11, "test1" and "test2" are in principle equivalent, since GRun/V140 corresponds to the "GRun" HLT menu available in the release. The issue is that "test1" fails at runtime with the error in [2], while "test2" works as intended. This was reported by @mmusich in relation to PDMVRELVALS-204 (in the latter JIRA ticket, GRun/V151 was used, while in [1] I'm using GRun/V140 just to take the same HLT menu as the one of HLT:GRun in CMSSW_13_0_11).

The problem is triggered by the use of the syntax -s HLT:[db]:[config], which allows to download a given HLT configuration ([config]) from one of the ConfDB databases ([db]), as opposed to using one of the HLT menus available in CMSSW (the latter menus can be loaded with the more common syntax -s HLT:[menu], where [menu] is a keyword associated to a given HLT menu, e.g. -s HLT:GRun for the latest pp menu available in the release). The main difference between -s HLT:[menu] and -s HLT:[db]:[config] is in the order in which the HLT configuration and other cff files are loaded into the cms.Process in the python file produced by cmsDriver.py.

This issue summarises a few findings in relation to the runtime error in [2].

[1]

#!/bin/bash

# cmsrel CMSSW_13_0_11
# cd CMSSW_13_0_11/src
# cmsenv

EXE='cmsDriver.py step2 --conditions 130X_mcRun3_2023_realistic_relvals2023D_v1 --datatier GEN-SIM-DIGI-RAW'
EXE+=' --era Run3_2023 --eventcontent FEVTDEBUGHLT --filein dbs:/RelValTTbar_14TeV/CMSSW_13_0_11-130X_mcRun3_2023_realistic_relvals2023D_v1_RV204-v1/GEN-SIM'
EXE+=' --fileout file:step2.root --geometry DB:Extended --nThreads 1 --number 10 --pileup 2023_Fills_8807_8901_ProjectedPileup_PoissonOOTPU'
EXE+=' --pileup_input das:/RelValMinBias_14TeV/CMSSW_13_0_11-130X_mcRun3_2023_realistic_relvals2023D_v1_RV204-v1/GEN-SIM'
#--customise_commands='process.siPixelQualityESProducer.siPixelQualityLabel = "forDigitizer"'

### TEST-1
foo=test1
${EXE} --python_filename "${foo}".py --step DIGI:pdigi_valid,L1,DIGI2RAW,HLT:run3:/dev/CMSSW_13_0_0/GRun/V140 --no_exec
cmsRun "${foo}".py &> "${foo}".log

### TEST-2
foo=test2
${EXE} --python_filename "${foo}".py --step DIGI:pdigi_valid,L1,DIGI2RAW,HLT:GRun --no_exec
cmsRun "${foo}".py &> "${foo}".log

[2] Error from test1.log

----- Begin Fatal Exception 07-Aug-2023 13:10:15 CEST-----------------------
An exception of category 'NoProxyException' occurred while
   [0] Processing  Event run: 1 lumi: 792 event: 158201 stream: 0
   [1] Running path 'HLTAnalyzerEndpath'
   [2] Prefetching for module L1TRawToDigi/'hltGtStage2Digis'
   [3] Prefetching for module RawDataCollectorByLabel/'rawDataCollector'
   [4] Prefetching for module SiStripDigiToRawModule/'SiStripDigiToRaw'
   [5] Calling method for module MixingModule/'mix'
Exception Message:
No data of type "SiPixelQuality" with label "forDigitizer" in record "SiPixelQualityRcd"
 Please add an ESSource or ESProducer to your job which can deliver this data.
----- End Fatal Exception -------------------------------------------------
@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 7, 2023

A new Issue was created by @missirol Marino Missiroli.

@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@missirol
Copy link
Contributor Author

missirol commented Aug 7, 2023

The main difference between -s HLT:[menu] and -s HLT:[db]:[config] is in the order in which the HLT configuration and other cff files are loaded into the cms.Process in the python file produced by cmsDriver.py.

-s HLT:[menu] case

The "test2" example in the description works at runtime, and the python output of cmsDriver.py contains the following lines.

from Configuration.Eras.Era_Run3_2023_cff import Run3_2023

process = cms.Process('HLT',Run3_2023)

# import of standard configurations
process.load('Configuration.StandardSequences.Services_cff')
process.load('SimGeneral.HepPDTESSource.pythiapdt_cfi')
process.load('FWCore.MessageService.MessageLogger_cfi')
process.load('Configuration.EventContent.EventContent_cff')
process.load('SimGeneral.MixingModule.mix_2023_Fills_8807_8901_ProjectedPileup_PoissonOOTPU_cfi')
process.load('Configuration.StandardSequences.GeometryRecoDB_cff')
process.load('Configuration.StandardSequences.MagneticField_cff')
process.load('Configuration.StandardSequences.Digi_cff')
process.load('Configuration.StandardSequences.DigiToRaw_cff')
process.load('HLTrigger.Configuration.HLT_GRun_cff')
process.load('Configuration.StandardSequences.EndOfProcess_cff')
process.load('Configuration.StandardSequences.FrontierConditions_GlobalTag_cff')

What (I think) happens is that

  • the loading of mix_2023_Fills_8807_8901_ProjectedPileup_PoissonOOTPU_cfi modifies the object siPixelQualityESProducer from SiPixelQualityESProducer_cfi in SiPixelSimParameters_cfi,
  • the loading of HLT_GRun_cff overwrites the module siPixelQualityESProducer with the one defined in the HLT_GRun_cff file (meaning, the ES module as defined in ConfDB), and
  • the loading of FrontierConditions_GlobalTag_cff overwrites the module siPixelQualityESProducer again, re-introducing the one from the SiPixelQualityESProducer_cfi (in its final version post modifiers).

In short, this returns the correct configuration because the loading of FrontierConditions_GlobalTag_cff is after that of HLT_GRun_cff.

@missirol
Copy link
Contributor Author

missirol commented Aug 7, 2023

The main difference between -s HLT:[menu] and -s HLT:[db]:[config] is in the order in which the HLT configuration and other cff files are loaded into the cms.Process in the python file produced by cmsDriver.py.

-s HLT:[db]:[config] case

The "test1" example in the description fails at runtime, and the python output of cmsDriver.py contains the following lines.

from Configuration.Eras.Era_Run3_2023_cff import Run3_2023

process = cms.Process('HLT',Run3_2023)

# import of standard configurations
process.load('Configuration.StandardSequences.Services_cff')
process.load('SimGeneral.HepPDTESSource.pythiapdt_cfi')
process.load('FWCore.MessageService.MessageLogger_cfi')
process.load('Configuration.EventContent.EventContent_cff')
process.load('SimGeneral.MixingModule.mix_2023_Fills_8807_8901_ProjectedPileup_PoissonOOTPU_cfi')
process.load('Configuration.StandardSequences.GeometryRecoDB_cff')
process.load('Configuration.StandardSequences.MagneticField_cff')
process.load('Configuration.StandardSequences.Digi_cff')
process.load('Configuration.StandardSequences.SimL1Emulator_cff')
process.load('Configuration.StandardSequences.DigiToRaw_cff')
process.load('Configuration.StandardSequences.EndOfProcess_cff')
process.load('Configuration.StandardSequences.FrontierConditions_GlobalTag_cff')

[..]

import HLTrigger.Configuration.Utilities
process.loadHltConfiguration("run3:/dev/CMSSW_13_0_0/GRun/V140",type='GRun')

What (I think) happens is that

  • the loading of mix_2023_Fills_8807_8901_ProjectedPileup_PoissonOOTPU_cfi modifies the object siPixelQualityESProducer from SiPixelQualityESProducer_cfi in SiPixelSimParameters_cfi,
  • the loading of FrontierConditions_GlobalTag_cff adds the module siPixelQualityESProducer to the process, but then
  • the call to loadHltConfiguration overwrites siPixelQualityESProducer with the module as defined in ConfDB, thus losing the changes induced by the modifier (i.e. the "forDigitizer" label).

One more note on cmsDriver.py. For what I understand ..

cmsDriver.py [..]

is equivalent to

cmsDriver.py [..] --no_exec && cmsRun [python output of cmsDriver.py]

, meaning that what matters at runtime is the content of the python file.

ConfigBuilder.py writes the python file while building in parallel a "transient" version of the cms.Process in self.process. In the case of -s HLT:[db]:[config], self.process goes out-of-sync with the content of the python file, because in

self.executeAndRemember('process.loadHltConfiguration("%s",%s)'%(stepSpec.replace(',',':'),optionsForHLTConfig))

executeAndRemember loads HLT into self.process right away (so before loading FrontierConditions_GlobalTag_cff), while the python config calls loadHltConfiguration after loading FrontierConditions_GlobalTag_cff.

@missirol
Copy link
Contributor Author

missirol commented Aug 7, 2023

To my knowledge, the easiest way to solve this is to remove this ESProducer from the HLT cff files (while keeping it, of course, in the full HLT dumps used to run HLT standalone, like during actual data-taking).

This ad-hoc removal is normally done in confdb.py (more or less here).

Such removal is okay as long as the process loads FrontierConditions_GlobalTag_cff (this is always the case in python files that load HLT menus via a cff file), as FrontierConditions_GlobalTag_cff provides the module siPixelQualityESProducer with its parameters correctly configured.

This change is done in cms-tsg-storm@cf289c7, and will be integrated via one of the next HLT-menu PRs.

@missirol
Copy link
Contributor Author

missirol commented Aug 7, 2023

An alternative way would be to simplify the implementation of SiPixelQualityESProducer, and use separate instances of it in the python configuration file (1 for label "", 1 for label "forDigitizer" when needed, and 1 for label "forRawToDigi" when needed).

This was tried in missirol@50bf3da.

The downside of this approach is the use makeProcessModifier to avoid loading 'extra' instances of the ES module when not needed. I didn't find examples of makeProcessModifier being used this way (to remove ES modules from the process), so I wonder if this is an abuse of makeProcessModifier. Is there a better way ?

@missirol
Copy link
Contributor Author

missirol commented Aug 7, 2023

Trying to summarise.

  • The handling of ES modules at HLT could be improved. In several cases, HLT defines an ES module with the same label (name of the module) as used in other "steps" of the simulation/reconstruction chain. This normally does not create problems because (1) HLT removes ad-hoc said modules from the cff files used by cmsDriver.py, or (2) the module used at HLT is identical to the one used in other steps, so it might get silently overwritten without really changing the overall configuration. HLT could use a more systematic way to find (ES) modules that can lead to this kind of clashes with other steps.

  • The cms.Process created at runtime by ConfigBuilder.py (i.e. self.process) can go out-of-sync with the content of the python config when -s HLT:[db]:[config] is used.

Feel free to add feedback/corrections/suggestions.

@mmusich
Copy link
Contributor

mmusich commented Aug 7, 2023

he downside of this approach is the use makeProcessModifier to avoid loading 'extra' instances of the ES module when not needed. I didn't find examples of makeProcessModifier being used this way (to remove ES modules from the process), so I wonder if this is an abuse of makeProcessModifier. Is there a better way ?

at least for

from Configuration.Eras.Modifier_run2_SiPixel_2018_cff import run2_SiPixel_2018
removeSiPixelQualityForDigitizerESProducer_ = (~run2_SiPixel_2018).makeProcessModifier( _removeSiPixelQualityForDigitizerESProducer )

to be honest I don't remember exactly why we chose to use it only for 2018 onwards in #25885 (perhaps to spare including the labeled payload in too many Global Tags), perhaps @tsusa remembers better.
In principle with some patience from our @cms-sw/alca-l2 colleagues we could introduce the record in all the needed Global Tags (where it doesn't need to differ from the regular one, one can use the same tag as the unlabeled record), thus removing the need of the modifier.

@mmusich
Copy link
Contributor

mmusich commented Aug 7, 2023

other than the above

This was tried in missirol@50bf3da.

looks a good improvement. I would go ahead with a PR (@cms-sw/trk-dpg-l2 FYI).

@mmusich
Copy link
Contributor

mmusich commented Aug 7, 2023

assign alca, trk-dpg

  • for lack of a better choice.

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 7, 2023

New categories assigned: trk-dpg,alca

@perrotta,@connorpa,@consuegs,@francescobrivio,@sroychow,@saumyaphor4252,@tvami you have been requested to review this Pull request/Issue and eventually sign? Thanks

@missirol
Copy link
Contributor Author

missirol commented Aug 7, 2023

assign hlt

  • The handling of ES modules at HLT could be improved. In several cases, HLT defines an ES module with the same label (name of the module) as used in other "steps" of the simulation/reconstruction chain. This normally does not create problems because (1) HLT removes ad-hoc said modules from the cff files used by cmsDriver.py, or (2) the module used at HLT is identical to the one used in other steps, so it might get silently overwritten without really changing the overall configuration. HLT could use a more systematic way to find (ES) modules that can lead to this kind of clashes with other steps.

It would be useful to have some kind of HLT unit test to spot this kind of issues earlier, or a better way to remove the relevant ES modules (as opposed to having an hard-coded list).

@cmsbuild
Copy link
Contributor

cmsbuild commented Aug 7, 2023

New categories assigned: hlt

@missirol,@Martin-Grunewald you have been requested to review this Pull request/Issue and eventually sign? Thanks

@Martin-Grunewald
Copy link
Contributor

Sorry, but wouldn't it be safer to make the two ways of loading simply provide identical output and order?

@missirol
Copy link
Contributor Author

missirol commented Aug 7, 2023

If you mean having

process.loadHltConfiguration(..)

in the same place as

process.load('HLTrigger.Configuration.HLT_[..]_cff')

in the python file produced by cmsDriver.py.. I agree it would be good, but I didn't see an easy way to update ConfigBuilder.py to obtain this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants