New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GPU] 11634.X RelVal failures during processing of Python configuration #35624
Comments
A new Issue was created by @iarspider . @Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
assign heterogeneous |
I took a quick look of 11634.502, and it seems to me that
|
can this be caused by #35566 ? |
Just by eye #35566 looks a good candidate. That kind of operations in customize functions can lead to a "sequence type" to contain |
Yes, that's the idea. |
assign core |
New categories assigned: core @Dr15Jones,@smuzaffar,@makortel you have been requested to review this Pull request/Issue and eventually sign? Thanks |
Given that e.g. process = cms.Process("Test")
process.a = cms.EDProducer("A")
process.b = cms.EDProducer("B")
process.s = cms.Sequence(process.a)
process.p = cms.Path(process.s)
process.schedule = cms.Schedule(process.p)
process.s = cms.Sequence(process.b) and process = cms.Process("Test")
process.a = cms.EDProducer("A")
process.b = cms.EDProducer("B")
process.t = cms.Task(process.a)
process.schedule = cms.Schedule(tasks=cms.Task(process.t))
process.t = cms.Task(process.b) lead to |
I should have a fix in #35630. |
@fwyzard I was wondering if there was a way to catch this in the (GPU-enabled) PR tests. Looking at the GPU outputs of the tests of #35566 , I see those run only the |
Good point. Looks like the GPU tests are still running the Run 2 workflows (10824.5xx), we should change them to use the Run 3 workflows (11634.5xx) that include the GRun menu. |
@smuzaffar how can we change the workflows that the bot runs when one asks "enable gpu" from the Run 2 ones
to the Run-3 ones
? This includes switching from |
@fwyzard , bot currently does not know any thing about Run2 or Run3. Currently PR workflwos are defined in https://github.com/cms-sw/cms-bot/blob/master/cmssw-pr-test-config we can convert this to a python/shell script which can dynamically generate different workflows based on the run. But still somehow we need to feel the run information to bot. Is there any thing in runTheMatrix which bot can use to find out the Run information? |
No, PR_TEST_MATRIX_EXTRAS=1306.0,101.0,9.0,25202.0,10224.0,250202.181
-PR_TEST_MATRIX_EXTRAS_GPU=10824.502,10824.512,10824.522
+PR_TEST_MATRIX_EXTRAS_GPU=11634.506,11634.512,11634.522
PR_TEST_MATRIX_EXTRAS_PROFILING=23434.21,11834.21 |
Done in cms-sw/cms-bot#1648 . |
Thanks, Andrea.
@makortel , just noting here that the fix might need to be backported to |
Seems that #35630 was not enough. @smuzaffar, could you reopen this issue? |
I investigated the issue a bit further. For a reason that is not yet clear to me, the 11634.502 step2 config file has process.digi2raw_step = cms.Path(process.DigiToRaw)
process.AlCa_LumiPixelsCounts_Random_v1 = cms.Path(process.HLTBeginSequenceRandom+process.hltScalersRawToDigi+process.hltPreAlCaLumiPixelsCountsRandom+process.hltPixelTrackerHVOn+process.HLTDoLocalPixelSequence+process.hltAlcaPixelClusterCounts+process.HLTEndSequence)
process.AlCa_LumiPixelsCounts_ZeroBias_v1 = cms.Path(process.HLTBeginSequence+process.hltScalersRawToDigi+process.hltL1sZeroBias+process.hltPreAlCaLumiPixelsCountsZeroBias+process.hltPixelTrackerHVOn+process.HLTDoLocalPixelSequence+process.hltAlcaPixelClusterCounts+process.HLTEndSequence)
process.endjob_step = cms.EndPath(process.endOfProcess) The |
cmssw/HLTrigger/Configuration/python/customizeHLTforPatatrack.py Lines 232 to 253 in aefd5fd
to work around a problem in the definition of the paths (that do not use the But they should be created after the HLT menu has been imported, not before ? |
Sorry, I was not fully clear. IIUC the |
With @Dr15Jones we traced the cause of the error. The cmssw/HLTrigger/Configuration/python/customizeHLTforCMSSW.py Lines 144 to 149 in 59b5b6e
that itself is called at the end of e.g. HLT_GRun_cff.py
from HLTrigger.Configuration.customizeHLTforCMSSW import customizeHLTforCMSSW
fragment = customizeHLTforCMSSW(fragment,"GRun") There the The two Paths get added to the generated configuration file because of this logic in cmssw/Configuration/Applications/python/ConfigBuilder.py Lines 1575 to 1576 in 59b5b6e
cmssw/Configuration/Applications/python/ConfigBuilder.py Lines 2232 to 2240 in 59b5b6e
The Paths from the HLT menu get added to self.process.paths , but the two Paths that appear as None in the HLTSchedule are not added to self.blacklist_paths and therefore lines to define them explicitly are generated. Also the two Path definitions disappear from `step2*.py" with #35722.
|
@makortel thanks for the proposed fix.
Mhm, the whole idea of the |
In order to support the same functionality as |
+1 |
This issue is fully signed and ready to be closed. |
class ProcessFragment(object):
def __init__(self, process):
if isinstance(process, Process):
self.__process = process
elif isinstance(process, str):
self.__process = Process(process)
#make sure we do not override the defaults
del self.__process.options
del self.__process.maxEvents
del self.__process.maxLuminosityBlocks
else:
raise TypeError('a ProcessFragment can only be constructed from an existig Process or from process name')
def __dir__(self):
return [ x for x in dir(self.__process) if isinstance(getattr(self.__process, x), _ConfigureComponent) ]
def __getattribute__(self, name):
if name == '_ProcessFragment__process':
return object.__getattribute__(self, '_ProcessFragment__process')
else:
return getattr(self.__process, name)
def __setattr__(self, name, value):
if name == '_ProcessFragment__process':
object.__setattr__(self, name, value)
else:
setattr(self.__process, name, value)
def __delattr__(self, name):
if name == '_ProcessFragment__process':
pass
else:
return delattr(self.__process, name) Do you have any idea why the fix made for the |
Hmm, that is a good point. I think what happened is that the I opened a new issue #35842 to figure out a way how we could eventually enforce this rule on |
Example log file
Only occurs in GPU IBs
The text was updated successfully, but these errors were encountered: