Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for duplicate cfi filenames while generating cfi files #19729

Merged
merged 4 commits into from Jul 24, 2017

Conversation

wddgit
Copy link
Contributor

@wddgit wddgit commented Jul 13, 2017

This includes fixes for the problems this new check finds in CMSSW. In most of these cases the fix was to change the call from add to addDefault. With addDefault there is no cfi file generated so the problem is eliminated. In all these case, the generated cfi was not used anywhere in CMSSW and fixing them in another way would require expert knowledge of how the modules should be configured.

The fact that nothing used these generated cfi files also means that these problems were probably not causing any actual problems for users trying to run these modules.

I also added an extra option to the edmWriteConfigs function that will create cfi files for a single plugin.

This should resolve issue #19690

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @wddgit (W. David Dagenhart) for master.

It involves the following packages:

CommonTools/RecoAlgos
DQMOffline/Trigger
DQMServices/FileIO
EventFilter/Utilities
FWCore/Framework
FWCore/Modules
FWCore/ParameterSet
IOPool/Output
RecoJets/JetProducers

@perrotta, @smuzaffar, @Dr15Jones, @vazzolini, @kmaeshima, @emeschi, @dmitrijus, @cmsbuild, @slava77, @mommsen, @vanbesien, @davidlange6 can you please review it and eventually sign? Thanks.
@rappoccio, @barvic, @jdolen, @battibass, @makortel, @mtosi, @abbiendi, @TaiSakuma, @jhgoh, @Martin-Grunewald, @calderona, @nhanvtran, @HuguesBrun, @gkasieczka, @yslai, @ahinzmann, @schoef, @trocino, @mariadalfonso, @seemasharmafnal, @rociovilar this is something you requested to watch as well.
@davidlange6 you are the release manager for this.

cms-bot commands are listed here

@wddgit
Copy link
Contributor Author

wddgit commented Jul 13, 2017

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 13, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/21446/console Started: 2017/07/13 18:45

@Dr15Jones
Copy link
Contributor

+1

@Martin-Grunewald
Copy link
Contributor

@wddgit

Sorry, no wait: how did you decide where to switch to addDefault when there is a collision?
The base class vs derived class? Random order?

Also, the (overwriting/overwritten) cfi file of the two is now gone, but remember ConfDB parsing looks at all cfi files, so will have picked up the surviving in the past and now pick up the single one generated, which may not be the "correct" one. IOW, that the cfi file is not used in CMSSW is only part of the use-case of the cfi file.

@wddgit
Copy link
Contributor Author

wddgit commented Jul 13, 2017

@Martin-Grunewald

With the current version of this PR and in the cases where I converted to using addDefault (which is all cases except for a couple in the Core), ALL generated versions of the cfi file for the affected plugins will be gone. There will not be even one left, they will all be gone. How do we proceed forward with this?

Is it possible to determine which plugins types need to be processed by ConfDB? Can you determine that? It seems to me that many plugins are never run in HLT and therefore ConfDB does not need to see them at all.

If there is an existing cfi file that is not generated but was manually put into the python subdirectory, then is the generated cfi file needed by ConfDB? In some of the cases, there is one or more of these. If we use the generated cfi, should we delete the other one?

One problem with me fixing this is that I know little or nothing about these plugins and to properly write a fillDescriptions function one needs to know what the module labels should be and all the possible parameters and their default values ... There is a question whether it is better for me to fix these or the expert. One way forward would be that someone is identified that would be available to answer questions for each plugin needed by ConfDB and then I could take care of the technical details of coding the fillDescriptions function. (Or the expert could do this directly without my involvement). I could try to read the code and guess what the default values should be or read the existing cfi if there is not more than one ...

Also note that if this PR is merged, someone could follow up with subsequent PRs that make further changes to these fillDescriptions functions.

There are 8 separate cases and the details associated with each case are different. Only two of these involve inheritance. Below are the complicated details on all these cases. If you can give me advice on what changes I need to make to get ConfDB to work with these I would appreciate it.

Case 1. VirtualJetProducer has 3 classes that inherit from it and do not define their own fillDescriptions function. The derived classes are CMSInsideOutJetProducer, SubEventGenJetProducer, and SubjetFilterJetProducer. Prior to this PR, all 4 of these classes generated a file named VirtualJetProducer_cfi.py and these overwrote one another with the last one getting saved (I don't know which). The cfi files were the same except the plugin name in the cfi file. Is VirtualJetProducer only there to be a base class and never actually run as a plugin? I see no reference to it in any python file. There is also no reference to CMSInsideOutJetProducer in any python file in the release. RecoHI/HiJetAlgos/python/HiGenJets_cff.py creates 3 modules of type SubEventGenJetProducer. SubjetFilterJetProducer is defined in 2 non-generated cfi files caSubjetFilterGenJets_cfi.py and caSubjetFilterPFJets_cfi.py. Does ConfDB need the plugin types not used anywhere? Does ConfDB use the non-generated cfi files for the other 2 plugin types? If I wrote a fillDescriptions function for them, which of the existing cfi files should I use as a pattern to get the values from?

Case 2 JetConstituentSelector. This is a template with 3 plugin types declared.

using PFJetConstituentSelector = JetConstituentSelectorreco::PFJet;

using PatJetConstituentSelector = JetConstituentSelector<pat::Jet, std::vector<edm::FwdPtrpat::PackedCandidate>>;

using MiniAODJetConstituentSelector = JetConstituentSelector<reco::PFJet, std::vector<edm::FwdPtrpat::PackedCandidate>>;

The fillDescriptions function is not specialized for each of these so the same function gets used for all 3 template instances. So the cfi JetConsituentSelector_cfi.py (yes it is really misspelled) is written 3 times, overwriting until the last is saved. Does ConfDB somehow know how to use this? What plugin does it get associated with? RecoJets/JetProducers/python/ak8PFJets_cfi.py defines 3 different instances of PFJetConstituentSelector (which one should go in fillDescriptions?). The other two appear to not be used anywhere in CMSSW python configurations.

Case 3 HLTTagAndProbeOfflineSource is another template class and there are 5 plugins declared with different template parameters. The fillDescriptions function is not specialized so all 5 plugins use the same function and generate a cfi named hltTagAndProbeOfflineSource_cfi.py. What does ConfDB currently do with that? Does it connect it to the 5 plugin types?

using HLTEleTagAndProbeOfflineSource = HLTTagAndProbeOfflineSourcereco::GsfElectron,reco::GsfElectronCollection;

using HLTPhoTagAndProbeOfflineSource = HLTTagAndProbeOfflineSourcereco::Photon,reco::PhotonCollection;

using HLTElePhoTagAndProbeOfflineSource = HLTTagAndProbeOfflineSourcereco::GsfElectron,reco::GsfElectronCollection,reco::Photon,reco::PhotonCollection;

using HLTMuEleTagAndProbeOfflineSource = HLTTagAndProbeOfflineSourcereco::Muon,reco::MuonCollection,reco::GsfElectron,reco::GsfElectronCollection;

using HLTMuTagAndProbeOfflineSource = HLTTagAndProbeOfflineSourcereco::Muon,reco::MuonCollection;

DQMOffline/Trigger/python/HLTEGTnPMonitor_cfi.py defines a module for HLTEleTagAndProbeOfflineSource and HLTElePhoTagAndProbeOfflineSource. The other 3 appear do not appear anywhere in CMSSW python code. Does ConfDB use either the generated or non-generated cfi files in this case? Or neither?

Case 4 DQMFileSaverOnline and DQMFileSaverPB both define their own fillDescriptions function and there is no template or inheritance issue. They both just use the same module label. And both classes try to generate a file saver_fi.py and one gets overwritten. There are non-generated files DQMFileSaverPB_cfi.py and DQMFileSaverOnline_cfi.py. Does ConfDB use the non-generated files? Does ConfDB use saver_fi.py?

Case 5 EvFOutputModule is a unique case. The same type is given two different plugin names.

In EventFilter/Utilities/plugins/SealModule.cc
...
//legacy name for ConfDB compatibility
typedef EvFOutputModule ShmStreamConsumer;
...
DEFINE_FWK_MODULE(EvFOutputModule);
DEFINE_FWK_MODULE(ShmStreamConsumer);

With the current design this always runs the same fillDescriptions function twice and produces two files named EvFOutputModule_cfi.py, with one overwriting the other. ShmStreamConsumer_cfi.py and OnlineOutput_cfi.py are both non-generated and both define modules with plugin type ShmStreamConsumer. What does ConfDB currently do with this case? Is the generated cfi file needed.

Case 6 There was one case in the Framework unit tests in ToyAnalyzers.cc. I am almost positive ConfDB does not need that one.

Case 7 FWCore/Modules/src/RunLumiEventChecker.cc was using the same module label as another Framework module. I gave it a new unique module label runLumiEventIDChecker_cfi.py. I do not know how ConfDB associates the new cfi name with the plugin, but I assume that should be OK.

Case 8 TimeoutPoolOutputModule inherits from PoolOutputModule and was using the fillDescriptions function from the base class. I defined fillDescriptions function in the derived class and the new generated cfi filename will be TimeoutPoolOutputModule_cfi.py.

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-19729/21446/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 23
  • DQMHistoTests: Total histograms compared: 2021747
  • DQMHistoTests: Total failures: 44271
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 1977310
  • DQMHistoTests: Total skipped: 166
  • DQMHistoTests: Total Missing objects: 0
  • Checked 93 log files, 14 edm output root files, 23 DQM output files

@Martin-Grunewald
Copy link
Contributor

Martin-Grunewald commented Jul 14, 2017

@wddgit

Hi David,

The file names and module label themselves are actually not used, the plugin class name is taken from the file content (1st parameter of the cfi set of parameters). So the file names matter only when they are identical and thus files overwrite each other.

ShmStreamConsumer/EvFOutputModule is actually a messy special case which is handled differently in ConfDB and no one remembers how or why - perhaps it is better to not touch anything here for this case!

For the VirtualJetProducer, its fillDescription is split into the parameter definition part and the actual descriptor call, to allow, say, the subclasses FastjetJetProducer and HTTopJetProducer (as fixed by Andrea) to put together a new set with using the base class parameters without repeating them verbatim.

The other plugins do not seem to be used in HLT menus in ConfDb at this time.
If they are needed in the future, I guess one could ask the requestor to fix the underlying issue.

BTW, for templated classes, we have the same issue in HLT code, but use the following construct:

template<typename T>
void
HLTSinglet<T>::fillDescriptions(edm::ConfigurationDescriptions& descriptions) {
  edm::ParameterSetDescription desc;
  makeHLTFilterDescription(desc);
  desc.add<edm::InputTag>("inputTag",edm::InputTag("hltCollection"));
  desc.add<int>("triggerType",0);
  desc.add<double>("MinE",-1.0);
  desc.add<double>("MinPt",-1.0);
  desc.add<double>("MinMass",-1.0);
  desc.add<double>("MinEta",-1.0);
  desc.add<double>("MaxEta",-1.0);
  desc.add<int>("MinN",1);
  descriptions.add(defaultModuleLabel<HLTSinglet<T>>(), desc);
}

where defaultModuleLabel contructs a template-T-dependent string to be used as filename.

@dmitrijus
Copy link
Contributor

+1

@wddgit
Copy link
Contributor Author

wddgit commented Jul 14, 2017

Hi Martin. Progress. Only 2 of the 8 cases need more discussion.

For ShmStreamConsumer/EvFOutputModule, does ConfDB use one of the non-generated cfi files which are EventFilter/Utilities/python/ShmStreamConsumer_cfi.py or DQM/SiStripCommissioningSources/python/OnlineOutput_cfi.py? Or does it actually need the generated cfi file with the problem?

I cannot just leave ShmStreamConsumer/EvFOutputModule alone because
it fails the new check if something is not done.

I tried building RecoJets/JetProducers in an unmodified working area without
the changes in this PR. There was a cfi file generated with the name
VirtualJetProducer_cfi.py. It contained this line:

VirtualJetProducer = cms.EDProducer('SubEventGenJetProducer',

It contains the plugin name SubEventGenJetProducer. I do not know
if this plugin name is consistently in this file or if the order of overwriting
varies. Does ConfDB actually use this file? Or does ConfDB use the non-generated
file RecoHI/HiJetAlgos/python/HiGenJets_cff.py which also defines SubEventGenJetProducer?
Or does ConfDB need SubEventGenJetProducer at all?

Given that the other three plugin types related to VirtualJetProducer are not mentioned
in the generated cfi file, can I assume that ConfDB does not need the generated cfi file
for them?

@Martin-Grunewald
Copy link
Contributor

Hmm, the file EventFilter/Utilities/EvFOutputModule_cfi.py is parsed but not used as ShmStreamConsumer is hardwired in ConfDB - so it is not actually needed.

The VirtualJetProducer = cms.EDProducer('SubEventGenJetProducer', is also parsed but also not used at the moment.

So OK, let's go ahead like this.

@wddgit
Copy link
Contributor Author

wddgit commented Jul 18, 2017

Just a comment that might help the daq and reco reviewers. The only effect the changes outside the core will have is that some cfi files that were being automatically generated will no longer be generated. None of these cfi files were being used by any configurations in the CMSSW release.

@perrotta
Copy link
Contributor

+1
Rather technical.
No effect on reco objects or configs.
The several cases listed in #19729 (comment) should guarantee that there is also no negative effect on configurations to be parsed in ConfDB for HLT.

@davidlange6
Copy link
Contributor

merge

@cmsbuild cmsbuild merged commit a2d46b6 into cms-sw:master Jul 24, 2017
@wddgit wddgit deleted the checkForDuplicatesInWriteConfigs branch November 29, 2017 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants