New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure VID Python/FWLite code and avoid generating dictionaries for Reco*/*Identification
packages
#35953
Restructure VID Python/FWLite code and avoid generating dictionaries for Reco*/*Identification
packages
#35953
Conversation
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35953/26361
|
A new Pull Request was created by @guitargeek (Jonas Rembser) for master. It involves the following packages:
@jpata, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35953/26362
|
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35953/26367
|
Pull request #35953 was updated. @smuzaffar, @Dr15Jones, @makortel, @cmsbuild, @slava77, @jpata can you please check and sign again. |
Reco*/*Identification
packages
Sorry, I should have made the PR a draft before. I'm done with the force pushes now |
namespace PhysicsTools_SelectorUtils { | ||
struct dictionary { | ||
pat::strbitset strbitset; | ||
edm::Wrapper<pat::strbitset> wstrbitset; | ||
std::vector<pat::strbitset> vstrbitset; | ||
edm::Wrapper<std::vector<pat::strbitset> > wvstrbitset; | ||
|
||
VersionedSelector<edm::Ptr<reco::Photon> > vPhotonSelector; | ||
VersionedSelector<edm::Ptr<pat::Photon> > vPatPhotonSelector; | ||
VersionedSelector<edm::Ptr<reco::Muon> > vMuonSelector; | ||
VersionedSelector<edm::Ptr<reco::GsfElectron> > vGsfElectronSelector; | ||
VersionedSelector<edm::Ptr<pat::Electron> > vPatElectronSelector; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This struct is not needed anymore (for a long time).
namespace PhysicsTools_SelectorUtils { | |
struct dictionary { | |
pat::strbitset strbitset; | |
edm::Wrapper<pat::strbitset> wstrbitset; | |
std::vector<pat::strbitset> vstrbitset; | |
edm::Wrapper<std::vector<pat::strbitset> > wvstrbitset; | |
VersionedSelector<edm::Ptr<reco::Photon> > vPhotonSelector; | |
VersionedSelector<edm::Ptr<pat::Photon> > vPatPhotonSelector; | |
VersionedSelector<edm::Ptr<reco::Muon> > vMuonSelector; | |
VersionedSelector<edm::Ptr<reco::GsfElectron> > vGsfElectronSelector; | |
VersionedSelector<edm::Ptr<pat::Electron> > vPatElectronSelector; | |
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thanks for the suggestion!
@@ -9,4 +9,5 @@ | |||
<class name="MuonVPlusJetsIDSelectionFunctor"/> | |||
<class name="PVSelector"/> | |||
<class name="RunLumiSelector"/> | |||
<class pattern="VersionedSelector<*>"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The classes_def.xml
should list all types explicitly (we had not so long ago a campaign to remove all patterns).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright! I fixed this
namespace edmplugin { | ||
template <class T> | ||
class PluginFactory; | ||
} | ||
typedef edmplugin::PluginFactory<AnyMVAEstimatorRun2Base*(const edm::ParameterSet&)> AnyMVAEstimatorRun2Factory; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me this sounds more of a reason to move the plugin factory definition out from this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I don't see why that's better, but it's also fine for me :) I have now put it in a separate file called AnyMVAEstimatorRun2Factory.h
. Are you fine with that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I don't see why that's better, but it's also fine for me :)
The users of AnyMVAEstimatorRun2Base
do not need to know about the factory (only the code creating the object needs the factory). This separation is actually a common pattern in CMSSW (I'd even say vast majority of plugin factories are defined that way, but I didn't do a thorough analysis). The split also avoids the need to explicitly forward-declare types from the framework (code rule 4.7).
I have now put it in a separate file called
AnyMVAEstimatorRun2Factory.h
. Are you fine with that?
That's good, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, got it!
@@ -47,13 +71,13 @@ class ElectronMVAEstimatorRun2 : public AnyMVAEstimatorRun2Base { | |||
std::vector<int> nVariables_; | |||
|
|||
// Data members | |||
std::vector<std::unique_ptr<const GBRForest>> gbrForests_; | |||
std::vector<const GBRForest*> gbrForests_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the smart pointer removed? (was not clear to me in the commit message)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because you can't use a unique_ptr<T>
if T is only forward declared. But that was an over-optimization anyway, I included now GBRForest.h
again, and generating the dictionaries for the ElectronMVAEstimatorRun2
is still very fast.
Hi @makortel, thanks for the review! I have addressed your comments, let me know if I should follow up on anything. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35953/26396
|
-1 Failed Tests: RelVals-INPUT RelVals-INPUT
Comparison SummarySummary:
|
All the unit tests pass, and there are no relevant reco differences. 🚀 |
+core Thanks @guitargeek! |
+reconstruction
|
This pull request is fully signed and it will be integrated in one of the next master IBs (but tests are reportedly failing). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2) |
+1 |
merge |
@guitargeek There appear in IB (and also PR tests) some warnings about duplicated dictionaries or dictionaries defined in the wrong place. Would you please have a look?: https://cmssdt.cern.ch/SDT/cgi-bin/showDupDict.py/slc7_amd64_gcc10/www/tue/12.3-tue-11/CMSSW_12_3_X_2022-01-25-1100/testLogs/dupDict-lostDefs.log Searching for 'classes_def.xml' in '/data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_12_3_X_2022-01-20-1100/src'. ./PhysicsTools/SelectorUtils/src/classes_def.xml |
CMSSW_12_3_X_2022-01-25-1100 appears to be a patch release. IIUC, we need a full build for the duplicate dictionaries to go away. |
CMSSW_12_3_X_2022-01-25-1700 is a full build, and the complaint of dictionaries in wrong package is still there I think in this case we should update the checker script instead
In general all dictionaries of DataFormat classes should be declared in the package that defines the class, but in this case |
@makortel , I will update the dict rule defined here https://github.com/cms-sw/cmssw/blob/master/Utilities/ReleaseScripts/scripts/duplicateReflexLibrarySearch.py#L32 |
PR description:
I took issue reported in #35754 as an opportunity to simplify the VID code a bit on the Python side.
The key class for the Python/FWLite support of VID is the
VersionedSelector<>
in PhysicsTools/SeletorUtils. The dictionaries for the different template instances of this class were split over the Electron/Muon/Photon identification packages. Now, they are all in thePhysicsTools/SeletorUtils
library.Then, there were dictionaries in the 3 identification packages for almost trivial classes used in
VIDSelectorBase.py
. However, these classes are tiny template classes that are only used in said python scipt, so I don't think we need dictionaries for them. The classes are small enough such that including them viagInterpreter.Declare
is very fast.Finally, there was a dictionary produced for the
ElectronMVAEstimatorRun2
class, which I suggest to also not do anymore. There is only one use case for this: the class is used in theElectronIdentification/python/FWLite.py
utility file. But again, declaring theElectronMVAEstimatorRun2
is very quick (after avoiding some includes), so it's not necessary to generate dictionaries in CMSSW just for this one usecase (which is outdated anyway because FWLite usage is going down since NanoAOD I guess).More information can also be found in the commit messages.
PR validation:
test_eleid_fwlite.py
runsVIDElectronSelector
,VIDPhotonSelector
,VIDMuonSelector
if this PR is a backport please specify the original PR and why you need to backport that PR:
No backport intended.