Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pack triggerobject standalone and photons/muon updated packing for miniaod #18739

Merged
merged 14 commits into from
Jun 1, 2017

Conversation

arizzi
Copy link
Contributor

@arizzi arizzi commented May 15, 2017

This PR packs further the TriggerObjectStandalone for miniaod (from 6Kb/ev on ttbar to 3.8kb/ev) as well as photon and muon objects.

The label packing in trigger object is now thread-safe and can be eventually further improved once the new function EventBase::getParameterSet is integrated in the release.

We could merge this as-is and then improve the access via the new function once available

@gpetruc @Dr15Jones

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @arizzi for master.

It involves the following packages:

DataFormats/PatCandidates
PhysicsTools/PatAlgos

@perrotta, @cmsbuild, @slava77, @monttj, @davidlange6 can you please review it and eventually sign? Thanks.
@TaiSakuma, @gouskos, @JyothsnaKomaragiri, @imarches, @ahinzmann, @acaudron, @mmarionncern, @rappoccio, @jdolen, @nhanvtran, @gpetruc, @gkasieczka, @schoef, @ferencek, @mverzett, @mariadalfonso, @pvmulder, @cbernet this is something you requested to watch as well.
@davidlange6 you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor

slava77 commented May 15, 2017

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented May 15, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/19839/console Started: 2017/05/15 21:25

@@ -25,6 +31,7 @@ TriggerObjectStandAlone::TriggerObjectStandAlone() :
TriggerObject()
{
filterLabels_.clear();
filterLabelIndices_.clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you calling clear in a constructor?

/// Vector of labels of all HLT filters or names od L1 conditions the trigger objects has been used in
std::vector< std::string > filterLabels_;
/// Vector of labels of all HLT filters or names of L1 conditions the trigger objects has been used in
mutable std::vector< std::string > filterLabels_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not thread safe and therefore could not be put into the Event.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see a classes_def.xml rule that prevents this from being written out. So if it gets filled in someones job they would pay the storage cost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well it is cleared before when packing, anyhow this topic of what we should write and when we should compress (what we should make mutable etc. etc..) is still to be finalized.


//if (0!=(pset=psetRegistry->getMapped(config.parameterSetID()))) {

edm::pset::Registry* psetRegistry = edm::pset::Registry::instance();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will almost certainly fail if someone attempts to use TTree::Draw since the registry will not have been filled. I also do not know if fwlite::Event calls would have filled the Registry.

@@ -49,6 +52,9 @@ namespace pat {
/// The vector is empty for data (size 0), if the according information is not available.
std::vector< bool > pathL3FilterAccepted_;

edm::ParameterSetID psetId_;
std::string processName_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the point of processName_? It is set but can not be accessed and nothing reads it.

@@ -306,3 +328,120 @@ void TriggerObjectStandAlone::unpackPathNames(const edm::TriggerNames &names) {
pathNames_.swap(paths);
}

void TriggerObjectStandAlone::packFilterLabels(const std::vector<std::string> &names) const {
if (!filterLabelIndices_.empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not thread safe

@@ -260,6 +274,14 @@ bool TriggerObjectStandAlone::checkIfPathsAreUnpacked(bool throwIfPacked) const
if (!unpacked && throwIfPacked) throw cms::Exception("RuntimeError", "This TriggerObjectStandAlone object has packed trigger path names. Before accessing path names you must call unpackPathNames with an edm::TriggerNames object. You can get the latter from the edm::Event or fwlite::Event and the TriggerResults\n");
return unpacked;
}
bool TriggerObjectStandAlone::checkIfFiltersAreUnpacked(bool unpack) const {
bool unpacked = filterLabelIndices_.empty();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not thread safe since filterLabelIndices_ can mutate during a const call.

@@ -260,6 +274,14 @@ bool TriggerObjectStandAlone::checkIfPathsAreUnpacked(bool throwIfPacked) const
if (!unpacked && throwIfPacked) throw cms::Exception("RuntimeError", "This TriggerObjectStandAlone object has packed trigger path names. Before accessing path names you must call unpackPathNames with an edm::TriggerNames object. You can get the latter from the edm::Event or fwlite::Event and the TriggerResults\n");
return unpacked;
}
bool TriggerObjectStandAlone::checkIfFiltersAreUnpacked(bool unpack) const {
bool unpacked = filterLabelIndices_.empty();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not thread safe since filterLabelIndices_ can mutate during a const call.

std::vector< std::string > filterLabels_;
/// Vector of labels of all HLT filters or names of L1 conditions the trigger objects has been used in
mutable std::vector< std::string > filterLabels_;
mutable std::vector< uint16_t > filterLabelIndices_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intended relationship between filterLabels_ and filterLabelIndices_? On a quick look, it appears you only intend one of these to be filled at a time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original design was to have both possibly filled with the first being a fallback when the name cannot be mapped and at the same time acting as the "unpacked" container.
probably we do not need this flexibility and we can stick to one mutable and one persistent but a decision has not been made yet.


const unsigned int n(triggerNames.size());
std::set<std::string> saveTags;
for (unsigned int i=0;i!=n; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I do not know what this code is doing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using namespace trigger;

const ParameterSet& HLTPSet(pset->getParameterSet("@trigger_paths"));
auto triggerNames= HLTPSet.getParameter<vector<string> >("@trigger_paths");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to match what I see in edm::EventBase::triggerNames_ and edm::TriggerNames::TriggerNames(edm::ParameterSet cons&).

      edm::pset::Registry* psetRegistry = edm::pset::Registry::instance();
      edm::ParameterSet const* pset=0;
      if (0!=(pset=psetRegistry->getMapped(triggerResults.parameterSetID()))) {
	 if (pset->existsAs<std::vector<std::string> >("@trigger_paths", true)) {
            TriggerNames triggerNames(*pset);
             ...

and the constructor does

  TriggerNames::TriggerNames(edm::ParameterSet const& pset) {
    triggerNames_ = pset.getParameter<Strings>("@trigger_paths");
...

So I think what you want is

auto triggerNames=pset->getParameter<std::vector<std::string>>("@trigger_paths");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code fragment is correct. What I originally looked at was starting from the PSetID for the trigger results PSet while your code is starting from the top level PSet.

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@arizzi
Copy link
Contributor Author

arizzi commented May 15, 2017

The mutable are just a temporary fix before we decide the final way to go (i.e. allow both version of the object with packed and unpacked or force one of the two to be only transient)

The code in eventBase accesses triggerNames and this is NOT what we need here.
Here we want the module names (see code in HLTConfigData class)...otherwise I would just use Event::triggerNames()!!

Can you comment on how we can be sure that fwlite loads the registry? Is there a way to force it to do so?
In fact while can't we have a proper (api) way to access the provenance in fwlite?

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-18739/19839/summary.html

Comparison Summary:

  • You potentially added 3522 lines to the logs
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 1723 differences found in the comparisons
  • DQMHistoTests: Total files compared: 24
  • DQMHistoTests: Total histograms compared: 1829394
  • DQMHistoTests: Total failures: 27216
  • DQMHistoTests: Total nulls: 39
  • DQMHistoTests: Total successes: 1801959
  • DQMHistoTests: Total skipped: 180
  • DQMHistoTests: Total Missing objects: 0
  • Checked 98 log files, 14 edm output root files, 24 DQM output files

@arizzi
Copy link
Contributor Author

arizzi commented May 16, 2017

@Dr15Jones we are adapting it (I'll push the commit asap) to get the Event as input so that we are sure it internally calls triggerNames and hence the registry is filled... but... wouldn't it be better if EventBase has some
EventBase::parameterSet(psetId)
or even
EventBase::registry()
function that is properly implemented in fwlite::Event and edm::Event ?

@Dr15Jones
Copy link
Contributor

I'm actually working on EventBase::parameterSet(psetId) right now :).

@cmsbuild
Copy link
Contributor

Pull request #18739 was updated. @perrotta, @monttj, @cmsbuild, @franzoni, @slava77, @davidlange6 can you please check and sign again.

@arizzi
Copy link
Contributor Author

arizzi commented May 26, 2017

@slava77 all set from our side, please trigger the compile and test again

@perrotta
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented May 26, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/20143/console Started: 2017/05/26 16:52

@slava77
Copy link
Contributor

slava77 commented May 26, 2017

@arizzi
I'm missing an answer to #18739 (comment) about the PATPhotonProducer

@arizzi
Copy link
Contributor Author

arizzi commented May 26, 2017

@gpetruc can you have a look?

@slava77
Copy link
Contributor

slava77 commented May 26, 2017

@cmsbuild please test

the last attempt failed in the infrastructure somewhere

ssl.SSLError: ('The read operation timed out',)
...
Finished: FAILURE

@cmsbuild
Copy link
Contributor

cmsbuild commented May 26, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/20148/console Started: 2017/05/27 01:57

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-18739/20148/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • ROOTFileChecks: Some differences in event products or their sizes found
  • Reco comparison results: 3359 differences found in the comparisons
  • DQMHistoTests: Total files compared: 24
  • DQMHistoTests: Total histograms compared: 1833626
  • DQMHistoTests: Total failures: 19544
  • DQMHistoTests: Total nulls: 65
  • DQMHistoTests: Total successes: 1813837
  • DQMHistoTests: Total skipped: 180
  • DQMHistoTests: Total Missing objects: 0
  • Checked 98 log files, 14 edm output root files, 24 DQM output files

@arizzi
Copy link
Contributor Author

arizzi commented May 29, 2017

@slava77 anything else missing?

@slava77
Copy link
Contributor

slava77 commented May 29, 2017 via email

@slava77
Copy link
Contributor

slava77 commented May 31, 2017

+1

for #18739 163fdb6

  • code changes are in line with the description: modifications are in miniAOD.
  • jenkins tests pass and comparisons with baseline show differences only in miniAOD photons (zeroed-out values) and muons (round-offs introduce small bin-to-bin shifts in a few places)
  • local tests with more events confirm the expected behaviour:
    • significant reduction in the trigger data from 10.6K to 7.2K per event using 200 events in 136.761; about 4% reduction in size of slimmedMuons (3K events in ZMM PU from wf 10242) and about 12% reduction in size in slimmedPhotons (3K events in Hgg PU from wf 10248).
    • CPU changes are somewhat negligible: slimmedPatTrigger takes 1.2 ms/evt compared to 1.5 s/evt spent in miniAOD processing total.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants