Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HGCal] Run HGCalValidator only on the hard scatterer event and correcting the mess from merging #27478

Merged
merged 7 commits into from Jul 14, 2019
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
24 changes: 21 additions & 3 deletions Validation/HGCalValidation/plugins/HGCalValidator.cc
Expand Up @@ -192,6 +192,18 @@ void HGCalValidator::dqmAnalyze(const edm::Event& event,
histoProducerAlgo_->fill_info_histos(histograms.histoProducerAlgo, totallayers_to_monitor_);
}

//Consider CaloParticles coming from the hard scatterer, excluding the PU contribution.
std::vector<CaloParticle> caloParticlesFromHardScat;
for (auto& it_caloPart : caloParticles) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be const auto?

if (it_caloPart.g4Tracks()[0].eventId().event() != 0 or it_caloPart.g4Tracks()[0].eventId().bunchCrossing() != 0) {
LogDebug("HGCalValidator") << "Excluding CaloParticles from event: "
<< it_caloPart.g4Tracks()[0].eventId().event()
<< " with BX: " << it_caloPart.g4Tracks()[0].eventId().bunchCrossing() << std::endl;
continue;
}
caloParticlesFromHardScat.push_back(it_caloPart);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is making a copy of each hard-scatter caloparticle. Some alternatives:

  • store pointers to hard-scatter caloparticles
  • store vector<size_t> with a list of hard-scatter indices, and then select only those indices from the original vector
    Both of these alternatives have worse memory locality, but reduce memory usage and copying overhead. It would be interesting to see if these improve the performance any further.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kpedro88 ,

Using pointers or the selected indices has far reaching consequences to the whole package since the input to the vast
majority of the methods in Validation/HGCalValidation/src/HGVHistoProducerAlgo.cc should be changed and subsequently
a lot of code. Could we please go on with the emplace_back?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine for now, but if the memory usage becomes concerning, these alternatives should be revisited.

Copy link
Contributor

@rovere rovere Jul 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apsallid move the filter on the CaloParticles where you actually use them, i.e. one level down the stack.
In this way, you do not have to copy anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rovere i tested that but for some reason it is really slow. Maybe because we need that check in 5 different places while the code in these places will loop in any case through all CaloParticles which is of the order of 10000. By reducing the collection to ~200 CaloParticles once in the beginning we gain speed. I do not know what else to think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apsallid doesn't sound like a good explanation. Are you sure, by any chance, you left some part of the code w/o the selection?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apsallid otherwise use the vector of indices down the stack, so that you don't need to loop over and over and make the very same selection.

}

// ##############################################
// fill caloparticles histograms
// ##############################################
Expand Down Expand Up @@ -224,7 +236,8 @@ void HGCalValidator::dqmAnalyze(const edm::Event& event,
w,
clusters,
densities,
caloParticles,
// caloParticles,
caloParticlesFromHardScat,
hitMap,
cummatbudg,
totallayers_to_monitor_,
Expand All @@ -237,8 +250,13 @@ void HGCalValidator::dqmAnalyze(const edm::Event& event,

if (domulticlustersPlots_) {
w++;
histoProducerAlgo_->fill_multi_cluster_histos(
histograms.histoProducerAlgo, w, multiClusters, caloParticles, hitMap, totallayers_to_monitor_);
histoProducerAlgo_->fill_multi_cluster_histos(histograms.histoProducerAlgo,
w,
multiClusters,
// caloParticles,
caloParticlesFromHardScat,
hitMap,
totallayers_to_monitor_);
}

//General Info
Expand Down
48 changes: 0 additions & 48 deletions Validation/HGCalValidation/src/HGVHistoProducerAlgo.cc
Expand Up @@ -2156,54 +2156,6 @@ void HGVHistoProducerAlgo::multiClusters_to_CaloParticles(const Histograms& hist
}
} //end of loop through sim hits of current calo particle

LogDebug("HGCalValidator") << std::setw(8) << "LayerId:\t" << std::setw(12) << "caloparticle\t" << std::setw(15)
<< "cp total energy\t" << std::setw(15) << "cpEnergyOnLayer\t" << std::setw(14)
<< "CPNhitsOnLayer\t" << std::setw(18) << "mclWithMaxEnergyInCP\t" << std::setw(15)
<< "maxEnergyMCLinCP\t" << std::setw(20) << "CPEnergyFractionInMCL"
<< "\n";
LogDebug("HGCalValidator") << std::setw(8) << layerId << "\t" << std::setw(12) << cpId << "\t" << std::setw(15)
<< cP[cpId].energy() << "\t" << std::setw(15) << CPenergy << "\t" << std::setw(14)
<< CPNumberOfHits << "\t" << std::setw(18) << mclWithMaxEnergyInCP << "\t"
<< std::setw(15) << maxEnergyMCLperlayerinCP << "\t" << std::setw(20)
<< CPEnergyFractionInMCLperlayer << "\n";

for (unsigned int i = 0; i < CPNumberOfHits; ++i) {
auto& cp_hitDetId = cPOnLayer[cpId][layerId].hits_and_fractions[i].first;
auto& cpFraction = cPOnLayer[cpId][layerId].hits_and_fractions[i].second;

bool hitWithNoMCL = false;
if (cpFraction == 0.f)
continue; //hopefully this should never happen
auto hit_find_in_MCL = detIdToMultiClusterId_Map.find(cp_hitDetId);
if (hit_find_in_MCL == detIdToMultiClusterId_Map.end())
hitWithNoMCL = true;
auto itcheck = hitMap.find(cp_hitDetId);
const HGCRecHit* hit = itcheck->second;
float hitEnergyWeight = hit->energy() * hit->energy();
for (auto& lcPair : cPOnLayer[cpId][layerId].layerClusterIdToEnergyAndScore) {
unsigned int multiClusterId = lcPair.first;
float mclFraction = 0.f;

if (!hitWithNoMCL) {
auto findHitIt = std::find(detIdToMultiClusterId_Map[cp_hitDetId].begin(),
detIdToMultiClusterId_Map[cp_hitDetId].end(),
HGVHistoProducerAlgo::detIdInfoInMultiCluster{multiClusterId, 0, 0.f});
if (findHitIt != detIdToMultiClusterId_Map[cp_hitDetId].end())
mclFraction = findHitIt->fraction;
}
if (mclFraction == 0.) {
mclFraction = -1.;
}
//Observe here that we do not divide as before by the layer cluster energy weight. We should sum first
//over all layers and divide with the total CP energy over all layers.
lcPair.second.second += (mclFraction - cpFraction) * (mclFraction - cpFraction) * hitEnergyWeight;
LogDebug("HGCalValidator") << "multiClusterId:\t" << multiClusterId << "\t"
<< "mclfraction,cpfraction:\t" << mclFraction << ", " << cpFraction << "\t"
<< "hitEnergyWeight:\t" << hitEnergyWeight << "\t"
<< "currect score numerator:\t" << lcPair.second.second << "\n";
}
} //end of loop through sim hits of current calo particle

if (cPOnLayer[cpId][layerId].layerClusterIdToEnergyAndScore.empty())
LogDebug("HGCalValidator") << "CP Id: \t" << cpId << "\t MCL id:\t-1 "
<< "\t layer \t " << layerId << " Sub score in \t -1"
Expand Down