Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lossy Calorimeter Truth Compression via PHG4Shower #101

Merged

Conversation

mccumbermike
Copy link
Contributor

This calorimeter branch is running well enough now that I would like Jin to exercise the code and comment on the implementation and look for bugs---I still need to add his sub-showers, I expect to hear about those again, but that can be done in a push to this branch. There is likely an edge usage case or two where I have not explored.

I will pull some slides together to explain the various changes in Tuesday's simulation meeting but for posterity here is a summary of my thoughts here:

This is a deep and extensive rewrite of the underlying mechanics of the calorimeter truth information and it will require more extensive testing before we merge it, but I wish to start that discussion now. This pull request impacts files in g4main, g4detectors, g4cemc, and g4eval. It will force our DST size to smaller than 2MB/event instead of hundreds of MB/event (a factor 50-200 compression is expected depending on the scenario). It does this by deleting g4hits, g4cells, secondary g4particles and their g4vertexes for the non-tracking subsystems---the compression can be expanded to forward calorimetry with a few macro lines. It preserves the truth ancestry by the introduction of a new truth structure, the PHG4Shower created by the PHG4TruthSubsystem, which summarizes the energy deposits from primary particles. A new ancestry path is created between the towers and these objects and then all the redundant substructure is removed by PHG4DstCompressReco which lives in the g4eval library.

On the CPU side, many things have been done to optimize the processing. Previously some primary particle indexing was previously done at the end of the truth event action with a tree-spanning search. That has now been incorporated into the truth track user info and are simply passed down during PHG4TruthInfoTrackingAction::PostUserTrackingAction(). The showers are created the same way as all G4Tracks now own a pointer in the user info object to the shower that they participate in. This allows passing information around between this part of the code and all the subsystem SteppingActions in g4detectors. All stepping actions have been modified to add newly g4hit ids to the shower objects. That's a trick that could be expanded upon for other uses in the future---so others should be made aware of the functionality of GEANT's GimmeSecondaries(). Shower summary fields are then calculated in PHG4TruthInfoEventAction at the end of the event while the g4hits and other objects are still extant. In a central event all of this truth processing consumes <5% CPU so is not exploding the time needed for PHG4Reco to complete. However the calorimeter evaluation is much faster. It used to be comparable to the PHG4Reco module in CPU requirements---and that was a very hard fought benchmark---but now the trace between cluster and particle via the shower has far fewer levels of hierarchy consuming much less memory. Each of the calorimeter evals is <2% CPU in those central events which is faster than individual parts of the tracking reconstruction and orders of magnitude faster than the GEANT4 calls in PHG4Reco.

I've added the PHG4Showers as a new map on the truth info container, so this will be another backwards compatibility shattering change. If it sounds like I'm doing this a lot these days, that is actually intensional, these things were put aside until after the pCDR review and now is rather pressing before we run a large batch of sims. I'm not planning another of these anytime soon---I'm sick of massive rewrites anyway---so g4main should quiet down after this merge.

Jin, I add these lines after the calorimeter processing to activate the deletions:
PHG4DstCompressReco* compress = new PHG4DstCompressReco("PHG4DstCompressReco");
compress->AddHitContainer("G4HIT_CEMC_ELECTRONICS");
compress->AddHitContainer("G4HIT_CEMC");
compress->AddHitContainer("G4HIT_ABSORBER_CEMC");
compress->AddHitContainer("G4HIT_CEMC_SPT");
compress->AddHitContainer("G4HIT_ABSORBER_HCALIN");
compress->AddHitContainer("G4HIT_HCALIN");
compress->AddHitContainer("G4HIT_HCALIN_SPT");
compress->AddHitContainer("G4HIT_MAGNET");
compress->AddHitContainer("G4HIT_ABSORBER_HCALOUT");
compress->AddHitContainer("G4HIT_HCALOUT");
compress->AddHitContainer("G4HIT_BH_1");
compress->AddHitContainer("G4HIT_BH_FORWARD_PLUS");
compress->AddHitContainer("G4HIT_BH_FORWARD_NEG");
compress->AddCellContainer("G4CELL_CEMC");
compress->AddCellContainer("G4CELL_HCALIN");
compress->AddCellContainer("G4CELL_HCALOUT");
compress->AddTowerContainer("TOWER_SIM_CEMC");
compress->AddTowerContainer("TOWER_RAW_CEMC");
compress->AddTowerContainer("TOWER_CALIB_CEMC");
compress->AddTowerContainer("TOWER_SIM_HCALIN");
compress->AddTowerContainer("TOWER_RAW_HCALIN");
compress->AddTowerContainer("TOWER_CALIB_HCALIN");
compress->AddTowerContainer("TOWER_SIM_HCALOUT");
compress->AddTowerContainer("TOWER_RAW_HCALOUT");
compress->AddTowerContainer("TOWER_CALIB_HCALOUT");
se->registerSubsystem(compress);

@mccumbermike
Copy link
Contributor Author

We decided at last week's meeting to merge this without the sub-shower modification and do that in a future pull request---when I'll find time for that I don't know perhaps we can pull time from both of us on it. But first can you exercise this pull request and comment on the quality of the evaluation output?

@blackcathj
Copy link
Member

@mccumbermike Thanks again for constructing this truth compression scheme. Unfortunately my crosscheck progress was delayed by travels.

Thereafter, I went through the proposed code changes. They look very reasonable to me. Then I tried few tests including

  1. single electron simulation with compression. The evaluation result looks reasonable.

  2. two stage embedding production without compression (similar to the test in Storage Revision for PHG4TruthInfoContainer #86). The result looks reasonable other than requiring one fix:

The new version of PHG4HitContainer has the default constructor moved to "protected" tag to avoid being accidentally used by user. However, this also prevents ROOT from reading it back from DST as it calls the default constructor. Therefore, with the proposed version, I could not readback PHG4hit from a production DST, which required a simple change:

diff --git a/simulation/g4simulation/g4main/PHG4HitContainer.h b/simulation/g4simulation/g4main/PHG4HitContainer.h
index 26c5e47..7aa054c 100644
--- a/simulation/g4simulation/g4main/PHG4HitContainer.h
+++ b/simulation/g4simulation/g4main/PHG4HitContainer.h
@@ -20,6 +20,7 @@ class PHG4HitContainer: public PHObject
   typedef std::pair<ConstIterator, ConstIterator> ConstRange;
   typedef std::set<unsigned int>::const_iterator LayerIter;

+  PHG4HitContainer(); //< default constructor for ROOT IO only. 
   PHG4HitContainer(std::string nodename);

   virtual ~PHG4HitContainer() {}
@@ -58,7 +59,6 @@ class PHG4HitContainer: public PHObject
   PHG4HitDefs::keytype getmaxkey(const unsigned int detid);

  protected:
-  PHG4HitContainer();

   int id; //< unique identifier from hash of node name
   Map hitmap;
  1. two stage embedding production with compression. More problem shows up with reading back towers, as our tower maker and tower geometry are not patched for DST read back+embedding yet. I will be happy to work on that. But this should NOT stop us from merge this pull request.

Therefore, I suggest we merge this pull request once resolve the PHG4HitContainer readback.

@mccumbermike
Copy link
Contributor Author

Riiiight... I've used this trick for DST objects and containers before, but
those were fully versioned and the un-versioned class isn't written out to
the DST. Here were are. Good catch. I will patch and merge shortly.

Michael P. McCumber, PhD
Los Alamos National Laboratory
505-709-8161

On Wed, Jan 13, 2016 at 10:34 AM, Jin Huang notifications@github.com
wrote:

@mccumbermike https://github.com/mccumbermike Thanks again for
constructing this truth compression scheme. Unfortunately my crosscheck
progress was delayed by travels.

Thereafter, I went through the proposed code changes. They look very
reasonable to me. Then I tried few tests including

  1. single electron simulation with compression. The evaluation result
    looks reasonable.

  2. two stage embedding production without compression (similar to the test
    in Storage Revision for PHG4TruthInfoContainer #86 Storage Revision for PHG4TruthInfoContainer #86).
    The result looks reasonable other than requiring one fix:

The new version of PHG4HitContainer has the default constructor moved to
"protected" tag to avoid being accidentally used by user. However, this
also prevents ROOT from reading it back from DST as it calls the default
constructor. Therefore, with the proposed version, I could not readback
PHG4hit from a production DST, which required a simple change:

diff --git a/simulation/g4simulation/g4main/PHG4HitContainer.h b/simulation/g4simulation/g4main/PHG4HitContainer.h
index 26c5e47..7aa054c 100644--- a/simulation/g4simulation/g4main/PHG4HitContainer.h+++ b/simulation/g4simulation/g4main/PHG4HitContainer.h@@ -20,6 +20,7 @@ class PHG4HitContainer: public PHObject
typedef std::pair<ConstIterator, ConstIterator> ConstRange;
typedef std::set::const_iterator LayerIter;

  • PHG4HitContainer(); //< default constructor for ROOT IO only.
    PHG4HitContainer(std::string nodename);

virtual ~PHG4HitContainer() {}@@ -58,7 +59,6 @@ class PHG4HitContainer: public PHObject
PHG4HitDefs::keytype getmaxkey(const unsigned int detid);

protected:- PHG4HitContainer();

int id; //< unique identifier from hash of node name
Map hitmap;

  1. two stage embedding production with compression. More problem shows up
    with reading back towers, as our tower maker and tower geometry are not
    patched for DST read back+embedding yet. I will be happy to work on that.
    But this should NOT stop us from merge this pull request.

Therefore, I suggest we merge this pull request once resolve the
PHG4HitContainer readback.


Reply to this email directly or view it on GitHub
#101 (comment)
.

mccumbermike added a commit that referenced this pull request Jan 13, 2016
Lossy Calorimeter Truth Compression via PHG4Shower
@mccumbermike mccumbermike merged commit 0b1b050 into sPHENIX-Collaboration:master Jan 13, 2016
@mccumbermike mccumbermike deleted the calo_truth_compress branch January 13, 2016 21:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants