Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track Kaboom #243

Merged
merged 4 commits into from
Sep 25, 2019
Merged

Track Kaboom #243

merged 4 commits into from
Sep 25, 2019

Conversation

osschar
Copy link
Collaborator

@osschar osschar commented Sep 18, 2019

  • Track is broken into TrackBase and Track
  • TrackCand is used during track finding, implementing common hit-on-track storage for all candidates stemming from the same seed
  • Track now uses std::vector for hit storage
  • Hit has been extended with bit-packed module-id-in-layer, charge_per_cm, and span_X/Y
  • Bin-file-writer has been modified accordingly
  • This requires a new bin file version (now at 4, samples have to be recreated, some are available at phi1 and phi3 in /data2)

Add HitOnTrack::operator< to allow making sets and maps of them.
…ize.

Make Track and TrackCand ctors from TrackBase explicit.

Modify Hit and mem-file-writer to store charge-per-cm and cluster spans according to expected needs.
@kmcdermo
Copy link
Collaborator

Cool! @osschar , can you share the validation plots here?

@osschar
Copy link
Collaborator Author

osschar commented Sep 18, 2019

There are difficulties creating a new version of our standard tkNtuple as input EDM files are not accessible. New tkNtuples are required as the old ones do not contain cluster size information.

  1. For the old sample validation has been run just before actual storing of cluster sizes has been added. This was the last step and this info is not used during tracking -- so in principle there should be no changes compared to this.
    http://xrd-cache-1.t2.ucsd.edu/matevz/PKF/666-TrackCand-for-HitStorage-v3/
  2. New sample PU70 2018 sim
    http://xrd-cache-1.t2.ucsd.edu/matevz/PKF/666-TrackCand-for-HitStorage-v4-2018-sim-pu70/
  3. New sample PU50 2018 sim
    http://xrd-cache-1.t2.ucsd.edu/matevz/PKF/666-TrackCand-for-HitStorage-v4-2018-sim-pu50/

@cerati
Copy link
Collaborator

cerati commented Sep 19, 2019

So TrackCand contains a pointer to CombCandidate, which in turn is a vector of TrackCand. Do I understand correctly this vector contains the TrackCand itself and all its siblings?
Trying to digest the changes, I may have more questions later...
Thanks!

@makortel
Copy link
Collaborator

If I understand correctly the CMSSW side needs to be updated to call for a constructed Hit object

  • hit.setupAsPixel(...) for pixel hits
  • hit.setupAsStrip(...) for strip hits

(for @cerati I believe) Can you show how exactly the TrackingNtuple as filled for the arguments of these functions (i.e. pix_clustSizeCol, pix_clustSizeRow, str_clustSize branches)? (it would also be nice to get these changes in the ntuple contributed back to upstream CMSSW)

@cerati
Copy link
Collaborator

cerati commented Sep 19, 2019

@makortel, this went into @slava77's fork: slava77/cmssw#102
It should be straightforward to replicate or cherry-pick for the upstream CMSSW.

unsigned int imoduleid;
{
auto ii = module_shortId_hash[ilay].emplace(pix_detId->at(ipix), (unsigned int) module_shortId_hash[ilay].size());
imoduleid = ii.first->second;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(For CMSSW side) Is it enough to provide a unique 12-bit number for each module on a layer? Or does the "short id" as implemented here carry any other meaning?

(I'm just thinking that for CMSSW it would be more natural to derive this kind of mapping from DetId to a "short id" once from the geometry instead of loops over hits)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would be preferable to extract it from DetId and store that. I wasn't exactly sure how this changes with different tracker phases so I picked a lazy way out. We also have more bits if we need them.

Also, don't forget that we use stereo/mono as different layers.

Do you know how to do it in a reasonable way?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(In lossyspeak) I was thinking to just take the geometry, ask for all modules, loop over them and use a similar module_shortId_has_ trick to map DetId to a "sequence number".

By very quick look the GeomDets actually have a global indexing
https://github.com/cms-sw/cmssw/blob/43c5f0689f3839d0e430dcf63cbf78c2ece55adc/Geometry/CommonDetUnit/interface/GeomDet.h#L86-L88
that is used at least for tracker
https://github.com/cms-sw/cmssw/blob/43c5f0689f3839d0e430dcf63cbf78c2ece55adc/Geometry/TrackerGeometryBuilder/src/TrackerGeometry.cc#L141
One option would be to use that directly (the geometry seems to internally map DetIds to const GeomDet* with a std::unordered_map so it should not be much slower than going through std::unordered_map<unsigned, unsigned>). I'm not sure if everything would fit in 12 bits though (in the available 18 for sure, I don't know on the top of my head whether phase2 tracker could hit 260k modules in total). In a sense this index would be overkill (it would be unique through the tracker), but it would still be unique within a layer :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slava once pointed me to this: https://github.com/cms-sw/cmssw/blob/02d4198c0b6615287fd88e9a8ff650aea994412e/Geometry/TrackerNumberingBuilder/README.md

And it scared me enough to go the hash_map way :) The largest number of modules I saw for my scheme was between 860 and 880 ... for the outer TOB layer.

But you are right, we can have an id that is unique across more than just "our" layer.

On the other hand, we could also have an id, that is only unique across potentially overlapping modules ... but this then becomes a "map coloring problem" and is probably even harder.

Now, if we ever need to access per module state (orientation, dead/always-on strips), this id would also be a "natural" index into the module_state vector.

Sigh, I guess I'm not helping. :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slava once pointed me to this: https://github.com/cms-sw/cmssw/blob/02d4198c0b6615287fd88e9a8ff650aea994412e/Geometry/TrackerNumberingBuilder/README.md

I also stared that page a lot while writing the comments above. While the idea of taking the within-layer data from the DetId itself (with bit masks) sounds intriguing, it won't work for phase2 outer tracker, which really seems to use all 20 lowest bits (thus not fitting in the available 18 bits). Everything else seems to fit in 18 bits (note that for phase1/2 pixel barrel layer starts at bit 20, but has the lowest 2 bits unused).

I'm really fine with the current solution (CMSSW tracking uses similar "sequence numbering" as well instead of DetIds in various places). I was really after that (for now at least) the only requirements for this number are

  • unique within mkFit layer
  • fits in 12 bits

and I interpret your comment ("unique across potentially overlapping modules ") such that this is indeed the case.

@osschar
Copy link
Collaborator Author

osschar commented Sep 19, 2019

@cerati Yes, CombCand is a vector of TrackCands (as it was before a vector of Tracks). We need a back pointer TrackCand->ComCand as ComCand now implements hit storage for all TrackCands it manages. Well, it could eventually also access seed-related info, if needed, but I don't think it does right now.

@osschar
Copy link
Collaborator Author

osschar commented Sep 23, 2019

The 10mu and PU50 and 70 samples are on UCSD phi machines in /data2.

Validation for all UCSD machines (and proper PR number):
http://xrd-cache-1.t2.ucsd.edu/matevz/PKF/243-TrackKaboom-2018-sim-pu50/
Note: the titles of histograms still say PU70 ... but this is 2018 PU50.

@kmcdermo, can you please check these titles and whatever else might need to be modified for validation/benchmarking, I only changed the data-sample locations on UCSD machines. Oh, and you'll probably want to copy these samples over to Cornell and modify the locations for LNX machines, too.

@osschar
Copy link
Collaborator Author

osschar commented Sep 25, 2019

I fixed the PU70 on histo titles to PU50 -- the plots are on the same location:
http://xrd-cache-1.t2.ucsd.edu/matevz/PKF/243-TrackKaboom-2018-sim-pu50/

@kmcdermo
Copy link
Collaborator

Attaching some quick slides regarding this PR and then will merge: track_kaboom_validation.pdf

Overall, when apples-to-apples with the old ttbar+PU70 2017 samples, speedup increases by 5-10% in building on a single thread, and maybe 2-3% at full load for full loop time. Efficiency also increases, mostly at low pT, in the endcaps, and at low nLayers.

With the new 2018 samples, given the 40-50% loss of quadruplets, it is a bit hard to make definitive statements. For sure, the overall time goes down when compared to devel (by about 40-50%). It seems that the PU50 samples actually take slightly longer to run than the PU70. N.B. The tracking performance also improves very nicely at high eta and at low nLayers with PU50.

@kmcdermo kmcdermo merged commit e864ed9 into devel Sep 25, 2019
@cerati
Copy link
Collaborator

cerati commented Sep 25, 2019 via email

@osschar
Copy link
Collaborator Author

osschar commented Sep 25, 2019

Not significantly :) There are two possible causes:

  1. there is no more limit on N_hits (also for cmssw sim tracks)
  2. score is now stored as float (not quantized into Status bits)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants