Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timing propagation in HGCAL RECO #28740

Merged
merged 22 commits into from Feb 4, 2020
Merged

Conversation

amartelli
Copy link
Contributor

@amartelli amartelli commented Jan 15, 2020

This PR

  • Introduces an error on the time for recHits (currently only the time value is available)

    • for recHits in the Silicon, this is done taking the measured resolution on the single cell as function of S/N (parameters are configurable)
    • for Scintillators, -1 is forced as error to evidence the fact that the value of the time injected there is currently a dummy value, and not to be used in the following reco chain.
    • update for this the HGCRecHit.h DataFormat that was cloned from an ECAL one
  • The time and error is propagated to layer clusters, computed as weighted mean of the input recHits

    • for 2Dcl the valueMap <ID, time> is changed into valueMap <ID, pair<time, error>>
    • the function to compute the time given a set of inputs (in ComputeClusterTime.h) is updated accordingly and moved from the previous RecoParticleFlow/PFClusterProducer package to RecoLocalCalo/HGCalRecProducers
  • Time and error are propagated to tracksters, computed as weighted mean of the input layerClusters

    • the time compatibility criterium to build doublets is also updated, in terms of n. sigmas, where the time errors of the layerClusters in each doublet are combined.

A validation of the performance was presented at the HGCAL DPG
https://indico.cern.ch/event/866332/contributions/3650459/attachments/1952623/3242126/HGCALtiming_updateNov2019_amartell.pdf

@felicepantaleo @rovere @Shameena01 for reference

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-28740/13368

  • This PR adds an extra 80KB to repository

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 29, 2020

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/4409/console Started: 2020/01/29 16:26

@amartelli
Copy link
Contributor Author

change in timing of the RECO process: +1% tested on 1k events particle gun 0PU with RECO+TICL

1% of the whole reco?
Could you please also provide an estimate with some 200 PU (or at least high PU) sample?

Dear Reco conveners,
while recognizing the usefulness of measuring the impact of a PR on the overall timing budget of typical CMS reconstruction job, I think it would be quite unpractical to ask every developer to measure timing changes at every single PR, especially if a 200 PU scenario is involved. Stated in a more positive and propositive way, I'd suggest monitoring the timing directly while testing the PR, so that you will have all the numbers you need to approve the PR at hand, or suggest modifications.
As a side note, the check on timing in your instruction page (link here) is listed under the Reconstruction coordinators section, not under the Reco contant one.

@rovere , as you certainly know reco conveners quite often check the timing of the proposed PRs by themselves before approving, normally even without asking the help of the original developers. In a few circumstances we ask the developers to provide their measurements in order to speed up the procedure. Complicate cases (as for example running on a 200 PU sample) were likely already considered by the author of the code while preparing the PR, or while providing its physics validation: at that point it would be rather straightforward to relaunch the job with the customization needed to setup the timing measurement.

We can make the test by ourselves, of course. Please allow some delay in the conclusion of the review for it. You can deal with the comments already sent, in the meanwhile.

Concerning this, I run it few times at 0PU, and find variations in both directions (+ and - few%)
With 200PU, same effect. I see the timing being in the same ballpark (PR wrt without).

@cmsbuild
Copy link
Contributor

+1
Tested at: 231b930
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-78e3c2/4409/summary.html
CMSSW: CMSSW_11_1_X_2020-01-29-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-78e3c2/4409/summary.html

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/results/JR-comparison/PR-78e3c2/136.793_RunDoubleEG2017C+RunDoubleEG2017C+HLTDR2_2017+RECODR2_2017reHLT_skimDoubleEG_Prompt+HARVEST2017

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 10 differences found in the comparisons
  • DQMHistoTests: Total files compared: 34
  • DQMHistoTests: Total histograms compared: 2697068
  • DQMHistoTests: Total failures: 1
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2696721
  • DQMHistoTests: Total skipped: 346
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared)
  • Checked 147 log files, 16 edm output root files, 34 DQM output files

@perrotta
Copy link
Contributor

I've made some timing measurement, not with a PU sample but with wf 21234 (TTbar_14TeV with geometry 2026D44). Looking at the HGCal related modules, I confirm what was suggested by the comment of @amartelli, that is there is no significant cpu time increase due to the additional code:

   +0.065563      +0.02%        11.61 ms/ev ->        12.39 ms/ev hgcalMultiClusters
   +0.015903      +0.01%        19.17 ms/ev ->        19.48 ms/ev hgcalLayerClustersHFNose
   +0.013623      +0.02%        57.47 ms/ev ->        58.26 ms/ev hgcalLayerClusters

for an overall time of

Job total:  3.29485 s/ev ==> 3.36672 s/ev

(numbers roughly compatible within the precision of the method)

@perrotta
Copy link
Contributor

The comparisons of the reco outputs in the automated tests show some missing component, e.g:
image

which is of course due to the type modification in the DataFormat's to allow storing the error together with the time:

-----------------------------------------------------------------
   or, B         new, B      delta, B   delta, %   deltaJ, %    branch 
-----------------------------------------------------------------
      0.0 ->      1789.6       1790     NEWO   0.14     floatfloatstdpairedmValueMap_hgcalLayerClusters_timeLayerCluster_RECO.
    916.9 ->         0.0       -917     OLDO  -0.07     floatedmValueMap_hgcalLayerClusters_timeLayerCluster_RECO.
      0.0 ->       600.9        601     NEWO   0.05     floatfloatstdpairedmValueMap_hgcalLayerClustersHFNose_timeLayerCluster_RECO.
    300.4 ->         0.0       -300     OLDO  -0.02     floatedmValueMap_hgcalLayerClustersHFNose_timeLayerCluster_RECO.
  17603.8 ->     18658.6       1055      5.8   0.08     HGCRecHitsSorted_HGCalRecHit_HGCHEFRecHits_RECO.
 141933.0 ->    149738.0       7805      5.4   0.60     HGCRecHitsSorted_HGCalRecHit_HGCEERecHits_RECO.
    904.2 ->       918.8         15      1.6   0.00     HGCRecHitsSorted_HGCalRecHit_HGCHEBRecHits_RECO.
  84341.6 ->     86831.7       2490      2.9   0.19     HGCRecHitsSorted_HGCalRecHit_HGCHFNoseRecHits_RECO.
-------------------------------------------------------------
  1295036 ->     1307513      12477             1.0     ALL BRANCHES
``

@perrotta
Copy link
Contributor

+1

  • Error on time added to the central value in HGCal recHits, so that they can be propagated to layer clusters and tracksters, as planned
  • Default jenkins tests show the disappearance of the old types stored, while the new ones are present in the new reco outputs

@kpedro88
Copy link
Contributor

+upgrade

@perrotta
Copy link
Contributor

perrotta commented Feb 4, 2020

@Dr15Jones,@smuzaffar,@makortel : is this PR ok for "core"? Please sign, or comment in case you have remarks.

@makortel
Copy link
Contributor

makortel commented Feb 4, 2020

+1

for DataFormats/Common/src/classes_def.xml

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 4, 2020

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @silviodonato, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@silviodonato
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 2290c13 into cms-sw:master Feb 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants