Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove all the ocurrences of SetCanExtend from TrackAnalyzer #27535

Merged
merged 1 commit into from
Jul 17, 2019

Conversation

hbecerri
Copy link
Contributor

PR description:

This PR removes all the occurrences of SetCanExtend from TrackAnalyzer.cc. Solves the issue discussed in #27531

PR validation:

tested using runTheMatrix.py -l 10024

@cmsbuild
Copy link
Contributor

The code-checks are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-27535/10911

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @hbecerri for master.

It involves the following packages:

DQM/TrackingMonitor

@andrius-k, @kmaeshima, @schneiml, @cmsbuild, @jfernan2, @fioriNTU can you please review it and eventually sign? Thanks.
@hdelanno, @makortel, @mtosi, @fioriNTU, @jandrea, @idebruyn, @threus this is something you requested to watch as well.
@davidlange6, @slava77, @fabiocos you are the release manager for this.

cms-bot commands are listed here

@jfernan2
Copy link
Contributor

jfernan2 commented Jul 16, 2019

@hbecerri thanks for this PR, however, despite the discussion in #27535 it would be nice to know why the wf was creating eta histograms from first: (31, -15.000000, 9.000000), second: (31, -9.000000, 15.000000) as shown by @fabiocos on #27528 (comment)
if you are actually setting 26 bins from -2.5 to 2.5 in:
https://cmssdt.cern.ch/lxr/source/DQM/TrackingMonitor/python/TrackingMonitor_cfi.py#0258
Is this line the conflicting one and a right setting?
https://cmssdt.cern.ch/lxr/source/DQM/TrackingMonitor/python/TrackingMonitor_cfi.py#0452

@mmusich
Copy link
Contributor

mmusich commented Jul 16, 2019

@jfernan2

know why the wf was creating eta histograms from first: (31, -15.000000, 9.000000), second: (31, -9.000000, 15.000000)

I think this happened exactly because of the fact that SetCanExtend was activated for this profile (while there is no reason for it to be)

Is this line the conflicting one and a right setting?

What do you mean? It's an era modifier. The phase0 detector had smaller pseudorapidity coverage hence the |eta|<2.5 axis limits, which was extended to |eta|<3 for the new phase1 detector.

@mmusich
Copy link
Contributor

mmusich commented Jul 16, 2019

Can tests be started here please?

@fioriNTU
Copy link
Contributor

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 17, 2019

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/1499/console Started: 2019/07/17 07:42

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-d27f61/1499/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 32
  • DQMHistoTests: Total histograms compared: 2622600
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 44
  • DQMHistoTests: Total successes: 2622254
  • DQMHistoTests: Total skipped: 302
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 31 files compared)
  • Checked 133 log files, 14 edm output root files, 32 DQM output files

@jfernan2
Copy link
Contributor

hold

@cmsbuild
Copy link
Contributor

Pull request has been put on hold by @jfernan2
They need to issue an unhold command to remove the hold state or L1 can unhold it for all

@cmsbuild cmsbuild added the hold label Jul 17, 2019
@jfernan2
Copy link
Contributor

jfernan2 commented Jul 17, 2019

@mmusich It is also disappointing that you skim my comments
What I meant is that the histogram is created with 26 bins from -2.5 to 2.5 and then the crash reports 31 bins, exactly as the Era modifier...
We need to understand what is going on before moving onward, even if the histogram is allowed to be extended, what module is causing the extension?

@mmusich
Copy link
Contributor

mmusich commented Jul 17, 2019

@jfernan2

skim my comments

what does "skim" mean in this context?

created with 26 bins from -2.5 to 2.5

for my education, how can you say it is created with 26 bins?
With this PR it gets correctly booked with |eta|<3 see https://tinyurl.com/y5behksq

image

even if the histogram is allowed to be extended

there is no point in extending this particular histogram, I don't follow your logic.
At any rate I think that David provided a plausible explanation of what is happening in #27531 (comment)

@jfernan2
Copy link
Contributor

jfernan2 commented Jul 17, 2019

what does "skim" mean in this context?

overlook, but never mind

for my education, how can you say it is created with 26 bins?

https://cmssdt.cern.ch/lxr/source/DQM/TrackingMonitor/python/TrackingMonitor_cfi.py#0258

With this PR it gets correctly booked with |eta|<3 see https://tinyurl.com/y5behksq

I was talking about Phase0 Pixel plots

there is no point in extending this particular histogram, I don't follow your logic.

I don't say it is logic to extend it but that the code was allowing to extend it in first instance, and it gets actually extended if you allow it: https://tinyurl.com/y4gaqlsg

At any rate I think that David provided a plausible explanation of what is happening in #27531 > :(comment)

My point is WHERE they get extended in your code, to avoid problems in the future

@mmusich
Copy link
Contributor

mmusich commented Jul 17, 2019

for my education, how can you say it is created with 26 bins?

https://cmssdt.cern.ch/lxr/source/DQM/TrackingMonitor/python/TrackingMonitor_cfi.py#0258

but this is valid only for phase-0 plots and not for phase-1 plots, where the era modifier prevails.

With this PR it gets correctly booked with |eta|<3 see https://tinyurl.com/y5behksq

I was talking about Phase0 Pixel plots

right, but I am still not getting your point. Sorry I must be very dense :(

there is no point in extending this particular histogram, I don't follow your logic.

I don't say it is logic to extend it but that the code was allowing to extend it in first instance, and it gets actually extended if you allow it: https://tinyurl.com/y4gaqlsg

but isn't it true that any plot can be extended?
In my understanding that's a ROOT TH1 feature that we pass onto CMSSW DQM MonitorElements:
https://root.cern.ch/doc/master/classTH1.html#a9f608543655d452dafd1cf6bce83c683

At any rate I think that David provided a plausible explanation of what is happening in #27531 > :(comment)

My point is WHERE they get extended in your code, to avoid problems in the future

They get extended all over the place where it makes sense, like in time or LS series: (see e.g. https://github.com/cms-sw/cmssw/blob/master/DQM/TrackingMonitor/src/TrackingMonitor.cc#L348 for an example).
In my understanding the advantage of using the SetCanExtend is that users do not need to code their own version of the histogram merging, relying instead on the one provided by ROOT as it has been commented yesterday by @fioriNTU here: #27528 (comment)

@fabiocos
Copy link
Contributor

@jfernan2 @mmusich @hbecerri this PR is indeed addressing an issue that is worthwhile to investigate. But in practice #27531 is solving a problem that has broken a pre-release making it unusable, while the other instances of SetCanExtend are not for what we can see. So I consider solving the former issue as urgent. Given that the discussion on this PR might require a bit more time, I will move forward with merging #27531, this PR can just be rebased on that, and possibly update it in case

@mmusich
Copy link
Contributor

mmusich commented Jul 17, 2019

Given that the discussion on this PR might require a bit more time, I will move forward with merging #27531, this PR can just be rebased on that, and possibly update it in case.

OK, provided that @hbecerri is fine in rebasing. Frankly I don't understand why the due investigations on the issue are a reason enough to hold this PR, which is basically doing the same thing as #27531 (which is already DQM signed) just more cleanly.
It seems to me PRs are reviewed differently based on the submitter...

@jfernan2
Copy link
Contributor

jfernan2 commented Jul 17, 2019

@mmusich the line you point only allows it CAN be extended, and it is a LS based plot which makes sense since you don't know the limits a priori, but not the ones we are talking about which have a known range from the beginning.
You are not saying WHERE the code was extended it in a random manner in the range: -15 < eta < 15
Just browse in the 4 plots here and you will see what I am talking about: https://tinyurl.com/y3latly4

And not, I am not reviewing PRs based on the submitter but on the urgency. Just to avoid disappointments to you in the future ;-)

@mmusich
Copy link
Contributor

mmusich commented Jul 17, 2019

You are not saying WHERE the code was extended it in a random manner in the range: -15 < eta < 15

Now I finally see what you mean! I think it's ROOT that eventually decides where to put the boundary, could it be that thee are some nans propagating here?
I think it can debugged by printing the x and y axis ranges of the profile every time the plot is filled.
To reproduce the plot in https://tinyurl.com/y3latly4 which wf shall I use?

@jfernan2
Copy link
Contributor

jfernan2 commented Jul 17, 2019

This is in particular is wf 4.53 but you can see it is happening in half of the workflows, those with a non zero number in:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/baseLineComparisons/CMSSW_11_0_X_2019-07-16-2300+d27f61/32673/dqm-histo-comparison-summary.html
Left column is original code, central PR, right comparison using common bin range.
Thanks

@mmusich
Copy link
Contributor

mmusich commented Jul 17, 2019

Hi @jfernan2 so I re-run step3 of wf 4.53 by adding these changes:

diff --git a/DQM/TrackingMonitor/src/TrackAnalyzer.cc b/DQM/TrackingMonitor/src/TrackAnalyzer.cc
index f1abe45ff75..67e628e20fd 100644
--- a/DQM/TrackingMonitor/src/TrackAnalyzer.cc
+++ b/DQM/TrackingMonitor/src/TrackAnalyzer.cc
@@ -1287,6 +1287,9 @@ void TrackAnalyzer::analyze(const edm::Event& iEvent, const edm::EventSetup& iSe
 
     if (Folder == "Tr") {
       DistanceOfClosestApproachToBSdz->Fill(track.dz(bs.position()));
+      std::cout << "====================================================================="<< std::endl;
+      std::cout << "track.eta():" << track.eta() << "  track.dxy(bs.position())):" <<  track.dxy(bs.position()) << std::endl;
+      std::cout<< "DistanceOfClosestApproachToBSVsEta->getTH1()->GetXaxis()->GetXmax(): " << DistanceOfClosestApproachToBSVsEta->getTH1()->GetXaxis()->GetXmax () <<  "    |  DistanceOfClosestApproachToBSVsEta->getTH1()->GetXaxis()->GetXmin(): " << DistanceOfClosestApproachToBSVsEta->getTH1()->GetXaxis()->GetXmin () << std::endl;
       DistanceOfClosestApproachToBSVsEta->Fill(track.eta(), track.dxy(bs.position()));
     }

and I identified two transitions where the x-axis range got updated:

=====================================================================
track.eta():2.50979  track.dxy(bs.position())):0.0306698
DistanceOfClosestApproachToBSVsEta->getTH1()->GetXaxis()->GetXmax(): 2.5    |  DistanceOfClosestApproachToBSVsEta->getTH1()->GetXaxis()->GetXmin(): -2.5
=====================================================================
track.eta():2.34284  track.dxy(bs.position())):-0.0247759
DistanceOfClosestApproachToBSVsEta->getTH1()->GetXaxis()->GetXmax(): 7.5    |  DistanceOfClosestApproachToBSVsEta->getTH1()->GetXaxis()->GetXmin(): -2.5
=====================================================================

and:

=====================================================================
track.eta():-2.50028  track.dxy(bs.position())):0.0215483
DistanceOfClosestApproachToBSVsEta->getTH1()->GetXaxis()->GetXmax(): 7.5    |  DistanceOfClosestApproachToBSVsEta->getTH1()->GetXaxis()->GetXmin(): -2.5
=====================================================================
track.eta():-2.22177  track.dxy(bs.position())):-0.0364012
DistanceOfClosestApproachToBSVsEta->getTH1()->GetXaxis()->GetXmax(): 7.5    |  DistanceOfClosestApproachToBSVsEta->getTH1()->GetXaxis()->GetXmin(): -12.5
=====================================================================

so it seems that indeed ROOT automatically updated the range (by some internal heuristic I guess) when we hit a track with a pseudo-rapidity very close to the boundary eta = 2.5 (which is the correct one for a phase-0 pixel detector workflow such as 4.53 (2012B)).
I don't know exactly what else we can learn from this, a part from the fact that we should in general put an epsilon more than the physical detector boundary to be able to capture tracks whose pseudo-rapidity is enhanced probably by BeamSpot z - widths effects.
Please let me know if you need more debugging...

@jfernan2
Copy link
Contributor

ok @mmusich that explains why the phi plots are not affected: there are no boundaries there...
Thanks!

@jfernan2
Copy link
Contributor

unhold

@jfernan2
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @davidlange6, @slava77, @smuzaffar, @fabiocos (and backports should be raised in the release meeting by the corresponding L2)

@fabiocos
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants