Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBHE M2: add missing photoStatistics uncertainties #17311

Merged
merged 2 commits into from Feb 3, 2017

Conversation

mariadalfonso
Copy link
Contributor

Added missing photoStatistics uncertainties in the M2 chi2.
clear benefits in both HPD and siPM
http://dalfonso.web.cern.ch/dalfonso/dalfonso_PhotoStatisticsFIX.pdf

This PR will be transparent for the plan0, as the kept the HPD fix commented for now.
a good plus if can be merged in the pre3 as facilitate the next developments.

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @mariadalfonso for CMSSW_9_0_X.

It involves the following packages:

RecoLocalCalo/HcalRecAlgos

@cmsbuild, @cvuosalo, @slava77, @davidlange6 can you please review it and eventually sign? Thanks.
@argiro this is something you requested to watch as well.
@davidlange6, @smuzaffar you are the release manager for this.

cms-bot commands are listed here #13028

if(!channelData.hasTimeInfo() && (charge-ped)>channelData.tsPedestalWidth(ip)) {
// the fcByPE is not in the DB, but is hard coded in python
// https://github.com/cms-sw/cmssw/blob/CMSSW_8_1_0/SimCalorimetry/HcalSimProducers/python/hcalSimParameters_cfi.py#L62
// Note in DIGI (from kPedro): the output number of photoelectrons after smearing is treated very differently for SiPMs: *each* pe is assigned a different time based on a random generation from the Y11 pulse plus the SimHit time. In HPDs, the overall pulse is shaped all at once using just the SimHit time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add some linebreaks in the comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slava77 sure done

// the fcByPE is not in the DB, but is hard coded in python
// https://github.com/cms-sw/cmssw/blob/CMSSW_8_1_0/SimCalorimetry/HcalSimProducers/python/hcalSimParameters_cfi.py#L62
// Note in DIGI (from kPedro): the output number of photoelectrons after smearing is treated very differently for SiPMs: *each* pe is assigned a different time based on a random generation from the Y11 pulse plus the SimHit time. In HPDs, the overall pulse is shaped all at once using just the SimHit time.
noisePHArr[ip] = (charge-ped)/sqrt(0.3305*16);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by the time we can comment it out, please define a common constant in python and use it both in hcalSimParameters_cfi and in m2 configuration.

in photoelectronsToAnalog = cms.vdouble([0.3305]*16) is a 16-times repeat of 0.3305. do these all add up to lead to "0.3305*16"?


// Note2. (from kPedro): the output number of photoelectrons after smearing is treated very differently for SiPMs: *each* pe is assigned a different time based on a random generation from the Y11 pulse plus the SimHit time. In HPDs, the overall pulse is shaped all at once using just the SimHit time.

noisePHArr[ip] = (charge-ped)/sqrt(0.3305*16);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[reposted, since the earlier one became hidden]

by the time we can comment it out, please define a common constant in python and use it both in hcalSimParameters_cfi and in m2 configuration.

in photoelectronsToAnalog = cms.vdouble([0.3305]*16) is a 16-times repeat of 0.3305. do these all add up to lead to "0.3305*16"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slava77 it is a per-tower (ieta) value in the hcalSimParameters config.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igv4321, @kpedro88 (including Igor and Kevin in the thread)

Just would like to mention that this is supposed to be ieta-dependent HB constant/factor. That's why there is an array of 16 in the photoelectronsToAnalog = cms.vdouble([0.3305]*16)
So the actual constant is just 0.3305 (sic), as described in the HB QIE8 digitization description:
https://indico.cern.ch/event/603091/contributions/2441161/attachments/1397849/2131705/HPD_QIE8_digitization.pdf

Currently this HB constant(s) is in SIM config, which must not be accessed by RECO, so as Slava mentioned, it should be somewhere on a "neutral ground" (e.g. Calibcalorimetry/HcalPlugins)
So. I'd suggest to make a small config in
http://cmslxr.fnal.gov/source/CalibCalorimetry/HcalPlugins/python/
accessible both by
(1) http://cmslxr.fnal.gov/source/SimCalorimetry/HcalSimProducers/python/hcalSimParameters_cfi.py
(2) http://cmslxr.fnal.gov/source/RecoLocalCalo/HcalRecProducers/python/HBHEMethod2Parameters_cfi.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slava77 I replaced the slides in the PR description with results from noisePHArr[ip] = (charge-ped)/sqrt(0.3305); things get even better, I will update the code (but keep commented) when the bot finish his tests)

@slava77
Copy link
Contributor

slava77 commented Jan 29, 2017

@cmsbuild please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 29, 2017

The tests are being triggered in jenkins.
https://cmssdt.cern.ch/jenkins/job/ib-any-integration/17482/console Started: 2017/01/29 12:32

@cmsbuild
Copy link
Contributor

@cmsbuild
Copy link
Contributor

Comparison job queued.

@abdoulline
Copy link

abdoulline commented Jan 29, 2017 via email

@cmsbuild
Copy link
Contributor

@kpedro88
Copy link
Contributor

@abdoulline as an alternative to rearranging configs, if we are going to be relying on the pe2fc value for reconstruction (rather than just simulation), we may want to specify it per-channel in the database. Supposedly the pe2fc value was initially measured for each HPD by someone at Minnesota (not sure if we can expect the values to be constant over time). We could repurpose the SiPMParameters database object for this - just define an "SiPM type" value corresponding to HPDs. (A bit kludgey, but we've done worse.)

@abdoulline
Copy link

abdoulline commented Jan 29, 2017

@kpedro88 - right, I also thought about using SiPMParameters earlier today, but it looked not quite logical (indeed) at the beginning... But once you also suggest it, I start thinking ("Il n'y a que les imbéciles qui ne changent pas d'avis"(c)) that it well may be the simplest way of "unification" of fCbyPE usage (in HPD case) for M2.
What everybody wants to avoid, I believe, is to have it hardcoded ...

Actually we have right away for non-SiPMs: 0 HcalNoSiPM as SiPMS type.
And M2 is not used neither for HO nor for HF (the latter is also of SiPMS type 0). So defining 3d column in SiPMParameters as 0.3305 for HB/HE HPD readouts should be enough to unambiguously use fCbyPE for SiPMS/HPDs in HB/HE in M2...
And to be used in digitization as well.

@abdoulline
Copy link

abdoulline commented Jan 30, 2017

@kpedro88
after looking closer to what would be needed to update all the existing SiPMs conditions
(actually making a series of new tags with all their IOVs and including them both in data and MC GTs)
https://twiki.cern.ch/twiki/bin/view/CMS/HcalSiPMParameters
and what needs to be changed on the code side (in DIGI to use DB instead of config value),
I believe we have to pursue a simple change of a couple of configs (suggested above) to move fast with HPD photoelectronsToAnalog = cms.vdouble([0.3305]*16) usage in M2
We may come to DB update later...

@kpedro88
Copy link
Contributor

@abdoulline my thought was to define a new "HcalHPD" type (separate from HcalNoSiPM = 0), then HcalSimParameters would pick up the DB value without any code changes needed. But of course you're right that updating all the DB conditions would take some work, so a config change can be pursued first if we want to have it quickly.

@mariadalfonso
Copy link
Contributor Author

mariadalfonso commented Jan 30, 2017

@abdoulline @kpedro88
The HPD fix cannot be pushed beyond the scene.
Given the potential physics impact of this fix, HCAL-dpg jetMET and PPD need to be on board for both the legacy re-reco and 2017 data taking. As I wrote in the PR description, we can put in the release only the fix on the siPM for now.
There is enough time to find a clean solution so that is substainable on long term: especially if this pe->fC is eta/time dependent (can you confirm this ?) and especially now that we will not do the HE-phase1 replacement.

@kpedro88
Copy link
Contributor

@mariadalfonso I tend to agree; if it won't be used for 2017 startup, we have more time to consider. I don't think anyone knows if HPD pe2fC is dependent on time, but there may be some channel dependence. @hatakeyamak @deguio this could be a topic for the DPG to investigate if the initial HPD measurements can be found.

@abdoulline
Copy link

abdoulline commented Jan 30, 2017

@mariadalfonso
you mean "behind the scene" - I agree. It should be accessed and digested by downstream consumers.

In principle, this HPD property can vary (to some small extent) from HPD to HPD.
But this is another story if it can be (re-) done this "detailed" way.
fCperPE depends on HPD gain and QE (quantum/photodetection efficiency) . The first one is changing with time at the level of several % (monitored by LED), while QE was measured some ~15 years ago...
So 0.3305 (!) was probably considered as a good approximation for fCperPE at that time.
Given all other uncertainties, I'd not pay much of attention to a variation of this value right now. It's not a primary parameter (unlike "gain"=GeV/fC), but an estimate of photostatistics variation scale.

@mariadalfonso
Copy link
Contributor Author

@kpedro88 @abdoulline

Given that

  1. the HPD data behave much better than the HPD MC (chi2/timing)
    http://dalfonso.web.cern.ch/dalfonso/dalfonso_PhotoStatisticsFIX.pdf

  2. "the output number of photoelectrons after smearing is treated very differently for SiPMs: each pe is assigned a different time based on a random generation from the Y11 pulse plus the SimHit time. In HPDs, the overall pulse is shaped all at once using just the SimHit time"

Can you put me in the condition to test the HPD simulation with the same siPM approach ?

@kpedro88
Copy link
Contributor

The SiPM approach is needed in order to handle the nonlinearity correctly - the time of each photon hitting the SiPM must be known. It is otherwise statistically equivalent to the HPD approach: random generation (Y11) + smearing (photodetector pulse) vs. smearing (convolution of Y11 + photodetector pulses).

Maybe the ingredients in the HPD pulse shape calculation https://github.com/cms-sw/cmssw/blob/CMSSW_9_0_X/CalibCalorimetry/HcalAlgos/src/HcalPulseShapes.cc#L138 should be revised, but I have no special knowledge on that topic.

@abdoulline
Copy link

abdoulline commented Jan 31, 2017 via email

@abdoulline
Copy link

Kevin, I'm late again :-)
<...>
It is otherwise statistically equivalent to the HPD approach: random generation (Y11) + smearing
(photodetector pulse) vs. smearing (convolution of Y11 + photodetector pulses)
<...>

  • That's what I meant in my last sentence.

@mariadalfonso
Copy link
Contributor Author

@abdoulline
Accidentally increasing the noise term get you "by definition" to a lower chi2 as you change the denominator (so that is a pretty bad way to approach a physics problem).
All the HPD energy quantities were completely badly measured and this should be our primary goal.

I'm left to keep doing "reverse engineering" of what come to me as input hopefully we learn more stuff and take better data that in 2015.

@abdoulline
Copy link

BTW, some time ago, when several ways were considered of improving the HPD DIGI/RECO (MC) situation, one option (second to 105->106 move) was listed as modification of DIGI pulse 125 to deliver DIGIs better suited for existing RECO pulse 105. But it's considered as medium-term task. Hopefully we'll discuss these matters tomorrow morning at the meeting called by Federico.

@igv4321
Copy link
Contributor

igv4321 commented Jan 31, 2017 via email

@kpedro88
Copy link
Contributor

The Poisson smearing for photostatistics occurs before the pulse shape smearing/random generation for both HPDs and SiPMs.

@abdoulline
Copy link

abdoulline commented Jan 31, 2017 via email

@igv4321
Copy link
Contributor

igv4321 commented Jan 31, 2017 via email

@abdoulline
Copy link

Imagine you have one SimHit for simplicity:
E_simhit -> fC -> Np.e. -> Poisson Np.e. -> fC -> distribute over CaloSamples using shape 125 (convolution of all timing components) -> TimeSlewSim -> add electronics noise...

@igv4321
Copy link
Contributor

igv4321 commented Jan 31, 2017 via email

@kpedro88
Copy link
Contributor

Classic pileup mixing just adds more SimHits to the event, which are processed by the digi simulation in the exact same way as signal hits.

Premixing instead creates a simulated RAW dataset with some distribution of interactions per event, and then converts those pileup digis back into CaloSamples to add to the CaloSamples generated in the digi simulation of the signal event.

@abdoulline
Copy link

Before its beginning ("accumulation" of SimHits for given readout) in the regular mixing

@igv4321
Copy link
Contributor

igv4321 commented Jan 31, 2017

OK, thanks. In the "classic pileup mixing", what determines the SimHit timing? Is the time slice chosen at random?

How does one distinguish the MC samples generated by "classic pileup mixing" and by "premixing"?

@kpedro88
Copy link
Contributor

The SimHits come from minbias gen-sim. I think their times get shifted to correspond to the bunch crossing for which they're used, but I'm not 100% sure of the details.

Premixing has only recently been used for central MC production. Usually some part of the campaign will have "premix" in the name.

@abdoulline
Copy link

Indeed, each added PU event (from <N_PU> simulated in each -N1 +N2 BXs = -12
+5 typically ) naturally has its own t_zero +/- N*25ns wrt "main physics
event" in SOI.

As to the premixing... Just would like to mention that DIGI code is filled
with "premixing" pieces (with bifurcations/forks), e.g.
http://cmslxr.fnal.gov/source/SimCalorimetry/HcalSimAlgos/src/HcalElectronicsSim.cc#0011
but admittedly I don't know well enough this quite "intrusive" part developed by M.Hildreth.

@abdoulline
Copy link

Admittedly we've filled this PR thread with quite extended discussions, but now it's probably too late to move to e-mail.,,

Igor, once you continue thinking how to improve signal simulation, I'd like to add a couple of comments:
(1) shape 125 includes several light emission components, but not WLS fiber Y11 explicitly.
http://cmslxr.fnal.gov/source/CalibCalorimetry/HcalAlgos/src/HcalPulseShapes.cc#0138
And since quite long time (if not "from the beginning", at least from 1998?) it was rather empirical mixture to play with the sim pulse shape. Initially - just to best fit TB CaloSamples, as there was only one pulse shape both for DIGI and RECO (and for the latter - only for out-of-reco-window correction).
I agree that HPD and preamp components (perhaps best known two in this "mixture") may not be quite relevant for p.e. considerations, indeed.

(2) As I've already mentioned previously/elsewhere, in the current situation
I'm more concerned by shape TimeSlewSim per se and also by its (non-equivalent) unfolding http://cmslxr.fnal.gov/source/RecoLocalCalo/HcalRecAlgos/src/PulseShapeFitOOTPileupCorrection.cc#0377
And neither of the two may correspond to what's going on in the detector. So at the end M2 deals with the MC shape which is certainly not 105 (neither 125).

As we've discussed on several occasions, in MC we may try (i) to maximally mimic hardware (apparently neither Chris Tully nor Jeremy Mans nor Rick Wilkinson, who developed/maintained DIGI part over ~15 years, achieved that) or (ii) to make MC DIGI-RECO kind of self-consistent to put data vs MC "on par" at the very end.

@mdhildreth
Copy link
Contributor

Just to comment: the reason there are all of those "Branches" in the code for PreMixing is that the hits have to be processed into fC with no additional smearing, noise, slew correction, ion-feedback, etc., etc. All that is required is the energy->charge conversion. It was difficult to do this without some ugly mangling of the code. Sorry!

@abdoulline
Copy link

abdoulline commented Feb 1, 2017 via email

@slava77
Copy link
Contributor

slava77 commented Feb 3, 2017

plots are from CMSSW_9_0_X_2017-01-27-1100, before the plan-0

pi 5-5000 gun

Time in the endcap is quite nicely around 0 without bumps (this is all hits)

chi2 tail is gone
all_sign842vsorig_pie5to5000in2017c_log10hbherechitssorted_hbhereco__reco_obj_obj_chi220

the spike at chi2~1 looks quite suspicious
it apparently mostly comes from hits with energy < 0.1 GeV (70%)
he_chi2_energym2lt01
this doesn't look particularly normal, but probably good enough for now for E<0.1 which is noise in this sample.

in PU35 the same E<0.1
has a much more significant "flattish" component, which seems to be more reasonable for non-noise part
he_chi2_energym2lt01

the change in PU35
all_sign842vsorig_ttbar13tevpu2017wf10224p0c_log10hbherechitssorted_hbhereco__reco_obj_obj_chi220

The reduction is clearly expected given the increase in the chi2 denominator.

Back to pi 5-5000 gun:
low-energy hits are coming out at even lower energy .. it's unclear if knowledge of the noise pulse is that important, but it looks like most of them are pushed below ~30 MeV, which is pretty good if it turns out to be real for data, since we can cut out the noise more effectively
all_sign842vsorig_pie5to5000in2017c_log10hbherechitssorted_hbhereco__reco_obj_obj_energy13

In PU35
all_sign842vsorig_ttbar13tevpu2017wf10224p0c_log10hbherechitssorted_hbhereco__reco_obj_obj_energy12
it looks like the number of in-time hit around 1GeV is ~20-30% smaller (handed over to out-of-time; some simhit info will be needed to see if there is proper loss of signal energy or proper OOT subtraction).

pretty large chunk of neutral hadrons is gone as a downstream effect
all_sign842vsorig_ttbar13tevpu2017wf10224p0c_log10recopfcandidates_particleflow__reco_obj_pt73

Looking at response to gen-level

"jets" from pi5-5000 show some 7% decrease in response in endcap for pt>~300 GeV
wfpi5to5000_ak4calo_corogen_genpt_he

the resolution apparently doesn't suffer. So, this could be calibrated away
wfpi5to5000_ak4calo_recoogen_600to1500
These are consistent with Maria's plots
Still, it looks like some inspection of TS pulse actual vs fitted is necessary for high pt [I seem to recall that M2/M0 was somewhat flat above the 3-1 switch transition.

@slava77
Copy link
Contributor

slava77 commented Feb 3, 2017

+1

for #17311 24ca41c

  • changes are as described, affecting only HBHE simulation with SiPMs at this point
  • jenkins tests pass and show some changes in workflows with SiPMs: these begin at HE or HB where appropriate and propagate downstream (where effects are somewhat smallish)
  • local test in the phase-1 setup of 2017 scenario (posted earlier) show roughly expected behavior. Some checks of high-energy response will be needed in the follow up work.

On the technical side: in PU35 ttbar hbheprereco time is down from 3.4 to 1.9 s/event, indicating a faster convergence.

@cmsbuild
Copy link
Contributor

cmsbuild commented Feb 3, 2017

This pull request is fully signed and it will be integrated in one of the next CMSSW_9_0_X IBs (tests are also fine). This pull request requires discussion in the ORP meeting before it's merged. @davidlange6, @smuzaffar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants