Skip to content

Latest commit

 

History

History
271 lines (199 loc) · 18.8 KB

ConfrontPredictions.rst

File metadata and controls

271 lines (199 loc) · 18.8 KB
.. index:: Confronting Theory Predictions with Data

Confronting Predictions with Experimental Limits

Once the relevant signal cross sections (or :ref:`theory predictions <theoryPredictions>`) have been computed for the input model, these must be compared to the respective upper limits. The upper limits for the signal are stored in the SModelS :ref:`Database <database>` and depend on the type of |ExpRes|: |UL| or |EM|.

In the case of a |ULr|, the theory predictions typically consist of a list of signal cross sections (one for each cluster) for the single |dataset| (see :ref:`Theory Predictions for Upper Limit Results <thePredUL>` for more details). Each theory prediction must then be compared to its corresponding upper limit. This limit is simply the cross section upper limit provided by the experimental publication or conference note and is extracted from the corresponding UL map (see |ULrs|).

For |EMrs| there is a single cluster for each |dataset| (or signal region), and hence a single signal cross section value. This value must be compared to the upper limit for the corresponding signal region. This upper limit is easily computed using the number of observed and expected events for the |dataset| and their uncertainties and is typically stored in the :ref:`Database <database>`. Since most |EMrs| have several signal regions (|datasets|), there will be one theory prediction/upper limit for each |dataset|. By default SModelS keeps only the best |dataset|, i.e. the one with the largest ratio r_\mathrm{exp}=(\mathrm{theory\,prediction})/(\mathrm{expected\, limit}). (See below for |covariace|) Thus each |EMr| will have a single theory prediction/upper limit, corresponding to the best |dataset| (based on the expected limit). If the user wants to have access to all the |datasets|, the default behavior can be disabled by setting useBestDataset=False in theoryPredictionsFor (see :ref:`Example.py <exampleCode>`).

The procedure described above can be applied to all the |ExpRess| in the database, resulting in a list of theory predictions and upper limits for each |ExpRes|. A model can then be considered excluded by the experimental results if, for one or more predictions, we have theory prediction > upper limit [1].

Likelihood Computation

In the case of |EMrs|, additional statistical information about the constrained model can be provided by the SModelS output. Most importantly, we can compute a likelihood, which describes the plausibility of a signal strength \mu given the data D:

\mathcal{L}(\mu,\theta|D) =  P\left(D|\mu s + b + \theta \right) p(\theta)

Here, \theta denotes the nuisance parameter that describes the variations in the signal and background contribtions due to systematic effects, s is the number of predicted signal events.

If no information about the correlation of signal regions is available (or if its usage is turned off, see :doc:`Using SModelS <RunningSModelS>`), we compute a simplified likelihood for the most sensitive (a.k.a. best) signal region, i.e. the signal region with the highest r_\mathrm{exp}=(\mathrm{theory\,prediction})/(\mathrm{expected\, limit}), following the procedure detailed in CMS-NOTE-2017-001.

This means we assume p(\theta) to follow a Gaussian distribution centered around zero and with a variance of \delta^2, whereas P(D) corresponds to a counting variable and is thus properly described by a Poissonian. The complete likelihood thus reads:

\mathcal{L}(\mu,\theta|D) = \frac{(\mu s + b + \theta)^{n_{obs}} e^{-(\mu s + b + \theta)}}{n_{obs}!} exp \left( -\frac{\theta^2}{2\delta^2} \right)

where n_{obs} is the number of observed events in the signal region. From this likelihood we compute a 95% confidence level limit on \mu using the CL_s (CL_{sb}/CL_{b}) limit from the test statistic q_\mu, as described in Eq. 14 in G. Cowan et al., Asymptotic formulae for likelihood-based tests. We then search for the value CL_s = 0.95 using the Brent bracketing technique available through the scipy optimize library. This is used for the computation the observed and expected r values.

In addition, SModelS reports for each |EMr| :
  • the likelihood for the hypothezised signal, \mathcal{L}_{\mathrm{BSM}} given by \mu=n_{\mathrm{signal}},
  • the likelihood for the SM, \mathcal{L}_{\mathrm{SM}} given by \mu=0, and
  • the maximum likelihood \mathcal{L}_{\mathrm{max}} obtained by setting \mu=n_{\mathrm{obs}}-b.

The values for n_{\mathrm{obs}}, b and \delta_{b} are directly extracted from the data set (coined observedN, expectedBG and bgError, respectively), while n_{\mathrm{signal}} is obtained from the calculation of the theory predictions. A default 20% systematical uncertainty is assumed for n_{\mathrm{signal}}, resulting in \delta^2 = \delta_{b}^2 + \left(0.2 n_{\mathrm{signal}}\right)^2.

We note that in the general case analyses may be correlated, so the likelihoods from different analyses cannot straightforwardly be combined into a global one. Therefore, for a conservative interpretation, only the result with the best expected limit should be used. Moreover, for a statistically rigorous usage in scans, it is recommended to check that the analysis giving the best expected limit does not wildly jump within continuous regions of parameter space that give roughly the same phenomenology.

Finally, note that in earlier SModelS versions, a \chi^2 value, defined as \chi^2=-2 \ln \frac{\mathcal{L}_{\mathrm{BSM}}}{\mathcal{L}_{\mathrm{max}}} was reported instead of \mathcal{L}_{\mathrm{max}} and \mathcal{L}_{\mathrm{SM}}. From v2.1 onwards, the definition of a test statistic for, e.g., likelihood ratio tests, is left to the user.

Combination of Signal Regions

The likelihoods from individual signal regions can be combined, if the experimental analysis provides the required information about the correlation between distinct signal regions. Currently two approaches are available, depending on which type of information is provided by the experimental collaborations. When using runSModelS.py, the combination of signal regions is turned on or off with the parameter options:combineSRs, see :ref:`parameter file <parameterFileCombineSRs>`. Its default value is False, in which case only the result from the best expected signal region (best SR) is reported. If combineSRs = True, the combined result is shown instead.

Simplified Likelihood Approach

The first approach is applicable when information about the (background) correlations is available in the form of a covariance matrix. Some CMS analyses provides such matrix together with the efficiency maps. The usage of such covariance matrices is implemented in SModelS v1.1.3 onwards, following as above the simplified likelihood approach described in CMS-NOTE-2017-001.

Since CPU performance is a concern in SModelS, we try to aggregate the official results, which can comprise >100 signal regions, to an acceptable number of aggregate regions. Here acceptable means as few aggregate regions as possible without loosing in precision or constraining power. The CPU time scales roughly linearly with the number of signal regions, so aggregating e.g. from 80 to 20 signal regions means gaining a factor 4 in computing time.

Under the assumptions described in CMS-NOTE-2017-001, the likelihood for the signal hypothesis when combining signal regions is given by:

\mathcal{L}(\mu,\theta|D) = \prod_{i=1}^{N} \frac{(\mu s_i^r + b_i + \theta_i)^{n_{obs}^i} e^{-(\mu s_i^r + b_i + \theta_i)}}{n_{obs}^i!} exp \left( -\frac{1}{2} \vec{\theta}^T V^{-1} \vec{\theta} \right)

where the product is over all N signal regions, \mu is the overall signal strength, s_i^r the signal strength in each signal region and V represents the covariance matrix. Note, however, that unlike the case of a single signal region, we do not include any signal uncertainties, since this should correspond to a second-order effect.

As above, we compute a 95% confidence level limit on \mu using the CL_s (CL_{sb}/CL_{b}) limit from the test statistic q_\mu, as described in Eq. 14 in G. Cowan et al., Asymptotic formulae for likelihood-based tests. We then search for the value CL_s = 0.95 using the Brent bracketing technique available through the scipy optimize library. Note that the limit computed through this procedure applies to the total signal yield summed over all signal regions and assumes that the relative signal strengths in each signal region are fixed by the signal hypothesis. As a result, the above limit has to be computed for each given input model (or each :ref:`theory prediction <theoryPredictions>`), thus considerably increasing CPU time.

In the :ref:`figure below <combinedSRfig>` we show the constraints on the simplified model T2bbWWoff when using the best signal region (left), all the 44 signal regions considered in CMS-PAS-SUS-16-052 (center) and the aggregated signal regions included in the SModelS database (right). As we can see, while the curve obtained from the combination of all 44 signal regions is much closer to the official exclusion than the one obtained using only the best SR. Finally, the aggregated result included in the SModelS database (total of 17 aggregate regions) similarly matches the official result, but considerably reduces the run time.

images/T2bbWWoff_bestSR.png

Best signal region

images/T2bbWWoff_44.png

44 signal regions

images/T2bbWWoff_17.png

17 aggregate regions

Figure: Comparison of exclusion curves for CMS-PAS-SUS-16-052 using only the best signal region (left), the combination of all 44 signal regions (center), and the combination of 17 aggregate signal regions (right).

Simplified Likelihood v2

From SModelS v2.3 on, simplified likelihoods with a skew (a.k.a. SLv2), as described in The Simplified Likelihood Framework paper, are supported. The simplified likelihood above (which we now call SLv1) is extended to account for a skewness in the nuisances:

\mathcal{L}(\mu,\theta|D) = \prod_{i=1}^{N} \mathrm{Pois}(n_{obs}^i|\mu s_i^r + a_i + b_i \theta_i + c_i \theta_i^2) \cdot \frac{e^{-\frac{1}{2}\vec{\theta}^{\mathrm{T}}{\rho}^{-1}\vec{\theta}}}{\sqrt{(2\pi)^{N}}}

Here, \rho denotes the correlation matrix, and a_i, b_i, and c_i are functions of the background expectations, and their second and third statistical momenta. In order for a result to be declared SLv2 in SModelS, third statistical momenta must be provided in addition to background expectations and their errors.

The :ref:`figure below <combinedSRfigV2>` shows a comparison for TChiHH of CMS-SUS-20-004, using a simplified likelihood but ignoring the third momenta, i.e. SLv1 (left), and a simplified likelihood result v2 (right).

images/SLv2_v1.png

SLv1

images/SLv2_v2.png

SLv2

Figure: Comparison of CMS-SUS-20-004 using SLv1 (left), and SLv2 (right).

Full Likelihoods (pyhf) Approach

In early 2020, following ATL-PHYS-PUB-2019-029, ATLAS has started to provide full likelihoods for results with full Run 2 luminosity (139/fb), using a JSON serialisation of the likelihood. This JSON format describes the HistFactory family of statistical models, which is used by the majority of ATLAS analyses. Thus background estimates, changes under systematic variations, and observed data counts are provided at the same fidelity as used in the experiment.

SModelS supports the usage of these JSON likelihoods from v1.2.4 onward via an interface to the pyhf package, a pure-python implementation of the HistFactory statistical model. This means that for |EMr| from ATLAS, for which a JSON likelihood is available and when the combination of signal regions is turned on, the evaluation of the likelihood is relegated to pyhf. Internally, the whole calculation is based on the asymptotic formulas of Asymptotic formulae for likelihood-based tests of new physics, arXiv:1007.1727.

The :ref:`figure below <combinedSRfigPyhf>` examplifies how the constraints improve from using the best signal region (left) to using the full likelihood (right).

images/TChiWH_bestSR.png

Best signal region

images/TChiWH_pyhf.png

pyhf combining 9 signal regions

Figure: Comparison of exclusion curves for ATLAS-SUSY-2019-08 using only the best signal region (left), and the combination of all 9 signal regions with pyhf (right).

Combination of different Analyses

Starting with SModelS v2.2, it is possible to combine likelihoods from different analyses, assuming that their signal regions are approximately uncorrelated. As of now, the information of which analyses meet this criterion is not given by SModelS itself, and has to be provided by the user (see option :ref:`combineAnas <parameterFileCombineAnas>`).

For these combinations, the combined likelihood \mathcal{L}_{C} is simply the product of the individual analysis likelihoods, \mathcal{L}_{i}. Furthermore, we assume a common signal strength \mu for all analyses:

\mathcal{L}_{C}(\mu) = \prod_{i=1} \mathcal{L}_{i}(\mu\; s^{i})

The individual likelihoods can correspond to the best signal region, if the combination of signal regions is turned off, or to the combined signal region likelihood otherwise. For the determination of the maximum likelihood, scipy.optimize.minimize is used with the method BFGS in order to compute \hat{\mu} and \mathcal{L}_{max} = \mathcal{L}_{C}(\hat{\mu}).

The resulting likelihood and r-values for the combination are displayed in the :ref:`output <outputDescription>` along with the individual results for each analysis.

[1]The statistical significance of the exclusion statement is difficult to quantify exactly, since the model is being tested by a large number of results simultaneously.