# Limits with asymptotic formulae

_Valerio Ippolito - INFN Sezione di Roma_

This is the part in which we run limits on some signal POI, using the asymptotic calculation (faster than toys).

Let's first make sure CommonStatTools is compiled

In [1]:
!cd ../CommonStatTools; mkdir -p build; cd build; cmake ..; make

-- Configuring done


-- Generating done


-- Build files have been written to: /eos/home-v/vippolit/SWAN_projects/ATLAS_statistics_tutorial/statistics-tutorial/CommonStatTools/build


[35m[1mScanning dependencies of target CommonStatTools[0m


[35m[1mConsolidate compiler generated dependencies of target CommonStatTools[0m


[ 50%] Built target CommonStatTools


[35m[1mConsolidate compiler generated dependencies of target runSig[0m


[ 60%] Built target runSig


[35m[1mConsolidate compiler generated dependencies of target getGlobalP0[0m


[ 70%] Built target getGlobalP0


[35m[1mConsolidate compiler generated dependencies of target exampleSignificance[0m


[ 80%] Built target exampleSignificance


[35m[1mConsolidate compiler generated dependencies of target getCorrMatrix[0m


[ 90%] Built target getCorrMatrix


[35m[1mConsolidate compiler generated dependencies of target runAsymptoticsCLs[0m


[100%] Built target runAsymptoticsCLs




We then load the compiled library, and the headers for the class which deals with the asymptotic limit setting

In [2]:
#include "../CommonStatTools/AsymptoticsCLsRunner.h"

In [3]:
R__ADD_LIBRARY_PATH(../CommonStatTools/build)

In [4]:
R__LOAD_LIBRARY(libCommonStatTools.so)


[1mRooFit v3.60 -- Developed by Wouter Verkerke and David Kirkby[0m 
                Copyright (C) 2000-2013 NIKHEF, University of California & Stanford University
                All rights reserved, please read http://roofit.sourceforge.net/license.txt



Limits are run on a given workspace, contained in some input file. The workspace is expected to contain the ModelConfig, which specifies how the content of the workspace should be used to perform a statistical analysis. Limits are run considering some dataset as data.

In [5]:
inputFile = TString("../create_data/ws/ATLASIT_prova_combined_ATLASIT_prova_model.root");
workspaceName = TString("combined"); 
modelConfigName = TString("ModelConfig"); 
dataName = TString("obsData"); 

The output of the limit setting should be saved in some folder, with some associated nickname

In [6]:
workspaceTag = TString("my_test");
outputFolder = TString("tmp"); 

Actually, the output will consist of a ROOT file containing a TTree the (many) results of the limit setting. In real life you often like to run over multiple signal hypotheses, so CommonStatTools provides you with the possibility to add a branch to the output TTree, which may represent for example the mass of a resonance: this is quite convenient as you can then merge multiple output files with `hadd` and get a single tree with the results of the limit setting on all signals (otherwise, just set it to a dummy value).

In [7]:
paramName = TString("my_resonance_mass");
paramValue = 1.0;

In terms of how the limit is set, we also need to specify:
- if we want to keep data blind
- if we want to inject any signal, and if so the injection strength (i.e. how bigger is the signal cross-section with respect to its nominal value)
- the confidence level CL for our limits (often 0.95, sometimes 0.90)

In [8]:
keepDataBlind = Bool_t(kFALSE);
doInjection = Bool_t(kFALSE);
muInjection = 1.0;
CL = 0.95;

An Asimov dataset is created, if it doesn't exist in the workspace - or one is used if available. Its name should be (or will be):

In [9]:
asimovDataName = TString("asimovData_0");

... and, of course, the verbosity we wish for the output of the code

In [10]:
debugLevel = 2;

Let's put everything together, using the so-called `AsymptoticsCLsRunner` - who is the guy who will actually run limits for us.

In [11]:
EXOSTATS::AsymptoticsCLsRunner limitRunner;

In [12]:
limitRunner.setBlind(keepDataBlind);
limitRunner.setInjection(doInjection);
limitRunner.setInjectionStrength(muInjection);
limitRunner.setDebugLevel(debugLevel);

In [13]:
limitRunner.run(inputFile, workspaceName, modelConfigName, dataName, paramName, paramValue, workspaceTag,
                   outputFolder, CL, asimovDataName);

Settings:
  - betterBands is set to 1
  - betterNegativeBands is set to 0
  - profileNegativeAtZero is set to 0
  - defaultMinimizer is set to Minuit2
  - defaultPrintLevel is set to 1
  - defaultStrategy is set to 2
  - killBelowFatal is set to 1
  - doBlind is set to 0
  - conditionalExpected is set to 1
  - doTilde is set to 1
  - doExp is set to 1
  - doObs is set to 1
  - doInj is set to 0
  - muInjection is set to 1
  - precision is set to 0.005
  - debugLevel is set to 2
  - usePredictiveFit is set to 1
  - extrapolateSigma is set to 1
  - maxRetries is set to 3
  - doPvals is set to 1
  - NumCPU is set to 4

  - target_CLs is set to 0.05
Creating asimov data at mu = 1, profiling at mu = 1
Pairing nui: alpha_lumi, with glob: nom_alpha_lumi, from constraint: alpha_lumiConstraint
Pairing nui: alpha_ttXsec, with glob: nom_alpha_ttXsec, from constraint: alpha_ttXsecConstraint
Pairing nui: alpha_JES_Scenario1_NP1, with glob: nom_alpha_JES_Scenario1_NP1, from constraint: alpha_JES_Sce

Info in <Minuit2>: MnSeedGenerator Initial state: FCN =        3.67575184 Edm =        2.91109019 NCalls =     21
Info in <Minuit2>: MnSeedGenerator run Hesse - new state: 
  Minimum value : 3.67575184
  Edm           : 1.19716438
  Internal parameters:
                0
                0
                0
                0
     -1.119769515
  Internal gradient  :
     -10.22184374
     -6.296757433
    -0.3484778665
     -7.378490353
     -98.47113348
  Internal covariance matrix:
    0.056745618  -0.0021746408  0.00028176382  -0.0029842015  -0.0046646515
  -0.0021746408    0.069780919  0.00013141943  -0.0022421351  -0.0038349398
  0.00028176382  0.00013141943    0.079750708 -0.00012706195 -0.00024171452
  -0.0029842015  -0.0022421351 -0.00012706195    0.065523911  -0.0042603606
  -0.0046646515  -0.0038349398 -0.00024171452  -0.0042603606   0.0014238006
Info in <Minuit2>: VariableMetricBuilder Start iterating until Edm is < 0.001 with call limit = 2500
Info in <Minuit2>: VariableMetri

Let's check the output, which is located in the folder we create (note that each piece of code of CommonStatTools creates a subfolder of the folder you specify - something convenient if you want to hold in a single directory the full output of your statistical analysis)

In [14]:
!ls tmp/asymptotics

my_test_CL95.root



In [17]:
f = new TFile(outputFolder + "/asymptotics/"
              + workspaceTag +
              "_CL" + TString::Format("%d", int(CL*100))
              + ".root");
f->ls()

TFile**		tmp/asymptotics/my_test_CL95.root	
 TFile*		tmp/asymptotics/my_test_CL95.root	
  KEY: TTree	stats;1	runAsymptoticsCLs


In [18]:
t = dynamic_cast<TTree*>(f->Get("stats"));
t->Show(0)

 my_resonance_mass = 1
 CLb_med         = 0.881606
 pb_med          = 0.118394
 CLs_med         = 0.567147
 CLsplusb_med    = 0.5
 CLb_obs         = 0.890854
 pb_obs          = 0.109146
 CLs_obs         = 0.540039
 CLsplusb_obs    = 0.481096
 obs_upperlimit  = 2.7827
 inj_upperlimit  = 0
 exp_upperlimit  = 1.83371
 exp_upperlimit_plus1 = 2.65335
 exp_upperlimit_plus2 = 3.82642
 exp_upperlimit_minus1 = 1.32129
 exp_upperlimit_minus2 = 0.984197
 fit_status      = 0
 mu_hat_obs      = 1.03389
 mu_hat_exp      = 0
 param_alpha_JES_Scenario1_NP1_hat = 0.206932
 param_alpha_lumi_hat = 0.0160748
 param_alpha_stXsec_hat = 0.0186162
 param_alpha_ttXsec_hat = -0.00143785
 param_mu_tt_hat = 1.087
 param_alpha_JES_Scenario1_NP1_med = 0.26309
 param_alpha_lumi_med = 0.0326332
 param_alpha_stXsec_med = 0.0187767
 param_alpha_ttXsec_med = 0.016653
 param_mu_tt_med = 1.13662


Each entry (row) of the TTree corresponds to a single run of the limit, and includes:
- the value of the parameter (in our example, the resonance mass)
- the observed upper limit `obs_upperlimit`
- the expected median upper limit `exp_upperlimit` and its plus/minus 1/2 sigma equivalents
- the fit status (`0` is good!)
- the upper limit obtained when a signal was injected (if it was), `inj_upperlimit`
- the best fit value for the parameter of interest on data, `mu_hat_obs`
- the best fit value for the parameter of interest on the background-only hypothesis, `mu_hat_exp`
- the values of CLs, CLb and pb for the median expected hypothesis (e.g. `CLb_med`) and the actual data
- the best values of the various parameters of the fit in the fit to data (e.g. `param_mu_tt_hat`)
- the best values of the various parameters of the fit in the fit to the median expected hypothesis (e.g. `param_mu_tt_med`)