# Limits with asymptotic formulae

_Valerio Ippolito - INFN Sezione di Roma_

This is the part in which we run limits on some signal POI, using the asymptotic calculation (faster than toys).

Let's first make sure CommonStatTools is compiled

In [None]:
!cd ../CommonStatTools; mkdir -p build; cd build; cmake ..; make

We then load the compiled library, and the headers for the class which deals with the asymptotic limit setting

In [None]:
#include "../CommonStatTools/AsymptoticsCLsRunner.h"

In [None]:
R__ADD_LIBRARY_PATH(../CommonStatTools/build)

In [None]:
R__LOAD_LIBRARY(libCommonStatTools.so)

Limits are run on a given workspace, contained in some input file. The workspace is expected to contain the ModelConfig, which specifies how the content of the workspace should be used to perform a statistical analysis. Limits are run considering some dataset as data.

In [None]:
inputFile = TString("../ws/ATLASIT_prova_combined_ATLASIT_prova_model.root");
workspaceName = TString("combined");
modelConfigName = TString("ModelConfig");
dataName = TString("obsData");

The output of the limit setting should be saved in some folder, with some associated nickname

In [None]:
workspaceTag = TString("my_test");
outputFolder = TString("tmp");

Actually, the output will consist of a ROOT file containing a TTree the (many) results of the limit setting. In real life you often like to run over multiple signal hypotheses, so CommonStatTools provides you with the possibility to add a branch to the output TTree, which may represent for example the mass of a resonance: this is quite convenient as you can then merge multiple output files with `hadd` and get a single tree with the results of the limit setting on all signals (otherwise, just set it to a dummy value).

In [None]:
paramName = TString("my_resonance_mass");
paramValue = 1.0;

In terms of how the limit is set, we also need to specify:
- if we want to keep data blind
- if we want to inject any signal, and if so the injection strength (i.e. how bigger is the signal cross-section with respect to its nominal value)
- the confidence level CL for our limits (often 0.95, sometimes 0.90)

In [None]:
keepDataBlind = Bool_t(kFALSE);
doInjection = Bool_t(kFALSE);
muInjection = 1.0;
CL = 0.95;

An Asimov dataset is created, if it doesn't exist in the workspace - or one is used if available. Its name should be (or will be):

In [None]:
asimovDataName = TString("asimovData_0");

... and, of course, the verbosity we wish for the output of the code

In [None]:
debugLevel = 2;

Let's put everything together, using the so-called `AsymptoticsCLsRunner` - who is the guy who will actually run limits for us.

In [None]:
EXOSTATS::AsymptoticsCLsRunner limitRunner;

In [None]:
limitRunner.setBlind(keepDataBlind);
limitRunner.setInjection(doInjection);
limitRunner.setInjectionStrength(muInjection);
limitRunner.setDebugLevel(debugLevel);

In [None]:
limitRunner.run(inputFile, workspaceName, modelConfigName, dataName, paramName, paramValue, workspaceTag,
                   outputFolder, CL, asimovDataName);

Let's check the output, which is located in the folder we create (note that each piece of code of CommonStatTools creates a subfolder of the folder you specify - something convenient if you want to hold in a single directory the full output of your statistical analysis)

In [None]:
!ls tmp/asymptotics

In [None]:
f = new TFile(outputFolder + "/asymptotics/"
              + workspaceTag +
              "_CL" + TString::Format("%d", int(CL*100))
              + ".root");
f->ls()

In [None]:
t = dynamic_cast<TTree*>(f->Get("stats"));
t->Show(0)

Each entry (row) of the TTree corresponds to a single run of the limit, and includes:
- the value of the parameter (in our example, the resonance mass)
- the observed upper limit `obs_upperlimit`
- the expected median upper limit `exp_upperlimit` and its plus/minus 1/2 sigma equivalents
- the fit status (`0` is good!)
- the upper limit obtained when a signal was injected (if it was), `inj_upperlimit`
- the best fit value for the parameter of interest on data, `mu_hat_obs`
- the best fit value for the parameter of interest on the background-only hypothesis, `mu_hat_exp`
- the values of CLs, CLb and pb for the median expected hypothesis (e.g. `CLb_med`) and the actual data
- the best values of the various parameters of the fit in the fit to data (e.g. `param_mu_tt_hat`)
- the best values of the various parameters of the fit in the fit to the median expected hypothesis (e.g. `param_mu_tt_med`)