# Workspace creation (minimal)

Simplified version of `create_workspace`.

We will create a simple example workspace based on existing histogram input, with just one channel (the "signal region") and one systematic uncertainty.

We first import ROOT (we'll be using it interactively via pyROOT) and `os`, to perform operations on the filesystem.

In [None]:
import ROOT
import os

The goal of this notebook is to create a workspace, which means:

* a file containing the actual ROOT workspace, one per "configuration" of the likelihood model (single-channel, all-channels);
* a file containing the specification of how we built such workspace from the input ROOT files, in XML format.

We then create two directories to store the workspaces and the XML files:

In [None]:
!mkdir -p ../ws
!mkdir -p ../xml

Our likelihood model, and the meaning we give to it, is stored within a measurement - an HistFactory concept which needs to know:

* how we want to nickname it;
* where output files should be stored;
* what's the parameter of interest (POI) of this measurement;
* what are the parameters to be considered as a constant, if any - we typically include the default luminosity nuisance parameter created by HistFactory, called Lumi, within this "blacklist";
* what are the default settings of the default luminosity parameter, used by HistFactory whenever you specify that a channel should be normalized by luminosity (see `SetNormalizeByTheory`).

We are also nice people who like to decouple logical steps, so we ask HistFactory to kindly not do anything else than exporting the workspace into a ROOT file (i.e. please HistFactory do not perform any statistical analysis without our consent).

Create the measurement object, set prefix for outputs, set parameter of insterest, plus a number of needed settings:

In [None]:
meas = ROOT.RooStats.HistFactory.Measurement('ICTPws_minimal', 'ICTPws_minimal')
meas.SetOutputFilePrefix('../ws/ICTPws_minimal')

meas.SetPOI('mu_ttH')

meas.AddConstantParam('Lumi')
meas.SetLumi(1.0)
meas.SetLumiRelErr(0.0)
meas.SetExportOnly(True)

We then follow this logic:

* we first create a channel (corresponding to some set of statistically-independent data)
* we tell HistFactory where (meaning: in which file, under which subdirectory path and more specifically in which histogram) to find the data for this channel
* we may indulge in specifying how uncertainties related to the limited MC statistics in signal/background histograms should be dealt with, in this channel
* we then add the samples which contribute to this channel, specifying where to find their nominal histograms, and which normalisation-only (AddOverallSys) and also-shape uncertainties (AddHistoSys) should be considered (keeping in mind that variations of any kind which share the same name are correlated)
* we also add free parameters to fit for determining the normalisation of our signal (and sometimes background) samples
* we add each sample to the channel

Create the channel:

In [None]:
chan_sr = ROOT.RooStats.HistFactory.Channel( "ljets_Mbb_ge6jge4b" )

Store the input file name:

In [None]:
InputFile_sr = "../data/ttH2015_forATLASit_ljets_Mbb_ge6jge4b_histos.root"

Set the data, set MC-stat uncertaintyy threshold to 5%:

In [None]:
chan_sr.SetData( "ljets_Mbb_ge6jge4b_Data", InputFile_sr, "ljets_Mbb_ge6jge4b/Data/nominal/" )
chan_sr.SetStatErrorConfig(0.05, 'Poisson')

Add signal sample, adding the POI as normalization factor to it:

In [None]:
signal_sr = ROOT.RooStats.HistFactory.Sample( "ttH", "ljets_Mbb_ge6jge4b_ttH", InputFile_sr, "ljets_Mbb_ge6jge4b/ttH/nominal/" )
signal_sr.AddNormFactor( "mu", 1, -10, 10 )

Add the background samples:

In [None]:
ttbar_sr = ROOT.RooStats.HistFactory.Sample( "ttbar", "ljets_Mbb_ge6jge4b_ttbar", InputFile_sr, "ljets_Mbb_ge6jge4b/ttbar/nominal/" )
stop_sr = ROOT.RooStats.HistFactory.Sample( "stop", "ljets_Mbb_ge6jge4b_singleTop", InputFile_sr, "ljets_Mbb_ge6jge4b/singleTop/nominal/" )

Assign a simple systematic to all the three samples (luminosity uncertainty, set to +/-5%):

In [None]:
signal_sr.AddOverallSys( "lumi",  0.95, 1.05 )
ttbar_sr.AddOverallSys( "lumi",  0.95, 1.05 )
stop_sr.AddOverallSys( "lumi",  0.95, 1.05 )

Add the three samples to the channel:

In [None]:
chan_sr.AddSample( signal_sr )
chan_sr.AddSample( ttbar_sr )
chan_sr.AddSample( stop_sr )

Add channel to measurement:

In [None]:
meas.AddChannel( chan_sr )

We then ask HistFactory to actually go and check the histograms, do its magic and create The Likelihood Model. We also persist this likelihood model in XML format, for our afternoons of debugging.

In [None]:
meas.CollectHistograms()
meas.PrintTree()
meas.PrintXML('../xml', meas.GetOutputFilePrefix())
ROOT.RooStats.HistFactory.MakeModelAndMeasurementFast(meas)