In [1]:
import ROOT
import os
%jsroot on
ROOT.gSystem.Load("/cvmfs/sft.cern.ch/lcg/releases/LCG_85swan3/vdt/0.3.6/x86_64-slc6-gcc49-opt/lib/libvdt.so")
ROOT.gSystem.Load("../../lib/libHiggsAnalysisCombinedLimit.so")

Welcome to JupyROOT 6.07/07


0

# Simple Counting Experiment

Lets understand the simplest posssible datacard:

In [2]:
os.system('cat simple-counting-experiment.txt')

0

# Simple counting experiment, with one signal and one background process
# Extremely simplified version of the 35/pb H->WW analysis for mH = 200 GeV,
# for 4th generation exclusion (EWK-10-009, arxiv:1102.5429v1)
imax 1  number of channels
jmax 1  number of backgrounds 
kmax 2  number of nuisance parameters (sources of systematical uncertainties)
------------
# we have just one channel, in which we observe 0 events
bin         1
observation 0
------------
# now we list the expected events for signal and all backgrounds in that bin
# the second 'process' line must have a positive number for backgrounds, and 0 for signal
# then we list the independent sources of uncertainties, and give their effect (syst. error)
# on each process and bin
bin             1      1
process       ggh4G  Bckg 
process         0      1
rate           4.76  1.47
------------
deltaS  lnN    1.20    -    20% uncertainty on signal
deltaB  lnN      -   1.50   50% uncertainty on background 


This datacard has just one channel with just one signal and one background, and two nuisance parameters. It is easy to understand how combine reconstructs the likelihood function from this datacard.

The combine tool converts datacards into a RooWorkspace to build the likelihood. In your terimnal, you can run "text2workspace.py simple-counting-experiment.txt" and you should end up with a file called simple-counting-experiment.root, or you can use the one which already produced. Lets open this file and inspect it.

## From Datacard to Likelihood

In [3]:
f_simple = ROOT.TFile("simple-counting-experiment.root","READ")
w_simple = f_simple.Get("w")
w_simple.Print()


RooWorkspace(w) w contents

variables
---------
(deltaB,deltaB_In,deltaS,deltaS_In,n_obs_binbin1,r)

p.d.f.s
-------
SimpleGaussianConstraint::deltaB_Pdf[ x=deltaB mean=deltaB_In sigma=1 ] = 1
SimpleGaussianConstraint::deltaS_Pdf[ x=deltaS mean=deltaS_In sigma=1 ] = 1
RooProdPdf::modelObs_b[ pdf_binbin1_bonly ] = 0.229925
RooProdPdf::modelObs_s[ pdf_binbin1 ] = 0.00196945
RooProdPdf::model_b[ modelObs_b * nuisancePdf ] = 0.229925
RooProdPdf::model_s[ modelObs_s * nuisancePdf ] = 0.00196945
RooProdPdf::nuisancePdf[ deltaS_Pdf * deltaB_Pdf ] = 1
RooPoisson::pdf_binbin1[ x=n_obs_binbin1 mean=n_exp_binbin1 ] = 0.00196945
RooPoisson::pdf_binbin1_bonly[ x=n_obs_binbin1 mean=n_exp_binbin1_bonly ] = 0.229925

functions
--------
RooAddition::n_exp_binbin1[ n_exp_binbin1_proc_ggh4G + n_exp_binbin1_proc_Bckg ] = 6.23
RooAddition::n_exp_binbin1_bonly[ n_exp_binbin1_proc_Bckg ] = 1.47
ProcessNormalization::n_exp_binbin1_proc_Bckg[ thetaList=(deltaB) asymmThetaList=() otherFactorList=() ] = 1.47
Pr

By expanding out the "model_s" pdf, it is easy to see that it follows the generic definition of a likelihood function.

# Shape Experiment with Templates

Lets look at a datcard which uses binned histograms for the observable. Each bin of the histogram can just be thought of as a separate channel in a counting experiment.

In [4]:
os.system('cat simple-shapes-TH1.txt')

0

imax 1
jmax 1
kmax *
---------------
shapes * * input-shapes-TH1.root $PROCESS $PROCESS_$SYSTEMATIC
---------------
bin 1
observation 85
------------------------------
bin             1          1
process         signal     background
process         0          1
rate            10         100
--------------------------------
lumi     lnN    1.10       1.0
bgnorm   lnN    1.00       1.3
alpha  shapeN2    -           1   uncertainty on background shape and normalization
sigma  shapeN2    0.5         -   uncertainty on signal resolution. Assume the histogram is a 2 sigma shift, 
#                                so divide the unit gaussian by 2 before doing the interpolation




There is a new line compared to a counting experiment, one that starts with "shapes". Lets open that file and see what is inside.

In [5]:
f_input_shapes_TH1 = ROOT.TFile("input-shapes-TH1.root","READ")
f_input_shapes_TH1.ls()

TFile**		input-shapes-TH1.root	
 TFile*		input-shapes-TH1.root	
  KEY: TH1F	signal;1	Histogram of signal__x
  KEY: TH1F	signal_sigmaUp;1	Histogram of signal__x
  KEY: TH1F	signal_sigmaDown;1	Histogram of signal__x
  KEY: TH1F	background;1	Histogram of background__x
  KEY: TH1F	background_alphaUp;1	Histogram of background__x
  KEY: TH1F	background_alphaDown;1	Histogram of background__x
  KEY: TH1F	data_obs;1	Histogram of data_obs__x
  KEY: TH1F	data_sig;1	Histogram of data_sig__x


This file contains 1D histograms for the signal, background, and observed data. Also, there are different histograms corresponding to different values of the nuisance parameters. Lets inspect those histograms for the background.

In [6]:
h_background = f_input_shapes_TH1.Get("background")
h_background.SetLineColor(1)

h_background_alphaUp = f_input_shapes_TH1.Get("background_alphaUp")
h_background_alphaUp.SetLineColor(2)

h_background_alphaDown = f_input_shapes_TH1.Get("background_alphaDown")
h_background_alphaDown.SetLineColor(4)

c = ROOT.TCanvas()
h_background_alphaDown.Draw()
h_background.Draw("same")
h_background_alphaUp.Draw("same")

c.Draw()

As you can see the shape of the histograms are different for the Up/Down variations of the nuisance parameter alpha. Also, the normalization of the histograms are also different, which means this nuisance affects both the shape and rate of the background process:

In [7]:
print h_background.Integral()

100.000002265


In [8]:
print h_background_alphaUp.Integral()

115.000002146


In [9]:
print h_background_alphaDown.Integral()

90.0000001192


Lets now convert to a RooWorkspace. Again, you can run "text2workspace.py simple-shapes-TH1.txt" in the terminal, or use the already created simple-shapes-TH1.root file. Lets open it and print the contents of the workspace:

In [10]:
f_shapes_TH1 = ROOT.TFile("simple-shapes-TH1.root","READ")
w_shapes_TH1 = f_shapes_TH1.Get("w")
w_shapes_TH1.Print()


RooWorkspace(w) w contents

variables
---------
(CMS_channel,CMS_th1x,alpha,alpha_In,bgnorm,bgnorm_In,lumi,lumi_In,r,sigma,sigma_In)

p.d.f.s
-------
SimpleGaussianConstraint::alpha_Pdf[ x=alpha mean=alpha_In sigma=1 ] = 1
SimpleGaussianConstraint::bgnorm_Pdf[ x=bgnorm mean=bgnorm_In sigma=1 ] = 1
SimpleGaussianConstraint::lumi_Pdf[ x=lumi mean=lumi_In sigma=1 ] = 1
RooSimultaneousOpt::model_b[ indexCat=CMS_channel bin1=pdf_binbin1_bonly extraConstraints=() channelMasks=() ] = 0.0821543
RooSimultaneousOpt::model_s[ indexCat=CMS_channel bin1=pdf_binbin1 extraConstraints=() channelMasks=() ] = 0.0864602
RooProdPdf::nuisancePdf[ lumi_Pdf * bgnorm_Pdf * alpha_Pdf * sigma_Pdf ] = 1
RooProdPdf::pdf_binbin1[ lumi_Pdf * bgnorm_Pdf * alpha_Pdf * sigma_Pdf * pdf_binbin1_nuis ] = 0.0864602
RooProdPdf::pdf_binbin1_bonly[ lumi_Pdf * bgnorm_Pdf * alpha_Pdf * sigma_Pdf * pdf_binbin1_bonly_nuis ] = 0.0821543
RooAddPdf::pdf_binbin1_bonly_nuis[ n_exp_final_binbin1_proc_background * shapeBkg_bin1_backgr

As you can see, the workspace contains "model_s" i.e. the likelihood function, which is composed of the observable "CMS_th1x" which is split in several channels, the POI "r" and several nuisance parameters. Lets see how combine smoothly morphs the shape of the background as a function of the nuisance parameter alpha:

In [11]:
shapeBkg = w_shapes_TH1.pdf("shapeBkg_bin1_background_morph")
th1x = w_shapes_TH1.var("CMS_th1x")
plot_th1x = th1x.frame()
alpha = w_shapes_TH1.var("alpha")
alpha.setVal(0.0)
shapeBkg.plotOn(plot_th1x,ROOT.RooFit.LineColor(1))
alpha.setVal(0.5)
shapeBkg.plotOn(plot_th1x,ROOT.RooFit.LineColor(3))
alpha.setVal(1.0)
shapeBkg.plotOn(plot_th1x,ROOT.RooFit.LineColor(2))
alpha.setVal(2.0)
shapeBkg.plotOn(plot_th1x,ROOT.RooFit.LineColor(6))

c2 = ROOT.TCanvas()
plot_th1x.Draw()
c2.Draw()

As you can see the shape at alpha=1.0 is the same as the input histogram. In addition there is now a shape for any value of the nuisance parameter alpha.

# Parametric Shape Experiment

Lets now look at a shape experiment where the shapes are described by PDF's instead of templates. We will consider a realistic example from the CMS search for H->gamma gamma at 8 TeV. Lets first look at the datacard:

In [12]:
os.system('cat hgg_8TeV_MVA_cat0145.txt')

0

Combination of .=../couplings/hgg/hgg_8TeV_MVA.txt
imax 4 number of bins
jmax 5 number of processes minus 1
kmax * number of nuisance parameters
----------------------------------------------------------------------------------------------------------------------------------
shapes WH        cat0      hgg.inputsig_8TeV_MVA.root wsig_8TeV:hggpdfrel_wh_cat0
shapes ZH        cat0      hgg.inputsig_8TeV_MVA.root wsig_8TeV:hggpdfrel_zh_cat0
shapes bkg_mass  cat0      hgg.inputbkgdata_8TeV_MVA.root cms_hgg_workspace:pdf_data_pol_model_8TeV_cat0
shapes data_obs  cat0      hgg.inputbkgdata_8TeV_MVA.root cms_hgg_workspace:roohist_data_mass_cat0
shapes ggH       cat0      hgg.inputsig_8TeV_MVA.root wsig_8TeV:hggpdfrel_ggh_cat0
shapes qqH       cat0      hgg.inputsig_8TeV_MVA.root wsig_8TeV:hggpdfrel_vbf_cat0
shapes ttH       cat0      hgg.inputsig_8TeV_MVA.root wsig_8TeV:hggpdfrel_tth_cat0
shapes WH        cat1      hgg.inputsig_8TeV_MVA.root wsig_8TeV:hggpdfrel_wh_cat1
shapes ZH        cat1    

As you can see there are a lot of nuisance parameters and channels, typical of a realistic analysis. Lets just focus in on the "shapes" line. You can see the syntax is a bit different. For a parametric shape analysis, the shapes are RooAbsPdf stored in RooWorkspaces. 

Lets look at the background shape. First we can plot the observed data in the "cat0" channel and plot the background only shape.

In [13]:
f_hgg_bkgdata = ROOT.TFile("hgg.inputbkgdata_8TeV_MVA.root","READ")
w_hgg_bkgdata = f_hgg_bkgdata.Get("cms_hgg_workspace")

CMS_hgg_mass = w_hgg_bkgdata.var("CMS_hgg_mass")
hgg_plot_cat0 = CMS_hgg_mass.frame()

data_cat0 = w_hgg_bkgdata.data("roohist_data_mass_cat0")
data_cat0.plotOn(hgg_plot_cat0)

pdf_bkg_cat0 = w_hgg_bkgdata.pdf("pdf_data_pol_model_8TeV_cat0")
pdf_bkg_cat0.plotOn(hgg_plot_cat0)

c3 = ROOT.TCanvas()
hgg_plot_cat0.Draw()
c3.Draw()

In this datacard the dataset is a binned dataset and the background shape is a smooth pdf. The observed dataset could also be unbinned. 

Lets now look at the signal shapes. First we convert the datacard to a workspace ("text2workspace.py hgg_8TeV_MVA_cat0145.txt"):

In [14]:
f_hgg = ROOT.TFile("hgg_8TeV_MVA_cat0145.root","READ")
w_hgg = f_hgg.Get("w")
#w_hgg.Print()

You can print the workspace to see what is inside, but it is quite long. If you do, you will see that the shape of the ggH signal in cat0 is called "shapeSig_ggH_cat0". Lets see what parameters it depends on:

In [15]:
data_obs = w_hgg.data("data_obs")
data_obs.Print()

pdf_ggH = w_hgg.pdf("shapeSig_ggH_cat0")
ggH_params = pdf_ggH.getParameters(data_obs)
ggH_params.Print()

RooDataHist::data_obs[CMS_channel,CMS_hgg_mass] = 640 bins (8700 weights)
RooArgSet::parameters = (CMS_hgg_globalscale,CMS_hgg_nuissancedeltamcat0,CMS_hgg_nuissancedeltasmearcat0,MH)


As you can see, the observable is called "CMS_hgg_mass" and the shape of the ggH signal depends on some nuisance parameters as well as MH. Lets draw the ggH PDF for different values of MH and the nuisance parameter "CMS_hgg_globalscale".

In [16]:

CMS_hgg_mass = w_hgg.var("CMS_hgg_mass")
hgg_plot_sig = CMS_hgg_mass.frame()
w_hgg.var("MH").setVal(122.0)
pdf_ggH.plotOn(hgg_plot_sig,ROOT.RooFit.LineColor(2))
w_hgg.var("MH").setVal(125.0)
pdf_ggH.plotOn(hgg_plot_sig,ROOT.RooFit.LineColor(1))
w_hgg.var("MH").setVal(128.0)
pdf_ggH.plotOn(hgg_plot_sig,ROOT.RooFit.LineColor(4))

w_hgg.var("MH").setVal(125.0)
w_hgg.var("CMS_hgg_globalscale").setVal(0.023585)
pdf_ggH.plotOn(hgg_plot_sig,ROOT.RooFit.LineColor(1),ROOT.RooFit.LineStyle(2))
w_hgg.var("CMS_hgg_globalscale").setVal(-0.023585)
pdf_ggH.plotOn(hgg_plot_sig,ROOT.RooFit.LineColor(1),ROOT.RooFit.LineStyle(3))

c4 = ROOT.TCanvas()
hgg_plot_sig.Draw()
c4.Draw()

As you can see the position of the peak changes as a function of MH, and also a function of the nuisance parameter. In combine the user can encode any dependence of the shape of the PDF's on the model parameters.

Now lets try to draw the combined Signal+Background PDF:

In [17]:
w_hgg.var("MH").setVal(125.0)
w_hgg.var("CMS_hgg_globalscale").setVal(0.0)

hgg_plot = CMS_hgg_mass.frame()
pdf_bincat0 = w_hgg.pdf("pdf_bincat0")

data_cat0 = data_obs.reduce(ROOT.RooFit.Cut("CMS_channel==CMS_channel::cat0"))
data_cat0.plotOn(hgg_plot)


pdf_bincat0.plotOn(hgg_plot,ROOT.RooFit.ProjWData(data_cat0,True),ROOT.RooFit.LineColor(2))
w_hgg.var("r").setVal(5.0)
pdf_bincat0.plotOn(hgg_plot,ROOT.RooFit.ProjWData(data_cat0,True),ROOT.RooFit.LineColor(3))

c5 = ROOT.TCanvas()
hgg_plot.Draw()
c5.Draw()

As you can see, the total signal+background PDF can now be drawn for any value of the model parameters. In this case we have drawn the total PDF for different values of the signal strength (r=1 and r=5).  

# Asymptotic Limits

Lets go through some of the basic functionality of Combine using the H->gamma gamma workspace. We will start with Asymptotic limits. Lets start by computing the expected and observed limit for m(H) = 125 GeV. You will need to switch to the terminal and run the following command: 

The "-n" gives the output root file a custom name. 
The "-m" sets the value of "MH" in the workspace, and also changes the name of the output root file.
The "-M" option tells combine the stastical method, in this case Asymptotic Limits.
The "hgg_8TeV_MVA_cat0145.root" is the input workspace
The option "--run both" tells combine to run both expected and observed limits. 

You should get an output like this:

In [21]:
os.system('cat limit125.txt')

0

At r = 1.538055:	q_mu = 2.74791	q_A  = 13.71681	CLsb = 0.04869	CLb  = 0.97962	CLs  = 0.04970

 -- Asymptotic -- 
Observed Limit: r < 1.5381
Expected  2.5%: r < 0.3840
Expected 16.0%: r < 0.5175
Expected 50.0%: r < 0.7363
Expected 84.0%: r < 1.0680
Expected 97.5%: r < 1.4934

Done in 0.09 min (cpu), 0.10 min (real)


You can see we have computed an observed limit on r, and also the nominal expected, and +/- 1 and 2 sigma expected limits. These results are also stored in an output .root file. Lets look at the file and see whats inside:

In [22]:
f_hgg_limit_125 = ROOT.TFile("higgsCombineLimitTest.Asymptotic.mH125.root")
f_hgg_limit_125.ls()

TFile**		higgsCombineLimitTest.Asymptotic.mH125.root	
 TFile*		higgsCombineLimitTest.Asymptotic.mH125.root	
  KEY: TDirectoryFile	toys;1	toys
  KEY: TTree	limit;1	limit


There is a TTree called limit which keeps the results of the limit computation. Lets look at it:

In [23]:
limit = f_hgg_limit_125.Get("limit")
for i in xrange(limit.GetEntries()):
    limit.GetEntry(i)
    print i,limit.mh,limit.limit

0 125.0 0.383983612061
1 125.0 0.517489135265
2 125.0 0.736328125
3 125.0 1.06798708439
4 125.0 1.49341821671
5 125.0 1.53805452175


So you can see that the entries 0-4 are the expected limit and entry 5 is the observed limit. 

We can run similar commands for a range of mass values using a simple script and then make a limit plot from the results. Here is such a script to run the limits:

In [24]:
os.system('cat run_hgg_asymptotic.sh')

0

#!/bin/bash 
MASS=115
while [  $MASS -lt 146 ]; do
    echo Running Asymptotic limits for MH = $MASS
    combine -n LimitTest -M Asymptotic -m $MASS hgg_8TeV_MVA_cat0145.root --run both > limit$MASS.txt
    let MASS=MASS+1 
done

mv higgsCombineLimitTest*.root results_hgg_asymptotic/
mv limit*.txt results_hgg_asymptotic/



Lets execute this script. Go to the terminal and change the perimssions of the file, and then execute it:

This should take ~3 minutes and you should end up with a bunch of .root files and .txt files for different masses in a folder called results_hgg_asymptotic.

Here is an example of a simple script to plot the results:

In [25]:
# import CMS tdrStyle
from tdrStyle import *
setTDRStyle()

# import some more python modules
import sys,glob
from array import array

# create some arrays to hold the results values
mass = array('d',[])
zeros = array('d',[])
exp_p2 = array('d',[])
exp_p1 = array('d',[])
exp = array('d',[])
exp_m1 = array('d',[])
exp_m2 = array('d',[])
obs = array('d',[])

# gather all the results files and sort the mass values
sortedmass = []
files=glob.glob("results_hgg_asymptotic/higgsCombineLimitTest.Asymptotic.mH*.root")
for afile in files:
    m = afile.split('mH')[1].replace('.root','')
    sortedmass.append(float(m))
sortedmass.sort()

# loop over the mass values and fill the arrays
for m in sortedmass:
    # mass value
    mass.append(m)
    # get the limit tree for this mass value
    f = ROOT.TFile("results_hgg_asymptotic/higgsCombineLimitTest.Asymptotic.mH"+str(m).replace('.0','')+".root","READ")
    t = f.Get("limit")
    # expected limit
    t.GetEntry(2)
    thisexp = t.limit
    exp.append(thisexp)
    #-2 sigma
    t.GetEntry(0)
    exp_m2.append(thisexp-t.limit)
    #-1 sigma
    t.GetEntry(1)
    exp_m1.append(thisexp-t.limit)
    #+1 sigma 
    t.GetEntry(3)
    exp_p1.append(t.limit-thisexp)
    #+2 sigma
    t.GetEntry(4)
    exp_p2.append(t.limit-thisexp)
    # observed limit
    t.GetEntry(5)
    obs.append(t.limit)
    # dummy array with 0.0 (for mass-uncertainty)
    zeros.append(0.0)

# convert arrays to TVectorD
v_mass = ROOT.TVectorD(len(mass),mass)
v_zeros = ROOT.TVectorD(len(zeros),zeros)
v_exp_p2 = ROOT.TVectorD(len(exp_p2),exp_p2)
v_exp_p1 = ROOT.TVectorD(len(exp_p1),exp_p1)
v_exp = ROOT.TVectorD(len(exp),exp)
v_exp_m1 = ROOT.TVectorD(len(exp_m1),exp_m1)
v_exp_m2 = ROOT.TVectorD(len(exp_m2),exp_m2)
v_obs = ROOT.TVectorD(len(obs),obs)

#new canvas

c6 = ROOT.TCanvas("c6","c6",800,800)
c6.SetGridx()
c6.SetGridy()
c6.SetRightMargin(0.06)
c6.SetLeftMargin(0.2)

# dummy historgram for axes labels, ranges, etc.
dummy = ROOT.TH1D("","", 1, 115,145)
dummy.SetBinContent(1,0.0)
dummy.GetXaxis().SetTitle('m(H) [GeV]')
dummy.GetYaxis().SetTitle('#sigma / #sigma(SM)')
dummy.SetLineColor(0)
dummy.SetLineWidth(0)
dummy.SetFillColor(0)
dummy.SetMinimum(0.0)
dummy.SetMaximum(5.0)
dummy.Draw()

gr_exp2 = ROOT.TGraphAsymmErrors(v_mass,v_exp,v_zeros,v_zeros,v_exp_m2,v_exp_p2)
gr_exp2.SetLineColor(ROOT.kYellow)
gr_exp2.SetFillColor(ROOT.kYellow)
gr_exp2.Draw("e3same")

gr_exp1 = ROOT.TGraphAsymmErrors(v_mass,v_exp,v_zeros,v_zeros,v_exp_m1,v_exp_p1)
gr_exp1.SetLineColor(ROOT.kGreen)
gr_exp1.SetFillColor(ROOT.kGreen)
gr_exp1.Draw("e3same")

gr_exp = ROOT.TGraphAsymmErrors(v_mass,v_exp,v_zeros,v_zeros,v_zeros,v_zeros)
gr_exp.SetLineColor(1)
gr_exp.SetLineWidth(2)
gr_exp.SetLineStyle(2)
gr_exp.Draw("Lsame")

gr_obs = ROOT.TGraphAsymmErrors(v_mass,v_obs,v_zeros,v_zeros,v_zeros,v_zeros)
gr_obs.SetLineColor(1)
gr_obs.SetLineWidth(2)
gr_obs.Draw("CPsame")

latex2 = ROOT.TLatex()
latex2.SetNDC()
latex2.SetTextSize(0.5*c6.GetTopMargin())
latex2.SetTextFont(42)
latex2.SetTextAlign(31) # align right                                                                                             
latex2.DrawLatex(0.87, 0.95,"19.6 fb^{-1} (8 TeV)")
latex2.SetTextSize(0.9*c6.GetTopMargin())
latex2.SetTextFont(62)
latex2.SetTextAlign(11) # align right                                                                                             
latex2.DrawLatex(0.25, 0.85, "CMS")
latex2.SetTextSize(0.7*c6.GetTopMargin())
latex2.SetTextFont(52)
latex2.SetTextAlign(11)
latex2.DrawLatex(0.25, 0.8, "Tutorial")

legend = ROOT.TLegend(.60,.70,.90,.90)
legend.AddEntry(gr_obs , "Observed 95% CL", "l")
legend.AddEntry(gr_exp , "Expected 95% CL", "l")
legend.AddEntry(gr_exp1 , "#pm 1#sigma", "f")
legend.AddEntry(gr_exp2 , "#pm 2#sigma", "f")
legend.SetShadowColor(0)
legend.SetFillColor(0)
legend.SetLineColor(0)
legend.Draw("same")

ROOT.gPad.RedrawAxis()

c6.Draw()


# Computing Significance

Now lets compute the expected and observed significance of the signal as a function of m(H). For the expected significance, this can be done for mH = 125 GeV by running the following command:

You should end up with the following output:

In [26]:
os.system('cat results_hgg_pvalue/expsignif125.txt')

0


 -- Profile Likelihood -- 
p-value of background: 0.00317242
       (Significance = 2.72941)
Done in 0.01 min (cpu), 0.01 min (real)


For the observed, we just remove the "-t -1 --expectSignal=1":

In [27]:
os.system('cat results_hgg_pvalue/obssignif125.txt')

0


 -- Profile Likelihood -- 
p-value of background: 0.012346
       (Significance = 2.24619)
Done in 0.01 min (cpu), 0.01 min (real)


So we observe a little less signal than we expected. Lets plot the observed and expected significance as a function of mH. Here is a script to run the command for each mass point:

In [28]:
os.system('cat run_hgg_pvalue.sh')

0

#!/bin/bash 

MASS=115
while [  $MASS -lt 146 ]; do
    echo Computing p-value for MH = $MASS
    #combine -n SignifExp -M ProfileLikelihood --signif -m $MASS hgg_8TeV_MVA_cat0145.root -t -1 --expectSignal=1 --toysFreq --pvalue > expsignif$MASS.txt
    combine -n SignifExp -M ProfileLikelihood --signif -m $MASS hgg_8TeV_MVA_cat0145.root -t -1 --expectSignal=1 --pvalue > expsignif$MASS.txt
    combine -n SignifObs -M ProfileLikelihood --signif -m $MASS hgg_8TeV_MVA_cat0145.root --pvalue > obssignif$MASS.txt
    let MASS=MASS+1 
done

mv higgsCombineSignif*.root results_hgg_pvalue/                                                                                  
mv *signif*.txt results_hgg_pvalue/     


You can run it by doing:

Again you will end up with a bunch of .txt and .root files in the folder results_hgg_pvalue. 

Here is an example script to plot the results:

In [29]:
unsortedmass = []

mass = array('d',[])
zeros = array('d',[])
exp = array('d',[])
obs = array('d',[])

files=glob.glob("results_hgg_pvalue/higgsCombineSignifExp.ProfileLikelihood.mH*.root")
for afile in files:
    m = afile.split('mH')[1].replace('.root','')
    unsortedmass.append(float(m))
unsortedmass.sort()

for m in unsortedmass:
    # mass value
    mass.append(m)
    # get the expected p-value
    f_exp = ROOT.TFile("results_hgg_pvalue/higgsCombineSignifExp.ProfileLikelihood.mH"+str(m).replace('.0','')+".root","READ")
    t_exp = f_exp.Get("limit")
    t_exp.GetEntry(0)
    exp.append(t_exp.limit)
    # get the observed p-value
    f_obs = ROOT.TFile("results_hgg_pvalue/higgsCombineSignifObs.ProfileLikelihood.mH"+str(m).replace('.0','')+".root","READ")
    t_obs = f_obs.Get("limit")
    t_obs.GetEntry(0)
    obs.append(t_obs.limit)
    # dummy, for mass error
    zeros.append(0.0)

# convert array to TVector
v_mass = ROOT.TVectorD(len(mass),mass)
v_zeros = ROOT.TVectorD(len(zeros),zeros)
v_exp = ROOT.TVectorD(len(exp),exp)
v_obs = ROOT.TVectorD(len(obs),obs)
# new canvas
c7 = ROOT.TCanvas("c7","c7",800, 800)
c7.SetLogy()
c7.SetRightMargin(0.06)
c7.SetLeftMargin(0.2)
# dummy histogram, for axis labels, ranges, etc.
dummy = ROOT.TH1D("","", 1, 115,145)
dummy.SetBinContent(1,0.0)
dummy.GetXaxis().SetTitle('m(H) [GeV]')
dummy.GetYaxis().SetTitle('Local p-value')
dummy.SetLineColor(0)
dummy.SetLineWidth(0)
dummy.SetFillColor(0)
dummy.SetMinimum(0.0001)
dummy.SetMaximum(1.0)
dummy.Draw()

# Draw some lines corresponding to 1,2,3 sigma 
latexf = ROOT.TLatex()
latexf.SetTextSize(0.4*c7.GetTopMargin())
latexf.SetTextColor(2)
f1 = ROOT.TF1("f1","0.15866",115,145)
f1.SetLineColor(2)
f1.SetLineWidth(2)
f1.Draw("lsame")
latexf.DrawLatex(116, 0.15866*1.1,"1#sigma")
f2 = ROOT.TF1("f1","0.02275",115,145)
f2.SetLineColor(2)
f2.SetLineWidth(2)
f2.Draw("lsame")
latexf.DrawLatex(116, 0.02275*1.1,"2#sigma")
f3 = ROOT.TF1("f1","0.0013499",115,145)
f3.SetLineColor(2)
f3.SetLineWidth(2)
f3.Draw("lsame")
latexf.DrawLatex(116, 0.0013499*1.1,"3#sigma")

# Draw the expected p-value graph
gr_exp = ROOT.TGraphAsymmErrors(v_mass,v_exp,v_zeros,v_zeros,v_zeros,v_zeros)
gr_exp.SetLineColor(4)
gr_exp.SetLineWidth(2)
gr_exp.SetLineStyle(2)
gr_exp.Draw("Lsame")
# Draw the observed p-value graph
gr_obs = ROOT.TGraphAsymmErrors(v_mass,v_obs,v_zeros,v_zeros,v_zeros,v_zeros)
gr_obs.SetLineColor(1)
gr_obs.SetLineWidth(2)
gr_obs.Draw("CPsame")

latex2 = ROOT.TLatex()
latex2.SetNDC()
latex2.SetTextSize(0.5*c7.GetTopMargin())
latex2.SetTextFont(42)
latex2.SetTextAlign(31) # align right                                                                                                                              
latex2.DrawLatex(0.87, 0.95,"19.6 fb^{-1} (8 TeV)")
latex2.SetTextSize(0.7*c7.GetTopMargin())
latex2.SetTextFont(62)
latex2.SetTextAlign(11) # align right                                                                                                                              
latex2.DrawLatex(0.20, 0.95, "CMS")
latex2.SetTextSize(0.6*c7.GetTopMargin())
latex2.SetTextFont(52)
latex2.SetTextAlign(11)
latex2.DrawLatex(0.32, 0.95, "Tutorial")

c7.Draw()

# Maximum Likelihood Fits

Suppose now that we want to measure the signal strength at M(H)=125 GeV. We can use the MaxLikelihoodFit and MultiDimFit (maximum likelihood fit for an arbitrary number of POIs) methods.

We can get the best fit for the signal strength by running this command:

You should get an output like this:

In [30]:
os.system('cat rfit.txt')

0

Running Minos for POI 
Real time 0:00:00, CP time 0.480


 --- MaxLikelihoodFit ---
Best fit r: 0.834356  -0.376971/+0.40773  (68% CL)
nll S+B -> -2.52972  nll B -> -0.00680221
Done in 0.02 min (cpu), 0.02 min (real)


We can also create likelihood scans by manually defining a range for the POI and computing the deltaNLL at each point. Lets do it for both the expected and observed:

Here is an example script to plot the scans:

In [34]:
# create arrays
r_exp = array('d',[])
nll_exp = array('d',[])
r_obs = array('d',[])
nll_obs = array('d',[])
zeros = array('d',[])
# get expected scan
f_exp = ROOT.TFile("higgsCombineExp.MultiDimFit.mH125.root","READ")
t_exp = f_exp.Get("limit")
for i in xrange(1,t_exp.GetEntries()):
    t_exp.GetEntry(i)
    r_exp.append(t_exp.r)
    nll_exp.append(2.0*t_exp.deltaNLL)
# get observed scan
f_obs = ROOT.TFile("higgsCombineObs.MultiDimFit.mH125.root","READ")
t_obs = f_obs.Get("limit")
for i in xrange(1,t_obs.GetEntries()):
    t_obs.GetEntry(i)
    r_obs.append(t_obs.r)
    nll_obs.append(2.0*t_obs.deltaNLL)
    zeros.append(0.0)
# convert arrays to TVectorD
v_r_exp = ROOT.TVectorD(len(r_exp),r_exp)
v_r_obs = ROOT.TVectorD(len(r_obs),r_obs)
v_nll_exp = ROOT.TVectorD(len(nll_exp),nll_exp)
v_nll_obs = ROOT.TVectorD(len(nll_obs),nll_obs)
v_zeros = ROOT.TVectorD(len(zeros),zeros)
# new canvas
c9 = ROOT.TCanvas("c9","c9",800, 800)
c9.SetRightMargin(0.06)
c9.SetLeftMargin(0.2)
# dummy for axis labels, ranges, etc.
dummy = ROOT.TH1D("","", 1, 0.0,3.0)
dummy.SetBinContent(1,0.0)
dummy.GetXaxis().SetTitle('#sigma/#sigma_{SM}')
dummy.GetYaxis().SetTitle('-2 #Delta lnL')
dummy.SetLineColor(0)
dummy.SetLineWidth(0)
dummy.SetFillColor(0)
dummy.SetMinimum(0.0)
dummy.SetMaximum(5.0)
dummy.Draw()
# Draw some lines for 68% CL and 95% CL
latexf = ROOT.TLatex()
latexf.SetTextSize(0.4*c9.GetTopMargin())
latexf.SetTextColor(2)
f1 = ROOT.TF1("f1","1.0",0.0,3.0)
f1.SetLineColor(2)
f1.SetLineWidth(2)
f1.Draw("lsame")
latexf.DrawLatex(2.5, 1.1,"68% CL")
f2 = ROOT.TF1("f1","3.84",0.0,3.0)
f2.SetLineColor(2)
f2.SetLineWidth(2)
f2.Draw("lsame")
latexf.DrawLatex(2.5, 3.94,"95% CL")
# draw expected scan
gr_exp = ROOT.TGraphAsymmErrors(v_r_exp,v_nll_exp,v_zeros,v_zeros,v_zeros,v_zeros)
gr_exp.SetLineColor(1)
gr_exp.SetLineWidth(2)
gr_exp.SetLineStyle(2)
gr_exp.Draw("Lsame")
# draw observed scan
gr_obs = ROOT.TGraphAsymmErrors(v_r_obs,v_nll_obs,v_zeros,v_zeros,v_zeros,v_zeros)
gr_obs.SetLineColor(1)
gr_obs.SetLineColor(1)
gr_obs.SetLineWidth(2)
gr_obs.Draw("Lsame")

latex2 = ROOT.TLatex()
latex2.SetNDC()
latex2.SetTextSize(0.5*c9.GetTopMargin())
latex2.SetTextFont(42)
latex2.SetTextAlign(31) # align right                                                                                                                              
latex2.DrawLatex(0.87, 0.95,"19.6 fb^{-1} (8 TeV)")
latex2.SetTextSize(0.7*c9.GetTopMargin())
latex2.SetTextFont(62)
latex2.SetTextAlign(11) # align right                                                                                                                              
latex2.DrawLatex(0.20, 0.95, "CMS")
latex2.SetTextSize(0.6*c9.GetTopMargin())
latex2.SetTextFont(52)
latex2.SetTextAlign(11)
latex2.DrawLatex(0.32, 0.95, "Tutorial")

legend = ROOT.TLegend(.60,.14,.90,.26)
legend.AddEntry(gr_obs , "Observed", "l")
legend.AddEntry(gr_exp , "Expected", "l")
legend.SetShadowColor(0)
legend.SetFillColor(0)
legend.SetLineColor(0)
legend.Draw("same")

ROOT.gPad.RedrawAxis()

c9.Draw()




# Channel Compatibility

Lets now compute the best fit signal strength in each category, but with full correlation of all the nuisance parameters. In order to do that, switch to the terminal and run the following command:

You should get an output like this:

In [32]:
os.system('cat ccc.txt')

0


 --- ChannelCompatibilityCheck --- 
Nominal fit  : r =  0.8344  -0.3771/+0.4076
Alternate fit: r =  2.1026  -0.7644/+0.9305   in channel cat0
Alternate fit: r =  0.0519  -0.0511/+0.6898   in channel cat1
Alternate fit: r =  0.2614  -0.2611/+0.7012   in channel cat4
Alternate fit: r =  0.9415  -0.9402/+1.1099   in channel cat5
Chi2-like compatibility variable: 415.939
Done in 0.04 min (cpu), 0.04 min (real)


Here is a script to plot the results:

In [33]:
import ROOT

poi = "r"; rMax = 4
infile="higgsCombineTest.ChannelCompatibilityCheck.mH125.root"

filein = ROOT.TFile(infile,"READ")
fit_nominal = filein.Get("fit_nominal");
fit_alternate = filein.Get("fit_alternate");
rFit = fit_nominal.floatParsFinal().find(poi);

prefix = "_ChannelCompatibilityCheck_"+poi+"_"

nChann = 0;
iter = fit_alternate.floatParsFinal().createIterator();

while True:
  a = iter.Next();
  if a==None: break
  if (prefix in a.GetName()): nChann+=1

frame = ROOT.TH2F("frame",";best fit #sigma/#sigma(SM);",1,rFit.getMin(),min(rFit.getMax(),rMax),nChann,0,nChann)

iter.Reset();
iChann = 0;
points = ROOT.TGraphAsymmErrors(nChann)

while True:
  a = iter.Next();
  if a==None: break
  if (prefix in a.GetName()):
    channel = a.GetName();
    channel.replace(prefix,"")
    points.SetPoint(iChann, a.getVal(), iChann+0.5);
    points.SetPointError(iChann, -a.getAsymErrorLo(), a.getAsymErrorHi(), 0, 0);
    iChann+=1;
    frame.GetYaxis().SetBinLabel(iChann, channel.replace(prefix,""));

c8 = ROOT.TCanvas("c8","c8",800,800)
c8.cd()                                                                                                                                
c8.SetTopMargin(0.07)                                                                                                                  
c8.SetBottomMargin(0.12)                                                                                                               
c8.SetLeftMargin(0.12)                                                                                                                 
                                                                                                                                      
points.SetLineColor(ROOT.kRed);                                                                                                       
points.SetLineWidth(3);                                                                                                               
points.SetMarkerStyle(21);                                                                                                            
frame.GetXaxis().SetTitleSize(0.05);                                                                                                  
frame.GetXaxis().SetLabelSize(0.04);                                                                                                  
frame.GetYaxis().SetLabelSize(0.06);                                                                                                  
frame.Draw();                                                                                                                         
ROOT.gStyle.SetOptStat(0);                                                                                                            
globalFitBand = ROOT.TBox(rFit.getVal()+rFit.getAsymErrorLo(), 0, rFit.getVal()+rFit.getAsymErrorHi(), nChann);                       
globalFitBand.SetFillColor(65);                                                                                                       
globalFitBand.SetLineStyle(0);                                                                                                        
globalFitBand.Draw("SAME");                                                                                                           
globalFitLine = ROOT.TLine(rFit.getVal(), 0, rFit.getVal(), nChann);                                                                  
globalFitLine.SetLineWidth(4);                                                                                                        
globalFitLine.SetLineColor(214);                                                                                                      
globalFitLine.Draw("SAME");                                                                                                           
points.Draw("PSAME");                                                                                                                
                                                                                                                                      
latex2 = ROOT.TLatex()                                                                                                                
latex2.SetNDC()                                                                                                                       
latex2.SetTextSize(0.5*c8.GetTopMargin())                                                                                              
latex2.SetTextFont(42)                                                                                                                
latex2.SetTextAlign(31) # align right                                                                                                 
latex2.DrawLatex(0.87, 0.95,"19.6 fb^{-1} (8 TeV)")                                                                                   
latex2.SetTextSize(0.7*c8.GetTopMargin())                                                                                              
latex2.SetTextFont(62)                                                                                                                
latex2.SetTextAlign(11) # align right                                                                                                 
latex2.DrawLatex(0.15, 0.95, "CMS")                                                                                                   
latex2.SetTextSize(0.6*c8.GetTopMargin())                                                                                              
latex2.SetTextFont(52)                                                                                                                
latex2.SetTextAlign(11)                                                                                                               
latex2.DrawLatex(0.27, 0.95, "Tutorial")                                                                                              
                                                                                                                                      
ROOT.gPad.RedrawAxis()                                                                                                                
c8.Draw()                  